Overview

The Plugable Thunderbolt 5 AI Enclosure (TBT5-AI) combines the incredible bandwidth of Thunderbolt 5 and a bundled high-performance graphics processing unit (GPU) along with a custom, unified open source software environment which leverages the capabilities of Microsoft Foundry Local and Google’s MCP Toolbox for Databases in order to provide a secure, private, on-premise local AI infrastructure.

Features and Specifications

The TBT5-AI enclosure hardware has the following features and specifications:

Bundled graphics processing unit (GPU) from Nvidia, AMD or Intel (models vary depending on customer preference)
850W internal power supply, providing a maximum of 600W to the internal GPU
1x 80cm Thunderbolt 5 cable
1x upstream Thunderbolt 5 port that can provide up to 96W of USB Power Delivery to the host system
1x downstream Thunderbolt 5 port capable of providing up to 15W of USB Power Delivery for Thunderbolt or USB-C peripherals
1x downstream 10Gbps USB-C port providing up to 15W of USB Power Delivery for USB-C peripherals
3x downstream 10Gbps USB-A ports each providing up to 7.5W of power for USB-A peripherals
1x 2.5Gbps Ethernet network port

Capabilities

At its core, the TBT5-AI is an external PCI Express enclosure that allows you to add a high-performance graphics processing unit (GPU) to a Windows 11 host computer that supports Thunderbolt 5 (or Thunderbolt 4 with a reduced level of performance).

Leveraging the 80Gbps of bandwidth provided by Thunderbolt 5 and a high performance dedicated GPU from Nvidia, AMD or Intel provides the ability to host and operate local AI infrastructure that would otherwise not be possible.

A custom developed open source application known as ‘Plugable Chat’ simplifies the use of Microsoft Foundry Local to run large language models (LLMs) that harness the power of the TBT5-AI hardware in order to provide excellent model performance.

The use of Microsoft Foundry Local provides developers with a familiar, enterprise-grade interface for versioning and running models directly on the hardware. The benefits are threefold:

A Unified Model Catalog for one-click deployment of models like Phi, OpenAI GPT-OSS and Mistral
Ensures Privacy First operations where data never leaves the enclosure
Delivers Optimized Performance via ONNX Runtime GenAI, maximizing the throughput of the embedded GPUs without the overhead of complex CUDA compilation chains.

For local AI to be viable in a local on premise setting, it must access business data without exposing it. The TBT5-AI and Plugable Chat software leverages the Model Context Protocol (MCP) by using Google’s Enterprise Toolbox to bridge the air gap between the AI model and internal data and provides the following capabilities:

Private SQL Analysis: The system can securely query local PostgreSQL, Oracle, or SQL Server databases to generate insights, ensuring the raw data never moves.
Local File Context: Using RAG (Retrieval Augmented Generation) users can securely analyze internal documents, such as PDFs and text files that reside strictly on the local network
Zero-Trust Architecture: The MCP layer enforces strict, read-only permissions, ensuring the AI accesses only what it is explicitly permitted to see.

View Other Articles in Category

LLM, Local AI, TBT5-AI

Loading Comments

_{Article ID: 746324984039}