Bring Your Own Agent
OpenAI-compatible · Intel TDX
BYOA · MCP · CrewAI · LangChain · OpenAI SDK

Bring Your Own Agent —
Run It Inside Intel TDX.

Point CrewAI, LangChain, OpenAI SDK, and MCP clients at api.voltagegpu.com/v1. Same code, sovereign infrastructure.

One base_url swap routes every prompt, tool call and embedding through Intel TDX hardware enclaves we operate in the EU. Provider-blind by design.

WHY BYOA MATTERS IN 2026

MCP became the agent standard

Model Context Protocol unlocks tool-calling across IDEs, desktops and CI. Confidential MCP servers seal that traffic in TDX.

CrewAI vertical agents exploded

Legal, finance, supply-chain crews need provider-blind LLMs. Drop-in base_url switch keeps your existing crew code intact.

92% privacy-driven AI infra

Buyers cite confidentiality as the #1 reason to leave hyperscalers. EU jurisdiction + Article 28 DPA closes the procurement door.

Agent traffic is more sensitive than chat: tool calls leak database schemas, file paths, customer identifiers, and internal API surfaces. Hyperscaler endpoints expose that traffic to a foreign jurisdiction. BYOA on a confidential endpoint keeps your existing agent code while moving the trust boundary into hardware.

OpenAI-compatible — drop in

Full API reference

Existing OpenAI SDK code works unchanged: switch base_url to https://api.voltagegpu.com/v1 and pass a VoltageGPU API key. Streaming, tool calls, structured outputs, JSON mode, embeddings — all OpenAI semantics preserved.

Python · OpenAI SDK · /v1/chat/completions
PYTHON
# Drop-in: same OpenAI SDK, sovereign endpoint
from openai import OpenAI

client = OpenAI(
    base_url="https://api.voltagegpu.com/v1",
    api_key="vg-...",  # https://app.voltagegpu.com/settings/api-keys
)

response = client.chat.completions.create(
    model="Qwen3-235B-A22B-Instruct-2507-TEE",
    messages=[
        {"role": "system", "content": "You are a regulated-industry assistant."},
        {"role": "user", "content": "Summarize this MSA section..."},
    ],
)

print(response.choices[0].message.content)

CrewAI integration

Private CrewAI guide

CrewAI's LLM wrapper accepts an OpenAI-compatible base URL. Construct one pointing at VoltageGPU and pass it to every agent in the crew. Your crews, tasks, planners and tools stay local; only token traffic crosses the enclave boundary.

Python · CrewAI · custom LLM
PYTHON
# CrewAI pointed at confidential VoltageGPU inference
from crewai import Agent, Crew, Task, LLM

confidential_llm = LLM(
    model="openai/Qwen3-235B-A22B-Instruct-2507-TEE",
    base_url="https://api.voltagegpu.com/v1",
    api_key="vg-...",
    temperature=0.2,
)

analyst = Agent(
    role="Senior Compliance Analyst",
    goal="Flag regulatory risks in vendor contracts",
    backstory="Trained on EU AI Act and DORA requirements.",
    llm=confidential_llm,
    allow_delegation=False,
)

review = Task(
    description="Review {contract} for Article 28 RGPD gaps.",
    expected_output="A bulleted list of clause-level findings.",
    agent=analyst,
)

crew = Crew(agents=[analyst], tasks=[review], verbose=True)
result = crew.kickoff(inputs={"contract": open("msa.txt").read()})

LangChain integration

All integrations

ChatOpenAI from langchain_openai exposes base_url and api_key. Wire it up once and every chain, agent, RAG pipeline, or LangGraph node runs against confidential inference.

Python · LangChain · ChatOpenAI
PYTHON
# LangChain ChatOpenAI -> TDX-sealed inference
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage

llm = ChatOpenAI(
    model="Qwen3-32B-TEE",
    base_url="https://api.voltagegpu.com/v1",
    api_key="vg-...",
    temperature=0,
)

messages = [
    SystemMessage(content="You are a sovereign legal assistant."),
    HumanMessage(content="Identify auto-renewal clauses in this NDA..."),
]

reply = llm.invoke(messages)
print(reply.content)
MCP — MODEL CONTEXT PROTOCOL

Host MCP servers sealed in Intel TDX

Tool calls, resource reads, and prompt templates stay encrypted in CPU memory. Connect from Claude Desktop, Cursor, Continue, or any MCP-aware client.

MCP setup walkthrough

PRICING — PER 1M TOKENS

BALANCED128K ctx

Qwen3-32B-TEE

Fast multilingual inference, ideal for high-volume agent loops and lightweight tools.

INPUT

$0.50

OUTPUT

$1.50

FLAGSHIP262K ctx

Qwen3-235B-A22B-Instruct-2507-TEE

Long-context reasoning. Recommended for contract review crews and document-heavy chains.

INPUT

$1.20

OUTPUT

$3.50

REASONING128K ctx

DeepSeek-R1-0528-TEE

Deep reasoning model. Use for tool-using agents that must justify each step.

INPUT

$1.80

OUTPUT

$5.40

No commitment, no platform fee. Billed per token. Volume contracts available for > 100M tokens / mo.

HOW IT STAYS CONFIDENTIAL

Provider-blind inference

Prompts decrypt only inside the Intel TDX trust domain. AES-256 memory encryption keeps RAM unreadable to the hypervisor and to VoltageGPU operators.

ECDSA attestation per request

Each completion can be paired with a signed attestation report proving which TDX module, base model and version processed your request.

Zero retention, zero training

Prompts, tool calls and outputs are never logged, replayed or used to train models. Native RGPD Article 28 DPA available without negotiation.

EU jurisdiction

VOLTAGE EI is a French company (SIREN 943 808 824). Your data stays under EU controller / processor law and EU contract enforcement.

EXPLORE FURTHER

Confidential MCP server

Tool calls sealed in TDX

Private CrewAI deployment

Multi-agent workflows

Sovereign agentic AI

Architectural overview

EU AI Act compliance

Article-by-article mapping

Public API reference

OpenAPI spec

SDK reference

Python / TS / Go

All integrations

Frameworks & tools

Get an API key and ship your first BYOA call

No credit card to start. Free tier covers ~250K tokens for evaluation.

Create account