API Documentation

Confidential AI API

OpenAI-compatible. Change one URL, every request runs inside Intel TDX hardware enclaves. Works with Python, Node.js, LangChain, CrewAI, or any OpenAI SDK.

Quick Start

1

Get your API key

Create an account, then go to Dashboard → API Keys → Create Key. Your key starts with vgpu_.

2

Set the base URL

Replace your OpenAI base URL with:

https://api.voltagegpu.com/v1/confidential
3

Choose a model

Use any agent slug as the model ID (see table below), or Qwen/Qwen2.5-7B-Instruct for general use.

Code Examples

cURL

bash
curl https://api.voltagegpu.com/v1/confidential/chat/completions \
  -H "Authorization: Bearer vgpu_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "contract-analyst",
    "messages": [
      {"role": "user", "content": "Review this NDA clause: The Receiving Party shall not disclose any Confidential Information for 5 years..."}
    ],
    "max_tokens": 2048,
    "stream": true
  }'

Python (OpenAI SDK)

python
from openai import OpenAI

# One line to change — same SDK, same code, hardware-encrypted
client = OpenAI(
    base_url="https://api.voltagegpu.com/v1/confidential",
    api_key="vgpu_YOUR_API_KEY",
)

# Use any of the 8 agents as model ID
response = client.chat.completions.create(
    model="contract-analyst",  # or: financial-analyst, compliance-officer, etc.
    messages=[
        {"role": "user", "content": "Review this NDA and flag non-standard terms:\n\n" + nda_text}
    ],
    max_tokens=2048,
)

print(response.choices[0].message.content)

Node.js / TypeScript

typescript
import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.voltagegpu.com/v1/confidential',
  apiKey: 'vgpu_YOUR_API_KEY',
});

const response = await client.chat.completions.create({
  model: 'financial-analyst',
  messages: [
    { role: 'user', content: 'Analyze this P&L for red flags:\n\n' + financialData }
  ],
  stream: true,
});

for await (const chunk of response) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

LangChain

python
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="https://api.voltagegpu.com/v1/confidential",
    api_key="vgpu_YOUR_API_KEY",
    model="compliance-officer",
)

response = llm.invoke("Assess GDPR compliance gaps in our AI usage policy")
print(response.content)

Available Agents

Use the agent slug as the model parameter. Each agent has a specialized system prompt that activates automatically — you just send your document.

Model IDAgentIndustryBest for
contract-analystContract AnalystLegalNDA review, clause risk, liability analysis
financial-analystFinancial AnalystFinanceP&L analysis, fraud detection, audit findings
compliance-officerCompliance OfficerGRCGDPR gaps, policy review, regulatory risk
medical-analystMedical Records AnalystHealthcarePatient records, drug interactions, clinical trials
due-diligenceDue Diligence AnalystM&ATarget assessment, concentration risk, valuation
cybersecurity-analystCybersecurity AnalystSecurityIncident triage, threat analysis, response plans
hr-analystHR & Workplace AnalystHRInvestigation analysis, compliance, policy review
tax-analystTax & Transfer PricingTaxTransfer pricing review, tax exposure, structure analysis

You can also list agents programmatically: GET /v1/confidential/models

API Reference

POST
/v1/confidential/chat/completions

Send messages and get a response from a confidential agent. Supports streaming.

Request body

ParameterTypeRequiredDescription
modelstringYesAgent slug (e.g. contract-analyst) or model ID
messagesarrayYesArray of {role, content} objects. Max 100 messages, 200K chars total.
max_tokensintegerNoMax response tokens. Default: agent-specific (typically 4096).
temperaturefloatNo0.0-1.0. Default: agent-specific (0.08-0.12 for precision).
streambooleanNoIf true, response is streamed as SSE. Default: false.

Response headers

HeaderDescription
X-Confidentialtrue if processed in TDX enclave
X-AgentAgent slug used
X-ProviderInfrastructure provider (targon)
GET
/v1/confidential/models

List all available confidential agents and models. No authentication required.

Authentication

All requests require an API key in the Authorization header:

header
Authorization: Bearer vgpu_YOUR_API_KEY
  • Keys start with vgpu_
  • Create keys at Dashboard → API Keys
  • Max 10 keys per account
  • Keys are hashed (SHA-256) before storage — we never store your key in plaintext
  • Revoke a key instantly from the dashboard

Error Codes

CodeMeaningWhat to do
401Invalid or missing API keyCheck your Authorization header
402Insufficient balanceTop up at voltagegpu.com/billing
429Rate limit exceededWait or upgrade plan (Developer: 60/min, Team: 300/min)
503TDX enclave starting upRetry after 30-60s. We never fall back to non-encrypted infrastructure.
No fallback to non-encrypted infrastructure. If TDX is unavailable, we return 503 rather than silently routing to a non-confidential backend. Your data is either hardware-encrypted or not processed at all.

Pricing

Developer

$0.008 / 1K tokens

1 API key, 60 req/min

Team

$499/mo + $0.007 / 1K tokens

10 API keys, 300 req/min, audit log

Enterprise

Custom volume pricing

Unlimited keys, dedicated enclave, SLA

Every request runs in an Intel TDX enclave — no premium tier required for confidential compute. Billing is per-token, debited from your account balance.

Security Architecture

Transport

TLS 1.3 for all API calls. API keys hashed with SHA-256 before storage.

Compute

Intel TDX enclaves on NVIDIA H200 GPUs. Data encrypted in CPU memory during inference. Operated by Manifold Labs (Bittensor Subnet 4).

Data retention

Zero retention. Documents are destroyed after analysis. Encrypted RAM is flushed. We store usage metadata (tokens, cost) but never your content.

Compliance

EU-based company (France). GDPR Art. 25 (data protection by design). DPA available on request.

Start integrating

Create an account, get your API key, change one URL. Every AI request runs in a hardware enclave from that moment on.