Transcribe audio and video with Whisper Large V3 on VoltageGPU. OpenAI-compatible API. Up to 10x cheaper than alternatives.
VoltageGPU runs OpenAI Whisper Large V3 and other speech-to-text models on GPU-accelerated infrastructure for fast, accurate transcription. Process hours of audio in minutes, support 99+ languages, and get word-level timestamps. Our OpenAI-compatible API makes migration effortless, and our pricing is up to 10x lower than hosted alternatives.
The most accurate open-source speech recognition model. 99%+ accuracy on clean audio in English.
Transcribe audio in over 99 languages with automatic language detection. No model switching needed.
Use the same OpenAI SDK and API format. Migrate from OpenAI Whisper API by changing one URL.
Get precise word-level timestamps for subtitle generation, content navigation, and searchable audio.
Whisper API on VoltageGPU costs ~$0.003/min vs $0.006/min on OpenAI. Even cheaper for bulk processing.
Transcribe hundreds of hours of audio in parallel. Ideal for podcast archives, call centers, and media companies.
from openai import OpenAI
# Initialize VoltageGPU client
client = OpenAI(
base_url="https://api.voltagegpu.com/v1",
api_key="YOUR_VOLTAGE_API_KEY"
)
# Transcribe an audio file
with open("interview.mp3", "rb") as audio_file:
transcript = client.audio.transcriptions.create(
model="whisper-large-v3",
file=audio_file,
response_format="verbose_json",
timestamp_granularities=["word"],
)
print(f"Transcription: {transcript.text}")
print(f"Language: {transcript.language}")
print(f"Duration: {transcript.duration}s")
# Access word-level timestamps
for word in transcript.words:
print(f" [{word.start:.2f}s - {word.end:.2f}s] {word.word}")
# Translation (any language to English)
with open("french_podcast.mp3", "rb") as audio_file:
translation = client.audio.translations.create(
model="whisper-large-v3",
file=audio_file,
)
print(f"English translation: {translation.text}")$5 free credit. No credit card required.