API Documentation
Integrate AudioToTextAI transcription into your applications with our RESTful API.
Authentication
All API requests require authentication using an API key. Include your API key in the Authorization header:
Authorization: Bearer YOUR_API_KEY
You can find your API key in your Account Settings.
Transcribe Audio
Upload an audio file for transcription. The API will return a job ID that you can use to poll for results.
Request Body (multipart/form-data)
| Parameter | Type | Description |
|---|---|---|
file * |
file | Audio file (MP3, WAV, M4A, FLAC, OGG, or video files) |
model |
string | Model to use: whisper-large-v3 (default) or whisper-turbo |
language |
string | ISO 639-1 language code (e.g., "en", "es"). Leave empty for auto-detection. |
diarization |
boolean | Enable speaker diarization (default: false) |
timestamps |
boolean | Include segment timestamps (default: true) |
word_timestamps |
boolean | Include word-level timestamps (default: false) |
summary |
boolean | Generate AI summary (default: false) |
vocabulary |
array | Custom vocabulary for improved accuracy (JSON array or comma-separated) |
Example Request
curl -X POST https://audiototextai.com/api/v1/transcribe/ \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@audio.mp3" \
-F "diarization=true" \
-F "language=en"
Response
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "processing",
"estimated_credits": 5.5
}
Transcribe from URL
Transcribe audio from a URL. Supports direct audio/video URLs and YouTube links.
Request Body (JSON)
{
"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"diarization": true,
"summary": true
}
Example Request
curl -X POST https://audiototextai.com/api/v1/transcribe/url/ \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com/podcast.mp3", "diarization": true}'
Get Transcription Results
Retrieve the status and results of a transcription job.
Query Parameters
| Parameter | Type | Description |
|---|---|---|
segments |
boolean | Include detailed segments with timestamps (default: false) |
Response (Processing)
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "processing",
"progress": 45
}
Response (Completed)
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"transcript": "Hello and welcome to today's podcast...",
"duration_seconds": 1847.5,
"language": "en",
"credits_used": 31.2,
"speakers": [
{"id": "SPEAKER_00", "label": "Host", "speaking_time": 920.5},
{"id": "SPEAKER_01", "label": "Guest", "speaking_time": 927.0}
],
"summary": "This podcast discusses the future of AI...",
"topics": ["artificial intelligence", "technology", "future"],
"segments": [
{
"start": 0.0,
"end": 5.2,
"text": "Hello and welcome to today's podcast.",
"speaker": "Host",
"confidence": 0.98
}
]
}
List Supported Languages
Get a list of all supported languages for transcription.
Response
{
"languages": [
{"code": "", "name": "Auto-detect"},
{"code": "en", "name": "English"},
{"code": "es", "name": "Spanish"},
{"code": "fr", "name": "French"},
...
]
}
Webhooks
For long-running transcriptions, you can provide a webhook URL to receive results when processing completes.
Add the webhook_url parameter to your transcription request:
curl -X POST https://audiototextai.com/api/v1/transcribe/ \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@audio.mp3" \
-F "webhook_url=https://yourapp.com/transcription-complete"
When transcription completes, we'll POST the results to your webhook URL:
POST https://yourapp.com/transcription-complete
Content-Type: application/json
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"transcript": "...",
...
}
Error Handling
The API uses standard HTTP status codes and returns error details in JSON format.
| Status | Description |
|---|---|
400 |
Bad Request - Invalid parameters or missing required fields |
401 |
Unauthorized - Invalid or missing API key |
402 |
Payment Required - Insufficient credits |
404 |
Not Found - Transcription not found |
500 |
Server Error - Something went wrong on our end |
Error Response Format
{
"error": "insufficient_credits",
"message": "You have 10 credits remaining but this transcription requires 30 credits."
}
Code Examples
Python
import requests
API_KEY = "your-api-key"
BASE_URL = "https://audiototextai.com/api/v1"
# Upload and transcribe
with open("audio.mp3", "rb") as f:
response = requests.post(
f"{BASE_URL}/transcribe/",
headers={"Authorization": f"Bearer {API_KEY}"},
files={"file": f},
data={"diarization": "true"}
)
job = response.json()
print(f"Job ID: {job['id']}")
# Poll for results
import time
while True:
result = requests.get(
f"{BASE_URL}/transcribe/{job['id']}/",
headers={"Authorization": f"Bearer {API_KEY}"},
params={"segments": "true"}
).json()
if result["status"] == "completed":
print(result["transcript"])
break
elif result["status"] == "failed":
print(f"Error: {result['error_message']}")
break
time.sleep(5)
JavaScript (Node.js)
const fs = require('fs');
const axios = require('axios');
const FormData = require('form-data');
const API_KEY = 'your-api-key';
const BASE_URL = 'https://audiototextai.com/api/v1';
async function transcribe(filePath) {
const form = new FormData();
form.append('file', fs.createReadStream(filePath));
form.append('diarization', 'true');
const { data: job } = await axios.post(
`${BASE_URL}/transcribe/`,
form,
{
headers: {
'Authorization': `Bearer ${API_KEY}`,
...form.getHeaders()
}
}
);
console.log(`Job ID: ${job.id}`);
// Poll for results
while (true) {
const { data: result } = await axios.get(
`${BASE_URL}/transcribe/${job.id}/`,
{ headers: { 'Authorization': `Bearer ${API_KEY}` } }
);
if (result.status === 'completed') {
return result;
} else if (result.status === 'failed') {
throw new Error(result.error_message);
}
await new Promise(r => setTimeout(r, 5000));
}
}
transcribe('./audio.mp3').then(console.log);