Home › Audio Tools › Deepgram

Deepgram - AI Audio Tools Tool

Overview

Deepgram is a production-grade speech AI platform that provides cloud APIs and SDKs for automatic speech recognition (ASR) and text-to-speech (TTS). It targets applications that require both real-time and batch processing — including call analytics, captions, voice assistants, and voice-enabled applications — by offering configurable models, speaker diarization, timestamps, and tooling for deployment and integration. Deepgram supports multiple languages and provides features for punctuation, profanity filtering, and speaker labeling to help turn audio into structured, searchable transcripts. Designed for integration at scale, Deepgram exposes REST and streaming (WebSocket) endpoints plus official SDKs for common server environments (Node.js, Python, Go). Developers can choose prebuilt models optimized for settings such as phone calls or meetings and extend base capabilities with custom vocabularies and model configuration. The platform also includes neural TTS voices and configurable synthesis options suitable for IVR and voice agents. Authentication is via API key, and typical workflows include low-latency streaming for live transcriptions and asynchronous batch jobs for large archives of recorded audio.

Key Features

Real-time streaming ASR with low-latency WebSocket and REST endpoints
Batch transcription for large audio archives with asynchronous processing
Speaker diarization to label and separate multiple speakers in transcripts
Word-level timestamps and confidence scores for fine-grained analysis
Multilingual models and automatic language selection support
Neural text-to-speech with configurable voices and SSML support
Custom vocabulary and model configuration for domain-specific terms
Official SDKs for Node.js, Python, and Go plus WebSocket streaming

Code Examples

Python

import requests

API_KEY = "YOUR_DEEPGRAM_API_KEY"
AUDIO_PATH = "audio.wav"

headers = {
    "Authorization": f"Token {API_KEY}",
    "Content-Type": "audio/wav"
}
params = {
    "punctuate": "true",
    "language": "en-US"
}

with open(AUDIO_PATH, "rb") as f:
    resp = requests.post("https://api.deepgram.com/v1/listen", headers=headers, params=params, data=f)

print(resp.status_code)
print(resp.json())

Curl

DEEPGRAM_API_KEY="YOUR_DEEPGRAM_API_KEY"
AUDIO_FILE="audio.wav"

curl -X POST "https://api.deepgram.com/v1/listen?language=en-US&punctuate=true" \
  -H "Authorization: Token ${DEEPGRAM_API_KEY}" \
  -H "Content-Type: audio/wav" \
  --data-binary @${AUDIO_FILE}

Javascript

const { Deepgram } = require('@deepgram/sdk');
const fs = require('fs');

const DEEPGRAM_API_KEY = 'YOUR_DEEPGRAM_API_KEY';
const deepgram = new Deepgram(DEEPGRAM_API_KEY);

async function transcribe() {
  const audio = { buffer: fs.readFileSync('audio.wav') };
  const options = { punctuate: true, language: 'en-US' };

  const response = await deepgram.transcription.preRecorded(audio, options);
  console.log(JSON.stringify(response.results, null, 2));
}

transcribe().catch(console.error);

API Overview

Authentication: API key
Base URL: https://api.deepgram.com/v1

Last Refreshed: 2026-01-09

Key Information

Category: Audio Tools
Type: AI Audio Tools Tool

Visit Official Website

Deepgram - AI Audio Tools Tool

Overview

Key Features

Code Examples

Python

Curl

Javascript

API Overview

Key Information

Related Tools

WhisperX

Retrieval-based Voice Conversion WebUI

Replica

ClearerVoice-Studio

GPT-SoVITS

VCClient Real-time Voice Changer