|
| 1 | +# AssemblyAI Python SDK |
| 2 | + |
| 3 | +Speech-to-text and audio intelligence SDK. Supports pre-recorded transcription, real-time streaming, and audio analysis features. |
| 4 | + |
| 5 | +## Quick start |
| 6 | + |
| 7 | +```bash |
| 8 | +pip install -U assemblyai |
| 9 | +``` |
| 10 | + |
| 11 | +```python |
| 12 | +import os |
| 13 | +import assemblyai as aai |
| 14 | + |
| 15 | +aai.settings.api_key = os.environ["ASSEMBLYAI_API_KEY"] |
| 16 | + |
| 17 | +transcript = aai.Transcriber().transcribe( |
| 18 | + "https://example.com/audio.mp3", |
| 19 | + config=aai.TranscriptionConfig( |
| 20 | + speech_models=["universal-3-pro", "universal-2"], |
| 21 | + speaker_labels=True, |
| 22 | + ), |
| 23 | +) |
| 24 | + |
| 25 | +print(transcript.text) |
| 26 | +for utterance in transcript.utterances: |
| 27 | + print(f"Speaker {utterance.speaker}: {utterance.text}") |
| 28 | +``` |
| 29 | + |
| 30 | +## Auth |
| 31 | + |
| 32 | +Set `ASSEMBLYAI_API_KEY` as an environment variable, or: |
| 33 | + |
| 34 | +```python |
| 35 | +aai.settings.api_key = "your-key" |
| 36 | +``` |
| 37 | + |
| 38 | +## Key classes |
| 39 | + |
| 40 | +- `aai.Transcriber` — Transcribe files, URLs, or streams. Methods: `transcribe()`, `transcribe_async()`, `submit()`, `list_transcripts()` |
| 41 | +- `aai.TranscriptionConfig` — All transcription options: `speech_models`, `speaker_labels`, `sentiment_analysis`, `entity_detection`, `auto_chapters`, `content_safety`, `language_detection`, `summarization`, `word_boost`, `disfluencies` |
| 42 | +- `aai.Transcript` — Result object with `.text`, `.status`, `.utterances`, `.words`, `.chapters`, `.entities`, `.sentiment_analysis`. Methods: `get_sentences()`, `get_paragraphs()`, `export_subtitles_srt()`, `export_subtitles_vtt()` |
| 43 | +- `assemblyai.streaming.v3.StreamingClient` — Real-time streaming with event-based API |
| 44 | + |
| 45 | +## Common patterns |
| 46 | + |
| 47 | +**Transcribe a local file:** |
| 48 | +```python |
| 49 | +transcript = aai.Transcriber().transcribe("./recording.mp3") |
| 50 | +``` |
| 51 | + |
| 52 | +**With multiple features:** |
| 53 | +```python |
| 54 | +config = aai.TranscriptionConfig( |
| 55 | + speech_models=["universal-3-pro", "universal-2"], |
| 56 | + speaker_labels=True, |
| 57 | + sentiment_analysis=True, |
| 58 | + entity_detection=True, |
| 59 | + auto_chapters=True, |
| 60 | + language_detection=True, |
| 61 | +) |
| 62 | +transcript = aai.Transcriber().transcribe(audio_url, config=config) |
| 63 | +``` |
| 64 | + |
| 65 | +**PII redaction** (uses setter, not constructor): |
| 66 | +```python |
| 67 | +config = aai.TranscriptionConfig() |
| 68 | +config.set_redact_pii( |
| 69 | + policies=[aai.PIIRedactionPolicy.email_address, aai.PIIRedactionPolicy.phone_number], |
| 70 | + substitution=aai.PIISubstitutionPolicy.hash, |
| 71 | +) |
| 72 | +``` |
| 73 | + |
| 74 | +**Streaming v3:** |
| 75 | +```python |
| 76 | +from assemblyai.streaming.v3 import ( |
| 77 | + StreamingClient, StreamingClientOptions, |
| 78 | + StreamingParameters, StreamingEvents, |
| 79 | +) |
| 80 | + |
| 81 | +client = StreamingClient(StreamingClientOptions( |
| 82 | + api_key=os.environ["ASSEMBLYAI_API_KEY"], |
| 83 | + api_host="streaming.assemblyai.com", |
| 84 | +)) |
| 85 | +client.on(StreamingEvents.Turn, lambda turn: print(turn.text)) |
| 86 | +client.connect(StreamingParameters( |
| 87 | + sample_rate=16000, |
| 88 | + speech_model="u3-rt-pro", |
| 89 | +)) |
| 90 | +``` |
| 91 | + |
| 92 | +**Retrieve existing transcript:** |
| 93 | +```python |
| 94 | +transcript = aai.Transcript.get_by_id("transcript-id") |
| 95 | +``` |
| 96 | + |
| 97 | +## Important gotchas |
| 98 | + |
| 99 | +- **Always check status**: `if transcript.status == aai.TranscriptStatus.error` — accessing `.text` on a failed transcript returns None, not an exception |
| 100 | +- **`speech_models` takes a list** with fallback ordering: `["universal-3-pro", "universal-2"]` |
| 101 | +- **PII redaction uses `set_redact_pii()`**, not a constructor parameter |
| 102 | +- **Streaming v3 is a separate module**: `assemblyai.streaming.v3`, not the legacy `RealtimeTranscriber` |
| 103 | +- **Microphone streaming needs extras**: `pip install "assemblyai[extras]"` for `pyaudio` |
| 104 | +- **`transcribe_async()` returns a `concurrent.futures.Future`**, not an asyncio coroutine |
| 105 | +- **Timestamps are in milliseconds** throughout the SDK |
| 106 | +- **Minimum Python**: 3.8+ |
| 107 | + |
| 108 | +## Dependencies |
| 109 | + |
| 110 | +`httpx`, `pydantic`, `typing-extensions`, `websockets`. Optional: `pyaudio` via `[extras]`. |
| 111 | + |
| 112 | +## Docs |
| 113 | + |
| 114 | +- [Full documentation](https://www.assemblyai.com/docs) |
| 115 | +- [API reference](https://www.assemblyai.com/docs/api-reference) |
| 116 | +- [llms-full.txt](https://www.assemblyai.com/docs/llms-full.txt?lang=python) (Python-filtered docs for LLMs) |
0 commit comments