Skip to content

Latest commit

 

History

History
343 lines (239 loc) · 8.6 KB

File metadata and controls

343 lines (239 loc) · 8.6 KB

AssemblyAI Python SDK Reference

Installation

pip install assemblyai

Authentication

The SDK uses the Authorization: KEY header (no Bearer prefix). Set your API key before making any calls:

import assemblyai as aai

aai.settings.api_key = "YOUR_API_KEY"

1. Basic Transcription

From a URL

import assemblyai as aai

aai.settings.api_key = "YOUR_API_KEY"

transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.com/audio.mp3")

print(transcript.text)

From a local file

transcript = transcriber.transcribe("/path/to/local/audio.mp3")

The SDK automatically uploads the local file to AssemblyAI's servers before transcription. No separate upload step is needed.

With TranscriptionConfig and speech_models fallback

Use speech_models to specify a preferred model with automatic fallback:

config = aai.TranscriptionConfig(
    speech_models=["universal-3-pro", "universal-2"]
)

transcriber = aai.Transcriber(config=config)
transcript = transcriber.transcribe("https://example.com/audio.mp3")
print(transcript.text)

If the first model in the list cannot process the audio, the next model is used as a fallback.


2. Error Handling

Always check transcript.status after transcription:

transcript = transcriber.transcribe("https://example.com/audio.mp3")

if transcript.status == aai.TranscriptStatus.error:
    print(f"Transcription failed: {transcript.error}")
else:
    print(transcript.text)

3. Speaker Diarization

Enable speaker labels and iterate over utterances:

config = aai.TranscriptionConfig(speaker_labels=True)

transcriber = aai.Transcriber(config=config)
transcript = transcriber.transcribe("https://example.com/audio.mp3")

for utterance in transcript.utterances:
    print(f"Speaker {utterance.speaker}: {utterance.text}")

4. PII Redaction

Basic redaction

config = aai.TranscriptionConfig(
    redact_pii=True,
    redact_pii_policies=[
        aai.PIIRedactionPolicy.person_name,
        aai.PIIRedactionPolicy.phone_number,
        aai.PIIRedactionPolicy.email_address,
        aai.PIIRedactionPolicy.credit_card_number,
        aai.PIIRedactionPolicy.ssn,
    ],
)

transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config)
print(transcript.text)  # PII is replaced with ###

Substitution policy

Control how redacted text appears:

config = aai.TranscriptionConfig(
    redact_pii=True,
    redact_pii_policies=[
        aai.PIIRedactionPolicy.person_name,
    ],
    redact_pii_sub=aai.PIISubstitutionPolicy.hash,  # or .entity_name
)

Redacted audio

Get a version of the audio with PII bleeped out:

config = aai.TranscriptionConfig(
    redact_pii=True,
    redact_pii_audio=True,
    redact_pii_policies=[
        aai.PIIRedactionPolicy.person_name,
    ],
)

transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config)
redacted_audio_url = transcript.redacted_audio_url

5. Audio Intelligence

Sentiment Analysis

config = aai.TranscriptionConfig(sentiment_analysis=True)
transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config)

for result in transcript.sentiment_analysis:
    print(f"{result.text}{result.sentiment}")  # POSITIVE, NEGATIVE, NEUTRAL

Entity Detection

config = aai.TranscriptionConfig(entity_detection=True)
transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config)

for entity in transcript.entities:
    print(f"{entity.text} ({entity.entity_type})")

Auto Chapters

Generates chapters with headlines, summaries, and gist for sections of audio.

config = aai.TranscriptionConfig(auto_chapters=True)
transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config)

for chapter in transcript.chapters:
    print(f"{chapter.headline}")
    print(f"  {chapter.summary}")
    print(f"  Gist: {chapter.gist}")

Note: auto_chapters and summarization are mutually exclusive. Do not enable both in the same config.

IAB Categories (Topic Detection)

config = aai.TranscriptionConfig(iab_categories=True)
transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config)

for result in transcript.iab_categories.results:
    print(result.text)
    for label in result.labels:
        print(f"  {label.label} ({label.relevance:.2f})")

Content Safety Detection

config = aai.TranscriptionConfig(content_safety=True)
transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config)

for result in transcript.content_safety.results:
    print(result.text)
    for label in result.labels:
        print(f"  {label.label} ({label.confidence:.2f})")

Summarization

config = aai.TranscriptionConfig(
    summarization=True,
    summary_model=aai.SummarizationModel.informative,
    summary_type=aai.SummarizationType.bullets,
)
transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config)

print(transcript.summary)

Note: summarization and auto_chapters are mutually exclusive. Do not enable both in the same config.

Auto Highlights (Key Phrases)

config = aai.TranscriptionConfig(auto_highlights=True)
transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config)

for result in transcript.auto_highlights.results:
    print(f"{result.text} (count: {result.count}, rank: {result.rank:.4f})")

6. Language Detection

Automatic language detection

config = aai.TranscriptionConfig(language_detection=True)
transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config)

print(transcript.json_response["language_code"])
print(transcript.text)

Specifying a language code directly

config = aai.TranscriptionConfig(language_code="es")  # Spanish
transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config)
print(transcript.text)

7. Prompting with Universal-3 Pro

Use prompt or keyterms_prompt to guide transcription. These two options are mutually exclusive — use one or the other, not both.

Using prompt

A free-form natural language prompt to provide context:

config = aai.TranscriptionConfig(
    speech_models=[aai.SpeechModel.universal_3_pro],
    prompt="This is a medical consultation discussing cardiology and hypertension.",
)

transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config)
print(transcript.text)

Using keyterms_prompt

A list of key terms to boost recognition accuracy:

config = aai.TranscriptionConfig(
    speech_models=[aai.SpeechModel.universal_3_pro],
    keyterms_prompt=["Kubernetes", "PostgreSQL", "gRPC", "Terraform"],
)

transcript = transcriber.transcribe("https://example.com/audio.mp3", config=config)
print(transcript.text)

Note on disfluencies: The disfluencies=True option (to include "ums" and "uhs") only works with Universal-2. For Universal-3 Pro, use a prompt to instruct the model to include disfluencies instead.


8. LLM Gateway Usage from Python

The LLM Gateway provides access to LLMs via AssemblyAI's infrastructure. Use requests to call the gateway endpoint directly. Do not use LeMUR — it is deprecated.

import requests

API_KEY = "YOUR_API_KEY"

response = requests.post(
    "https://llm-gateway.assemblyai.com/v1/chat/completions",
    headers={
        "Authorization": API_KEY,
        "Content-Type": "application/json",
    },
    json={
        "model": "claude-sonnet-4-20250514",
        "messages": [
            {
                "role": "user",
                "content": "Summarize the key themes from this transcript: ...",
            }
        ],
        "temperature": 0.5,
    },
)

result = response.json()
print(result["choices"][0]["message"]["content"])

The gateway follows the OpenAI-compatible chat completions format. The Authorization header uses the API key directly — no Bearer prefix.


9. File Upload

The SDK handles file uploads automatically when you pass a local file path to transcribe():

transcript = transcriber.transcribe("/path/to/local/recording.wav")

Under the hood, the SDK uploads the file to AssemblyAI's servers and then submits the returned URL for transcription. No manual upload step is required.

If you need to upload manually (e.g., to reuse the URL across multiple transcriptions):

upload_url = transcriber.upload_file("/path/to/local/recording.wav")
transcript = transcriber.transcribe(upload_url)