Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 3 additions & 7 deletions api-reference/openapi.json
Original file line number Diff line number Diff line change
Expand Up @@ -1345,11 +1345,7 @@
"type": "string"
},
"train_mode": {
"default": "full",
"enum": [
"fast",
"full"
],
"const": "fast",
"title": "Train Mode",
"type": "string"
},
Expand Down Expand Up @@ -1647,7 +1643,7 @@
"type": "string"
},
"train_mode": {
"default": "full",
"default": "fast",
"enum": [
"fast",
"full"
Expand Down Expand Up @@ -4052,7 +4048,7 @@
"type": "string"
},
"train_mode": {
"default": "full",
"default": "fast",
"enum": [
"fast",
"full"
Expand Down
12 changes: 10 additions & 2 deletions developer-guide/core-features/emotions.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -60,9 +60,17 @@

<AdvancedEmotions />

### Tone Markers (5 expressions)
## Sound & Delivery Markers

Control volume and intensity:
These markers aren't emotions — they shape *how* a line is delivered, add natural human sounds, or layer in ambient effects. Combine them with the emotion cues above.

### Tone Markers (6 expressions)

Control volume, intensity, and emphasis. Place `[emphasis]` right before the word or phrase you want to stress:

```text
This is [emphasis] really important.
```

<ToneMarkers />

Expand Down Expand Up @@ -159,7 +167,7 @@
- Use natural expressions when possible
- Space out emotional changes for realism

### Don'ts

Check warning on line 170 in developer-guide/core-features/emotions.mdx

View check run for this annotation

Mintlify / Mintlify Validation (hanabiaiinc) - vale-spellcheck

developer-guide/core-features/emotions.mdx#L170

Did you really mean 'Don'ts'?

- Don't overuse emotion tags in short text
- Don't mix conflicting emotions
Expand Down Expand Up @@ -246,7 +254,7 @@
| Whispered Secret | `[mysterious][whispering]` | "I have something to tell you..." |
| Angry Shout | `[angry][shouting]` | "Stop right there!" |
| Sad Sigh | `[sad][sighing]` | "I wish things were different. Sigh." |
| Excited Laugh | `[excited][laughing]` | "We did it! Ha ha!" |

Check warning on line 257 in developer-guide/core-features/emotions.mdx

View check run for this annotation

Mintlify / Mintlify Validation (hanabiaiinc) - vale-spellcheck

developer-guide/core-features/emotions.mdx#L257

'ha' is repeated!
| Nervous Question | `[nervous][uncertain]` | "Are you sure about this?" |

## S1 (legacy) syntax
Expand Down
14 changes: 13 additions & 1 deletion features/realtime-streaming.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: "Realtime Streaming"

Check warning on line 2 in features/realtime-streaming.mdx

View check run for this annotation

Mintlify / Mintlify Validation (hanabiaiinc) - vale-spellcheck

features/realtime-streaming.mdx#L2

Did you really mean 'Realtime'?
description: "Stream audio as it generates for the lowest latency"
icon: "bolt"
---
Expand Down Expand Up @@ -41,11 +41,11 @@

<CodeGroup>
```python Python
from fishaudio import FishAudio

Check warning on line 44 in features/realtime-streaming.mdx

View check run for this annotation

Mintlify / Mintlify Validation (hanabiaiinc) - vale-spellcheck

features/realtime-streaming.mdx#L44

Did you really mean 'fishaudio'?

client = FishAudio() # reads FISH_API_KEY

with open("out.mp3", "wb") as f:

Check warning on line 48 in features/realtime-streaming.mdx

View check run for this annotation

Mintlify / Mintlify Validation (hanabiaiinc) - vale-spellcheck

features/realtime-streaming.mdx#L48

Did you really mean 'wb'?
for chunk in client.tts.stream(text="Streaming keeps latency low."):
f.write(chunk) # or send to a speaker / socket as it arrives

Expand Down Expand Up @@ -92,7 +92,7 @@

<CodeGroup>
```python Python
from fishaudio import FishAudio

Check warning on line 95 in features/realtime-streaming.mdx

View check run for this annotation

Mintlify / Mintlify Validation (hanabiaiinc) - vale-spellcheck

features/realtime-streaming.mdx#L95

Did you really mean 'fishaudio'?
from fishaudio.utils import play

client = FishAudio()
Expand Down Expand Up @@ -151,14 +151,26 @@

Both streaming paths take a `latency` mode:

- `latency="balanced"` (default) — lowest time-to-first-audio. Use it for voice agents and live LLM output.
- `latency="balanced"` (Python SDK default) — lowest time-to-first-audio. Use it for voice agents and live LLM output.
- `latency="normal"` — slightly higher latency, best audio quality. Use it for narration where you can afford a beat.

```python
for chunk in client.tts.stream_websocket(llm_tokens(), latency="balanced"):
...
```

<Warning>
**Set `latency` explicitly for real-time use.** The Python SDK defaults to `balanced`, but the raw HTTP/WebSocket API defaults to `normal`, which is tuned for quality and noticeably increases time-to-first-audio — you may wait several seconds for the first chunk. If you call the API directly, or through a third-party integration such as the LiveKit plugin, pass `balanced` (or `low`) for interactive latency.
</Warning>

The available modes differ slightly between the raw API and the SDK:

| Mode | Raw HTTP/WebSocket API | Python SDK | Behavior |
| ---------- | ---------------------- | ------------- | ---------------------------------------------- |
| `low` | Supported | Not available | Lowest latency |
| `balanced` | Supported | Default | Reduced latency — recommended for real-time |
| `normal` | Default | Supported | Best quality, highest time-to-first-audio |

For finer control, pass a `TTSConfig` with chunk tuning. Smaller chunks emit audio sooner (lower latency); larger chunks give the model more context (smoother prosody):

```python
Expand All @@ -176,7 +188,7 @@

## Stream asynchronously

For asyncio apps, `AsyncFishAudio` exposes the same streaming methods. `stream_websocket` accepts an async generator, so you can pipe an async LLM client straight into speech.

Check warning on line 191 in features/realtime-streaming.mdx

View check run for this annotation

Mintlify / Mintlify Validation (hanabiaiinc) - vale-spellcheck

features/realtime-streaming.mdx#L191

Did you really mean 'asyncio'?

```python
import asyncio
Expand Down
7 changes: 6 additions & 1 deletion features/text-to-speech.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@

<CardGroup cols={2}>
<Card title="Voiceovers & narration" icon="film">
Audiobooks, explainers, ads, and video narration.

Check warning on line 26 in features/text-to-speech.mdx

View check run for this annotation

Mintlify / Mintlify Validation (hanabiaiinc) - vale-spellcheck

features/text-to-speech.mdx#L26

Did you really mean 'explainers'?
</Card>
<Card title="Conversational AI" icon="comments">
Speak an assistant's replies — pair with [streaming](/features/realtime-streaming) for low latency.
Expand All @@ -42,7 +42,7 @@

<CodeGroup>
```python Python
from fishaudio import FishAudio

Check warning on line 45 in features/text-to-speech.mdx

View check run for this annotation

Mintlify / Mintlify Validation (hanabiaiinc) - vale-spellcheck

features/text-to-speech.mdx#L45

Did you really mean 'fishaudio'?
from fishaudio.utils import save

client = FishAudio() # reads FISH_API_KEY
Expand Down Expand Up @@ -191,16 +191,16 @@

To reuse a voice across many requests, [clone it once](/features/voice-cloning) and pass the resulting `reference_id` instead.

### Format & bitrate

Check warning on line 194 in features/text-to-speech.mdx

View check run for this annotation

Mintlify / Mintlify Validation (hanabiaiinc) - vale-spellcheck

features/text-to-speech.mdx#L194

Did you really mean 'bitrate'?

Pick a format for your delivery channel, and tune bitrate to trade size against quality:

Check warning on line 196 in features/text-to-speech.mdx

View check run for this annotation

Mintlify / Mintlify Validation (hanabiaiinc) - vale-spellcheck

features/text-to-speech.mdx#L196

Did you really mean 'bitrate'?

| Format | Notes |
|---|---|
| `mp3` (default) | good size/quality balance; set `mp3_bitrate` to `64`, `128`, or `192` |
| `wav` | uncompressed, highest quality; set `sample_rate` (e.g. `44100`) |
| `pcm` | raw samples, no container — for low-latency playback and telephony pipelines |
| `opus` | efficient for streaming; bitrate is automatic (`opus_bitrate=-1000`) |

Check warning on line 203 in features/text-to-speech.mdx

View check run for this annotation

Mintlify / Mintlify Validation (hanabiaiinc) - vale-spellcheck

features/text-to-speech.mdx#L203

Did you really mean 'bitrate'?

```python
from fishaudio.types import TTSConfig
Expand All @@ -215,10 +215,15 @@

`latency` trades stability for speed; `chunk_length` controls how much text the engine batches before it starts generating.

- `latency="balanced"` (default) — lower time-to-first-audio (~300ms). Good for interactive use.
- `latency="balanced"` (Python SDK default) — lower time-to-first-audio (~300ms). Good for interactive use.
- `latency="normal"` — most stable output, at slightly higher latency.
- `latency="low"` (raw API only) — lowest latency.
- `chunk_length` (`100`–`300`, default `200`) — smaller chunks start audio sooner; larger chunks are more efficient for long text.

<Note>
The raw HTTP/WebSocket API defaults `latency` to `normal` (quality-tuned), while the Python SDK defaults to `balanced`. For real-time use over the raw API, set `latency` to `balanced` or `low` explicitly — see [Tune latency vs. quality](/features/realtime-streaming#tune-latency-vs-quality).
</Note>

<CodeGroup>
```python Python
from fishaudio.types import TTSConfig
Expand Down
1 change: 1 addition & 0 deletions snippets/emotion-list-tones-s2.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@
| Screaming | `[screaming]` | Very loud, panicked | Emergencies, fear |
| Whispering | `[whispering]` | Very soft, secretive | Secrets, quiet scenes |
| Soft | `[soft tone]` | Gentle, quiet | Comfort, lullabies |
| Emphasis | `[emphasis]` | Stress a word/phrase | Highlighting key words |
Loading