Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 14 additions & 10 deletions api-reference/openapi.json
Original file line number Diff line number Diff line change
Expand Up @@ -2430,13 +2430,15 @@
{
"in": "header",
"name": "model",
"description": "Specify which TTS model to use. We recommend `s2-pro`.",
"description": "Specify which TTS model to use. Use `s2.1-pro-free` for the free developer tier.",
"required": true,
"schema": {
"default": "s2-pro",
"default": "s2.1-pro-free",
"enum": [
"s1",
"s2-pro"
"s2-pro",
"s2.1-pro",
"s2.1-pro-free"
],
"title": "Model",
"type": "string"
Expand Down Expand Up @@ -2584,12 +2586,12 @@
{
"lang": "bash",
"label": "Single Speaker",
"source": "curl --request POST \\\n --url https://api.fish.audio/v1/tts \\\n --header 'Authorization: Bearer <token>' \\\n --header 'Content-Type: application/json' \\\n --header 'model: s2-pro' \\\n --data '{\n \"text\": \"Hello! Welcome to Fish Audio.\",\n \"reference_id\": \"model-id\",\n \"temperature\": 0.7,\n \"top_p\": 0.7,\n \"prosody\": {\n \"speed\": 1,\n \"volume\": 0,\n \"normalize_loudness\": true\n },\n \"chunk_length\": 300,\n \"normalize\": true,\n \"format\": \"mp3\",\n \"sample_rate\": 44100,\n \"mp3_bitrate\": 128,\n \"latency\": \"normal\",\n \"max_new_tokens\": 1024,\n \"repetition_penalty\": 1.2,\n \"min_chunk_length\": 50,\n \"condition_on_previous_chunks\": true,\n \"early_stop_threshold\": 1\n }'"
"source": "curl --request POST \\\n --url https://api.fish.audio/v1/tts \\\n --header 'Authorization: Bearer <token>' \\\n --header 'Content-Type: application/json' \\\n --header 'model: s2.1-pro-free' \\\n --data '{\n \"text\": \"Hello! Welcome to Fish Audio.\",\n \"reference_id\": \"model-id\",\n \"temperature\": 0.7,\n \"top_p\": 0.7,\n \"prosody\": {\n \"speed\": 1,\n \"volume\": 0,\n \"normalize_loudness\": true\n },\n \"chunk_length\": 300,\n \"normalize\": true,\n \"format\": \"mp3\",\n \"sample_rate\": 44100,\n \"mp3_bitrate\": 128,\n \"latency\": \"normal\",\n \"max_new_tokens\": 1024,\n \"repetition_penalty\": 1.2,\n \"min_chunk_length\": 50,\n \"condition_on_previous_chunks\": true,\n \"early_stop_threshold\": 1\n }'"
},
{
"lang": "bash",
"label": "Multi Speaker (S2-Pro only)",
"source": "curl --request POST \\\n --url https://api.fish.audio/v1/tts \\\n --header 'Authorization: Bearer <token>' \\\n --header 'Content-Type: application/json' \\\n --header 'model: s2-pro' \\\n --data '{\n \"text\": \"<|speaker:0|>Hello!<|speaker:1|>Hi there!\",\n \"reference_id\": [\"speaker-a-id\", \"speaker-b-id\"],\n \"temperature\": 0.7,\n \"top_p\": 0.7,\n \"prosody\": {\n \"speed\": 1,\n \"volume\": 0,\n \"normalize_loudness\": true\n },\n \"chunk_length\": 300,\n \"normalize\": true,\n \"format\": \"mp3\",\n \"sample_rate\": 44100,\n \"mp3_bitrate\": 128,\n \"latency\": \"normal\",\n \"max_new_tokens\": 1024,\n \"repetition_penalty\": 1.2,\n \"min_chunk_length\": 50,\n \"condition_on_previous_chunks\": true,\n \"early_stop_threshold\": 1\n }'"
"label": "Multi Speaker (S2.1 Pro only)",
"source": "curl --request POST \\\n --url https://api.fish.audio/v1/tts \\\n --header 'Authorization: Bearer <token>' \\\n --header 'Content-Type: application/json' \\\n --header 'model: s2.1-pro-free' \\\n --data '{\n \"text\": \"<|speaker:0|>Hello!<|speaker:1|>Hi there!\",\n \"reference_id\": [\"speaker-a-id\", \"speaker-b-id\"],\n \"temperature\": 0.7,\n \"top_p\": 0.7,\n \"prosody\": {\n \"speed\": 1,\n \"volume\": 0,\n \"normalize_loudness\": true\n },\n \"chunk_length\": 300,\n \"normalize\": true,\n \"format\": \"mp3\",\n \"sample_rate\": 44100,\n \"mp3_bitrate\": 128,\n \"latency\": \"normal\",\n \"max_new_tokens\": 1024,\n \"repetition_penalty\": 1.2,\n \"min_chunk_length\": 50,\n \"condition_on_previous_chunks\": true,\n \"early_stop_threshold\": 1\n }'"
Comment on lines 2591 to +2594

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

Fish Audio TTS multi-speaker support s2.1-pro-free vs s2.1-pro model

💡 Result:

Both the s2.1-pro and s2.1-pro-free models are variations of the same Fish Audio S2.1-Pro architecture [1][2], which supports multi-speaker dialogue synthesis [3][4]. The capability is a core feature of the S2-Pro series, not a differentiator between the paid and free tiers [3][5][4]. To utilize multi-speaker support, you must use the s2-pro model identifier [6][4] (which represents the S2.1-Pro series) and format your request as follows: 1. Text Formatting: Use inline speaker tags, such as <|speaker:0|> and <|speaker:1|>, within your text to indicate speaker changes [3][5][1]. 2. Voice Mapping: Instead of providing a single reference_id or voice, you must provide an array of model IDs (e.g., ["speaker-a-id", "speaker-b-id"]) that correspond to the speaker indices defined in your text [3][5][1]. The model handles turn-taking, pacing, and natural pauses between the speakers automatically [5]. Note that the speech.voices (array) and speech.voice (singular) parameters are mutually exclusive; using the array format is required to enable the multi-speaker functionality [5][1]. The s2.1-pro-free tier provides access to these same technical capabilities, typically optimized for development, testing, or lower-latency needs [2].

Citations:


🏁 Script executed:

# Find and verify the openapi.json file
fd -name "openapi.json" -type f

# Check lines 2591-2594 and surrounding context
sed -n '2585,2600p' api-reference/openapi.json

Repository: fishaudio/docs

Length of output: 2370


Update the label to remove the "Pro only" restriction; multi-speaker is supported on both s2.1-pro and s2.1-pro-free.

The label reads "Multi Speaker (S2.1 Pro only)" but according to Fish Audio's documentation, multi-speaker synthesis is a core feature available on both the s2.1-pro and s2.1-pro-free models—it is not exclusive to the paid tier. The curl sample correctly uses s2.1-pro-free, but the label is misleading and should be updated to reflect that multi-speaker works on both tier variants.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api-reference/openapi.json` around lines 2591 - 2594, The label for the
multi-speaker bash example currently reads "Multi Speaker (S2.1 Pro only)" which
incorrectly restricts the feature to only the paid tier. Update this label to
remove the "Pro only" restriction and clarify that multi-speaker synthesis is
supported on both s2.1-pro and s2.1-pro-free models, since the example curl
command correctly demonstrates the feature working with s2.1-pro-free.

}
]
}
Expand All @@ -2606,13 +2608,15 @@
{
"in": "header",
"name": "model",
"description": "Specify which TTS model to use. We recommend `s2-pro`.",
"description": "Specify which TTS model to use. Use `s2.1-pro-free` for the free developer tier.",
"required": true,
"schema": {
"default": "s2-pro",
"default": "s2.1-pro-free",
"enum": [
"s1",
"s2-pro"
"s2-pro",
"s2.1-pro",
"s2.1-pro-free"
],
"title": "Model",
"type": "string"
Expand Down Expand Up @@ -3540,7 +3544,7 @@
{
"lang": "bash",
"label": "Stream With Timestamps",
"source": "curl --no-buffer --request POST \\\n --url https://api.fish.audio/v1/tts/stream/with-timestamp \\\n --header 'Authorization: Bearer <token>' \\\n --header 'Content-Type: application/json' \\\n --header 'model: s2-pro' \\\n --data '{\n \"text\": \"[happy] I can’t believe it’s been this long. It feels like forever since we last really talked. I’ve missed hearing your voice, your stories, even the little things you used to say. How have you been? I’ve thought about calling you so many times, but I never knew where to start. Seeing you again now makes me realize just how much I’ve missed you. We have so much to catch up on, and I don’t even know which part of my life to tell you about first.\",\n \"format\": \"opus\",\n \"normalize\": true,\n \"temperature\": 0.9,\n \"chunk_length\": 100,\n \"top_p\": 0.9,\n \"latency\": \"balanced\",\n \"sample_rate\": 48000,\n \"reference_id\": \"fbe02f8306fc4d3d915e9871722a39d5\"\n }'"
"source": "curl --no-buffer --request POST \\\n --url https://api.fish.audio/v1/tts/stream/with-timestamp \\\n --header 'Authorization: Bearer <token>' \\\n --header 'Content-Type: application/json' \\\n --header 'model: s2.1-pro-free' \\\n --data '{\n \"text\": \"[happy] I can’t believe it’s been this long. It feels like forever since we last really talked. I’ve missed hearing your voice, your stories, even the little things you used to say. How have you been? I’ve thought about calling you so many times, but I never knew where to start. Seeing you again now makes me realize just how much I’ve missed you. We have so much to catch up on, and I don’t even know which part of my life to tell you about first.\",\n \"format\": \"opus\",\n \"normalize\": true,\n \"temperature\": 0.9,\n \"chunk_length\": 100,\n \"top_p\": 0.9,\n \"latency\": \"balanced\",\n \"sample_rate\": 48000,\n \"reference_id\": \"fbe02f8306fc4d3d915e9871722a39d5\"\n }'"
}
]
}
Expand Down
Loading