add http sse cosyvocie tts api and omni createItem function by songguocola · Pull Request #204 · dashscope/dashscope-sdk-java

songguocola · 2026-04-14T09:14:36Z

No description provided.

gemini-code-assist

Code Review

This pull request introduces HTTP-based speech synthesis support, including both SSE streaming and non-SSE URL-based results, and adds tool calling functionality to the Qwen3 Omni realtime conversation. Additionally, it implements 60-second connection timeouts for WebSocket-based services. Review feedback highlights the need for consistent error handling in synthesis results, the importance of verifying connection status after timeouts to prevent silent failures, and concerns regarding memory management when using ThreadLocal storage or unconditional audio accumulation during streaming.

gemini-code-assist · 2026-04-14T09:16:59Z

+  private HttpSpeechSynthesisResult convertNonSSEResult(DashScopeResult dashScopeResult) {
+    HttpSpeechSynthesisResult result = new HttpSpeechSynthesisResult();


The convertNonSSEResult method lacks error checking for the API response. If the API returns an error (e.g., invalid parameters or API key issues), dashScopeResult.getCode() will be non-null, but this method will proceed to try and parse it as a successful result. This should be handled similarly to convertResult by throwing an ApiException.

private HttpSpeechSynthesisResult convertNonSSEResult(DashScopeResult dashScopeResult) { if (dashScopeResult.getCode() != null && !dashScopeResult.getCode().isEmpty()) { String errorMsg = dashScopeResult.getMessage() != null ? dashScopeResult.getMessage() : "Unknown error"; Status status = Status.builder() .statusCode( dashScopeResult.getStatusCode() != null ? dashScopeResult.getStatusCode() : 400) .code(dashScopeResult.getCode()) .message(errorMsg) .requestId(dashScopeResult.getRequestId()) .build(); throw new ApiException(status); } HttpSpeechSynthesisResult result = new HttpSpeechSynthesisResult();

gemini-code-assist · 2026-04-14T09:16:59Z

+    if (!connectLatch.get().await(60, TimeUnit.SECONDS)) {
+      throw new RuntimeException("Connection timed out after 60 seconds");
+    }


The connect() method waits for the connectLatch but does not verify if the connection was actually successful. Since onFailure also counts down the latch, the method might return normally even if the WebSocket failed to open. You should check the isOpen status after the latch is released.

if (!connectLatch.get().await(60, TimeUnit.SECONDS)) { throw new RuntimeException("Connection timed out after 60 seconds"); } if (!isOpen.get()) { throw new RuntimeException("Connection failed"); }

gemini-code-assist · 2026-04-14T09:16:59Z

+    if (!connectLatch.get().await(60, TimeUnit.SECONDS)) {
+      throw new RuntimeException("Connection timed out after 60 seconds");
+    }


Similar to OmniRealtimeConversation, the connect() method should verify that the connection is open after the latch is released, as onFailure also triggers the countdown. Proceeding with a failed connection will lead to subsequent errors when attempting to send messages.

if (!connectLatch.get().await(60, TimeUnit.SECONDS)) { throw new RuntimeException("Connection timed out after 60 seconds"); } if (!isOpen.get()) { throw new RuntimeException("Connection failed"); }

gemini-code-assist · 2026-04-14T09:16:59Z

+            log.error("Error sending audio from file: {}", filePath, e);
+        }
+        //wait 5 seconds for demo response done
+        Thread.sleep(5 * 1000);


The sample program might exit or close the conversation before processing tool calls that arrive after the audio file has been fully read. The current 5-second sleep does not check for or handle pending tool calls that might be triggered by the model's final processing of the audio stream.

// wait for demo response done and handle any pending tool calls long startTime = System.currentTimeMillis(); while (System.currentTimeMillis() - startTime < 5000) { if (handlePendingToolCalls(conversation)) { conversation.createResponse(null, Arrays.asList(OmniRealtimeModality.AUDIO, OmniRealtimeModality.TEXT)); } Thread.sleep(200); }

gemini-code-assist · 2026-04-14T09:16:59Z

+@Slf4j
+public class HttpSpeechSynthesizer {
+
+  private final ThreadLocal<ByteBuffer> accumulatedAudioData = new ThreadLocal<>();


Using an instance-level ThreadLocal for accumulatedAudioData can lead to memory overhead and potential leaks in thread-pooled environments. Each instance of HttpSpeechSynthesizer creates a new ThreadLocal key. If many instances are created and used across a fixed set of threads, the ThreadLocalMap in those threads will accumulate many entries. Additionally, the thread-local nature of this data means it's only accessible from the specific thread that performed the synthesis, which can be confusing for users of the asynchronous streamCall API if they attempt to access it from a different thread. Consider refactoring this to store the accumulated data within the result object or the callback context.

gemini-code-assist · 2026-04-14T09:16:59Z

+    accumulatedAudioData.remove();
+
+    SynchronizeHalfDuplexApi<HttpSpeechSynthesisParam> api = createApi(true);
+    ByteArrayOutputStream audioBuffer = new ByteArrayOutputStream();


The streamCall method unconditionally accumulates all received audio chunks into a ByteArrayOutputStream. This can lead to high memory usage or OutOfMemoryError for long synthesis tasks. Since the primary purpose of streamCall is to provide audio chunks in real-time via the callback, accumulation should be optional. Consider adding a configuration parameter to enable or disable audio accumulation.

songguocola added 2 commits April 10, 2026 11:11

feat(model/omni):support item.create api

9ddc277

feat(model/cosyvoice):support http sse api

3c12c89

songguocola changed the title ~~Dev/omni 0331~~ add http sse cosyvocie tts api and omni createItem function Apr 14, 2026

gemini-code-assist Bot reviewed Apr 14, 2026

View reviewed changes

songguocola closed this Apr 14, 2026

songguocola deleted the dev/omni_0331 branch April 14, 2026 11:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add http sse cosyvocie tts api and omni createItem function#204

add http sse cosyvocie tts api and omni createItem function#204
songguocola wants to merge 2 commits into
dashscope:mainfrom
songguocola:dev/omni_0331

songguocola commented Apr 14, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 14, 2026

Uh oh!

gemini-code-assist Bot Apr 14, 2026

Uh oh!

gemini-code-assist Bot Apr 14, 2026

Uh oh!

gemini-code-assist Bot Apr 14, 2026

Uh oh!

gemini-code-assist Bot Apr 14, 2026

Uh oh!

gemini-code-assist Bot Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		private HttpSpeechSynthesisResult convertNonSSEResult(DashScopeResult dashScopeResult) {
		HttpSpeechSynthesisResult result = new HttpSpeechSynthesisResult();

Conversation

songguocola commented Apr 14, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant