Skip to content

add http sse cosyvocie tts api and omni createItem function#202

Closed
songguocola wants to merge 2 commits into
dashscope:mainfrom
songguocola:dev/omni_0331
Closed

add http sse cosyvocie tts api and omni createItem function#202
songguocola wants to merge 2 commits into
dashscope:mainfrom
songguocola:dev/omni_0331

Conversation

@songguocola
Copy link
Copy Markdown
Contributor

No description provided.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces HTTP-based text-to-speech synthesis support through the new HttpSpeechSynthesizer class and provides comprehensive usage examples for both HTTP TTS and Qwen3 Omni tool calling. Key updates include the addition of metadata classes for audio results, enhanced WebSocket connection management with explicit timeouts, and the implementation of conversation item creation for real-time interactions. Review feedback highlights several critical improvement areas: addressing thread-safety concerns in the synthesizer, correcting a type mismatch for a protocol constant, and properly handling connection timeouts by checking await return values. Furthermore, recommendations were provided to refine exception handling to maintain specific error types and to refactor redundant logic for parsing usage statistics.

* @author songsong.sss
*/
@Slf4j
public class HttpSpeechSynthesizer {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The HttpSpeechSynthesizer class is not thread-safe. It maintains state in fields like accumulatedAudioData and modifies the shared serviceOption object (e.g., calling setIsSSE) during method calls. If multiple threads use the same instance concurrently, it will lead to race conditions and unpredictable behavior. Consider making the class stateless by moving request-specific data to local variables or using a per-request configuration.

public static final String PROTOCOL_EVENT_TYPE_CREATE_RESPONSE = "response.create";
public static final String PROTOCOL_EVENT_TYPE_CANCEL_RESPONSE = "response.cancel";
public static final String PROTOCOL_EVENT_TYPE_FINISH_SESSION = "session.finish";
public static final Object PROTOCOL_EVENT_TYPE_ITEM_CREATE = "conversation.item.create";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The constant PROTOCOL_EVENT_TYPE_ITEM_CREATE is declared as Object, while all other protocol constants in this class are String. This inconsistency can lead to type mismatches and is likely a typo.

Suggested change
public static final Object PROTOCOL_EVENT_TYPE_ITEM_CREATE = "conversation.item.create";
public static final String PROTOCOL_EVENT_TYPE_ITEM_CREATE = "conversation.item.create";

websocktetClient = client.newWebSocket(request, this);
connectLatch.set(new CountDownLatch(1));
connectLatch.get().await();
connectLatch.get().await(60, java.util.concurrent.TimeUnit.SECONDS);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The return value of await is not checked. If the connection attempt times out after 60 seconds, the method will proceed as if it succeeded, which can lead to subsequent operations failing silently or throwing unexpected errors. You should verify if the connection was successful.

    if (!connectLatch.get().await(60, java.util.concurrent.TimeUnit.SECONDS)) {
      throw new RuntimeException("Connection timed out after 60 seconds");
    }

websocktetClient = client.newWebSocket(request, this);
connectLatch.set(new CountDownLatch(1));
connectLatch.get().await();
connectLatch.get().await(60, TimeUnit.SECONDS);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The result of the await call is ignored. If the connection fails to establish within the 60-second timeout, the client will proceed in an uninitialized state. It is safer to check the result and throw an exception on timeout.

    if (!connectLatch.get().await(60, TimeUnit.SECONDS)) {
      throw new RuntimeException("Connection timed out after 60 seconds");
    }

Comment on lines +90 to +93
} catch (Exception e) {
log.error("Speech synthesis failed", e);
throw new ApiException(e);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Catching a broad Exception and wrapping it in a new ApiException loses the specific type of checked exceptions like NoApiKeyException or InputRequiredException that are declared in the method signature. These should be caught specifically and rethrown to preserve the API contract.

    } catch (ApiException | NoApiKeyException | InputRequiredException e) {
      throw e;
    } catch (Exception e) {
      log.error("Speech synthesis failed", e);
      throw new ApiException(e);
    }

Comment on lines +382 to +391
if (dashScopeResult.getUsage() != null) {
try {
SpeechSynthesisUsage usage =
JsonUtils.fromJsonObject(
dashScopeResult.getUsage().getAsJsonObject(), SpeechSynthesisUsage.class);
result.setUsage(usage);
} catch (Exception e) {
log.debug("Failed to parse usage information", e);
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block for parsing usage information is identical to the one in convertResult (lines 254-262). Consider refactoring this into a private helper method to reduce code duplication and improve maintainability.

@songguocola songguocola deleted the dev/omni_0331 branch April 14, 2026 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant