Skip to content

fix summary language respects user primary language setting#6470

Merged
mdmohsin7 merged 4 commits intomainfrom
fix/summary-language-v2
Apr 13, 2026
Merged

fix summary language respects user primary language setting#6470
mdmohsin7 merged 4 commits intomainfrom
fix/summary-language-v2

Conversation

@krushnarout
Copy link
Copy Markdown
Member

Summary

  • Summaries were generated in the conversation language instead of the user's primary language setting
  • Added output_language_code param to get_transcript_structure, get_reprocess_transcript_structure, extract_action_items, and get_message_structure
  • _get_structured now fetches the user's language preference and passes it as the output language — falls back to conversation language if no preference is set

Demo

ScreenRecording_04-09-2026.08-51-56_1.MP4

🤖 Generated with Claude Code

krushnarout and others added 3 commits April 9, 2026 09:17
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 9, 2026

Greptile Summary

This PR fixes conversation summaries by honoring the user's primary language setting instead of always using the conversation's detected language. It adds an output_language_code parameter to get_transcript_structure, get_reprocess_transcript_structure, extract_action_items, and get_message_structure, and wires it up through _get_structured by fetching the user preference with get_user_language_preference. The core logic is sound: it falls back to conversation language when no preference is set, and all major processing paths are covered.

Confidence Score: 5/5

Safe to merge — all remaining findings are P2 style/consistency suggestions that don't affect correctness.

The fix is correct: user language preference is fetched with a proper fallback to conversation language, and all major processing paths are updated consistently. The two P2 findings (missing language in the other source path and the prompt-cache placement in get_transcript_structure) are minor inconsistencies that don't break functionality.

No files require special attention for merge safety.

Vulnerabilities

No security concerns identified. The language preference is fetched from Firestore using the authenticated uid, and the value is only used as a language instruction in LLM prompts — no injection risk.

Important Files Changed

Filename Overview
backend/utils/conversations/process_conversation.py Adds get_user_language_preference call in _get_structured and passes output_language_code to all downstream LLM functions; summarize_experience_text path is left without language preference.
backend/utils/llm/conversation_processing.py Adds output_language_code to three functions; extract_action_items correctly places the language instruction in the dynamic second message, but get_transcript_structure places it in the static first message, partially defeating per-conversation prompt caching.
backend/utils/llm/external_integrations.py Adds output_language_code to get_message_structure with correct fallback logic; prompt updated to use {response_language} variable.

Sequence Diagram

sequenceDiagram
    participant C as Caller
    participant GS as _get_structured
    participant DB as users_db
    participant LLM as LLM Functions

    C->>GS: uid, language_code, conversation
    GS->>DB: get_user_language_preference(uid)
    DB-->>GS: user_language (or '' if not set)
    GS->>GS: user_language = user_language or language_code

    alt audio source
        GS->>LLM: get_transcript_structure(..., output_language_code=user_language)
        GS->>LLM: extract_action_items(..., output_language_code=user_language)
    else message source
        GS->>LLM: get_message_structure(..., output_language_code=user_language)
    else other source
        GS->>LLM: summarize_experience_text(...) no language param
    else force_process
        GS->>LLM: get_reprocess_transcript_structure(..., output_language_code=user_language)
        GS->>LLM: extract_action_items(..., output_language_code=user_language)
    else normal transcript
        GS->>LLM: get_transcript_structure(..., output_language_code=user_language)
        GS->>LLM: extract_action_items(..., output_language_code=user_language)
    end

    LLM-->>GS: Structured in user_language
    GS-->>C: Structured, discarded
Loading

Comments Outside Diff (1)

  1. backend/utils/conversations/process_conversation.py, line 142-145 (link)

    P2 summarize_experience_text path skips user language preference

    The other source branch calls summarize_experience_text without any language argument, so conversations from this path will always be summarized in the LLM's default language rather than the user's preference — inconsistent with the audio and message branches fixed in this PR.

    summarize_experience_text would need an output_language_code parameter added (similar to the other functions) for full consistency.

Reviews (1): Last reviewed commit: "fix add output_language_code param to ge..." | Re-trigger Greptile

Comment on lines +644 to +645
instructions_text = '''You are an expert content analyzer. Your task is to analyze the provided content (which could be a transcript, a series of photo descriptions from a wearable camera, or both) and provide structure and clarity.
The content language is {language_code}. Use the same language {language_code} for your response.
The content language is {language_code}. You MUST respond entirely in {response_language}.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Language instruction in static cache prefix defeats per-conversation caching

{language_code} and the new {response_language} both live in instructions_text, which is the first system message specifically commented as the "static prefix" for cross-conversation OpenAI prompt caching. Every distinct (language_code, response_language) pair produces a different first message, creating a separate cache entry.

extract_action_items correctly handles this by placing language instructions in the second (dynamic) context_message. Moving the language line here would restore the caching benefit for get_transcript_structure:

context_message = 'The content language is {language_code}. You MUST respond entirely in {response_language}.\n\nContent:\n{conversation_context}'

…aching

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@krushnarout krushnarout requested a review from mdmohsin7 April 9, 2026 14:01
@mdmohsin7 mdmohsin7 merged commit 762acd4 into main Apr 13, 2026
2 checks passed
@mdmohsin7 mdmohsin7 deleted the fix/summary-language-v2 branch April 13, 2026 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

App generates summary in conversation language instead of respecting primary language setting

2 participants