Skip to content

[FEAT]: Stateful Resumption & Checkpointed Pipeline for LLM Extraction #131

@Cubix33

Description

@Cubix33

📝 Description

Currently, the LLM extraction pipeline in llm.py is entirely stateless. If a user is processing a long voice transcription for a complex 20-field incident report and the local Ollama container crashes, times out, or runs out of memory on field 19, the entire process must be restarted from scratch.

This feature introduces a Checkpointed Pipeline. By caching the progress of self._json locally per field, the system can detect an interrupted session_id upon restart, skip the fields that were already successfully extracted, and resume exactly where it left off.

💡 Rationale

For emergency response tools, forced restarts are a dealbreaker. Running massive transcript context windows through local LLMs (like Mistral/Llama) is computationally expensive and slow. A transient error (like a momentary API connection drop) shouldn't cost the user 5 minutes of re-processing time. Implementing statefulness makes the core extraction engine production-ready, resilient, and significantly more respectful of local compute resources.

🛠️ Proposed Solution

  • Implement a lightweight state manager (e.g., a temporary JSON file tied to a session_id or document hash) that updates after every successful add_response_to_json() call.
  • Modify the main_loop() in llm.py to check for an existing state file at initialization.
  • If a state file exists, dynamically filter self._target_fields to exclude keys that already have valid, non-null values in the cached state before beginning the iteration.
  • Logic change in src/llm.py and src/main.py
  • Update to requirements.txt (none needed if using standard library json or sqlite3)
  • New prompt for Mistral/Ollama (none needed, purely architectural)

✅ Acceptance Criteria

  • A script manually interrupted (e.g., via Ctrl+C) during extraction successfully resumes at the interrupted field upon restart.
  • Cached state files are properly cleaned up upon successful completion of the PDF generation.
  • Feature works seamlessly within the Docker container environment.
  • Documentation updated in docs/ explaining how session resumption works.
  • JSON output validates against the schema without duplicating fields.

📌 Additional Context

This directly addresses the underlying fragility of sequential API calls in the backend. It will also serve as a foundational layer if FireForm eventually supports pausing and resuming form-filling via a frontend UI.

Metadata

Metadata

Assignees

No one assigned

    Labels

    to-thinkMore time to think about, advantages and disadvantages of each

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Week X Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions