Skip to content

Eng 1595 single prompt based extraction pass prompt encodes dg#963

Open
sid597 wants to merge 2 commits intoeng-1592-accept-upload-of-exactly-one-pdffrom
eng-1595-single-prompt-based-extraction-pass-prompt-encodes-dg
Open

Eng 1595 single prompt based extraction pass prompt encodes dg#963
sid597 wants to merge 2 commits intoeng-1592-accept-upload-of-exactly-one-pdffrom
eng-1595-single-prompt-based-extraction-pass-prompt-encodes-dg

Conversation

@sid597
Copy link
Copy Markdown
Collaborator

@sid597 sid597 commented Apr 15, 2026


Open with Devin

Summary by CodeRabbit

  • New Features
    • PDF extraction functionality is now operational with real-time state feedback
    • Users can dynamically configure node types for extraction
    • Button states update to reflect extraction progress ("Extracting…" vs "Re-Extract")
    • Extraction validation ensures required parameters are provided before processing

@supabase
Copy link
Copy Markdown

supabase bot commented Apr 15, 2026

This pull request has been ignored for the connected project zytfjzqyijgagqxrzbmz because there are no changes detected in packages/database/supabase directory. You can change this behaviour in Project Integrations Settings ↗︎.


Preview Branches by Supabase.
Learn more about Supabase Branching ↗︎.

@sid597 sid597 changed the base branch from main to eng-1592-accept-upload-of-exactly-one-pdf April 15, 2026 06:37
@sid597
Copy link
Copy Markdown
Collaborator Author

sid597 commented Apr 15, 2026

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 15, 2026

✅ Actions performed

Full review triggered.

devin-ai-integration[bot]

This comment was marked as resolved.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 15, 2026

📝 Walkthrough

Walkthrough

This PR implements dynamic extraction prompt generation based on selected node types, replacing a hard-coded default prompt. The extraction workflow is integrated into the page component with base64 PDF conversion and API integration, while the sidebar is updated to provide extraction control and status feedback.

Changes

Cohort / File(s) Summary
UI Extraction Component
apps/website/app/(extract)/extract-nodes/components/Sidebar.tsx
Removed hard-coded "Model" section and added extraction control props (onExtract, canExtract, isExtracting). Updated bottom action button with click handler, conditional disable state, and dynamic label reflecting extraction state.
Page-level Extraction Logic
apps/website/app/(extract)/extract-nodes/page.tsx
Implemented extraction workflow including base64 file conversion, state management for isExtracting, and handleExtract async function that posts to /api/ai/extract endpoint with constructed request payload containing provider, model, system prompt, and optional research question.
Extraction Prompt Generation
apps/website/app/prompts/extraction.ts
Replaced hard-coded DEFAULT_EXTRACTION_PROMPT with dynamic buildSystemPrompt(nodeTypes) function that constructs system prompt by mapping node types to definitions. Refactored prompt components into internal constants (QUALITY_CRITERIA, FEW_SHOT_EXAMPLES).
Extraction API Handler
apps/website/app/api/ai/extract/route.ts
Removed fallback to DEFAULT_EXTRACTION_PROMPT and changed to use systemPrompt directly from validated request without default fallback.
Extraction Type System
apps/website/app/types/extraction.ts
Made systemPrompt field in ExtractionRequestSchema required with non-empty validation (z.string().min(1)), changing from optional to mandatory.

Sequence Diagram

sequenceDiagram
    actor User
    participant Sidebar as Sidebar Component
    participant Page as Page Component
    participant PDFHandler as PDF Handler
    participant API as Extract API
    participant Anthropic as Anthropic API

    User->>Sidebar: Click "Re-Extract" button
    Sidebar->>Page: onExtract()
    Page->>Page: handleExtract()
    Page->>PDFHandler: readFileAsBase64(pdfFile)
    PDFHandler-->>Page: base64 string
    Page->>Page: buildSystemPrompt(selectedNodeTypes)
    Page->>Page: Construct request payload
    Page->>API: POST /api/ai/extract
    Note over API: Validate systemPrompt (required)
    API->>Anthropic: Send extraction request with base64, systemPrompt
    Anthropic-->>API: Extraction results
    API-->>Page: JSON response
    Page->>Page: isExtracting = false
    Page->>Sidebar: Update canExtract, isExtracting state
    Sidebar-->>User: Display results/re-enable button
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • ENG-1602: Add PDF extraction API route #937 — Modifies the same extraction API route, prompt construction, and request schema, directly implementing the dynamic buildSystemPrompt and required systemPrompt validation.
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title is vague and contains unclear abbreviations ('Eng 1595', 'dg') that don't clearly convey the main change to someone unfamiliar with internal ticket systems. Consider a more descriptive title like 'Make extraction prompt dynamic based on selected node types' that clearly explains the primary change without relying on jargon or ticket references.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot]

This comment was marked as resolved.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 6 additional findings in Devin Review.

Open in Devin Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant