Commit 1448f9c
docs: add text-to-sql dev note (#349)
* docs: add text-to-sql devnote
* add diagram, update content
* correct inconsistencies
* docs: address PR #349 feedback and add BIRD benchmark results
PR feedback fixes:
- Fix Window Functions contradiction: Key Takeaway #1 now uses
"Geospatial SQL" (Advanced) instead of "Window Functions" (Intermediate)
- Fix score-0 truthiness bug: use `is not none` instead of truthy check
in Jinja2 expression columns (inline example + production pipeline)
- Soften Code Sandbox language: "A natural next step would be..." instead
of "We are actively implementing..."
- Cut Gretel reference per mvansegbroeck: replaced with NVIDIA/Nemotron
team description
- Replace Qwen model references with Nemotron per mvansegbroeck: MODEL_NAME,
ASCII diagram labels, Pipeline Overview prose
- Rename sdg_qwen_235b.py -> sdg_ndd_text2sql.py per mvansegbroeck
- Fix Try It Yourself: use MODEL_ALIAS = "nvidia-text" with default
provider pattern (matches structured-outputs dev note), remove unused
explicit ModelConfig
- Remove placeholder dataset link (#), add "Dataset: Internal" note
New content:
- Add BIRD Benchmark Results section with bar chart (JPG), data table,
BIRD caveat paragraph, and Jocelyn Huang acknowledgement
(Nemotron Super EX: 26.77% -> 41.80%, +15 pts, beats GPT-OSS-120B)
- Replace "Looking Ahead: Code Sandbox" with broader "Next Steps":
Code Sandbox, RL on BIRD via NeMo Gym, schema representation, Spider 2.0
- Add Project Summary table at end of post
* docs: address second round of PR #349 feedback
- Fix "EHR Systems" -> "Electronic Health Records" in Key Takeaway #1
to match the exact taxonomy string in the code example (greptile)
- Add admonition clarifying code snippets are illustrative, not
runnable, with link to Enterprise Text-to-SQL Recipe (nabinchha)
- Add context before score extraction snippet referencing the five
LLMJudgeColumnConfig columns and linking to full recipe (nabinchha)
- Add companion file note and recipe link to production pipeline
details block for prompts.py, rubrics.py, text2sql_seed.json (nabinchha)
* docs: address round 2 PR #349 feedback, replace production block with recipe
- Fix "EHR Systems" -> "Electronic Health Records" in Key Takeaway #1
to match the exact taxonomy string in the code example (greptile)
- Add admonition clarifying inline code snippets are illustrative,
with link to runnable Enterprise Text-to-SQL Recipe (nabinchha)
- Add context before score extraction snippet referencing the five
LLMJudgeColumnConfig columns and linking to full recipe (nabinchha)
- Replace production pipeline <details> block (230 lines with phantom
imports from prompts.py, rubrics.py, text2sql_seed.json) with
snippet include of enterprise_text_to_sql.py recipe — self-contained
and runnable, consistent with other merged dev notes (nabinchha)
* docs: polish Try It Yourself and Summary sections
- Wrap minimal inline example in collapsible <details> dropdown
- Rename "A Team Effort" section to "Summary"
- Remove redundant Scale/Dialects/Dataset line
* docs: add missing sql_dialect sampler to Step 1 code snippet
The Step 3/4 prompt templates reference {{ sql_dialect }} but the
Step 1 seeding code never defined it, leaving an unresolved Jinja2
variable for readers following along. Add the sql_dialect sampler
with a comment explaining the pipeline runs once per dialect.
* fix ascii diagram
* docs: fix BIRD score framing and MySQL dialect wording
- Remove specific "60-70%" BIRD claim from intro to avoid contradiction
with the 41.80%/38.25% direct-generation results shown later (those
higher figures come from specialized systems with schema linking)
- Reword MySQL "forbids" to "prompts exclude" -- REGEXP_REPLACE and
CONVERT_TZ are valid MySQL functions; the pipeline excluded them for
portability, not because the dialect forbids them
* docs: move text-to-sql images to assets/ convention and update refs
* docs: address text-to-sql devnote review comments
- Add devnote to mkdocs nav after Async All the Way Down
- Swap Recursive CTEs to Advanced, CASE Expressions to Intermediate (matches recipe)
- Fix score extraction truthy check to use 'is not none' (preserves score-0 values)
- Drop REPLACE() vs regexp_replace from dialect takeaway (REPLACE is cross-dialect)
- Tighten prose: remove 'The key insight:', use actual BIRD number, trim X-not-Y
- Fix knowledge dependency count: 8 -> 9 concepts (3x3 in recipe)
---------
Signed-off-by: Yev Meyer <ymeyer@nvidia.com>
Co-authored-by: Yev Meyer <ymeyer@nvidia.com>1 parent 64f31bc commit 1448f9c
6 files changed
Lines changed: 608 additions & 1 deletion
File tree
- docs
- assets/recipes/code_generation
- devnotes
- posts
- assets/text-to-sql
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
543 | 543 | | |
544 | 544 | | |
545 | 545 | | |
546 | | - | |
| 546 | + | |
547 | 547 | | |
548 | 548 | | |
549 | 549 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
26 | 34 | | |
27 | 35 | | |
28 | 36 | | |
| |||
0 commit comments