Skip to content

Commit 5893f89

Browse files
committed
gnata-sqlite
1 parent 6375ab4 commit 5893f89

102 files changed

Lines changed: 12319 additions & 1997 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,10 @@
11
*.wasm
2+
*.dylib
3+
*.so
4+
gnata_jsonata.h
5+
.agents
6+
skills-lock.json
27
.secret
38
.DS_Store
9+
node_modules/
10+
editor/codemirror/dist/

AGENTS.md

Lines changed: 106 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -2,50 +2,123 @@
22

33
This file provides guidance to AI agents (Claude Code, GitHub Copilot, Cursor, etc.) when working with code in this repository.
44

5-
## What Is Gnata
6-
7-
Gnata is a full JSONata 2.x implementation in Go, built for production streaming workloads. JSONata is a query and transformation language for JSON data. Gnata provides two-tier evaluation: fast-path (GJSON zero-copy) for simple expressions and full AST evaluation for complex ones, plus a lock-free `StreamEvaluator` for high-throughput batched evaluation.
5+
## Project Overview
6+
7+
gnata-sqlite is a fork of [RecoLabs/gnata](https://github.com/RecoLabs/gnata) — a full JSONata 2.x implementation in Go. This fork extends it with:
8+
9+
- **SQLite C extension** (`sqlite/`) — registers `jsonata()`, `jsonata_query()`, and `jsonata_each` as SQL functions/virtual tables via CGo
10+
- **Query planner** (`internal/planner/`) — decomposes JSONata expressions for streaming SQL aggregation
11+
- **Editor/LSP** (`editor/`) — language server and WASM entry point for browser-based editing
12+
- **CodeMirror plugin** (`editor/codemirror/`) — TypeScript CodeMirror 6 language support
13+
14+
Module path: `github.com/rbbydotdev/gnata-sqlite`
15+
16+
## Package Map
17+
18+
| Package | Purpose |
19+
|---------|---------|
20+
| Root (`gnata`) | Core JSONata 2.x engine — lexer, parser, evaluator, streaming. Entry points: `gnata.go`, `stream.go` |
21+
| `functions/` | 55+ built-in JSONata functions (string, array, numeric, datetime, etc.). Registered via `functions.RegisterAll` |
22+
| `internal/evaluator/` | AST evaluation dispatch — binary ops, functions, chains, transforms. One file per eval category (`eval_binary.go`, `eval_function.go`, `eval_chain.go`, etc.) |
23+
| `internal/parser/` | Pratt parser, AST nodes, fast-path analysis (`parser.AnalyzeFastPath`) |
24+
| `internal/lexer/` | Tokenizer for JSONata expression strings |
25+
| `internal/planner/` | Query planner — decomposes JSONata for streaming SQL aggregation. Extracts paths, predicates, accumulators |
26+
| `sqlite/` | SQLite C extension via CGo — registers `jsonata()`, `jsonata_query()`, `jsonata_each` virtual table. Routes aggregates through planner |
27+
| `sqlite/tinygo/` | TinyGo-compatible eval subset for WASM builds |
28+
| `sqlite/benchmarks/` | SQL benchmark files for SQLite extension performance testing |
29+
| `editor/` | LSP server (native) + TinyGo WASM entry point — completions, hover, diagnostics via JSON-RPC 2.0 over stdin/stdout |
30+
| `editor/codemirror/` | npm package — CodeMirror 6 language support (TypeScript) |
31+
| `wasm/` | WASM entry point for browser playground. Exports `gnataEval`, `gnataCompile`, `gnataEvalHandle` |
832

933
## Development Commands
1034

1135
```sh
12-
# Lint (from package root)
36+
# Run all tests (includes CGo sqlite tests)
37+
go test ./...
38+
39+
# Run tests excluding sqlite (no CGo needed)
40+
go test $(go list ./... | grep -v sqlite)
41+
42+
# Lint
1343
golangci-lint run
1444

15-
# Run all tests (1,273 JSON test cases from official jsonata-js suite + unit tests)
16-
go test ./...
45+
# Build sqlite extension (macOS)
46+
go build -buildmode=c-shared -o gnata_jsonata.dylib ./sqlite
47+
48+
# Build sqlite extension (Linux)
49+
go build -buildmode=c-shared -o gnata_jsonata.so ./sqlite
1750

18-
# Run a single test
19-
go test -run TestName
51+
# Build WASM LSP
52+
tinygo build -o gnata-lsp.wasm -target wasm ./editor
53+
54+
# Build WASM playground
55+
GOOS=js GOARCH=wasm go build -ldflags="-s -w" -trimpath -o gnata.wasm ./wasm
56+
57+
# Build CodeMirror package
58+
cd editor/codemirror && npm install && npm run build
59+
60+
# Build native LSP server
61+
go build -o gnata-lsp ./editor/
2062

2163
# Run benchmarks
2264
go test -bench=. -benchmem
2365
```
2466

67+
## CI
68+
69+
GitHub Actions (`.github/workflows/ci.yml`) runs on push/PR to `main`:
70+
- `go test -race -count=1 ./...`
71+
- `golangci-lint` via `golangci/golangci-lint-action@v9`
72+
2573
## Architecture
2674

2775
### Compilation Pipeline
2876

29-
Lexer Parser AST Processing Fast-Path Analysis Expression
77+
Lexer -> Parser -> AST Processing -> Fast-Path Analysis -> Expression
3078

3179
1. **Lexer** (`internal/lexer/`) — Tokenizes JSONata expression strings
3280
2. **Parser** (`internal/parser/`) — Pratt (top-down operator precedence) parser producing AST nodes
3381
3. **AST Processing** (`parser.ProcessAST`) — Normalizes and optimizes the AST
3482
4. **Fast-Path Analysis** (`parser.AnalyzeFastPath`) — Classifies expressions into:
3583
- Pure-path fast path (e.g., `Account.Name`) — uses GJSON zero-copy
3684
- Comparison fast path (e.g., `a.b = "x"`) — zero allocations
85+
- Function fast path (e.g., `$exists(a.b)`) — direct GJSON evaluation
3786
- Full AST evaluation required
3887

3988
### Two-Tier Evaluation
4089

4190
- `Eval(ctx, any)` — Evaluate against pre-parsed Go values via full AST walk
4291
- `EvalBytes(ctx, json.RawMessage)` — Fast-path expressions use GJSON directly on raw JSON bytes; full-path falls back to unmarshal + Eval
4392

44-
### StreamEvaluator (stream.go)
93+
### StreamEvaluator (`stream.go`)
4594

4695
Batch-evaluates multiple expressions against events. Schema-keyed `GroupPlan` caching deduplicates field extraction across expressions. Lock-free reads via `atomic.Pointer` snapshot; writes serialized by `sync.Mutex`. Single JSON scan per event via `gjson.GetManyBytes`.
4796

48-
### Evaluator Dispatch (internal/evaluator/)
97+
### Query Planner (`internal/planner/`)
98+
99+
Decomposes JSONata expressions into `QueryPlan` for SQL streaming aggregation. Each plan contains:
100+
- `Accumulators` — fed per row via `StepBatch` (single GJSON scan per row)
101+
- `FinalExpr` — evaluated once at finalization
102+
- `Predicates` — deduplicated, evaluated once per row; accumulators reference by index
103+
- Used by `jsonata_query()` SQL aggregate function in the SQLite extension
104+
105+
### SQLite Bridge (`sqlite/`)
106+
107+
CGo extension that registers SQL functions with SQLite:
108+
- `jsonata(expr, json [, bindings])` — scalar function, evaluates JSONata expression against JSON
109+
- `jsonata_query(expr, json)` — aggregate function, routes through query planner for streaming aggregation
110+
- `jsonata_each` — virtual table for iterating JSONata results as rows
111+
- Entry point: `extension.go` with C bridge in `bridge.c` / `bridge.h`
112+
113+
### Editor/LSP (`editor/`)
114+
115+
Shared Go code compiled to either:
116+
- Native LSP server (`main_lsp.go`, build tag `!js`) — JSON-RPC 2.0 over stdin/stdout
117+
- TinyGo WASM (`main_wasm.go`, build tag `js`) — for browser integration
118+
119+
Supports: `textDocument/didOpen`, `textDocument/didChange` (diagnostics), `textDocument/completion` (schema-aware)
120+
121+
### Evaluator Dispatch (`internal/evaluator/`)
49122

50123
`evaluator.Eval(node, input, env)` dispatches by `node.Type`. Each eval category is in its own file:
51124
- `eval_binary.go` — Binary operators, subscripts, filtering
@@ -58,32 +131,38 @@ Batch-evaluates multiple expressions against events. Schema-keyed `GroupPlan` ca
58131
- `eval_regex.go` — Regex compilation and matching
59132
- `eval_unary.go` — Unary operators (negation, array constructor)
60133

61-
### Standard Library (functions/)
134+
### Fast-Path Byte Evaluation (`func_fast.go`)
62135

63-
55+ built-in JSONata functions across categorized files: `string_funcs.go`, `string_match_replace.go`, `string_format_number.go`, `string_format_integer.go`, `string_encoding.go`, `numeric_funcs.go`, `array_funcs.go`, `object_funcs.go`, `hof_funcs.go`, `boolean_funcs.go`, `datetime_funcs.go`, `datetime_format.go`, `datetime_parse.go`. All registered via `functions.RegisterAll`.
136+
Dispatch-map of `funcFastHandlers` maps each `FuncFastKind` to a standalone handler function (e.g., `evalFuncContains`, `evalFuncString`). Each handler operates directly on `gjson.Result` for zero-copy evaluation.
64137

65-
### Fast-Path Byte Evaluation (func_fast.go)
138+
## Key Types
66139

67-
Dispatch-map of `funcFastHandlers` maps each `FuncFastKind` to a standalone handler function (e.g., `evalFuncContains`, `evalFuncString`). Each handler operates directly on `gjson.Result` for zero-copy evaluation. `FuncFastRound` is intentionally absent — it requires banker's rounding handled by the full evaluator.
140+
| Type | File | Description |
141+
|------|------|-------------|
142+
| `gnata.Expression` | `gnata.go` | Compiled, goroutine-safe JSONata expression with fast-path metadata |
143+
| `gnata.StreamEvaluator` | `stream.go` | Batch evaluator with copy-on-write expression slice + `BoundedCache` for schema plans |
144+
| `evaluator.Environment` | `internal/evaluator/env.go` | Lexical scope chain for variable bindings and function registry |
145+
| `parser.Node` | `internal/parser/node.go` | AST node types |
146+
| `planner.QueryPlan` | `internal/planner/planner.go` | Compiled execution plan for `jsonata_query` aggregate |
147+
| `BoundedCache` | `bounded_cache.go` | Lock-free FIFO ring-buffer cache (atomic pointer reads) |
148+
| `OrderedMap` | `internal/evaluator/ordered_map.go` | Insertion-ordered map preserving JSON field order |
68149

69-
### Key Types
150+
## Testing
70151

71-
- **Expression** (`gnata.go`) — Compiled, goroutine-safe JSONata expression with fast-path metadata
72-
- **StreamEvaluator** (`stream.go`) — Copy-on-write expression slice + `BoundedCache` for schema plans
73-
- **BoundedCache** (`bounded_cache.go`) — Lock-free FIFO ring-buffer cache (atomic pointer reads)
74-
- **OrderedMap** (`internal/evaluator/ordered_map.go`) — Insertion-ordered map preserving JSON field order
75-
- **Environment** (`internal/evaluator/env.go`) — Lexical scope chain for variable bindings
152+
- Tests use separate `_test` packages (`gnata_test`, `lexer_test`, `parser_test`)
153+
- `testdata/groups/` contains 100+ test case groups ported from the jsonata-js official suite
154+
- `testdata/datasets/` contains JSON test fixtures
155+
- `suite_test.go` loads 1,200+ JSON test cases — each `.json` file has `expr`, `dataset`, `bindings`, and `result` fields
156+
- Key test files: `gnata_test.go`, `stream_test.go`, `func_fast_test.go`, `evaluator_test.go`, `suite_test.go`, `lexer_test.go`, `parser_test.go`, `analysis_test.go`
157+
- SQLite benchmarks: `sqlite/benchmarks/*.sql`
158+
- CI runs tests with `-race` flag
76159

77160
## Dependencies
78161

79-
Only one direct dependency (pure Go, no CGo):
162+
Only one direct dependency (pure Go, no CGo for core):
80163
- `tidwall/gjson` — Zero-copy JSON field extraction for fast-path byte-level evaluation
81164

82-
Regex uses Go's standard `regexp` package (no external regex library).
83-
84-
## Testing
85-
86-
Tests use separate `_test` packages (`gnata_test`, `lexer_test`, `parser_test`). The primary test suite (`suite_test.go`) loads 1,200+ JSON test cases from `testdata/groups/` (100+ subdirectories) — each `.json` file is a case with `expr`, `dataset`, `bindings`, and `result` fields. Datasets live in `testdata/datasets/`. Additional unit tests in `evaluator_test.go` cover regression tests using table-driven test (TDT) style.
165+
The `sqlite/` package requires CGo (links against SQLite via `sqlite3ext.h`). Regex uses Go's standard `regexp` package.
87166

88167
## Custom Functions
89168

@@ -94,7 +173,3 @@ customFuncs := map[string]gnata.CustomFunc{
94173
}
95174
se := gnata.NewStreamEvaluator(nil, gnata.WithCustomFunctions(customFuncs))
96175
```
97-
98-
## WASM
99-
100-
`wasm/main.go` exports `gnataEval`, `gnataCompile`, `gnataEvalHandle` for browser use. Build with `GOOS=js GOARCH=wasm go build -ldflags="-s -w" -trimpath -o gnata.wasm ./wasm/`.

CODE_OF_CONDUCT.md

Lines changed: 0 additions & 83 deletions
This file was deleted.

CONTRIBUTING.md

Lines changed: 0 additions & 82 deletions
This file was deleted.

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
MIT License
22

3-
Copyright (c) 2026 RecoLabs Inc.
3+
Copyright (c) 2026 Robert Polana
44

55
Permission is hereby granted, free of charge, to any person obtaining a copy
66
of this software and associated documentation files (the "Software"), to deal

0 commit comments

Comments
 (0)