Skip to content

feat(engine): persisted FQN index schema foundation (PR-01)#719

Open
shivasurya wants to merge 1 commit into
mainfrom
shiva/rauth-pr-01-index-schema
Open

feat(engine): persisted FQN index schema foundation (PR-01)#719
shivasurya wants to merge 1 commit into
mainfrom
shiva/rauth-pr-01-index-schema

Conversation

@shivasurya

Copy link
Copy Markdown
Owner

Stack: PR-01 of 14 (Rule Authoring Toolchain, Phase 0). First PR in the stack; nothing depends on it yet upstream.

What

Generalise the experimental Go-only SQLite analysis cache into a language-agnostic, persisted FQN index that later PRs populate (fqn_index in PR-02, call_sites in PR-03) and query (pathfinder fqn in PR-04). No new user-facing query surface ships here; this is the schema + location + versioning foundation.

Changes

  • schema.go (new): fqn_index and call_sites tables plus their indices (per tech-spec Section 2.1), created idempotently alongside the existing Go cache tables. applySchema / wipeDataTables helpers.
  • metadata.go (new): meta key/value accessors with a real "missing key" sentinel kept distinct from SQL errors, so an absent stamp is never confused with a backend failure.
  • index_path.go (new): default location moves out of the project tree to $HOME/.codepathfinder/<project-hash>.sqlite (16 hex chars of SHA-256 over the cleaned absolute root, so branches share one index). --index-path override and CODEPATHFINDER_INDEX_PATH env var; the parent directory is created for every resolved path.
  • analysis_cache.go: options-based constructor (OpenAnalysisCacheWithOptions), global schema_version + engine_version stamps, and open-time auto-rebuild on mismatch or --rebuild-index. An absent stamp (a DB written by an older binary) is upgraded in place rather than wiped, so existing warm Go caches survive the first upgrade. Engine version is only compared when known, so an auxiliary command cannot clobber another command's stamp.
  • scan.go: --index-path and --rebuild-index flags; the resolved index path is logged at debug verbosity.

OpenAnalysisCache(projectRoot) is retained as a thin wrapper so every existing caller and test keeps compiling. The per-table version mechanism, Go callgraph cache tables, and scan output are unchanged.

Reconciliations vs the spec's illustrative code

  • Reused the existing meta table instead of introducing a separate metadata table (renaming would orphan existing DBs for no benefit).
  • Rebuild guard skips when the stored stamp is absent (vs the spec's literal != current), so existing caches upgrade gracefully instead of being wiped on the first run of the new binary.

Verification

  • gradle buildGo clean; go test ./... green; golangci-lint run reports 0 issues.
  • New code unit-tested incl. SQL-error injection; schema.go and metadata.go at 100% line coverage.
  • End-to-end on a Go project: fresh run creates all tables with correct meta stamps; second run reuses the warm cache (no rebuild); --rebuild-index rebuilds; an engine-version bump triggers [cache] rebuilding index: engine 1.0.0 -> 2.1.1 and re-stamps.

Note: the repo lintGo gradle task chains lintPython, which is red on main due to pre-existing black formatting in three python-sdk/ files. This PR touches zero Python files.

🤖 Generated with Claude Code

@shivasurya shivasurya added enhancement New feature or request go Pull requests that update go code labels May 30, 2026
@shivasurya shivasurya self-assigned this May 30, 2026
@safedep

safedep Bot commented May 30, 2026

Copy link
Copy Markdown

SafeDep Report Summary

Green Malicious Packages Badge Green Vulnerable Packages Badge Green Risky License Badge

No dependency changes detected. Nothing to scan.

View complete scan results →

This report is generated by SafeDep Github App

@code-pathfinder

code-pathfinder Bot commented May 30, 2026

Copy link
Copy Markdown

Pathfinder Report

No security findings on the changed files. This pull request is clean.

View report on the dashboard


Powered by Code Pathfinder.

@codecov

codecov Bot commented May 30, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 90.19608% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.49%. Comparing base (460d0d3) to head (7926bf4).

Files with missing lines Patch % Lines
sast-engine/cmd/scan.go 16.66% 10 Missing ⚠️
...t-engine/graph/callgraph/builder/analysis_cache.go 96.34% 1 Missing and 2 partials ⚠️
sast-engine/graph/callgraph/builder/index_path.go 92.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #719      +/-   ##
==========================================
+ Coverage   85.47%   85.49%   +0.02%     
==========================================
  Files         192      195       +3     
  Lines       27523    27605      +82     
==========================================
+ Hits        23524    23600      +76     
- Misses       3107     3113       +6     
  Partials      892      892              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Extend the experimental Go-only analysis cache into a language-agnostic
on-disk index that later phases populate and query. This is the Phase 0
foundation of the rule-authoring toolchain; it ships no new query surface
yet (that lands in PR-04).

- Add fqn_index and call_sites tables (per tech-spec Section 2.1) plus their
  indices, alongside the existing Go cache tables, via a new schema.go.
- Add a global schema_version stamp and an engine_version stamp in meta, with
  open-time auto-rebuild on mismatch. An absent stamp (a DB written by an older
  binary) is upgraded in place, not wiped, so warm Go caches survive the first
  upgrade. The engine version is only compared when known, so an auxiliary
  command cannot clobber another command's stamp.
- The open path is two clear steps: applySchema (idempotent DDL) then
  reconcileVersions (rebuild/wipe decision + stamp), which also makes the
  invalidation paths directly testable.
- ResolveIndexPath: default $HOME/.codepathfinder/<project-hash>.sqlite,
  --index-path override, CODEPATHFINDER_INDEX_PATH env var, parent dir created
  for every resolved path. Hash is the first 16 hex chars of SHA-256 over the
  cleaned absolute project root, so branches share one index.
- scan: --index-path and --rebuild-index flags wired through; the resolved
  index path is logged at debug verbosity.

The existing per-table version mechanism, Go callgraph cache tables, and scan
output are unchanged. Added-code line coverage is ~95% (schema.go and
metadata.go at 100%, index_path.go at 96%), with SQL-error paths exercised via
injection (closed DB, dropped tables, name collisions). The handful of
remaining lines are defensive error returns on a freshly opened DB. Behaviour
verified end-to-end against a Go project: fresh create, warm reuse, rebuild,
and engine-upgrade auto-rebuild.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@shivasurya shivasurya force-pushed the shiva/rauth-pr-01-index-schema branch from db99dae to 7926bf4 Compare May 30, 2026 02:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request go Pull requests that update go code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant