- Use
jqto analyze json outputs from ast. - poolmanager.json - ast generated by solc (single-file build fixture for PoolManager).
Before working on any feature that touches goto-definition, references, call hierarchy, implementation, rename, or the caching system, you must understand how file IDs and node IDs work. Getting this wrong causes cross-build collisions that are extremely hard to debug (a function in one build silently maps to a completely different function in another build).
| Type | Defined in | Inner | Assigned by | Stable across compilations? |
|---|---|---|---|---|
FileId(u64) |
types.rs:24 |
unsigned 64-bit | PathInterner (canonical) |
Yes — same path always gets same ID |
NodeId(i64) |
types.rs:4 |
signed 64-bit | solc (per-compilation) | No — same function can get different IDs |
There is also SolcFileId(String) — a string wrapper used as HashMap keys matching
solc's JSON output (e.g. "0", "34"). It is the stringified form of a file ID.
Node IDs are signed because solc uses negative IDs for built-in symbols (-1 for
abi, -15 for msg, -18 for require, -28 for this).
Every AST node has a src field in the format "offset:length:fileId":
offset— byte offset from the start of the source filelength— byte length of the source rangefileId— which source file this location belongs to
Parsed by SourceLoc::parse() in types.rs. After canonicalization, fileId is a
canonical FileId from the PathInterner, not solc's original per-compilation ID.
Solc assigns file IDs sequentially based on input order. If you compile Foo.sol first,
it gets ID 0. If you compile Bar.sol first, Foo.sol gets a different ID. A single-file
build of PoolManager.sol produces different file IDs than a full project build that
includes all 160 source files.
PathInterner (types.rs:718) is a project-wide, append-only table that assigns canonical
FileId values from file paths. It lives on ForgeLsp behind Arc<RwLock<PathInterner>>.
The invariant: Once a path is interned, it keeps the same ID for the lifetime of the
session. Every CachedBuild::new() call (the only production constructor for fresh builds)
does this:
- Calls
interner.build_remap(&solc_id_to_path)— for each file in this compilation, interns its path and builds a translation table{solc_file_id → canonical_FileId}. - Calls
canonicalize_node_info()on everyNodeInfo— rewrites thefileIdcomponent insrc,name_location,name_locations, andmember_locationstrings. - Rewrites
external_refskeys the same way. - Sets
id_to_path_map = interner.to_id_to_path_map()— the canonical map.
After this, all builds share the same file-ID space. You can safely resolve any src
string from any build using any build's id_to_path_map. This is the foundation that
makes merging builds and cross-build src lookups safe.
Key code path: goto.rs → CachedBuild::new() → build_remap() + canonicalize_node_info()
Solc assigns node IDs as a monotonically increasing counter during AST construction. The counter's value depends on how many nodes have been processed before a given declaration. When the compilation closure changes (different files in scope), the same function gets a different numeric ID.
Concrete example from our debugging:
- File build of
PoolManager.sol:swapfunction = node ID 616 - Sub-cache build of a library: node ID 616 = a completely different function
- Searching all files for bare node ID 616 across builds would return the wrong function
This is explicitly documented in code:
references.rs:408: "Node IDs are not stable across builds, but byte offsets within a file are."lsp.rs:103: "Each sub-cache has its own node ID space — matching across caches is done by absolute file path + byte offset, not by node ID."
The server uses (absolute_file_path, byte_offset) as the cross-build-safe identifier
for any source location. This pair is stable because:
- File paths don't change between compilations
- Byte offsets are properties of the source text, not the compilation
The pattern for cross-build lookups:
Step 1: In the originating build, resolve to (abs_path, byte_offset)
→ resolve_target_location() in references.rs:411
Step 2: In each target build, re-resolve to that build's node ID
→ byte_to_id(build.nodes, abs_path, byte_offset) in references.rs:131
byte_to_id() finds the innermost AST node at a byte position using span containment:
for every node in the file, checks offset <= position < offset + length, then picks the
narrowest (smallest length) match. This gives you the build-local NodeId for the same
source location.
Within a single build's data, node IDs are globally unique and safe to use freely. All of these are safe:
build.nodes[abs_path][node_id]— lookup within one buildbuild.decl_index[node_id]— typed declaration lookup within one buildnode_info.referenced_declaration— following a reference within one buildbuild.base_function_implementation[node_id]— equivalence lookup within one buildfind_node_info(&build.nodes, node_id)— search all files within one build
Any time you hold a NodeId from build A and look it up in build B:
builds.iter().find_map(|b| find_node_info(&b.nodes, node_id))— WRONG, leaks node IDs across builds. A sub-cache may have a completely different function at the same numeric ID.other_build.decl_index.get(&node_id)— WRONG unless you know both builds compiled the same file and solc assigned the same IDs (which is true for file build vs project build of the same file, but NOT for sub-caches).
A NodeId alone is ambiguous across builds, but a NodeId plus its NodeInfo
carries enough metadata to prove identity. Every node has:
name_location—"offset:length:canonicalFileId", a globally unique position- The source text at that position — the node's name
Since canonical file IDs are stable (PathInterner) and byte offsets are properties of
the source text, checking (file_path, name_offset, name_text) is an O(1) identity
proof. This is implemented as verify_node_identity() in call_hierarchy.rs:
// O(1) identity check: does node_id in this build refer to the expected entity?
verify_node_identity(
&build.nodes,
node_id,
expected_abs_path, // which file
expected_name_offset, // byte offset of name_location
expected_name, // function/modifier/contract name
) -> boolThe check is: look up build.nodes[abs_path][node_id], parse its name_location
offset, compare against the expected offset, then read the source bytes at that span
to confirm the name matches. If all three match (file + offset + name), this is
guaranteed to be the same source entity regardless of which compilation produced
the build.
When iterating multiple builds to find a target function, use
resolve_target_in_build() (call_hierarchy.rs):
for build in &builds {
let ids = resolve_target_in_build(
build, node_id, target_abs, target_name, target_name_offset,
);
// ids is empty if this build doesn't contain the target,
// or contains the verified node ID(s) for the target.
}This uses a two-tier strategy:
- Fast path (O(1)):
verify_node_identity()— if the numeric ID exists and passes identity verification, accept it immediately. - Slow path (O(n)):
byte_to_id()— if the ID doesn't exist or fails verification (e.g. sub-cache with a different function at the same numeric ID), re-resolve by byte offset using span containment.
This replaces the older pattern of contains_key + inline name/position scan.
Used in both callHierarchy/incomingCalls and callHierarchy/outgoingCalls.
When the same function appears in multiple builds (file build + project build both
contain PoolManager.swap), the results will have different NodeIds but the same
source position. Always dedup by source position (e.g. selectionRange.start),
never by node ID.
TreeSitter operates on the live buffer text (including unsaved edits) and is completely independent of solc's AST and its IDs. It is used for:
- Dirty-file goto-definition fallback (when AST byte offsets are stale)
- Document symbols, semantic tokens, folding ranges, selection ranges
- Signature help (finding the enclosing call expression)
- Code actions, highlight, rename (identifier collection)
TreeSitter nodes are identified by byte ranges in the current buffer, not by any persistent ID. They are always re-parsed from the current text.
| Build type | Created by | Scope | Node ID space |
|---|---|---|---|
| File build | get_or_fetch_build() → CachedBuild::new() |
Target file + its imports | Shared with project build for same file |
| Project build | ensure_project_cached_build() |
All src + test + script files | Shared with file builds for overlapping files |
| Sub-cache | load_lib_cache() → from_reference_index() |
Library sub-project files | Isolated — different IDs for same functions |
The builds vector in LSP handlers is typically:
let mut builds = vec![&file_build];
if let Some(ref pb) = project_build { builds.push(pb); }
for sc in sub_caches.iter() { builds.push(sc); }Key rule: File build and project build share node IDs for the same file. Sub-caches do NOT — always use the scoped lookup pattern for sub-caches.
| Operation | Safe method | Unsafe method |
|---|---|---|
| Cross-build function lookup | resolve_target_in_build() (verify + fallback) |
Bare NodeId across builds |
| Cross-build node identity | verify_node_identity(nodes, id, path, offset, name) |
contains_key() without validation |
Cross-build src resolution |
Any build's id_to_path_map (canonical) |
Raw solc file IDs |
| Dedup across builds | Source position (Range.start) |
Node ID comparison |
| Sub-cache node lookup | verify_node_identity() → byte_to_id() fallback |
find_node_info() across all files |
| Within single build | Free use of NodeId everywhere |
N/A |
- poolmanager.json — single-file solc AST output for
PoolManager.sol. Node IDs in this fixture are from a file-level build. In a full project build, the same functions will have the same IDs (same file), but a sub-cache build of a different library will have a completely different mapping of IDs to functions.
Use jq to explore the AST:
# Find all FunctionDefinition nodes and their IDs
jq '[.. | objects | select(.nodeType == "FunctionDefinition") | {id, name}]' poolmanager.json
# Find a specific node's referencedDeclaration targets
jq '.. | objects | select(.referencedDeclaration == 616)' poolmanager.json
# Show the source_id_to_path mapping (solc's per-compilation file IDs)
jq '.sources | to_entries | map({key: .value.ast.id, path: .key})' poolmanager.jsonAlways use lsp-bench as the first choice when you want to debug lsp methods and their output. The lsp-bench repo is https://github.com/mmsaki/lsp-bench (local clone path: ../lsp-bench).
There are many examples on ./benchmarks on how to write a simple yaml config to your needs.
For features that depend on cross-file data (references, call hierarchy, implementation), you need a full project index. Add this to your benchmark config:
initializeSettings:
projectIndex:
fullProjectScan: trueThen use waitForProgressToken to wait for the index to complete:
- method: callHierarchy/incomingCalls
waitForProgressToken: "solidity/projectIndexFull"Phase 1 (solidity/projectIndex) covers src-only files. Phase 2 (solidity/projectIndexFull)
covers src + test + script. Cross-file incoming callers require phase 2.
Always build with --release flag
When adding or changing a struct field, LSP method, named data structure, or feature behavior in src/, update the corresponding reference page in docs/pages/reference/ in the same commit.
Also keep these files in sync with each other whenever LSP methods or features change:
FEATURES.md(root) anddocs/pages/docs/features.mdmust always matchCHANGELOG.md(root) anddocs/pages/changelog.mdmust always match
After any doc changes, run bun run docs:publish to deploy to Cloudflare Pages.