Skip to content

Commit bf5b013

Browse files
Night1099Ekozmaster
authored andcommitted
indirect calls, switch tables, dataflow, CI
1 parent e057422 commit bf5b013

18 files changed

Lines changed: 1230 additions & 61 deletions

.claude/CLAUDE.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
**Never run static analysis tools directly.** Delegate to a `static-analyzer` subagent. Only exceptions — run these inline:
66
- `sigdb.py identify` / `fingerprint` (single-function ID, <5s)
77
- `context.py assemble` / `postprocess` (context gathering, <5s)
8+
- `dataflow.py --constants` / `--slice` (single-function analysis, <5s)
89
- `readmem.py` (single typed read from PE, <5s)
910
- `asi_patcher.py build` (build step, not analysis)
1011
- `pyghidra_backend.py status` (project existence check, <1s)
@@ -75,5 +76,5 @@ Each file reads as if it was always designed this way. Comments guide the next d
7576

7677
- **Tool catalog, decision guide, and caveats**: @.claude/rules/tool-catalog.md
7778
- **Subagent workflow and delegation rules**: @.claude/rules/subagent-workflow.md
78-
- **Project workspace and knowledge base format**: @.claude/rules/project-workspace.md
79-
- **DX9 FFP proxy porting for RTX Remix**: @.claude/rules/dx9-ffp-port.md
79+
- **DX9 FFP proxy porting for RTX Remix**: `/dx9-ffp-port` skill
80+
- **Frida-based dynamic analysis**: `/dynamic-analysis` skill

.claude/rules/subagent-workflow.md

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ When analyzing a binary for the first time (no existing or sparsely populated `p
4141
| Single function ID (`sigdb.py identify`, `fingerprint`) | Main agent -- fast (<5s) |
4242
| Context assembly (`context.py assemble`) | Main agent -- fast (<5s) |
4343
| Decompiler postprocess (`context.py postprocess`) | Main agent -- instant |
44+
| Dataflow: constants + backward slice (`dataflow.py`) | Main agent -- fast (<5s) |
4445
| File editing, patch specs, builds | Main agent — directly |
4546
| KB updates from subagent findings | `static-analyzer` writes to `kb.h`; main agent may refine |
4647

@@ -82,15 +83,17 @@ For deep analysis tasks (finding subsystems, mapping call chains, understanding
8283
## Examples
8384

8485
**"Disable culling in game.exe"**
85-
1. Spawn `static-analyzer` #1 (r2ghidra): find `SetRenderState` calls with `D3DRS_CULLMODE`, string search for "cull", xrefs to render state functions. Uses `--backend pdg --types kb.h`. Writes to `findings_r2.md`.
86+
1. Spawn `static-analyzer` #1 (r2ghidra): find `SetRenderState` calls with `D3DRS_CULLMODE`, string search for "cull", xrefs --indirect to find vtable call sites. Uses `--backend pdg --types kb.h`. Writes to `findings_r2.md`.
8687
2. Spawn `static-analyzer` #2 (pyghidra): same search strategy but decompile with `pyghidra_backend.py decompile`. Writes to `findings.md`.
8788
3. Immediately tell the user: "Please launch the game — I'll need to attach with livetools to patch culling at runtime once I find the addresses"
88-
4. When both return, merge findings and use `livetools` to verify and patch: `mem write` to NOP the cull-enable instruction or force `D3DRS_CULLMODE` to `D3DCULL_NONE`
89+
4. While waiting, run `dataflow.py --constants` on any known render functions to see what cull mode constants flow in (e.g., `eax = 0x2` = D3DCULL_CW)
90+
5. When both return, merge findings and use `livetools` to verify and patch: `mem write` to NOP the cull-enable instruction or force `D3DRS_CULLMODE` to `D3DCULL_NONE`
8991

9092
**"What does function 0x401000 do?"**
91-
1. Spawn `static-analyzer`: decompile with `--types kb.h`, get callgraph, xrefs
92-
2. Tell the user: "Static analysis is running. Want me to also trace this function live to see actual register values and call frequency?"
93-
3. If yes, attach with `livetools trace 0x401000 --count 20 --read`
93+
1. Spawn `static-analyzer`: decompile with `--types kb.h`, get callgraph --indirect, xrefs
94+
2. Run `dataflow.py 0x401000 --constants` inline — see what constants flow through
95+
3. Tell the user: "Static analysis is running. Want me to also trace this function live to see actual register values and call frequency?"
96+
4. If yes, attach with `livetools trace 0x401000 --count 20 --read`
9497

9598
**"Find who writes to address 0x7A0000"**
9699
1. Spawn `static-analyzer`: `datarefs.py` for static references

.claude/rules/tool-catalog.md

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -23,15 +23,20 @@ These are fast (<5s) and allowed inline:
2323
- "Get full context before reasoning about a function" → `python -m retools.context assemble $B $VA --project $P`
2424
- "Clean up decompiler output with known names" → pipe through `python -m retools.context postprocess`
2525
- "Read a typed value from the PE file" → `python -m retools.readmem $B $VA $TYPE`
26+
- "What constant flows into this register?" → `python -m retools.dataflow $B $VA --constants`
27+
- "Trace where this value comes from" → `python -m retools.dataflow $B $VA --slice TARGET_VA:REG`
2628
- "Build an ASI patch DLL" → `python -m retools.asi_patcher build spec.json`
2729

2830
### Delegate to `static-analyzer` subagent
2931

3032
Everything else. Tell the subagent WHAT you need, not HOW to run it — it has the full tool catalog.
3133

32-
- "What does this function do?" → decompile + callgraph + xrefs
34+
- "What does this function do?" → decompile + callgraph + xrefs + dataflow --constants
3335
- "Who calls this function?" → xrefs or callgraph --up
34-
- "What does this function call?" → callgraph --down
36+
- "What does this function call?" → callgraph --down (add --indirect for vtable calls)
37+
- "Who calls this virtual method?" → xrefs --indirect + filter by vtable slot offset
38+
- "What constant reaches this call?" → dataflow --constants or --slice VA:REG
39+
- "Resolve a switch/jump table" → cfg (auto-resolves MSVC switch patterns)
3540
- "Find a string and who uses it" → string search with xrefs
3641
- "Where is this global read/written?" → datarefs
3742
- "Where is struct field +0x54 used?" → structrefs
@@ -69,9 +74,10 @@ Everything else. Tell the subagent WHAT you need, not HOW to run it — it has t
6974
| `pyghidra_backend.py decompile $B $VA --project $P` | Decompile via saved Ghidra project | `pyghidra_backend.py decompile game.exe 0x401000 --project patches/MyGame` |
7075
| `pyghidra_backend.py status $B --project $P` | Check if Ghidra project exists | `pyghidra_backend.py status game.exe --project patches/MyGame` |
7176
| `funcinfo.py $B $VA` | Find function start/end, rets, calling convention, callees | `funcinfo.py binary.exe 0x401000` |
72-
| `cfg.py $B $VA` | Control flow graph (basic blocks + edges, text or mermaid) | `cfg.py binary.exe 0x401000 --format mermaid` |
73-
| `callgraph.py $B $VA` | Caller/callee tree (multi-level, --up/--down N) | `callgraph.py binary.exe 0x401000 --up 3` |
74-
| `xrefs.py $B $VA` | Find all calls/jumps TO an address | `xrefs.py binary.exe 0x401000 -t call` |
77+
| `cfg.py $B $VA` | Control flow graph (basic blocks + edges, text or mermaid). Resolves MSVC switch/jump tables automatically. `--switch-details` shows table info | `cfg.py binary.exe 0x401000 --format mermaid` |
78+
| `callgraph.py $B $VA` | Caller/callee tree (multi-level, --up/--down N). `--indirect` adds vtable/fptr calls to --down trees | `callgraph.py binary.exe 0x401000 --down 2 --indirect` |
79+
| `xrefs.py $B $VA` | Find all calls/jumps TO an address. `--indirect` also scans for `call [reg+offset]`, `call [reg]`, `call [addr]` | `xrefs.py binary.exe 0x401000 --indirect` |
80+
| `dataflow.py $B $VA` | Forward constant propagation (`--constants`) or backward register slice (`--slice VA:REG`) within a function | `dataflow.py binary.exe 0x401000 --constants` |
7581
| `datarefs.py $B $VA` | Find instructions that reference a global address (mem deref + `--imm` for push/mov constants) | `datarefs.py binary.exe 0x7A0000 --imm` |
7682
| `structrefs.py $B $OFF` | Find all `[reg+offset]` accesses (struct field usage) | `structrefs.py binary.exe 0x54 --base esi` |
7783
| `structrefs.py $B --aggregate` | Reconstruct C struct from all field accesses in a function | `structrefs.py binary.exe --aggregate --fn 0x401000 --base esi` |
@@ -92,7 +98,7 @@ Everything else. Tell the subagent WHAT you need, not HOW to run it — it has t
9298
| `sigdb.py scan $B` | Bulk signature scan against DB | `sigdb.py scan game.exe` |
9399
| `sigdb.py identify $B $VA` | Single function signature lookup (multi-tier) | `sigdb.py identify game.exe 0x401200` |
94100
| `sigdb.py fingerprint $B` | Identify compiler version (Rich header + markers + imports) | `sigdb.py fingerprint game.exe` |
95-
| `context.py assemble $B $VA --project $P` | Gather full analysis context for a function | `context.py assemble game.exe 0x401500 --project Warband` |
101+
| `context.py assemble $B $VA --project $P` | Gather full analysis context for a function. Includes forward constant propagation by default (`--no-dataflow` to skip) | `context.py assemble game.exe 0x401500 --project Warband` |
96102
| `context.py postprocess $B $VA --project $P` | Mechanically rename/annotate decompiler output (pipe) | `decompiler.py ... \| context.py postprocess ...` |
97103
| `sigdb.py build $MANIFEST` | Build/extend signature DB from manifest | `sigdb.py build sources.json` |
98104
| `sigdb.py pull` | Download signature DB from HuggingFace | `sigdb.py pull` or `sigdb.py pull --sources` |

.cursor/rules/subagent-workflow.mdc

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ When analyzing a binary for the first time (no existing or sparsely populated `p
4242
| Single function ID (`sigdb.py identify`, `fingerprint`) | Main agent -- fast (<5s) |
4343
| Context assembly (`context.py assemble`) | Main agent -- fast (<5s) |
4444
| Decompiler postprocess (`context.py postprocess`) | Main agent -- instant |
45+
| Dataflow: constants + backward slice (`dataflow.py`) | Main agent -- fast (<5s) |
4546
| File editing, patch specs, builds | Main agent — directly |
4647
| KB updates from subagent findings | `static-analyzer` writes to `kb.h`; main agent may refine |
4748

@@ -83,15 +84,17 @@ For deep analysis tasks (finding subsystems, mapping call chains, understanding
8384
## Examples
8485

8586
**"Disable culling in game.exe"**
86-
1. Spawn `static-analyzer` #1 (r2ghidra): find `SetRenderState` calls with `D3DRS_CULLMODE`, string search for "cull", xrefs to render state functions. Uses `--backend pdg --types kb.h`. Writes to `findings_r2.md`.
87+
1. Spawn `static-analyzer` #1 (r2ghidra): find `SetRenderState` calls with `D3DRS_CULLMODE`, string search for "cull", xrefs --indirect to find vtable call sites. Uses `--backend pdg --types kb.h`. Writes to `findings_r2.md`.
8788
2. Spawn `static-analyzer` #2 (pyghidra): same search strategy but decompile with `pyghidra_backend.py decompile`. Writes to `findings.md`.
8889
3. Immediately tell the user: "Please launch the game — I'll need to attach with livetools to patch culling at runtime once I find the addresses"
89-
4. When both return, merge findings and use `livetools` to verify and patch: `mem write` to NOP the cull-enable instruction or force `D3DRS_CULLMODE` to `D3DCULL_NONE`
90+
4. While waiting, run `dataflow.py --constants` on any known render functions to see what cull mode constants flow in (e.g., `eax = 0x2` = D3DCULL_CW)
91+
5. When both return, merge findings and use `livetools` to verify and patch: `mem write` to NOP the cull-enable instruction or force `D3DRS_CULLMODE` to `D3DCULL_NONE`
9092

9193
**"What does function 0x401000 do?"**
92-
1. Spawn `static-analyzer`: decompile with `--types kb.h`, get callgraph, xrefs
93-
2. Tell the user: "Static analysis is running. Want me to also trace this function live to see actual register values and call frequency?"
94-
3. If yes, attach with `livetools trace 0x401000 --count 20 --read`
94+
1. Spawn `static-analyzer`: decompile with `--types kb.h`, get callgraph --indirect, xrefs
95+
2. Run `dataflow.py 0x401000 --constants` inline — see what constants flow through
96+
3. Tell the user: "Static analysis is running. Want me to also trace this function live to see actual register values and call frequency?"
97+
4. If yes, attach with `livetools trace 0x401000 --count 20 --read`
9598

9699
**"Find who writes to address 0x7A0000"**
97100
1. Spawn `static-analyzer`: `datarefs.py` for static references

.cursor/rules/tool-catalog.mdc

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -24,15 +24,20 @@ These are fast (<5s) and allowed inline:
2424
- "Get full context before reasoning about a function" → `python -m retools.context assemble $B $VA --project $P`
2525
- "Clean up decompiler output with known names" → pipe through `python -m retools.context postprocess`
2626
- "Read a typed value from the PE file" → `python -m retools.readmem $B $VA $TYPE`
27+
- "What constant flows into this register?" → `python -m retools.dataflow $B $VA --constants`
28+
- "Trace where this value comes from" → `python -m retools.dataflow $B $VA --slice TARGET_VA:REG`
2729
- "Build an ASI patch DLL" → `python -m retools.asi_patcher build spec.json`
2830

2931
### Delegate to `static-analyzer` subagent
3032

3133
Everything else. Tell the subagent WHAT you need, not HOW to run it — it has the full tool catalog.
3234

33-
- "What does this function do?" → decompile + callgraph + xrefs
35+
- "What does this function do?" → decompile + callgraph + xrefs + dataflow --constants
3436
- "Who calls this function?" → xrefs or callgraph --up
35-
- "What does this function call?" → callgraph --down
37+
- "What does this function call?" → callgraph --down (add --indirect for vtable calls)
38+
- "Who calls this virtual method?" → xrefs --indirect + filter by vtable slot offset
39+
- "What constant reaches this call?" → dataflow --constants or --slice VA:REG
40+
- "Resolve a switch/jump table" → cfg (auto-resolves MSVC switch patterns)
3641
- "Find a string and who uses it" → string search with xrefs
3742
- "Where is this global read/written?" → datarefs
3843
- "Where is struct field +0x54 used?" → structrefs
@@ -70,9 +75,10 @@ Everything else. Tell the subagent WHAT you need, not HOW to run it — it has t
7075
| `pyghidra_backend.py decompile $B $VA --project $P` | Decompile via saved Ghidra project | `pyghidra_backend.py decompile game.exe 0x401000 --project patches/MyGame` |
7176
| `pyghidra_backend.py status $B --project $P` | Check if Ghidra project exists | `pyghidra_backend.py status game.exe --project patches/MyGame` |
7277
| `funcinfo.py $B $VA` | Find function start/end, rets, calling convention, callees | `funcinfo.py binary.exe 0x401000` |
73-
| `cfg.py $B $VA` | Control flow graph (basic blocks + edges, text or mermaid) | `cfg.py binary.exe 0x401000 --format mermaid` |
74-
| `callgraph.py $B $VA` | Caller/callee tree (multi-level, --up/--down N) | `callgraph.py binary.exe 0x401000 --up 3` |
75-
| `xrefs.py $B $VA` | Find all calls/jumps TO an address | `xrefs.py binary.exe 0x401000 -t call` |
78+
| `cfg.py $B $VA` | Control flow graph (basic blocks + edges, text or mermaid). Resolves MSVC switch/jump tables automatically. `--switch-details` shows table info | `cfg.py binary.exe 0x401000 --format mermaid` |
79+
| `callgraph.py $B $VA` | Caller/callee tree (multi-level, --up/--down N). `--indirect` adds vtable/fptr calls to --down trees | `callgraph.py binary.exe 0x401000 --down 2 --indirect` |
80+
| `xrefs.py $B $VA` | Find all calls/jumps TO an address. `--indirect` also scans for `call [reg+offset]`, `call [reg]`, `call [addr]` | `xrefs.py binary.exe 0x401000 --indirect` |
81+
| `dataflow.py $B $VA` | Forward constant propagation (`--constants`) or backward register slice (`--slice VA:REG`) within a function | `dataflow.py binary.exe 0x401000 --constants` |
7682
| `datarefs.py $B $VA` | Find instructions that reference a global address (mem deref + `--imm` for push/mov constants) | `datarefs.py binary.exe 0x7A0000 --imm` |
7783
| `structrefs.py $B $OFF` | Find all `[reg+offset]` accesses (struct field usage) | `structrefs.py binary.exe 0x54 --base esi` |
7884
| `structrefs.py $B --aggregate` | Reconstruct C struct from all field accesses in a function | `structrefs.py binary.exe --aggregate --fn 0x401000 --base esi` |
@@ -93,7 +99,7 @@ Everything else. Tell the subagent WHAT you need, not HOW to run it — it has t
9399
| `sigdb.py scan $B` | Bulk signature scan against DB | `sigdb.py scan game.exe` |
94100
| `sigdb.py identify $B $VA` | Single function signature lookup (multi-tier) | `sigdb.py identify game.exe 0x401200` |
95101
| `sigdb.py fingerprint $B` | Identify compiler version (Rich header + markers + imports) | `sigdb.py fingerprint game.exe` |
96-
| `context.py assemble $B $VA --project $P` | Gather full analysis context for a function | `context.py assemble game.exe 0x401500 --project Warband` |
102+
| `context.py assemble $B $VA --project $P` | Gather full analysis context for a function. Includes forward constant propagation by default (`--no-dataflow` to skip) | `context.py assemble game.exe 0x401500 --project Warband` |
97103
| `context.py postprocess $B $VA --project $P` | Mechanically rename/annotate decompiler output (pipe) | `decompiler.py ... \| context.py postprocess ...` |
98104
| `sigdb.py build $MANIFEST` | Build/extend signature DB from manifest | `sigdb.py build sources.json` |
99105
| `sigdb.py pull` | Download signature DB from HuggingFace | `sigdb.py pull` or `sigdb.py pull --sources` |

.github/copilot-instructions.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,8 @@ Run `python verify_install.py` from the repo root before first use. If pyghidra/
1818
**Never run static analysis tools (`retools`) directly in sequence.** Delegate to the `static-analyzer` agent for all offline analysis. Exceptions — run these inline (all <5s):
1919

2020
- `sigdb.py identify` / `fingerprint` — single-function ID or compiler detection
21-
- `context.py assemble` / `postprocess` — context gathering and decompiler annotation
21+
- `context.py assemble` / `postprocess` — context gathering and decompiler annotation (assemble now includes forward constant propagation by default; `--no-dataflow` to skip)
22+
- `dataflow.py --constants` / `--slice` — single-function constant propagation or backward register trace
2223
- `readmem.py` — single typed read from a PE file
2324
- `asi_patcher.py build` — build step, not analysis
2425
- `pyghidra_backend.py status` — project existence check (<1s)

.github/workflows/ci.yml

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
name: CI
2+
3+
on:
4+
push:
5+
branches: [master]
6+
pull_request:
7+
branches: [master]
8+
9+
jobs:
10+
test:
11+
runs-on: ubuntu-latest
12+
strategy:
13+
matrix:
14+
python-version: ["3.10", "3.12"]
15+
steps:
16+
- uses: actions/checkout@v4
17+
- name: Set up Python ${{ matrix.python-version }}
18+
uses: actions/setup-python@v5
19+
with:
20+
python-version: ${{ matrix.python-version }}
21+
- name: Install dependencies
22+
run: pip install -r requirements.txt pytest
23+
- name: Run tests
24+
run: pytest tests/ -v --tb=short

0 commit comments

Comments
 (0)