Skip to content

Commit 2551e7b

Browse files
committed
refactor: consolidate to 2-file data model
Simplify from 4 JSON files to 2: - upstream_versions.json: latest available versions (committed) - local_state.json: machine-specific state (gitignored) Removed: - latest_versions.json: merged into upstream_versions.json - tools_snapshot.json: now gitignored (generated on demand) Updated all documentation references from latest_versions.json to upstream_versions.json. Note: docs still reference __hints__ feature which was removed during prior refactoring - needs separate cleanup pass.
1 parent 9a196a9 commit 2551e7b

16 files changed

Lines changed: 579 additions & 1560 deletions

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ htmlcov/
3434

3535
# Machine-specific audit state (Phase 2.1 split files)
3636
local_state.json
37+
tools_snapshot.json
3738

3839
# Node.js
3940
node_modules/

README.md

Lines changed: 8 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -730,24 +730,17 @@ make audit
730730
- Some package managers (like Go) don't have built-in bulk update mechanisms - manual updates are required
731731
- The script gracefully handles missing package managers (skips them)
732732

733-
## Caching
733+
## Data Files
734734

735-
- Manual baseline (committed): `latest_versions.json` in this repo (override with `CLI_AUDIT_MANUAL_FILE`). Used as the primary source in offline mode; also used as a fallback when online lookups fail. Example content:
735+
The audit system uses two JSON files:
736736

737-
```json
738-
{
739-
"rust": "1.89.0",
740-
"jq": "jq-1.8.1",
741-
"parallel": "20240322"
742-
}
743-
```
744-
745-
- Auto-updates: when an online lookup succeeds, the tool writes the discovered latest value back into `latest_versions.json` (toggle with `CLI_AUDIT_WRITE_MANUAL=0`).
746-
- Offline behavior: set `CLI_AUDIT_OFFLINE=1` to use `latest_versions.json` exclusively.
747-
748-
### Lookup hints
737+
| File | Purpose | Git Tracked |
738+
|------|---------|-------------|
739+
| `upstream_versions.json` | Latest available versions from upstream sources | Yes (committed) |
740+
| `local_state.json` | Machine-specific installed tool versions | No (gitignored) |
749741

750-
To speed up future runs, the audit records which upstream retrieval method worked last per tool. These hints are stored inside `latest_versions.json` under the special key `"__hints__"`. They help prioritize the fastest working method on subsequent runs and are safe to edit or remove; they will be rebuilt.
742+
- **Offline mode**: Set `CLI_AUDIT_OFFLINE=1` to use cached `upstream_versions.json` without network calls
743+
- **Baseline refresh**: Run `python audit.py --update-baseline` to update upstream versions
751744

752745
## License
753746
MIT

docs/ARCHITECTURE.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -206,13 +206,13 @@ sleep_time = base * (2 ** attempt) + random.uniform(0, jitter)
206206

207207
**Cache Hierarchy (fastest → slowest):**
208208

209-
1. **Hints Cache** (`__hints__` in `latest_versions.json`)
209+
1. **Hints Cache** (`__hints__` in `upstream_versions.json`)
210210
- Stores which API method worked last per tool
211211
- Example: `"gh:BurntSushi/ripgrep": "latest_redirect"`
212212
- Purpose: Skip failed methods, try successful ones first
213213
- Lock: `HINTS_LOCK` (acquired after `MANUAL_LOCK`)
214214

215-
2. **Manual Cache** (`latest_versions.json`)
215+
2. **Manual Cache** (`upstream_versions.json`)
216216
- Committed to repository
217217
- Used in offline mode (`CLI_AUDIT_OFFLINE=1`)
218218
- Updated on successful upstream fetches (unless `CLI_AUDIT_WRITE_MANUAL=0`)
@@ -227,7 +227,7 @@ sleep_time = base * (2 ** attempt) + random.uniform(0, jitter)
227227
**Lock Ordering Rule:**
228228
```python
229229
with MANUAL_LOCK:
230-
# Update latest_versions.json
230+
# Update upstream_versions.json
231231
with HINTS_LOCK:
232232
# Update __hints__ section
233233
```
@@ -239,8 +239,8 @@ with MANUAL_LOCK:
239239

240240
```python
241241
# Global locks for cache writes
242-
MANUAL_LOCK = threading.Lock() # For latest_versions.json
243-
HINTS_LOCK = threading.Lock() # For __hints__ in latest_versions.json (nested)
242+
MANUAL_LOCK = threading.Lock() # For upstream_versions.json
243+
HINTS_LOCK = threading.Lock() # For __hints__ in upstream_versions.json (nested)
244244

245245
# ThreadPoolExecutor for parallel tool audits
246246
with ThreadPoolExecutor(max_workers=MAX_WORKERS) as executor:
@@ -369,7 +369,7 @@ Attempt 1: Try upstream API (with retries)
369369
↓ (on failure)
370370
Attempt 2: Check hints cache for alternative method
371371
↓ (on failure)
372-
Attempt 3: Use manual cache (latest_versions.json)
372+
Attempt 3: Use manual cache (upstream_versions.json)
373373
↓ (on failure)
374374
Result: Mark as UNKNOWN, continue audit
375375
```

docs/ARCHITECTURE_DIAGRAM.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -486,18 +486,18 @@ get_available_version(tool, pm, cache_ttl)
486486
│ ├─→ SUCCESS:
487487
│ │ ├─→ Update in-memory cache
488488
│ │ ├─→ Update hints cache
489-
│ │ └─→ Write to latest_versions.json
489+
│ │ └─→ Write to upstream_versions.json
490490
│ │
491491
│ └─→ FAILURE:
492-
│ └─→ Fallback to latest_versions.json
492+
│ └─→ Fallback to upstream_versions.json
493493
│ └─→ Return cached or "UNKNOWN"
494494
495495
└─→ Returns: version string
496496
497497
Multi-Tier Cache Hierarchy:
498498
1. In-memory (fastest)
499499
2. Hints (method preference)
500-
3. Manual cache (latest_versions.json)
500+
3. Manual cache (upstream_versions.json)
501501
4. Snapshot (tools_snapshot.json)
502502
```
503503

@@ -511,7 +511,7 @@ Rule: Always acquire MANUAL_LOCK before HINTS_LOCK
511511
Prevents: Deadlocks
512512
513513
with MANUAL_LOCK:
514-
# Update latest_versions.json
514+
# Update upstream_versions.json
515515
with HINTS_LOCK:
516516
# Update __hints__ section
517517
```
@@ -606,7 +606,7 @@ Example: Version lookup failure
606606
└──────────────────────────────────────┘
607607
608608
├─→ No network requests
609-
├─→ Use committed cache (latest_versions.json)
609+
├─→ Use committed cache (upstream_versions.json)
610610
├─→ Use snapshot (tools_snapshot.json)
611611
└─→ Baseline version checking only
612612
```

docs/CLI_REFERENCE.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,7 @@ make audit # Render only
142142
### Offline Mode
143143

144144
```bash
145-
# Use only manual cache (latest_versions.json)
145+
# Use only manual cache (upstream_versions.json)
146146
CLI_AUDIT_OFFLINE=1 python3 cli_audit.py
147147

148148
# Offline + render from snapshot
@@ -190,7 +190,7 @@ CLI_AUDIT_OFFLINE=1 CLI_AUDIT_RENDER=1 python3 cli_audit.py
190190
| Variable | Type | Default | Description |
191191
|----------|------|---------|-------------|
192192
| `CLI_AUDIT_SNAPSHOT_FILE` | path | `tools_snapshot.json` | Snapshot file path |
193-
| `CLI_AUDIT_MANUAL_FILE` | path | `latest_versions.json` | Manual cache path |
193+
| `CLI_AUDIT_MANUAL_FILE` | path | `upstream_versions.json` | Manual cache path |
194194
| `CLI_AUDIT_WRITE_MANUAL` | bool | `1` | Auto-update manual cache |
195195
| `CLI_AUDIT_MANUAL_FIRST` | bool | `0` | Try manual cache before network |
196196

@@ -708,7 +708,7 @@ CLI_AUDIT_COLLECT=1 python3 cli_audit.py
708708
jq '.' tools_snapshot.json
709709

710710
# Rebuild from scratch
711-
rm tools_snapshot.json latest_versions.json
711+
rm tools_snapshot.json upstream_versions.json
712712
make update
713713
```
714714

docs/DEPLOYMENT.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -517,7 +517,7 @@ make audit-offline
517517

518518
**What Changes:**
519519
- No upstream API calls (GitHub, PyPI, crates.io, npm)
520-
- Uses `latest_versions.json` exclusively for version data
520+
- Uses `upstream_versions.json` exclusively for version data
521521
- Marks upstream method as `manual`
522522
- Displays `(offline)` in readiness summary
523523

@@ -532,12 +532,12 @@ make audit-offline
532532
1. **Update manual cache online:**
533533
```bash
534534
make update
535-
# Populates latest_versions.json with current data
535+
# Populates upstream_versions.json with current data
536536
```
537537

538538
2. **Commit manual cache:**
539539
```bash
540-
git add latest_versions.json
540+
git add upstream_versions.json
541541
git commit -m "chore: update manual version cache"
542542
```
543543

@@ -548,7 +548,7 @@ CLI_AUDIT_OFFLINE=1 python3 cli_audit.py --only python
548548

549549
### Offline Cache Management
550550

551-
**Cache File:** `latest_versions.json` (override with `CLI_AUDIT_MANUAL_FILE`)
551+
**Cache File:** `upstream_versions.json` (override with `CLI_AUDIT_MANUAL_FILE`)
552552

553553
**Structure:**
554554
```json
@@ -610,7 +610,7 @@ CLI_AUDIT_PROGRESS=0
610610

611611
# Paths
612612
CLI_AUDIT_SNAPSHOT_FILE=tools_snapshot.json
613-
CLI_AUDIT_MANUAL_FILE=latest_versions.json
613+
CLI_AUDIT_MANUAL_FILE=upstream_versions.json
614614

615615
# Cache
616616
CLI_AUDIT_WRITE_MANUAL=1
@@ -728,7 +728,7 @@ jobs:
728728
uses: actions/cache@v3
729729
with:
730730
path: tools_snapshot.json
731-
key: tools-snapshot-${{ runner.os }}-${{ hashFiles('latest_versions.json') }}
731+
key: tools-snapshot-${{ runner.os }}-${{ hashFiles('upstream_versions.json') }}
732732
restore-keys: |
733733
tools-snapshot-${{ runner.os }}-
734734
@@ -746,7 +746,7 @@ jobs:
746746
if: always()
747747
with:
748748
path: tools_snapshot.json
749-
key: tools-snapshot-${{ runner.os }}-${{ hashFiles('latest_versions.json') }}
749+
key: tools-snapshot-${{ runner.os }}-${{ hashFiles('upstream_versions.json') }}
750750
```
751751
752752
#### PR Comment Workflow

docs/DEVELOPER_GUIDE.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ make update && make audit
5454
### 5. Commit and Push
5555

5656
```bash
57-
git add cli_audit.py latest_versions.json
57+
git add cli_audit.py upstream_versions.json
5858
git commit -m "feat(tools): add support for new-tool"
5959
git push -u origin feature/add-new-tool
6060
```
@@ -143,7 +143,7 @@ CLI_AUDIT_JSON=1 python3 cli_audit.py --only your-tool | jq '.'
143143

144144
### Step 6: Update Manual Cache
145145

146-
Add initial version to `latest_versions.json`:
146+
Add initial version to `upstream_versions.json`:
147147

148148
```json
149149
{
@@ -185,7 +185,7 @@ python3 cli_audit.py --only deno | python3 smart_column.py -s "|" -t
185185

186186
**4. Commit:**
187187
```bash
188-
git add cli_audit.py latest_versions.json
188+
git add cli_audit.py upstream_versions.json
189189
git commit -m "feat(tools): add Deno runtime support"
190190
```
191191

@@ -251,7 +251,7 @@ When updating caches, always use locks:
251251
```python
252252
# ALWAYS acquire MANUAL_LOCK before HINTS_LOCK
253253
with MANUAL_LOCK:
254-
# Update latest_versions.json
254+
# Update upstream_versions.json
255255
set_manual_latest(tool_name, version)
256256

257257
with HINTS_LOCK:
@@ -313,7 +313,7 @@ CLI_AUDIT_DEBUG=1 CLI_AUDIT_TRACE=1 python3 cli_audit.py 2>&1 | grep "slow"
313313

314314
```bash
315315
# Ensure manual cache is valid JSON
316-
jq '.' latest_versions.json > /dev/null
316+
jq '.' upstream_versions.json > /dev/null
317317

318318
# Ensure snapshot is valid
319319
jq '.__meta__.schema_version' tools_snapshot.json
@@ -418,10 +418,10 @@ CLI_AUDIT_DEBUG=1 python3 cli_audit.py --only problematic-tool 2>&1 | tee debug.
418418

419419
```bash
420420
# View manual cache
421-
jq '.' latest_versions.json
421+
jq '.' upstream_versions.json
422422

423423
# View hints
424-
jq '.__hints__' latest_versions.json
424+
jq '.__hints__' upstream_versions.json
425425

426426
# View snapshot metadata
427427
jq '.__meta__' tools_snapshot.json
@@ -457,7 +457,7 @@ test(smoke): verify JSON output schema
457457
- [ ] Code passes `pyflakes` lint
458458
- [ ] Manual test passed for affected tools
459459
- [ ] Smoke test passed (`bash scripts/test_smoke.sh`)
460-
- [ ] Updated `latest_versions.json` if adding tools
460+
- [ ] Updated `upstream_versions.json` if adding tools
461461
- [ ] Updated documentation if changing behavior
462462
- [ ] Commit message follows conventions
463463

docs/FUNCTION_REFERENCE.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Functions for reading/writing snapshot files that decouple data collection from
2525
Load snapshot from file.
2626

2727
**Parameters:**
28-
- `paths` (`Sequence[str] | None`) - Custom paths to try (default: `[SNAPSHOT_FILE, "latest_versions.json"]`)
28+
- `paths` (`Sequence[str] | None`) - Custom paths to try (default: `[SNAPSHOT_FILE, "upstream_versions.json"]`)
2929

3030
**Returns:**
3131
- `dict[str, Any]` - Document with `__meta__` and `tools` keys
@@ -543,7 +543,7 @@ Multi-tier caching system for offline operation and performance.
543543

544544
### `load_manual_versions() -> None`
545545

546-
Load manual cache from `latest_versions.json`.
546+
Load manual cache from `upstream_versions.json`.
547547

548548
**Side Effects:** Populates global `MANUAL_VERSIONS` dict
549549

@@ -591,12 +591,12 @@ Update manual cache with new version.
591591
- `tool_name` (`str`) - Tool name
592592
- `tag_or_version` (`str`) - Version string or tag
593593

594-
**Side Effects:** Writes to `latest_versions.json` (requires `MANUAL_LOCK`)
594+
**Side Effects:** Writes to `upstream_versions.json` (requires `MANUAL_LOCK`)
595595

596596
**Usage:**
597597
```python
598598
set_manual_latest("ripgrep", "14.1.1")
599-
# Updates latest_versions.json atomically
599+
# Updates upstream_versions.json atomically
600600
```
601601

602602
**See:** [API_REFERENCE.md#set_manual_latest](API_REFERENCE.md#set_manual_latest)
@@ -605,7 +605,7 @@ set_manual_latest("ripgrep", "14.1.1")
605605

606606
### `load_hints() -> None`
607607

608-
Load API method hints from `__hints__` in `latest_versions.json`.
608+
Load API method hints from `__hints__` in `upstream_versions.json`.
609609

610610
**Side Effects:** Populates global `HINTS` dict
611611

@@ -656,7 +656,7 @@ Store API method hint for future runs.
656656
- `key` (`str`) - Hint key
657657
- `value` (`str`) - Method that worked
658658

659-
**Side Effects:** Writes to `__hints__` in `latest_versions.json`
659+
**Side Effects:** Writes to `__hints__` in `upstream_versions.json`
660660

661661
**Lock Ordering:** Requires `MANUAL_LOCK``HINTS_LOCK`
662662

@@ -1152,7 +1152,7 @@ See [API_REFERENCE.md#environment-variables](API_REFERENCE.md#environment-variab
11521152

11531153
- **[API_REFERENCE.md](API_REFERENCE.md)** - Complete function signatures
11541154
- **Data Structures** - Tool dataclass, TOOLS registry
1155-
- **File Schemas** - tools_snapshot.json, latest_versions.json
1155+
- **File Schemas** - tools_snapshot.json, upstream_versions.json
11561156

11571157
### Developer Guide
11581158

docs/MIGRATION_GUIDE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ cli_audit/
8080
### File Locations
8181

8282
**Unchanged:**
83-
- `latest_versions.json` (cache)
83+
- `upstream_versions.json` (cache)
8484
- `tools_snapshot.json` (snapshot)
8585
- `smart_column.py` (formatting)
8686
- `Makefile` and `Makefile.d/`

docs/PRD.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ AI CLI Preparation is a specialized environment audit and installation managemen
5858
- Multi-source API support (GitHub, PyPI, crates.io, npm, GNU FTP)
5959
- HTTP layer with retries, exponential backoff, rate limiting
6060
- Per-origin request caps (GitHub: 5/min, PyPI: 10/min, crates.io: 5/min)
61-
- Offline-first design with committed cache (latest_versions.json)
61+
- Offline-first design with committed cache (upstream_versions.json)
6262

6363
**Output Formats:**
6464
- Table view with status icons (✓ UP-TO-DATE, ↑ OUTDATED, ✗ NOT INSTALLED, ? UNKNOWN)
@@ -79,8 +79,8 @@ AI CLI Preparation is a specialized environment audit and installation managemen
7979
- Independent tool audits (failures isolated)
8080

8181
**Cache Hierarchy:**
82-
- **Hints**: Optimization hints for faster lookups (__hints__ in latest_versions.json)
83-
- **Manual**: User-committed versions (latest_versions.json)
82+
- **Hints**: Optimization hints for faster lookups (__hints__ in upstream_versions.json)
83+
- **Manual**: User-committed versions (upstream_versions.json)
8484
- **Upstream**: Live API queries (fallback)
8585

8686
**Resilience Patterns:**

0 commit comments

Comments
 (0)