Skip to content

Commit d4c4a7f

Browse files
Andrey Golovanovclaude
andcommitted
Rewrite README: metrics pipeline + autoresearch
Previous README had wrong license (AGPL, now MIT), wrong API examples (compute_bac_metrics doesn't exist), and no mention of autoresearch. New README covers both capabilities with working examples: - Metrics pipeline: CLI and Python API with correct function names - Autoresearch: three-stage pipeline, quick start, multi-cycle, persistence Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent b624307 commit d4c4a7f

1 file changed

Lines changed: 144 additions & 54 deletions

File tree

README.md

Lines changed: 144 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -1,33 +1,14 @@
11
# NetLab
22

3-
Metrics aggregation and statistical analysis for [NetGraph](https://github.com/networmix/NetGraph) simulation results.
3+
Metrics and autonomous research tools for [NetGraph](https://github.com/networmix/NetGraph) network simulations.
44

5-
## Overview
5+
## What It Does
66

7-
NetLab processes NetGraph workflow outputs (JSON artifacts) to compute statistical metrics across random seeds and failure scenarios. Provides CLI and Python API for batch analysis, cross-seed aggregation, and visualization.
7+
NetLab has two capabilities:
88

9-
## Features
9+
1. **Metrics pipeline** — computes verified reliability metrics (BAC, latency, alpha) from ngraph simulation results. Per-direction, occurrence-count-weighted, hand-verified against 252 assertions.
1010

11-
### Metrics
12-
13-
- **BAC (Bandwidth Availability Curve)**: Delivered bandwidth quantiles, availability at thresholds, AUC normalization
14-
- **Latency**: Stretch distributions (p50/p99), tail degradation ratios, SLO compliance, WES (Weighted Excess Stretch)
15-
- **IterationOps**: Per-iteration SPF calls, flows created, reoptimization calls, placement iterations
16-
- **SPS (Structural Pair Survivability)**: Fraction of src-dst pairs meeting demand under failures
17-
- **MSD (Maximum Supported Demand)**: Alpha-star multiplier for traffic matrix scaling capacity
18-
- **CostPower**: CapEx/Power totals, USD/Watt per Gbit (offered and at p99.9 reliability)
19-
20-
### Cross-Seed Analysis
21-
22-
- Positional alignment of time-series data by iteration index
23-
- Median and IQR (interquartile range) computation
24-
- Variable-length series handling with NaN padding
25-
26-
### Visualization
27-
28-
- Cross-seed plots with median curves and IQR bands
29-
- Baseline-normalized delta comparisons
30-
- Statistical significance heatmaps (p-values)
11+
2. **Autoresearch** — LLM-driven topology exploration. Describe a connectivity idea in natural language, and the system generates a valid ngraph scenario, runs the simulation, computes metrics, and produces a structural interpretation. The LLM also proposes the next experiment, closing the research loop.
3112

3213
## Installation
3314

@@ -43,68 +24,177 @@ cd NetLab
4324
make dev
4425
```
4526

46-
## Usage
27+
## Metrics
4728

4829
### CLI
4930

5031
```bash
51-
# Compute metrics for scenarios
52-
netlab metrics tests/data/scenarios/
32+
# Compute metrics for all scenarios in a directory
33+
netlab metrics path/to/scenarios/
5334

54-
# With summary tables and plots
55-
netlab metrics tests/data/scenarios/ --summary
35+
# Summary tables only, no plots
36+
netlab metrics path/to/scenarios/ --no-plots
5637

5738
# Filter specific scenarios
58-
netlab metrics tests/data/scenarios/ --only small_clos,small_dragonfly
59-
60-
# Skip plot generation
61-
netlab metrics tests/data/scenarios/ --no-plots
39+
netlab metrics path/to/scenarios/ --only small_clos,small_dragonfly
6240
```
6341

6442
### Python API
6543

6644
```python
67-
from metrics.bac import compute_bac_metrics
68-
from metrics.aggregate import summarize_across_seeds
45+
from metrics.bac import compute_bac
46+
from metrics.latency import compute_latency_stretch
47+
from metrics.msd import compute_alpha_star
48+
49+
# Load ngraph results
50+
import json
51+
with open("scenario.results.json") as f:
52+
results = json.load(f)
53+
54+
# Capacity
55+
alpha = compute_alpha_star(results)
56+
print(f"alpha_star: {alpha.alpha_star}")
57+
58+
# Bandwidth availability (aggregate + per direction)
59+
bac = compute_bac(results, step_name="tm_placement")
60+
print(f"BAC AUC: {bac.auc_normalized:.4f}")
61+
for label, pf in bac.per_flow.items():
62+
print(f" {label}: AUC={pf.auc_normalized:.4f}")
63+
64+
# Latency stretch
65+
lat = compute_latency_stretch(results)
66+
print(f"baseline p99: {lat.baseline['p99']:.4f}")
67+
print(f"failure p99: {lat.failures['p99']:.4f}")
68+
```
69+
70+
### Metrics Reference
71+
72+
| Metric | What it measures |
73+
|--------|-----------------|
74+
| **BAC** | Delivered bandwidth distribution across failure iterations. AUC, quantiles, availability at thresholds, BW at probability levels. Per-direction breakdown. |
75+
| **Latency** | Volume-weighted stretch (cost / baseline cost). p50, p95, p99 percentiles, SLO compliance, WES (weighted excess stretch). |
76+
| **Alpha (MSD)** | Maximum demand multiplier the topology supports before saturation. |
77+
| **SPS** | Fraction of source-destination demand satisfied under failures. |
78+
| **CostPower** | CapEx and power normalized by offered demand and reliable bandwidth. |
79+
| **IterOps** | Failure iteration counts, unique pattern counts, timing. |
80+
81+
All metrics correctly handle ngraph's Monte Carlo deduplication (`occurrence_count` expansion).
82+
83+
## Autoresearch
84+
85+
LLM-driven topology research with verified metrics. Three stages:
86+
87+
```
88+
Hypothesis (natural language)
89+
90+
[Generation Loop] LLM → ngraph YAML → inspect → validate → iterate
91+
92+
[Simulation] ngraph run (expensive, once)
93+
94+
[Analysis] Metrics pipeline (verified) → LLM interprets → proposes next hypothesis
95+
```
96+
97+
### Quick Start
98+
99+
```python
100+
from pathlib import Path
101+
from netlab.autoresearch.hypothesis_manager import HypothesisManager
102+
from netlab.autoresearch.backend import ClaudeCLIBackend
103+
import sys
104+
105+
manager = HypothesisManager(
106+
project_dir=Path("/tmp/my_research"),
107+
backend=ClaudeCLIBackend(model="sonnet"),
108+
ngraph_bin=str(Path(sys.executable).parent / "ngraph"),
109+
)
110+
111+
cycle = manager.run_cycle("""
112+
2-site topology, 3 backbone planes, 100 Gbps cross-site per plane.
113+
Internal 500 Gbps. BB nodes with role: bb.
114+
Demands: 100 Gbps each direction, ECMP.
115+
Failure: single random BB node, 20 iterations.
116+
""")
117+
118+
print(cycle.analysis.metrics_report) # verified numbers
119+
print(cycle.analysis.interpretation) # LLM explanation
120+
print(cycle.analysis.next_hypothesis) # what to test next
121+
```
69122

70-
# Compute BAC for a workflow step
71-
bac = compute_bac_metrics(iterations, offered_bw, step_name="max_demand")
72-
print(f"p99 availability: {bac['q_pct'][0.99]:.2%}")
123+
### Multi-Cycle Research
73124

74-
# Aggregate across seeds
75-
summary = summarize_across_seeds(series_by_seed, label="latency_p99")
125+
```python
126+
hypothesis = "your initial idea..."
127+
for i in range(5):
128+
cycle = manager.run_cycle(hypothesis)
129+
print(f"Cycle {cycle.cycle_id}: {cycle.status}")
130+
hypothesis = cycle.analysis.next_hypothesis # LLM proposes next
76131
```
77132

133+
### What Gets Persisted
134+
135+
```
136+
project_dir/
137+
cycle_log.jsonl # one-line summary per cycle
138+
cycles/001/
139+
hypothesis.yml # what was tested
140+
scenario.yml # generated ngraph YAML
141+
results/ # ngraph simulation output
142+
metrics_report.md # verified numbers (machine-generated)
143+
interpretation.md # structural explanation (LLM-generated)
144+
next_hypothesis.md # suggested next experiment (LLM-generated)
145+
status.yml # analyzed | failed | skipped
146+
```
147+
148+
### Key Design Decision
149+
150+
The LLM never extracts numbers from results. The metrics pipeline (same code that passed 252 hand-calculated assertions) computes all numbers programmatically. The LLM receives verified metrics and provides only interpretation — connecting numbers to topology structure.
151+
78152
## Repository Structure
79153

80154
```
81-
metrics/ # Core metrics modules
82-
bac.py # Bandwidth availability
83-
latency.py # Latency percentiles
84-
iterops.py # Iteration analysis
85-
aggregate.py # Cross-seed aggregation
86-
plot_*.py # Visualization
87-
netlab/ # CLI
88-
cli.py # Command-line interface
89-
metrics_cmd.py # Metrics command implementation
90-
tests/ # Test suite
91-
lib/ # Config files
155+
metrics/ # Verified metrics pipeline
156+
common.py # Shared: expand_flow_results, canonical_dc
157+
bac.py # Bandwidth availability curve
158+
latency.py # Latency stretch analysis
159+
msd.py # Maximum supported demand
160+
sps.py # Structural pair survivability
161+
iterops.py # Iteration counts and timing
162+
aggregate.py # Cross-seed aggregation
163+
costpower.py # Cost and power normalization
164+
matrixdump.py # Per-pair placement matrices
165+
metrics_report.py # → autoresearch (in netlab/)
166+
netlab/
167+
cli.py # CLI entry point
168+
metrics_cmd.py # Metrics command orchestration
169+
autoresearch/
170+
generation_loop.py # Inner Loop 1: idea → validated YAML
171+
analysis_loop.py # Inner Loop 2: metrics → interpretation
172+
metrics_report.py # Programmatic metrics → markdown
173+
hypothesis_manager.py # Outer loop: hypothesis cycles + persistence
174+
backend.py # LLM backends (Claude CLI, OpenAI, mock)
175+
scenario_generator.py # DC-BB topology generator
176+
sweep.py # Parametric sweep runner
177+
tests/
178+
data/mini_dcbb.yaml # 10-node verification scenario
179+
test_mini_dcbb_verification.py # 252 hand-calculated assertions
92180
```
93181

94182
## Development
95183

96184
```bash
97185
make dev # Setup environment
98-
make check # Tests and linting
186+
make check # Pre-commit + tests + lint
99187
make test # Tests only
100188
make lint # Linting only
189+
make qt # Quick tests (skip slow)
101190
```
102191

103192
## Requirements
104193

105194
- Python 3.11+
106-
- Dependencies: numpy, pandas, matplotlib, seaborn, scipy, rich, ngraph
195+
- [ngraph](https://github.com/networmix/NetGraph) >= 0.21.0
196+
- [netgraph-core](https://github.com/networmix/NetGraph-Core) >= 0.7.0
107197

108198
## License
109199

110-
AGPL-3.0-or-later
200+
[MIT License](LICENSE)

0 commit comments

Comments
 (0)