Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 41 additions & 1 deletion artifacts/features/FEAT-FALCON-rollout.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2204,6 +2204,46 @@ artifacts:
- type: depends-on
target: FEAT-FALCON-v0.19.8

- id: FEAT-FALCON-v0.20
type: feature
title: "v0.20 — relay-pos thrust calibration + RC#3 estimator-divergence proof"
status: approved
description: >
LANDED. First rung of the roadmap to v1.0
(docs/ROADMAP-AUTONOMOUS-FLIGHT.md). Two honest outcomes:

(1) Thrust calibration (correct + necessary). relay-pos's
PosGains::DEFAULT.hover_thrust=0.5 targets a 500 g quad; the gz
falcon-quad is 2 kg and hovers at 0.72. Since the mixer floor is
0.5, hover_thrust=0.5 sat at the floor → the body could not climb.
The full-cascade bench now constructs PosController with
hover_thrust=0.72. The body climbs cleanly to the -2 m setpoint and
stays physically level (true_tilt < 2°).

(2) Decisive RC#3 proof (the falsification). With the body
physically level (true_tilt < 2°), the Mahony EKF attitude estimate
diverges to est_tilt=51° over ~5 s; the cascade tracks this phantom
attitude and drifts away (~206 m / 30 s). Position-hold is
IMPOSSIBLE on the attitude-only complementary filter — this mandates
the v0.21 Invariant EKF. "Estimation before control", proven not
asserted.

Ships: full-cascade hover_thrust calibration (0.72) + POS_DEBUG
true-vs-est tilt diagnostic; FV-FALCON-POS-002;
bench-evidence/gz-sim/2026-05-30-v0.20-*.md.

Honestly NOT claimed: position-hold (diverges). v0.20 is the thrust
fix + the RC#3 mandate, nothing more.
tags: [falcon, milestone, v0.20, pos, rc3, falsification, landed]
fields:
release-target: "relay-pos thrust calibration + RC#3 estimator-divergence proof"
bench-date: "2026-05-30"
bench-result: "body climbs + level (true_tilt<2°); EKF diverges to 51° → IEKF mandated"
falsification: "position-hold is not a tuning problem; Mahony filter drifts 50° from truth → needs IEKF (v0.21)"
links:
- type: depends-on
target: FEAT-FALCON-v0.19.9

- id: FEAT-FALCON-v1.0
type: feature
title: "v1.0 — six-domain credit dossier + airframe variants"
Expand Down Expand Up @@ -2240,4 +2280,4 @@ artifacts:
- type: implements
target: SYSREQ-FALCON-010
- type: depends-on
target: FEAT-FALCON-v0.19.9
target: FEAT-FALCON-v0.20
52 changes: 52 additions & 0 deletions artifacts/verification/FV-FALCON-POS-002.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
artifacts:
- id: FV-FALCON-POS-002
type: sw-verification
title: "POS thrust calibration (gz 2 kg) + decisive RC#3 estimator-divergence evidence (v0.20)"
status: approved
description: >
Partially verifies SWREQ-FALCON-POS-P01 (the position cascade drives
the body toward its setpoint) and establishes the falsification that
gates the v0.21 Invariant-EKF keystone.

Thrust calibration (correct + necessary): relay-pos's
PosGains::DEFAULT.hover_thrust = 0.5 targets a 500 g quad; the gz
falcon-quad is 2 kg and hovers at collective 0.72. Since the v0.19.8
mixer floor is 0.5, hover_thrust=0.5 sat at the floor and the body
could not climb. v0.20 constructs the full-cascade bench's
PosController with hover_thrust=0.72. Effect: the body climbs cleanly
to the -2 m setpoint and stays physically level (true_tilt < 2°,
measured from the accelerometer) for the whole run.

Decisive RC#3 evidence (the reason position-hold still fails): with
POS_DEBUG, the EKF attitude estimate (est_tilt, from the EKF
quaternion) diverges from the true tilt:
t=2.5 est_tilt=0.2° ... t=5.0 4.2° ... t=6.5 44.5° ... t=7.0 51.3°
while true_tilt stays < 2°. The Mahony complementary filter
(relay-ekf) uses the accelerometer as its sole gravity reference and
drifts ~50° from truth during quasi-static hover; the cascade then
tracks this phantom attitude and the body drifts away (final_dist
~206 m over 30 s). Position-hold is impossible on the attitude-only
Mahony filter — this mandates the v0.21 Invariant EKF (full-state,
consistency-guaranteed). bench-evidence/gz-sim/2026-05-30-v0.20-*.md.

Verification:
- gz hover (full cascade): body climbs to -1.8 m, true_tilt < 2°
→ thrust calibration confirmed.
- gz hover: est_tilt diverges to 51° vs true 2° → RC#3 proven.
- cargo test --workspace → green.
- rivet validate → PASS.

Honestly NOT claimed:
- That v0.20 achieves position-hold. It does not (diverges). v0.20
ships only the thrust calibration + the RC#3 proof that drives
the IEKF.
tags: [falcon, pos, thrust-calibration, rc3, estimator, falsification, v0.20]
fields:
runs:
- "gz bench --scenario=hover (body climbs to -1.8 m, true_tilt < 2°; est_tilt diverges to 51° → RC#3) # bench-only"
- "cargo test --workspace (green)"
bench-result: "thrust fix correct (climbs, level); estimator diverges 51° vs true 2° → IEKF mandated (v0.21)"
falsification: "position-hold is NOT a tuning problem; the attitude-only Mahony filter drifts 50° from truth → needs IEKF"
links:
- type: verifies
target: SWREQ-FALCON-POS-P01
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# v0.20 — relay-pos thrust calibration + decisive RC#3 estimator-divergence proof

Date: 2026-05-30
Backend: Gazebo Harmonic (gz sim 8.11.0), `gz sim -s -r`, `--home=47.3977,8.5456,488`
Scenario: `hover` (full cascade: EKF → POS → ATT → RATE → MIX)

## What v0.20 fixes (correct + necessary)

`relay-pos`'s `PosGains::DEFAULT.hover_thrust = 0.5` is documented for a
"500 g 10-inch quad". The gz falcon-quad is **2 kg** and hovers at
collective **0.72** (the value the alt-rate path uses). Since the v0.19.8
mixer floor is 0.5, a hover-thrust of 0.5 sits *at the floor* → the
position controller could not command enough collective to climb. v0.20
calibrates the full-cascade bench to construct
`PosController::with_gains(PosGains { hover_thrust: 0.72, ..DEFAULT })`.

Effect (trajectory, `POS_DEBUG=1`):

```
t=0.0 pos=[0.0, 0.0, -0.0] true_tilt=0.0° est_tilt=0.0° thrust=0.87
t=1.0 pos=[0.0, 0.0, -0.7] true_tilt=0.0° est_tilt=0.0° thrust=0.84
t=2.0 pos=[0.0,-0.0, -1.8] true_tilt=0.0° est_tilt=0.0° thrust=0.75 ← climbed to altitude
```

The body now climbs cleanly to the −2 m setpoint and is **physically
level** (`true_tilt`, from the accelerometer, stays < 2° for the whole
run). The thrust fix is correct and necessary.

## Why position-hold STILL fails — RC#3, proven

The same run, continued:

```
t=2.5 pos=[0.0,-0.0,-1.9] true_tilt=0.0° est_tilt=0.2°
t=4.5 pos=[0.1, 0.0,-1.8] true_tilt=0.1° est_tilt=1.7°
t=5.0 pos=[0.1, 0.0,-1.9] true_tilt=0.2° est_tilt=4.2°
t=6.0 pos=[0.5, 0.4,-2.2] true_tilt=0.6° est_tilt=19.8°
t=6.5 pos=[0.9, 0.9,-2.2] true_tilt=1.0° est_tilt=44.5°
t=7.0 pos=[1.4, 2.1,-2.1] true_tilt=2.0° est_tilt=51.3°
t=7.5 pos=[0.9, 5.8,-0.7] true_tilt=1.4° est_tilt=41.2° ← drifting away
verdict: final_dist=8.48 m (30 s run diverges to ~206 m)
```

**The body is level (`true_tilt` < 2°) the entire time, but the EKF
believes it is tilted up to 51°.** The Mahony complementary filter
(`relay-ekf`) uses the accelerometer as its sole gravity reference; its
attitude estimate drifts ~50° from truth during quasi-static hover. The
cascade then faithfully tracks this *phantom* attitude — it commands a
"level" correction that actually tilts and translates the real body,
which drifts away.

## Conclusion — the IEKF is mandatory (v0.21)

This is the decisive falsification of "position-hold is a tuning
problem":

- The thrust fix is correct (body climbs, physically stable, `true_tilt`
< 2°). Necessary, not sufficient.
- The wall is the **estimator**: `est_tilt` diverges to 51° while
`true_tilt` stays < 2°. The controllers are tracking a fiction.
- Position-hold is **impossible** on the attitude-only Mahony filter.
This mandates the v0.21 **Invariant EKF** (full-state, consistency-
guaranteed) — exactly the roadmap's "estimation before control"
keystone.

Flag for v0.21: the 50° divergence is *larger* than inherent-Mahony
limits at near-hover would predict — it hints at a relay-ekf gyro-bias /
integration issue as well, not just the structural accel-as-gravity
limitation. The IEKF rebuild and its consistency (NEES) checks will
resolve which, either way.

## NOT claimed

- That v0.20 achieves position-hold. It does not (diverges to ~206 m).
v0.20 ships the thrust calibration + this RC#3 proof, nothing more.
173 changes: 173 additions & 0 deletions docs/ROADMAP-AUTONOMOUS-FLIGHT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
# Roadmap — to perfect, formally-proven autonomous flight (v0.20 → v1.0)

Status: approved direction (2026-05-30). Supersedes the ad-hoc v0.19.x
stepping-stone plan. Authoritative source for the falcon cascade's
technical direction.

## The mission framing

PulseEngine's question is *"AI writes the code. Who proves it's safe?"*
This roadmap answers it for flight: reach **frontier flight capability**
(PX4-class autonomy, then Swift-class agility) while being the only stack
that **formally proves** the result — Kani + Verus + Lean + Rocq +
witness MC/DC + sigil attestation, end to end.

Two destinations, fused in the chosen order **certified floor → agile
ceiling**:
- **Certified autonomy** (v0.20–v0.27): provable PX4-class capability —
full nav, position-hold, missions, fault tolerance, dossier.
- **Verified agility** (v0.28+): a learned/NMPC agile policy flying
*inside a formally-verified runtime-assurance shield* that falls back
to the Lyapunov-certified geometric controller. The literal answer to
"who proves the AI is safe?"

## Honest starting point (v0.19.9)

The cascade is, architecturally, a **~2010-era stack missing its entire
navigation half**:

- **Estimation** — `relay-ekf` is a Mahony *complementary filter*,
**attitude only**. `position_ned`/`velocity_ned` are hardcoded zero
(`crates/relay-ekf/plain/src/lib.rs:95-98`). It trusts the
accelerometer as a gravity reference — true only at rest, which IS
root cause RC#3.
- **Control** — `relay-pos` (cascaded P-PI) and `relay-att`
(quaternion-error P) both use **small-angle linearization** (<~30°).
- **Why the bench "hovers"** — it feeds the controllers Gazebo's
ground-truth position. The aircraft cannot yet estimate where it is.
- **Verified** — proptest everywhere; Kani on `relay-mix-quad` (MIX-P05)
and `relay-arm` (ARM-P01). Lean (RATE-P03 Lyapunov) and Rocq (ATT-P01)
proofs are deferred — this roadmap cashes them.

So "make it fly perfectly" is **not** a tuning problem. It needs the
navigation half built, the controllers moved off small-angle onto
geometric SE(3), and the whole thing proven.

## State of the art we are targeting (2024–2026)

| Layer | Frontier reference | Why it's the target |
|---|---|---|
| Estimation | **Invariant EKF (IEKF)** on SE₂(3) — consistency-guaranteed Lie-group filter. PX4 **EKF2** = 24-state error-state EKF (att+pos+vel+biases+mag+wind, delayed-fusion horizon, multi-hypothesis yaw GSF). | IEKF leapfrogs EKF2 (2015-era): the invariant error is *group-affine* (state-independent error dynamics) → a clean geometric object Lean/Rocq prove well, sharing the SE(3) foundation with the controller. |
| Attitude/pos control | **Geometric SE(3) control** (Lee, Leok, McClamroch 2010) — almost-global, Lyapunov-provable. | Replaces small-angle linearization; the almost-global Lyapunov proof is the Lean track's crown jewel. |
| Inner loop / robustness | **INDI** (Incremental Nonlinear Dynamic Inversion) — sensor-based, model-light, robust; the basis of modern aggressive + fault-tolerant flight. | Rejects model uncertainty; enables rotor-loss survival. |
| Allocation | Pseudo-inverse + **sequential desaturation / active-set**, reconfigurable under rotor loss. | Replaces the static 4×4 matrix; handles saturation + faults + arbitrary airframes. |
| Agility | **Learned policy** (Swift, *Nature* 2023 — champion-level drone racing via deep RL, collective-thrust + body-rate interface) / NMPC. | The ceiling. Wrapped in a verified shield, not trusted bare. |
| Verification frontier | SMT-verified neural-Lyapunov certificates, reachability (ReachNN). | We go further: formal proofs (Lean/Kani/Verus), not SMT-on-NN alone. |

Sources: PX4 EKF2 architecture (DeepWiki / PX4 docs / PR #22262);
Invariant EKF (Barrau & Bonnabel; arXiv:2404.10665 IterIEKF; J. Field
Robotics 2025 Mars-quadrotor IEKF); Geometric SE(3) (arXiv:1003.2005,
Lee/Leok/McClamroch); INDI fault-tolerant (arXiv:2002.07837, Sun et al.);
Swift (Nature s41586-023-06419-4); neural-Lyapunov certification
(arXiv:2503.04129).

## Decisions locked (2026-05-30)

- **Estimator (v0.21): Invariant EKF (IEKF) — leapfrog.** Accept the
research risk for the consistency guarantee + the shared SE(3)
verification foundation. Not the pragmatic error-state EKF.
- **North star: both, sequenced** — certified floor (v0.20–27) then
agile ceiling (v0.28+), culminating in a v1.0 that is both and fully
attested.

## The 10-release climb

Each release pairs a capability jump with the SOTA tech AND its
verification gate (defense-in-depth) AND a falsification statement.

### v0.20 — Full closed-loop position-hold
- **Capability:** the full cascade holds a *position* (not just
attitude+altitude) in gz. Closes the SIMULATOR "hover" scenario.
- **Tech:** fix `relay-pos` hover-thrust normalisation (default 0.5 ≠ the
2 kg body's 0.72); differential-flatness feed-forward.
- **Gate:** gz hover ±0.5 m / 30 s, 4/4 reproducible; rivet POS artifact.
- **Falsify:** if the loop diverges, the position-bound assertion trips.
- **Risk:** low. *(Note: still uses gz-truth position — v0.21 removes
that crutch.)*

### v0.21 — ★ Full-state Invariant EKF (the keystone)
- **Capability:** onboard estimation of position + velocity + attitude +
IMU biases, fusing IMU + GPS + baro. Removes the gz-truth crutch.
**Kills RC#3** (proper predict/correct gravity handling).
- **Tech:** Invariant EKF on SE₂(3) (group-affine error dynamics).
- **Gate:** proptest + a **consistency proof** (the invariant error is a
Lie-group symmetry); NEES/consistency bench vs gz truth; defense-in-
depth Kani on the bounded update.
- **Falsify:** if NEES leaves its χ² envelope, the filter is
inconsistent — fail.
- **Risk:** HIGH (research-shaped). The crux release.

### v0.22 — Magnetometer + multi-hypothesis yaw
- **Capability:** drift-free heading; recover yaw without mag (GPS-only).
- **Tech:** mag fusion + Gaussian-Sum-Filter multi-hypothesis yaw.
- **Gate:** observability oracle; yaw-drift bound over a long bench.
- **Risk:** medium.

### v0.23 — Geometric SE(3) control + Lean Lyapunov proof
- **Capability:** almost-global attitude/position tracking; large-angle
manoeuvres the small-angle stack can't do.
- **Tech:** geometric control on SE(3) (Lee 2010).
- **Gate:** **Lean almost-global Lyapunov stability theorem** via
`rules_lean4` — finally discharges the deferred RATE-P03; Rocq ATT-P01.
- **Falsify:** publish the basin of attraction; if a trajectory inside
the proven basin diverges in sim, the model/proof is wrong.
- **Risk:** medium-high.

### v0.24 — Control allocation
- **Capability:** correct saturation behaviour; any airframe geometry.
- **Tech:** pseudo-inverse + sequential desaturation / active-set.
- **Gate:** Kani bounds (MIX-P05-style) on the allocator output set.
- **Risk:** low-medium.

### v0.25 — INDI inner loop
- **Capability:** robustness to model error / mass change; aggressive
flight.
- **Tech:** Incremental Nonlinear Dynamic Inversion (sensor-based).
- **Gate:** robustness-margin proptest; disturbance-rejection bench.
- **Risk:** medium.

### v0.26 — Fault-tolerant control (rotor loss)
- **Capability:** survive single / double-opposing rotor failure.
- **Tech:** INDI fault-tolerant + reconfigurable allocation.
- **Gate:** reachability/safety oracle; the relay-lc/sc fault chain.
Large dossier value (DO-178C / ISO-26262 fault cases).
- **Risk:** high.

### v0.27 — Trajectory generation + all SIMULATOR scenarios
- **Capability:** missions, 5 m step (<2 s, <20% overshoot), 10 m/s gust
rejection — **all four SIMULATOR.md scenarios green**, with RTL.
- **Tech:** differential-flatness / minimum-snap trajectories.
- **Gate:** the four scenario verdicts in the verification gate.
- **Risk:** medium. *This is the "certified autonomy floor" complete.*

### v0.28 — Learned/NMPC agility inside a verified shield
- **Capability:** Swift-class agile flight.
- **Tech:** RL policy (collective-thrust + body-rate) or NMPC, **wrapped
in a verified simplex / runtime-assurance shield**: if the policy
leaves the proven safe set, the v0.23 Lyapunov-certified geometric
controller takes over.
- **Gate:** formal proof of the shield's safe-set invariance + the
certified fallback. **The moat** — AI agility with a proven floor.
- **Risk:** high.

### v1.0 — Perfectly fly, formally proven
- **Capability:** the full picture — certified autonomy + verified
agility, hex/coax airframe variants.
- **Tech:** all of the above, fused.
- **Gate:** the complete stack — Kani + Verus + Lean + Rocq + witness
MC/DC + sigil attestation — and the six-domain credit dossier.

## Sequencing principles

1. **Estimation before control.** A geometric SE(3) controller on a
filter that can't localise is pointless. v0.21 (IEKF) is the keystone;
everything above is gated on it. v0.20 is a quick win on gz-truth to
keep momentum while v0.21 is built.
2. **Provable spine, learned tip.** The geometric controller (v0.23) is
the verified backbone with an almost-global Lyapunov proof. The
learned/NMPC layer (v0.28) sits *inside a shield that falls back to
that spine*. This inverts industry order (learn first, certify never)
into PulseEngine order (prove the floor, then let AI push the ceiling).
3. **One geometry, two uses.** IEKF and SE(3) control share the same
Lie-group foundation — the estimator's consistency proof and the
controller's stability proof rest on the same verified math.
Loading
Loading