pulseengine · avrabe · May 30, 2026 · May 30, 2026
diff --git a/artifacts/features/FEAT-FALCON-rollout.yaml b/artifacts/features/FEAT-FALCON-rollout.yaml
@@ -2204,6 +2204,46 @@ artifacts:
       - type: depends-on
         target: FEAT-FALCON-v0.19.8
 
+  - id: FEAT-FALCON-v0.20
+    type: feature
+    title: "v0.20 — relay-pos thrust calibration + RC#3 estimator-divergence proof"
+    status: approved
+    description: >
+      LANDED. First rung of the roadmap to v1.0
+      (docs/ROADMAP-AUTONOMOUS-FLIGHT.md). Two honest outcomes:
+
+      (1) Thrust calibration (correct + necessary). relay-pos's
+      PosGains::DEFAULT.hover_thrust=0.5 targets a 500 g quad; the gz
+      falcon-quad is 2 kg and hovers at 0.72. Since the mixer floor is
+      0.5, hover_thrust=0.5 sat at the floor → the body could not climb.
+      The full-cascade bench now constructs PosController with
+      hover_thrust=0.72. The body climbs cleanly to the -2 m setpoint and
+      stays physically level (true_tilt < 2°).
+
+      (2) Decisive RC#3 proof (the falsification). With the body
+      physically level (true_tilt < 2°), the Mahony EKF attitude estimate
+      diverges to est_tilt=51° over ~5 s; the cascade tracks this phantom
+      attitude and drifts away (~206 m / 30 s). Position-hold is
+      IMPOSSIBLE on the attitude-only complementary filter — this mandates
+      the v0.21 Invariant EKF. "Estimation before control", proven not
+      asserted.
+
+      Ships: full-cascade hover_thrust calibration (0.72) + POS_DEBUG
+      true-vs-est tilt diagnostic; FV-FALCON-POS-002;
+      bench-evidence/gz-sim/2026-05-30-v0.20-*.md.
+
+      Honestly NOT claimed: position-hold (diverges). v0.20 is the thrust
+      fix + the RC#3 mandate, nothing more.
+    tags: [falcon, milestone, v0.20, pos, rc3, falsification, landed]
+    fields:
+      release-target: "relay-pos thrust calibration + RC#3 estimator-divergence proof"
+      bench-date: "2026-05-30"
+      bench-result: "body climbs + level (true_tilt<2°); EKF diverges to 51° → IEKF mandated"
+      falsification: "position-hold is not a tuning problem; Mahony filter drifts 50° from truth → needs IEKF (v0.21)"
+    links:
+      - type: depends-on
+        target: FEAT-FALCON-v0.19.9
+
   - id: FEAT-FALCON-v1.0
     type: feature
     title: "v1.0 — six-domain credit dossier + airframe variants"
@@ -2240,4 +2280,4 @@ artifacts:
       - type: implements
         target: SYSREQ-FALCON-010
       - type: depends-on
-        target: FEAT-FALCON-v0.19.9
+        target: FEAT-FALCON-v0.20
diff --git a/artifacts/verification/FV-FALCON-POS-002.yaml b/artifacts/verification/FV-FALCON-POS-002.yaml
@@ -0,0 +1,52 @@
+artifacts:
+  - id: FV-FALCON-POS-002
+    type: sw-verification
+    title: "POS thrust calibration (gz 2 kg) + decisive RC#3 estimator-divergence evidence (v0.20)"
+    status: approved
+    description: >
+      Partially verifies SWREQ-FALCON-POS-P01 (the position cascade drives
+      the body toward its setpoint) and establishes the falsification that
+      gates the v0.21 Invariant-EKF keystone.
+
+      Thrust calibration (correct + necessary): relay-pos's
+      PosGains::DEFAULT.hover_thrust = 0.5 targets a 500 g quad; the gz
+      falcon-quad is 2 kg and hovers at collective 0.72. Since the v0.19.8
+      mixer floor is 0.5, hover_thrust=0.5 sat at the floor and the body
+      could not climb. v0.20 constructs the full-cascade bench's
+      PosController with hover_thrust=0.72. Effect: the body climbs cleanly
+      to the -2 m setpoint and stays physically level (true_tilt < 2°,
+      measured from the accelerometer) for the whole run.
+
+      Decisive RC#3 evidence (the reason position-hold still fails): with
+      POS_DEBUG, the EKF attitude estimate (est_tilt, from the EKF
+      quaternion) diverges from the true tilt:
+        t=2.5 est_tilt=0.2°  ... t=5.0 4.2° ... t=6.5 44.5° ... t=7.0 51.3°
+      while true_tilt stays < 2°. The Mahony complementary filter
+      (relay-ekf) uses the accelerometer as its sole gravity reference and
+      drifts ~50° from truth during quasi-static hover; the cascade then
+      tracks this phantom attitude and the body drifts away (final_dist
+      ~206 m over 30 s). Position-hold is impossible on the attitude-only
+      Mahony filter — this mandates the v0.21 Invariant EKF (full-state,
+      consistency-guaranteed). bench-evidence/gz-sim/2026-05-30-v0.20-*.md.
+
+      Verification:
+        - gz hover (full cascade): body climbs to -1.8 m, true_tilt < 2°
+          → thrust calibration confirmed.
+        - gz hover: est_tilt diverges to 51° vs true 2° → RC#3 proven.
+        - cargo test --workspace → green.
+        - rivet validate → PASS.
+
+      Honestly NOT claimed:
+        - That v0.20 achieves position-hold. It does not (diverges). v0.20
+          ships only the thrust calibration + the RC#3 proof that drives
+          the IEKF.
+    tags: [falcon, pos, thrust-calibration, rc3, estimator, falsification, v0.20]
+    fields:
+      runs:
+        - "gz bench --scenario=hover  (body climbs to -1.8 m, true_tilt < 2°; est_tilt diverges to 51° → RC#3)  # bench-only"
+        - "cargo test --workspace  (green)"
+      bench-result: "thrust fix correct (climbs, level); estimator diverges 51° vs true 2° → IEKF mandated (v0.21)"
+      falsification: "position-hold is NOT a tuning problem; the attitude-only Mahony filter drifts 50° from truth → needs IEKF"
+    links:
+      - type: verifies
+        target: SWREQ-FALCON-POS-P01
diff --git a/...ence/gz-sim/2026-05-30-v0.20-thrust-calibration-and-RC3-estimator-divergence.md b/...ence/gz-sim/2026-05-30-v0.20-thrust-calibration-and-RC3-estimator-divergence.md
@@ -0,0 +1,75 @@
+# v0.20 — relay-pos thrust calibration + decisive RC#3 estimator-divergence proof
+
+Date: 2026-05-30
+Backend: Gazebo Harmonic (gz sim 8.11.0), `gz sim -s -r`, `--home=47.3977,8.5456,488`
+Scenario: `hover` (full cascade: EKF → POS → ATT → RATE → MIX)
+
+## What v0.20 fixes (correct + necessary)
+
+`relay-pos`'s `PosGains::DEFAULT.hover_thrust = 0.5` is documented for a
+"500 g 10-inch quad". The gz falcon-quad is **2 kg** and hovers at
+collective **0.72** (the value the alt-rate path uses). Since the v0.19.8
+mixer floor is 0.5, a hover-thrust of 0.5 sits *at the floor* → the
+position controller could not command enough collective to climb. v0.20
+calibrates the full-cascade bench to construct
+`PosController::with_gains(PosGains { hover_thrust: 0.72, ..DEFAULT })`.
+
+Effect (trajectory, `POS_DEBUG=1`):
+
+```
+ t=0.0  pos=[0.0, 0.0, -0.0]  true_tilt=0.0°  est_tilt=0.0°   thrust=0.87
+ t=1.0  pos=[0.0, 0.0, -0.7]  true_tilt=0.0°  est_tilt=0.0°   thrust=0.84
+ t=2.0  pos=[0.0,-0.0, -1.8]  true_tilt=0.0°  est_tilt=0.0°   thrust=0.75   ← climbed to altitude
+```
+
+The body now climbs cleanly to the −2 m setpoint and is **physically
+level** (`true_tilt`, from the accelerometer, stays < 2° for the whole
+run). The thrust fix is correct and necessary.
+
+## Why position-hold STILL fails — RC#3, proven
+
+The same run, continued:
+
+```
+ t=2.5  pos=[0.0,-0.0,-1.9]  true_tilt=0.0°  est_tilt=0.2°
+ t=4.5  pos=[0.1, 0.0,-1.8]  true_tilt=0.1°  est_tilt=1.7°
+ t=5.0  pos=[0.1, 0.0,-1.9]  true_tilt=0.2°  est_tilt=4.2°
+ t=6.0  pos=[0.5, 0.4,-2.2]  true_tilt=0.6°  est_tilt=19.8°
+ t=6.5  pos=[0.9, 0.9,-2.2]  true_tilt=1.0°  est_tilt=44.5°
+ t=7.0  pos=[1.4, 2.1,-2.1]  true_tilt=2.0°  est_tilt=51.3°
+ t=7.5  pos=[0.9, 5.8,-0.7]  true_tilt=1.4°  est_tilt=41.2°   ← drifting away
+verdict: final_dist=8.48 m (30 s run diverges to ~206 m)
+```
+
+**The body is level (`true_tilt` < 2°) the entire time, but the EKF
+believes it is tilted up to 51°.** The Mahony complementary filter
+(`relay-ekf`) uses the accelerometer as its sole gravity reference; its
+attitude estimate drifts ~50° from truth during quasi-static hover. The
+cascade then faithfully tracks this *phantom* attitude — it commands a
+"level" correction that actually tilts and translates the real body,
+which drifts away.
+
+## Conclusion — the IEKF is mandatory (v0.21)
+
+This is the decisive falsification of "position-hold is a tuning
+problem":
+
+- The thrust fix is correct (body climbs, physically stable, `true_tilt`
+  < 2°). Necessary, not sufficient.
+- The wall is the **estimator**: `est_tilt` diverges to 51° while
+  `true_tilt` stays < 2°. The controllers are tracking a fiction.
+- Position-hold is **impossible** on the attitude-only Mahony filter.
+  This mandates the v0.21 **Invariant EKF** (full-state, consistency-
+  guaranteed) — exactly the roadmap's "estimation before control"
+  keystone.
+
+Flag for v0.21: the 50° divergence is *larger* than inherent-Mahony
+limits at near-hover would predict — it hints at a relay-ekf gyro-bias /
+integration issue as well, not just the structural accel-as-gravity
+limitation. The IEKF rebuild and its consistency (NEES) checks will
+resolve which, either way.
+
+## NOT claimed
+
+- That v0.20 achieves position-hold. It does not (diverges to ~206 m).
+  v0.20 ships the thrust calibration + this RC#3 proof, nothing more.
diff --git a/docs/ROADMAP-AUTONOMOUS-FLIGHT.md b/docs/ROADMAP-AUTONOMOUS-FLIGHT.md
@@ -0,0 +1,173 @@
+# Roadmap — to perfect, formally-proven autonomous flight (v0.20 → v1.0)
+
+Status: approved direction (2026-05-30). Supersedes the ad-hoc v0.19.x
+stepping-stone plan. Authoritative source for the falcon cascade's
+technical direction.
+
+## The mission framing
+
+PulseEngine's question is *"AI writes the code. Who proves it's safe?"*
+This roadmap answers it for flight: reach **frontier flight capability**
+(PX4-class autonomy, then Swift-class agility) while being the only stack
+that **formally proves** the result — Kani + Verus + Lean + Rocq +
+witness MC/DC + sigil attestation, end to end.
+
+Two destinations, fused in the chosen order **certified floor → agile
+ceiling**:
+- **Certified autonomy** (v0.20–v0.27): provable PX4-class capability —
+  full nav, position-hold, missions, fault tolerance, dossier.
+- **Verified agility** (v0.28+): a learned/NMPC agile policy flying
+  *inside a formally-verified runtime-assurance shield* that falls back
+  to the Lyapunov-certified geometric controller. The literal answer to
+  "who proves the AI is safe?"
+
+## Honest starting point (v0.19.9)
+
+The cascade is, architecturally, a **~2010-era stack missing its entire
+navigation half**:
+
+- **Estimation** — `relay-ekf` is a Mahony *complementary filter*,
+  **attitude only**. `position_ned`/`velocity_ned` are hardcoded zero
+  (`crates/relay-ekf/plain/src/lib.rs:95-98`). It trusts the
+  accelerometer as a gravity reference — true only at rest, which IS
+  root cause RC#3.
+- **Control** — `relay-pos` (cascaded P-PI) and `relay-att`
+  (quaternion-error P) both use **small-angle linearization** (<~30°).
+- **Why the bench "hovers"** — it feeds the controllers Gazebo's
+  ground-truth position. The aircraft cannot yet estimate where it is.
+- **Verified** — proptest everywhere; Kani on `relay-mix-quad` (MIX-P05)
+  and `relay-arm` (ARM-P01). Lean (RATE-P03 Lyapunov) and Rocq (ATT-P01)
+  proofs are deferred — this roadmap cashes them.
+
+So "make it fly perfectly" is **not** a tuning problem. It needs the
+navigation half built, the controllers moved off small-angle onto
+geometric SE(3), and the whole thing proven.
+
+## State of the art we are targeting (2024–2026)
+
+| Layer | Frontier reference | Why it's the target |
+|---|---|---|
+| Estimation | **Invariant EKF (IEKF)** on SE₂(3) — consistency-guaranteed Lie-group filter. PX4 **EKF2** = 24-state error-state EKF (att+pos+vel+biases+mag+wind, delayed-fusion horizon, multi-hypothesis yaw GSF). | IEKF leapfrogs EKF2 (2015-era): the invariant error is *group-affine* (state-independent error dynamics) → a clean geometric object Lean/Rocq prove well, sharing the SE(3) foundation with the controller. |
+| Attitude/pos control | **Geometric SE(3) control** (Lee, Leok, McClamroch 2010) — almost-global, Lyapunov-provable. | Replaces small-angle linearization; the almost-global Lyapunov proof is the Lean track's crown jewel. |
+| Inner loop / robustness | **INDI** (Incremental Nonlinear Dynamic Inversion) — sensor-based, model-light, robust; the basis of modern aggressive + fault-tolerant flight. | Rejects model uncertainty; enables rotor-loss survival. |
+| Allocation | Pseudo-inverse + **sequential desaturation / active-set**, reconfigurable under rotor loss. | Replaces the static 4×4 matrix; handles saturation + faults + arbitrary airframes. |
+| Agility | **Learned policy** (Swift, *Nature* 2023 — champion-level drone racing via deep RL, collective-thrust + body-rate interface) / NMPC. | The ceiling. Wrapped in a verified shield, not trusted bare. |
+| Verification frontier | SMT-verified neural-Lyapunov certificates, reachability (ReachNN). | We go further: formal proofs (Lean/Kani/Verus), not SMT-on-NN alone. |
+
+Sources: PX4 EKF2 architecture (DeepWiki / PX4 docs / PR #22262);
+Invariant EKF (Barrau & Bonnabel; arXiv:2404.10665 IterIEKF; J. Field
+Robotics 2025 Mars-quadrotor IEKF); Geometric SE(3) (arXiv:1003.2005,
+Lee/Leok/McClamroch); INDI fault-tolerant (arXiv:2002.07837, Sun et al.);
+Swift (Nature s41586-023-06419-4); neural-Lyapunov certification
+(arXiv:2503.04129).
+
+## Decisions locked (2026-05-30)
+
+- **Estimator (v0.21): Invariant EKF (IEKF) — leapfrog.** Accept the
+  research risk for the consistency guarantee + the shared SE(3)
+  verification foundation. Not the pragmatic error-state EKF.
+- **North star: both, sequenced** — certified floor (v0.20–27) then
+  agile ceiling (v0.28+), culminating in a v1.0 that is both and fully
+  attested.
+
+## The 10-release climb
+
+Each release pairs a capability jump with the SOTA tech AND its
+verification gate (defense-in-depth) AND a falsification statement.
+
+### v0.20 — Full closed-loop position-hold
+- **Capability:** the full cascade holds a *position* (not just
+  attitude+altitude) in gz. Closes the SIMULATOR "hover" scenario.
+- **Tech:** fix `relay-pos` hover-thrust normalisation (default 0.5 ≠ the
+  2 kg body's 0.72); differential-flatness feed-forward.
+- **Gate:** gz hover ±0.5 m / 30 s, 4/4 reproducible; rivet POS artifact.
+- **Falsify:** if the loop diverges, the position-bound assertion trips.
+- **Risk:** low. *(Note: still uses gz-truth position — v0.21 removes
+  that crutch.)*
+
+### v0.21 — ★ Full-state Invariant EKF (the keystone)
+- **Capability:** onboard estimation of position + velocity + attitude +
+  IMU biases, fusing IMU + GPS + baro. Removes the gz-truth crutch.
+  **Kills RC#3** (proper predict/correct gravity handling).
+- **Tech:** Invariant EKF on SE₂(3) (group-affine error dynamics).
+- **Gate:** proptest + a **consistency proof** (the invariant error is a
+  Lie-group symmetry); NEES/consistency bench vs gz truth; defense-in-
+  depth Kani on the bounded update.
+- **Falsify:** if NEES leaves its χ² envelope, the filter is
+  inconsistent — fail.
+- **Risk:** HIGH (research-shaped). The crux release.
+
+### v0.22 — Magnetometer + multi-hypothesis yaw
+- **Capability:** drift-free heading; recover yaw without mag (GPS-only).
+- **Tech:** mag fusion + Gaussian-Sum-Filter multi-hypothesis yaw.
+- **Gate:** observability oracle; yaw-drift bound over a long bench.
+- **Risk:** medium.
+
+### v0.23 — Geometric SE(3) control + Lean Lyapunov proof
+- **Capability:** almost-global attitude/position tracking; large-angle
+  manoeuvres the small-angle stack can't do.
+- **Tech:** geometric control on SE(3) (Lee 2010).
+- **Gate:** **Lean almost-global Lyapunov stability theorem** via
+  `rules_lean4` — finally discharges the deferred RATE-P03; Rocq ATT-P01.
+- **Falsify:** publish the basin of attraction; if a trajectory inside
+  the proven basin diverges in sim, the model/proof is wrong.
+- **Risk:** medium-high.
+
+### v0.24 — Control allocation
+- **Capability:** correct saturation behaviour; any airframe geometry.
+- **Tech:** pseudo-inverse + sequential desaturation / active-set.
+- **Gate:** Kani bounds (MIX-P05-style) on the allocator output set.
+- **Risk:** low-medium.
+
+### v0.25 — INDI inner loop
+- **Capability:** robustness to model error / mass change; aggressive
+  flight.
+- **Tech:** Incremental Nonlinear Dynamic Inversion (sensor-based).
+- **Gate:** robustness-margin proptest; disturbance-rejection bench.
+- **Risk:** medium.
+
+### v0.26 — Fault-tolerant control (rotor loss)
+- **Capability:** survive single / double-opposing rotor failure.
+- **Tech:** INDI fault-tolerant + reconfigurable allocation.
+- **Gate:** reachability/safety oracle; the relay-lc/sc fault chain.
+  Large dossier value (DO-178C / ISO-26262 fault cases).
+- **Risk:** high.
+
+### v0.27 — Trajectory generation + all SIMULATOR scenarios
+- **Capability:** missions, 5 m step (<2 s, <20% overshoot), 10 m/s gust
+  rejection — **all four SIMULATOR.md scenarios green**, with RTL.
+- **Tech:** differential-flatness / minimum-snap trajectories.
+- **Gate:** the four scenario verdicts in the verification gate.
+- **Risk:** medium. *This is the "certified autonomy floor" complete.*
+
+### v0.28 — Learned/NMPC agility inside a verified shield
+- **Capability:** Swift-class agile flight.
+- **Tech:** RL policy (collective-thrust + body-rate) or NMPC, **wrapped
+  in a verified simplex / runtime-assurance shield**: if the policy
+  leaves the proven safe set, the v0.23 Lyapunov-certified geometric
+  controller takes over.
+- **Gate:** formal proof of the shield's safe-set invariance + the
+  certified fallback. **The moat** — AI agility with a proven floor.
+- **Risk:** high.
+
+### v1.0 — Perfectly fly, formally proven
+- **Capability:** the full picture — certified autonomy + verified
+  agility, hex/coax airframe variants.
+- **Tech:** all of the above, fused.
+- **Gate:** the complete stack — Kani + Verus + Lean + Rocq + witness
+  MC/DC + sigil attestation — and the six-domain credit dossier.
+
+## Sequencing principles
+
+1. **Estimation before control.** A geometric SE(3) controller on a
+   filter that can't localise is pointless. v0.21 (IEKF) is the keystone;
+   everything above is gated on it. v0.20 is a quick win on gz-truth to
+   keep momentum while v0.21 is built.
+2. **Provable spine, learned tip.** The geometric controller (v0.23) is
+   the verified backbone with an almost-global Lyapunov proof. The
+   learned/NMPC layer (v0.28) sits *inside a shield that falls back to
+   that spine*. This inverts industry order (learn first, certify never)
+   into PulseEngine order (prove the floor, then let AI push the ceiling).
+3. **One geometry, two uses.** IEKF and SE(3) control share the same
+   Lie-group foundation — the estimator's consistency proof and the
+   controller's stability proof rest on the same verified math.