feat(pomdps): POMDPs.jl integration via package extension + pendulum demo#59
Open
jamgochiana wants to merge 5 commits into
Open
feat(pomdps): POMDPs.jl integration via package extension + pendulum demo#59jamgochiana wants to merge 5 commits into
jamgochiana wants to merge 5 commits into
Conversation
Collaborator
Author
jamgochiana
commented
May 27, 2026
Comment on lines
+245
to
+258
| dmodel = NonlinearDynamicsModel(pendulum_step, W) | ||
| omodel = NonlinearObservationModel(observe_omega, V) | ||
| ekf = ExtendedKalmanFilter(dmodel, omodel) | ||
| updater = pomdps_updater(ekf) # wraps the EKF as a POMDPs.Updater | ||
| pomdp = PendulumPOMDP() | ||
| policy = ILQRPolicy() | ||
|
|
||
| # --------------------------------------------------------------------- | ||
| # Run it via POMDPs.simulate | ||
| # --------------------------------------------------------------------- | ||
|
|
||
| rng = MersenneTwister(0) | ||
| hr = HistoryRecorder(rng = rng, max_steps = 60) | ||
| hist = simulate(hr, pomdp, policy, updater) |
Collaborator
Author
There was a problem hiding this comment.
Example usage with POMDPs.simulate
Member
|
Awesome 😎 |
059fb95 to
b3bf81b
Compare
a3d554f to
3d7473a
Compare
Add a weak dependency on POMDPs.jl and a package extension (GaussianFiltersPOMDPsExt) that wires our AbstractFilter into the POMDPs.jl belief-updater interface: - POMDPs.update(filter, b, a, o) dispatches to GaussianFilters.update - POMDPs.initialize_belief(filter, b::GaussianBelief) is identity Users who do not have POMDPs.jl loaded pay zero overhead — the extension only activates when both packages are present (requires Julia 1.9+). Bumps julia compat to 1.9 to enable [weakdeps] and [extensions]. Includes: - examples/pomdps_integration.jl demonstrating the extension end-to-end - test/test_pomdps.jl regression test that POMDPs.update produces the same belief as GaussianFilters.update Closes #38
… demo
Extension additions:
- initialize_belief(::AbstractFilter, ::AbstractMvNormal) extracts mean
and cov and returns a GaussianBelief. Lets callers initialize from any
multivariate normal distribution (POMDPs problems often expose initial
state as a Distribution).
End-to-end example (examples/pendulum_ekf_ilqr.jl):
- Closed-loop stabilization of POMDPModels.InvertedPendulum from a small
perturbation (theta0 = 0.3 rad).
- Partial observation: angle only; angular velocity must be inferred.
- An ExtendedKalmanFilter doubles as the POMDPs.Updater via the
extension installed in d03083c.
- A minimal certainty-equivalent iLQR (~80 LoC, LQR backward pass over
the dynamics linearized at the nominal trajectory) plans torques from
belief.μ. iLQR ignores belief.Sigma per the standard certainty-
equivalent control split.
- Empirically reaches theta_f ~ 0 from theta0 = 0.3 within 6 seconds.
The pendulum dynamics are re-expressed generically rather than calling
POMDPModels.euler directly, because the latter pins state to
Tuple{Float64,Float64} which blocks ForwardDiff dual numbers used both
inside the EKF and inside the iLQR linearization.
Extension surface (ext/GaussianFiltersPOMDPsExt.jl):
- New GaussianFilterUpdater <: POMDPs.Updater wrapping any AbstractFilter
so it can be passed to simulators that dispatch on the Updater type
(HistoryRecorder, RolloutSimulator, etc).
- pomdps_updater(filter) construction helper, exported as a stub from
the main package and given a real implementation in the extension.
- Both the wrapper and the direct AbstractFilter dispatch are kept; the
direct form is convenient for simple calls, the wrapper is required
for POMDPs.simulate machinery.
Pendulum example refactored to a proper POMDPs.jl program:
- New PendulumPOMDP <: POMDP{SVector{2,Float64}, SVector{1,Float64},
SVector{1,Float64}} — continuous-state partial-observation POMDP that
also exercises the StaticArrays support added in the previous layer.
- Custom ILQRPolicy <: POMDPs.Policy holding only iLQR hyperparameters
and warm-start state (no belief, no EKF — those live elsewhere).
- POMDPs.action(p::ILQRPolicy, b::GaussianBelief) is a one-liner that
runs ilqr on belief.mu and shifts the warm start.
- iLQR upgraded from a single backward pass to multiple iterations with
backtracking line search and Levenberg-Marquardt regularization;
actuator saturation included. Empirically holds the pendulum within
~0.07 rad of upright at steady state from a 0.6 rad initial tilt.
- POMDPs.simulate (via HistoryRecorder) drives the closed loop instead
of a hand-rolled control loop.
Animation:
- Writes examples/outputs/pendulum_ekf_ilqr.gif (gitignored).
- Disable with GENERATE_GIF=false in the environment.
Tests: add pomdps_updater wrapper coverage to test_pomdps.jl (238/238).
Change the pendulum observation model from angle (θ) to angular velocity (ω) — gyroscope-style. This makes the demo a genuine hidden- state inference problem: the angle is never directly observed, the controller plans on belief.μ[1] which is reconstructed by the EKF from successive ω measurements. Empirically σ_θ shrinks from ~0.22 to ~0.05 over the first ~20 steps while σ_ω stays small throughout (it's the observed dimension). The new 3-panel animation makes this visible: - left: pendulum rod (true vs belief μ) - top right: θ trajectory with belief μ ± 2σ ribbon - bottom right: ω trajectory with belief μ ± 2σ ribbon Layout tweaks: right-aligned ylabels and explicit bottom margin so the x-axis label is not clipped.
…certain) Change the prior from N([0.6, 0.0], diag(0.05, 0.1)) to N([0.0, 0.0], diag(0.5², 0.1²)). The new prior is mean-upright with σ_θ = 0.5 rad. Effects: - The true initial state is drawn from the prior, so the angle varies meaningfully across seeds. - The initial belief mean is 0 (upright), but the truth typically is not — the belief is wrong by ~0.3–0.7 rad initially. The first few control actions are based on an incorrect angle estimate; the filter then converges on the true angle via ω observations and the controller catches up. - Empirically σ_θ drops 10× over the run (0.50 → 0.05). The animation ribbon visibly narrows over time, making the filter convergence the central visual story.
b3bf81b to
fbac0ec
Compare
3d7473a to
d42dff7
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Closes #38
Adds a POMDPs.jl package extension (weak dependency, requires Julia 1.9+) that wires GaussianFilters'
AbstractFilterinto the POMDPs belief-updater interface.Extension surface (
ext/GaussianFiltersPOMDPsExt.jl):POMDPs.update(filter, b, a, o)dispatches directly onAbstractFilter— convenient for simple integrations.POMDPs.initialize_beliefaccepts either aGaussianBelief(identity) or anyAbstractMvNormal(extracts mean and cov).GaussianFilterUpdater <: POMDPs.Updaterwrapper for simulators that dispatch on the abstract type (HistoryRecorder,RolloutSimulator, etc).pomdps_updater(filter)builder, exported as a stub from the main package and given a real implementation in the extension.Users who do not have POMDPs.jl loaded pay zero overhead — the extension only activates when both packages are present.
Two example scripts:
examples/pomdps_integration.jl— minimal demonstration of using aKalmanFilterthroughPOMDPs.update.examples/pendulum_ekf_ilqr.jl— closed-loop stabilization of a noisy inverted pendulum observed only through its angular velocity (gyroscope-style). Defines aPendulumPOMDP(withSVectorstate/action/observation, exercising the StaticArrays support from refactor: parametric filter and model types + StaticArrays test suite #58), anILQRPolicy <: POMDPs.Policythat runs iterative LQR on the belief mean (certainty-equivalent control), wraps the EKF withpomdps_updater, and drives the whole closed loop throughPOMDPs.simulateviaHistoryRecorder.The iLQR implementation includes backward pass with Levenberg-Marquardt regularization, backtracking line search, and actuator saturation — about 80 lines. The pendulum dynamics are pulled from
POMDPModels.InvertedPendulum(parameters) but re-expressed generically so ForwardDiff dual numbers can pass through them (the originaleulersignature pins state toTuple{Float64,Float64}).The example produces an animation of the closed-loop trajectory with belief mean ± 2σ ribbons on both θ and ω.
Bumps julia compat to 1.9 to enable
[weakdeps]and[extensions]. AddsPOMDPsto test deps and adds atest/test_pomdps.jlregression covering both the direct-dispatch andGaussianFilterUpdater-wrapper paths.