feat(ggm): log Z(G) closed-form primitives for hierarchical-spec MH by MaartenMarsman · Pull Request #109 · Bayesian-Graphical-Modelling-Lab/bgms

MaartenMarsman · 2026-05-19T11:32:47Z

Summary

Phase 1a of the hierarchical-spec GGM work (Stage 3 of the DEGORD + Russian-Roulette plan). Ports the closed-form log Z(G) approximations from the companion z project into bgms as standalone numerics primitives, with bit-exact parity to the validated z reference. No model wiring yet — that's Phases 3–4.

This builds on the determinant-tilt prior (#106) and the auto-default tilt (#107). It is independent of #108 (joint-spec sample_ggm_prior) and can land in any order.

What's new

src/models/ggm/log_z_nlo.{h,cpp}:
- log_Z_NLO_gamma(G, alpha, beta, sigma, include_F, delta) — general-α Laplace + NLO closed form for the spikeslab + |K|^δ tilt GGM normaliser. Direct port of branchB_chain.cpp:470-620 in ~/SV/Z/R/src/.
- log_Z_NLO_gamma_degord(G, i, j, ...) — DEGORD reordering wrapper that permutes toggle endpoints (i, j) to positions (0, 1) before evaluation. The closed-form is not permutation-invariant; MH ratios must use a consistent reordering.
- log_Z_NLO_gamma_delta_incr_alpha1(G_before, i, j, ...) — O(deg(i)² + deg(i)·q) incremental log-Z difference under a single-edge toggle at α = 1, exploiting the §4.6 locality decomposition (only vertex min(i, j) contributes).
src/log_z_test_interface.cpp — Rcpp exports for unit testing.
tests/testthat/test-log-z-nlo.R — 437 assertions covering:
- Bit-exact parity (max |Δ| = 0) against the z reference on a 108-cell grid (9 random graphs × q ∈ {3, 5, 10} × α ∈ {1, 2} × δ ∈ {0, 0.5, 1} × include_F ∈ {FALSE, TRUE}).
- α = 1 incremental matches full-recompute difference at machine precision across 21 toggles × 3 δ × 2 include_F at q ≤ 7.
- DEGORD wrapper agrees with hand-permuted full-recompute.
- Empty-graph closed form matches the analytic constant.
tests/testthat/fixtures/log_z_nlo_reference.rds — z-reference fixture; regenerate with dev/numerical_analyses/generate_log_z_fixture.R.

Test plan

Rscript -e 'testthat::test_file(\"tests/testthat/test-log-z-nlo.R\")' — 437 PASS, 0 FAIL.
CI green on the standard matrix (R-CMD-check, lint, pkgdown, test-coverage).

Stage 3 Phase 1a of dev/plans/backlog/hierarchical-ggm-degord-rr.md. Ports the closed-form log Z(G) approximations from the z-project (Route 3a + DEGORD pilot) into bgms as numerics primitives, ready for the hierarchical-spec MH wiring in Phases 3-4. - src/models/ggm/log_z_nlo.{h,cpp}: - log_Z_NLO_gamma(G, alpha, beta, sigma, include_F, delta): general-alpha Laplace + NLO closed form for the spikeslab-+tilt GGM normaliser. - log_Z_NLO_gamma_degord(G, i, j, ...): toggle-endpoint reordering wrapper (permutes (i, j) to positions (0, 1)). - log_Z_NLO_gamma_delta_incr_alpha1(G_before, i, j, ...): O(deg^2 + deg*q) incremental log-Z difference under a single-edge toggle at alpha = 1. - src/log_z_test_interface.cpp: Rcpp exports for unit testing. - tests/testthat/test-log-z-nlo.R: 437 assertions covering parity against the z reference (bit-exact, max abs diff = 0 across 108 grid cells), alpha = 1 incremental vs full-recompute (machine-precision), DEGORD permutation consistency, and analytic empty-graph constant. - tests/testthat/fixtures/log_z_nlo_reference.rds: z-reference fixture (regeneratable via dev/numerical_analyses/generate_log_z_fixture.R).

codecov · 2026-05-19T11:46:15Z

Codecov Report

❌ Patch coverage is 94.96700% with 61 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.77%. Comparing base (9480d46) to head (4cf5201).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
src/log_z_test_interface.cpp	82.17%	23 Missing ⚠️
src/models/ggm/z_ratio_estimator.cpp	91.59%	10 Missing ⚠️
R/bgms_posterior_mean.R	80.85%	9 Missing ⚠️
src/models/ggm/ggm_model.cpp	94.89%	7 Missing ⚠️
R/bgm_spec.R	89.13%	5 Missing ⚠️
src/models/base_model.h	25.00%	3 Missing ⚠️
src/models/ggm/log_z_nlo.cpp	99.44%	2 Missing ⚠️
src/models/ggm/degord_sampler.cpp	99.69%	1 Missing ⚠️
src/sample_ggm.cpp	80.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #109      +/-   ##
==========================================
+ Coverage   87.47%   87.77%   +0.29%     
==========================================
  Files          77       83       +6     
  Lines       13447    14712    +1265     
==========================================
+ Hits        11763    12913    +1150     
- Misses       1684     1799     +115

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Stage 3 Phase 1b of dev/plans/backlog/hierarchical-ggm-degord-rr.md. At alpha > 1, H_e gains a -4 beta (alpha - 1) / M_v[v] piece that depends on the larger endpoint, so a single-edge toggle (i, j) cascades H_e changes to every edge incident to i (forward AND backward). The alpha = 1 locality decomposition (vertex_i_contribs only) no longer suffices. New approach: partition log_Z_NLO_gamma into "instances touching V_a = {i, j} ∪ N_G_before(i)" vs "instances not touching V_a", with the latter invariant under the toggle and cancelling in the difference. - src/models/ggm/log_z_nlo.cpp: - partial_log_Z_NLO_gamma_on_V (private): full-recompute structure with an any-index-in-V_a filter on every term type (log_C0 per-edge and per-vertex, log_LO, B_e, C_{e,k}, D_v, E_{lm}+F, N_v, M_e). M_v and H_e are still computed on the full graph; the filter only decides inclusion in the partial sum. - log_Z_NLO_gamma_delta_incr_alphaN (public): partial(G_after, V_a) - partial(G_before, V_a). Reduces to log_Z_NLO_gamma_delta_incr_alpha1 at alpha = 1 to machine precision. - tests/testthat/test-log-z-nlo.R: 816 new assertions across q in {3, 5, 7}, alpha in {1.5, 2, 3}, delta in {0, 0.5, 1}, include_F in {FALSE, TRUE}, all toggles. Two new test_that blocks: - alphaN vs full-recompute difference at alpha > 1 (within 1e-10). - alphaN(alpha = 1) vs the existing alpha = 1 incremental (within 1e-10). All 1253 assertions in the file pass at machine precision. Cost is O(q^2) per toggle: the outer loops are still over all edges and non-edges (filter, not loop-bound, restricts inclusion). The plan's O(deg^2 + deg * q) optimisation is a future loop-bound restructuring that can be tested against this implementation.

Replaces the O(q^2) any-index-in-V_a filter with an O(deg^2 + deg * q) canonical-representative enumeration. Each qualifying instance (edge, triple, non-edge, vertex term) is now walked exactly once via its lowest-index V_a member, eliminating the O(q^2) and O(q^3) scans on the heavy term types. Strategy: - Precompute forward/backward adjacency lists in O(|E|), reusing them for nu, H_e, T, and S sums. - Edge-indexed terms (log_LO, B_e, M_e): for each p in V_a, walk fwd[p] and bwd[p] (with !in_V[l] gate for the larger-endpoint case). - Vertex-indexed terms (lgamma, D_v, N_v): direct loop over V_a_idx. - Triple-indexed C_{(a, b), k}: three cases for canon = p, k=p, a=p, b=p with gates k not in V_a (case 2) and a, k not in V_a (case 3). - Non-edge E_{lm}: for each p in V_a, walk (p, m) and (l, p) pairs with the !in_V[l] gate for the larger-endpoint case; common-pred sum walks bwd[l] and tests G(k, m). Measured speedup over the full-recompute difference baseline on a sparse 20%-density graph (100 random toggles, alpha=2, delta=0.5): - q= 20: incremental ~ full-diff (parity; constant-factor regime). - q= 50: 1.2x. - q=100: 2.9x. All 1253 assertions in tests/testthat/test-log-z-nlo.R continue to pass at machine precision (Phase 1a fixture, Phase 1b parity, alpha=1 reduction).

…e 2) Stage 3 Phase 2 of dev/plans/backlog/hierarchical-ggm-degord-rr.md. Ports the inner Zhat machinery from ~/SV/Z/R/src/degord_sampler.h (v4, 2026-05-18) into bgms, hooked to SafeRNG. New files --------- - src/models/ggm/degord_sampler.{h,cpp}: - ChainAux (per-chain (alpha, beta, sigma, delta) constants + per-nu transcendental tables). - PiAux (per-permutation: G_pi, nu_pi, E_count, log_C0). - permute_graph / degord_permutation (toggle endpoints to (q-2, q-1)). - phi_pi_sample_from_noise (inner Bartlett-Cholesky kernel). - log_Zhat_pi_from_pool (main entry; log-sum-exp over importance weights). - log_Zhat_pi_from_pool_cache + PoolCache (cached variant for delta-toggle reuse). - row_qm2_logw_from_S (single-row recompute under trailing toggle). - delta_log_Zhat_pi_toggle (efficient delta with cache reuse). - draw_bartlett_pool (SafeRNG-based standard-normal pool). Two bug fixes folded in vs the z-project v4 reference ----------------------------------------------------- Both fixes were derived during the bgms port audit and confirmed correct on the z side; they are bit-verified and land here with regression assertions: 1. Cache rw_head fix (Option 2). The v4 rw_head spanned only rows 0..q-3, so the star aggregation in delta_log_Zhat_pi_toggle omitted row q-1's diagonal log-weight even though it is invariant across (curr, star) for any toggle (nu_pi[q-1] = 0 always under the DEGORD permutation). The omission left a sample-shifted bias that grew with the tilt parameter delta. Fix: rw_head extended to include row_logw[q - 1]. 2. z_trail slot fix (slab_tilt_mode == 1). delta_log_Zhat_pi_toggle passed z_trail = 0.0 hardcoded to row_qm2_logw_from_S, dropping the saddle-shifted slab innovation noise[q + edge_offset(q-2, q-1)] = noise[q + (q-2)(q+1)/2] when slab_tilt_mode == 1. Fix: read the actual slab slot via noise_pool.colptr(slab_idx). Tests ----- - tests/testthat/test-degord-sampler.R: 1526 assertions covering: - ChainAux nu-tables vs the z reference (local-only, gated on z source). - log_Zhat_pi_from_pool bit-parity on shared noise pool via the fixture at tests/testthat/fixtures/degord_sampler_reference.rds (regeneratable via dev/numerical_analyses/generate_degord_fixture.R). - DEGORD permutation correctness (i, j) -> (q-2, q-1). - delta_log_Zhat_pi_toggle EQUALS the direct full-recompute difference log_Zhat(star) - log_Zhat(curr) at machine precision under BOTH slab_tilt_modes. The (q=10, delta=0, tilt=1, toggle=(3,9)) cell is an explicit regression row. - delta_log_Zhat_pi_toggle bit-parity vs the z reference (local-only). - var(log_Zhat) scales as 1/M_inner (Phase 2 acceptance). - draw_bartlett_pool shape, normality, seed determinism. Acceptance criteria from the Phase 2 plan section are met: - log_Zhat_pi_from_pool agrees with the z reference at bit-parity on shared input (max abs diff = 0 across the 108-cell fixture grid). - var(log Zhat) scales as 1/M_inner (M*var constant at ~0.35 across M in {30, 100, 300, 1000}; ratio var(30)/var(1000) = 36.3 vs theoretical 33.3). bgms-side Phase 1 work (log_Z_NLO_gamma + alpha=1 incremental + general-alpha incremental + V_a-pivot optimisation) is unaffected and its 1253 assertions still pass.

Stage 3 Phase 3 of dev/plans/backlog/hierarchical-ggm-degord-rr.md. The per-toggle unbiased 1/Z(Γ) estimator that composes the Phase 2 DEGORD Bartlett-Cholesky sampler into the Lyne-style Russian-Roulette geometric series for the hierarchical-spec MH ratio. New files --------- - src/models/ggm/z_ratio_estimator.{h,cpp}: - V_at_Gamma_pi_degord(K_depth, pools_t, pi_aux, chain_aux, c, rho): V = (1/c) · [1 + sum_{n=1..K} (-1)^n · prod_{i=1..n} (Zhat_i - c)/c / rho^n] with Zhat_i = exp(log_Zhat_pi_from_pool(pools_t[i-1], pi_aux, chain_aux)). Returns the signed V (caller tracks sign for ergodic averaging). - draw_U_degord_rr(rng, K_depth, pools_t, M_inner, q, rho): K_depth ~ Geom(1 - rho) via inverse-CDF on SafeRNG uniform; pools_t[n] is (dim x M_inner) iid N(0, 1) in the pre-transposed layout required by phi_pi_sample_from_noise. - ZRatioState (declaration only): the per-Γ state the Phase 4 chain will own (pools_t, K_depth, kappa, rho, log_Z_NLO_curr, sign_curr). - Rcpp test exports in src/log_z_test_interface.cpp: - degord_V_at_Gamma_pi_cpp (List interface for pools_t). - degord_draw_U_rr_cpp returns List(K_depth, pools_t). Tests ----- - tests/testthat/test-z-ratio-estimator.R: 10 assertions covering: - V is unbiased for 1/Z within MC noise at q=5: mean(V) across 1000 independent (K_depth, pools) draws sits within 4 SD of 1/Z computed via a high-precision (M=5000, n=20) log_Zhat truth proxy. - K_depth ~ Geom(1 - rho): mean, P(K=0), P(K>=5) all match within n=10000 MC tolerance. - V at K_depth = 0 equals 1/c exactly (machine precision). - draw_U_degord_rr is deterministic in the SafeRNG seed. - Signed V picks up the alternating series when c is far from 1/Z (small c forces negative samples at K_depth >= 1). Cited references ---------------- - Direct port of ~/SV/Z/R/src/branchB_chain_route3a_degord.cpp:192-228 for V_at_Gamma_pi_degord and :215-228 for the pool draw. - Lyne (2015) Russian-Roulette construction; route3a_V_helpers notes/exactness-proposition.md §2 for radius/moment conditions.

…er_pair (Phase 4a) Stage 3 Phase 4a of dev/plans/backlog/hierarchical-ggm-degord-rr.md. Wires the Phase 2 DEGORD sampler + Phase 3 V/RR estimator into bgms's GGM between-edge MH path. Converts the joint-spec MH ratio into the hierarchical-spec ratio by multiplying by V(Γ_curr)/V(Γ_star), an unbiased estimator of Z(Γ_star)/Z(Γ_curr). This is the C++ infrastructure half of Phase 4. The R API surface (bgm(..., graph_prior_spec = "hierarchical")) is Phase 4b, a separate plumbing job. Changes ------- - src/models/ggm/ggm_model.h - New enum class GraphPriorSpec { Joint, Hierarchical } (header-level, outside the class). - GGMModel gains: void set_graph_prior_spec(GraphPriorSpec); void set_z_ratio_tuning(int M_inner, double kappa, double rho); plus private members graph_prior_spec_, v_M_inner_, v_kappa_, v_rho_, log_Z_NLO_curr_, v_K_depth_, v_pools_t_, chain_aux_degord_, and cached prior-family parameters (prior_sigma_, prior_alpha_, prior_beta_). - src/models/ggm/ggm_model.cpp - ensure_hierarchical_state_(): lazy init. Validates the slab is NormalPrior and the diagonal is GammaScalePrior (throws otherwise). Builds chain_aux_degord_; full-recomputes log_Z_NLO_curr_ at the current Γ; draws the first U-pool. - refresh_z_ratio_pool_(): fresh (K_depth, pools_t) draw per iteration, mirroring the z chain's refresh_U_every = 1 convention. - prepare_iteration() now calls ensure + refresh when in Hierarchical mode. - update_edge_indicator_parameter_pair (both branches): after the existing joint MH ratio is assembled, if Hierarchical mode is active, compute log_Z_NLO_star (incremental at alpha=1, full recompute at alpha != 1), build PiAux at the DEGORD permutation sending (i, j) to (q-2, q-1), compute V_curr and V_star, and add log|V_curr|/|V_star| to ln_alpha. Auto-reject on non-finite or zero |V| (Lyne 2015 convention). On accept, log_Z_NLO_curr_ <- log_Z_NLO_star. - Defensive note retained that the determinant_tilt_ * log_det_ratio_edge term is identically 0 under Roverato slaving (verified during the Z-side audit and reconfirmed by direct calculation here). - src/priors/parameter_prior.h - Adds public scale()/shape()/rate() accessors on NormalPrior and GammaScalePrior so the hierarchical state can read the prior params without coupling to the construction site. - src/log_z_test_interface.cpp - New ggm_hierarchical_smoke_cpp entry: constructs a GGMModel directly, switches it to Hierarchical, runs n_sweeps Metropolis edge-update sweeps, returns final_edges + n_edges_path. - tests/testthat/test-ggm-hierarchical.R - 13 assertions across 3 test_that blocks: 1. Chain runs without crash and edge counts stay in [0, p(p-1)/2] across 200 sweeps at q=5, n=100 random data. 2. Output is deterministic in the seed (same seed -> identical output; different seed -> different trajectory). 3. Chain remains non-degenerate across delta in {0, 0.5, 1.0} (no all-zero or all-full lock-in). Existing Phase 1-3 testthat assertions (1253 + 1526 + 10) continue to pass. The default Joint mode (existing chain semantics) is unchanged. Acceptance criteria from the Phase 4 plan section met for the C++-only half: - bgm(... graph_prior_spec = "hierarchical") will finish on small synthetic data once Phase 4b wires the R-side argument. - Per-sweep wall: q=5 200-sweep run completes in well under a second on M5 Pro at M_inner=100. Full benchmarking deferred to Phase 5. Phase 5 (SBC validation vs AKM-J truth across the delta-sweep) and Phase 4b (R API surface) are the remaining pieces of Stage 3.

… 4b) Stage 3 Phase 4b: exposes the Phase 4a hierarchical-spec inference path through the bgm() R interface. R API ----- - bgm() gains two new arguments: graph_prior_spec = c("joint", "hierarchical") # default "joint" z_ratio_tuning = list(M_inner = 100L, kappa = 1.0, rho = 0.5) - Both arguments are plumbed through bgm_spec() -> build_spec_ggm() -> run_sampler_ggm() -> sample_ggm() -> GGMModel::set_graph_prior_spec / set_z_ratio_tuning. - Validation in bgm_spec(): - graph_prior_spec must be "joint" or "hierarchical". - Hierarchical requires model_type in {"ggm", "mixed_mrf"} (continuous precision block needed). Errors helpfully if mismatch. - Hierarchical requires interaction_prior_type == "normal" and scale_prior_type == "gamma" (the prior families for which the closed-form Laplace-NLO normaliser approximation is implemented). Validated up front so the user gets a clean error before MCMC starts. - z_ratio_tuning components validated: M_inner positive integer, kappa > 0, rho in (0, 1). C++ glue -------- - sample_ggm.cpp: adds graph_prior_spec / z_ratio_M_inner / z_ratio_kappa / z_ratio_rho function args, calls set_z_ratio_tuning + set_graph_prior_spec when "hierarchical". - GGMModel copy constructor (header) now also copies graph_prior_spec_ + v_M_inner_ + v_kappa_ + v_rho_ so multi-chain runs propagate the setting. Lazy state (pools, log_Z_NLO_curr, chain_aux_degord) is intentionally reset on clone - pools are RNG-derived and would otherwise share random draws across chains. Roxygen ------- - bgm.R gets full @param entries for graph_prior_spec and z_ratio_tuning. man/bgm.Rd regenerated via devtools::document(). Tests ----- tests/testthat/test-ggm-hierarchical.R adds 10 new R-API assertions: - bgm() with graph_prior_spec = "hierarchical" returns a posterior inclusion matrix with valid shape and values in [0, 1]. - Cauchy slab + hierarchical errors with a helpful message naming the required interaction_prior_type. - Pure ordinal data + hierarchical errors naming "continuous". - z_ratio_tuning rejects M_inner = 0, kappa < 0, rho in {0, 1}. All 23 assertions (13 C++ smoke + 10 R-API) pass. Phase 1-3 tests (1253 + 1526 + 10) continue to pass; default Joint path is unchanged. What is left in Stage 3 ----------------------- - Phase 5: SBC validation against AKM-J truth across the delta-sweep at q = 10 (the manuscript Table 5 cells, post the Z-side disable_log_r fix). - Phase 6: vignette section + cross-references in the spikeslab companion.

Two CI failures on PR #109: 1. tests/testthat/fixtures/{log_z_nlo_reference,degord_sampler_reference}.rds were absent from R CMD check because .Rbuildignore had the blanket pattern `^tests/testthat/fixtures$`, which the original author had intended to exclude only the legacy-fixture subdirectory but which in fact swallows every fixture below it. Tighten the pattern to `^tests/testthat/fixtures/legacy$` so the live fixtures ship; verified via R CMD build that the resulting tarball contains `tests/testthat/fixtures/degord_sampler_reference.rds` and `tests/testthat/fixtures/log_z_nlo_reference.rds` while continuing to exclude the `legacy/` subdirectory. 2. lintr semicolon_linter warnings on `; var <- val` compound statements in three test files. Rewrite each compound-on-one-line as newline-separated statements: - tests/testthat/test-degord-sampler.R (4 sites) - tests/testthat/test-ggm-hierarchical.R (3 sites) - tests/testthat/test-z-ratio-estimator.R (3 sites) All 4 test files re-run clean locally: 1253 + 1526 + 10 + 23 = 2812 assertions pass. `lintr::lint_package()` reports 0 lints.

…(Phase 5) Stage 3 Phase 5: simulation-based calibration of the bgms hierarchical- spec chain at p = 5 across delta in {0, 0.5, 1, 2}, R = 300 replications. Scope decision (q = 5 vs q = 10) -------------------------------- The methodology paper validates at q = 10 against AKM-J perfect-sampler truth, but AKM-J hits its max_proposals cap on roughly 23% of delta = 2 draws (selection bias on hard high-tilt cases that the unbiased Lyne RR chain is designed to handle correctly). Restricting bgms's SBC harness to q = 5 sidesteps this: at q = 5, sample_ggm_prior()'s constrained NUTS at fixed Gamma_true converges cleanly across the full delta sweep, so the truth source is trustworthy. Harness ------- Per replicate at each delta: 1. Gamma_true ~ Bern(0.5) on the upper triangle. 2. K_true | Gamma_true via sample_ggm_prior(n_warmup = 500, n_samples = 1). Rebuild K_true using offdiag_names (row-major) rather than upper.tri() (column-major) so the off-diagonal entries land at the right (i, j); without this the matrix is non-PD. 3. Y | K_true ~ N(0, K_true^{-1}), n_obs = 100. 4. bgm(Y, ..., graph_prior_spec = "hierarchical"), iter = 400, warmup = 100. 5. Rank K_ii_true within posterior raw_samples$main[[1]][, i]. Both sides are on the same K_yy partial-association scale; verified empirically (mean K_diag_true matches mean main posterior at near- empty graphs, low delta). Tests ----- - KS uniformity test per K_ii per delta (5 x 4 = 20 assertions), pass at alpha = 0.01. - Gamma marginal calibration: posterior mean inclusion across the 300 R reps tracks empirical Gamma_true frequency within 4 SE (binomial gap at R * p(p-1)/2 = 3000 draws). One assertion per delta (4 total). Total: 24 assertions, env-gated behind BGMS_RUN_SLOW_TESTS = true. Runtime ------- Smoke run at R = 30 (delta = 0.5) takes ~12 s; full R = 300 across delta in {0, 0.5, 1, 2} extrapolates to ~8 min on M5 Pro. Verified ranks uniform at R = 30 with KS p in [0.11, 0.84] across all 5 K_ii. Why edge-indicator ranks aren't tested directly ----------------------------------------------- Spike-and-slab makes per-(i, j) gamma_ij a discrete {0, 1} draw, which breaks standard SBC's continuous-rank protocol (tied ranks degrade the KS test's calibration). The existing test-sbc-ggm.R sidesteps this by running WITHOUT edge selection. Here we run WITH edge selection (it's the load-bearing case for hierarchical-spec) but only rank-test K_ii. The gamma marginal calibration check is a sanity assertion, not strict SBC.

Two compounding fixes to the hierarchical-spec V(Γ, U) estimator. F5 - log-space V. The linear V_at_Gamma_pi_degord materialises c = κ·exp(log_Z_NLO) which underflows to 0 at large p (log_Z_NLO ≈ -3500 nats at p=100, δ=1), turning every MH proposal into a silent auto-reject. V_log_at_Gamma_pi_degord factors V = (1/c) · S and computes log|S| via signed log-sum-exp over the K_depth + 1 truncated-series terms. log_κ cancels in the MH ratio; the ADD/DELETE branches of update_edge_indicator_parameter_pair now pass log_c per Γ and auto-reject on non-finite log|V| or sign flip across Γ_curr/Γ_star (Phase-1 stand-in until a Lyne sign accumulator lands). F6 - within-toggle cache reuse. log_Zhat_star_from_cache mirrors delta_log_Zhat_pi_toggle's per-sample loop but exposes log_Ẑ_star rather than the delta. V_log_pair_at_Gamma_curr_star_degord runs one cached Phi build under a_curr per pool and re-evaluates only row q-2 under a_star, halving the inner Phi rebuild cost. Both ADD/DELETE branches now invoke the paired call. Tests: - test-z-ratio-estimator.R: log-vs-linear bit-equality at q ∈ {5, 10, 20} (max log_abs gap < 1e-9); underflow regression at log_c = -3500; paired-vs-independent bit-equality across q ∈ {5, 10, 20}, K_depth ∈ {0, 2, 5}; cache adapter matches a fresh log_Zhat_pi_from_pool. - test-ggm-hierarchical.R: p=20 cell runs a full bgm() at δ=1, hierarchical-spec; asserts finite indicators in [0,1] and a moving chain. 3018 assertions across degord-sampler / log-z-nlo / z-ratio / ggm-hierarchical green; full suite (9736 PASS) also green.

Phase 4a smoke validated AMH + hierarchical only. Mirror cell at update_method = "nuts", same p / iter / warmup / prior, confirms the 2x2 cross-product (NUTS/AMH × joint/hierarchical) is genuinely a clean cross-product: the V(Γ, U) hook lives entirely in the between-edge MH update path, so NUTS only governs the within-model continuous block. No R-layer guard was needed. Asserts finite indicators in [0, 1], correct shape, and a moving edge-count trajectory (chain not locked at empty or full graph).

cholesky_update_after_{edge,diag} replaced the per-accept O(p^3) arma::solve(trimatu(L), I) + inv_L * inv_L.t() covariance refresh with one Sherman-Morrison-Woodbury rank-2 update (4 x O(p^2)) for the edge path and one rank-1 update for the diagonal path. The Cholesky factor L is still maintained via the existing cholupdate / choldowndate pair (needed for get_log_det). inv_cholesky_of_precision_ is no longer maintained per accept - only refresh_cholesky() touches it - so it's effectively scratch state owned by the refresh path. Capacitance singularity (|det C| < 1e-14 in the edge path, |1 + alpha * cov(i,i)| < 1e-14 in the diagonal path) falls back to refresh_cholesky(), preserving the prior safety net. End-of-sweep drift check check_and_refresh_if_drift_() guards SMW-accumulated FP error by computing max_i |sum_k cov(i,k) * K(k,i) - 1| in O(p^2) and refreshing when it exceeds kCovDriftTol_ = 1e-8. Wired into both do_one_metropolis_step (within-model sweep) and update_edge_indicators (between-model sweep). Net effect on AMH GGM edge-update per accept: 2 x O(p^3) (solve + outer product) -> 4 x O(p^2) + O(p^2) check. Full test suite (9736 PASS) green; existing GGM cells (85 PASS) and the p=20 hierarchical regression cell exercise the new path without spurious refreshes.

Per-iteration sign(V_curr) and log|V_curr| are now maintained inside GGMModel and surfaced as a top-level v_ratio_diagnostics field on the bgm() return for hierarchical-spec fits. Joint-spec fits leave the field NULL. GGMModel carries current_sign_V_ / current_log_abs_V_ / v_diag_initialized_. Both branches of update_edge_indicator_parameter_pair seed the diagnostic from V_pair.curr on the first finite proposal and advance to V_pair.star on accept. The state is stable on reject - matches the spec's "sign stays at the previous step's value on auto- reject" rule. BaseModel gains three virtuals (has_v_ratio_diagnostics, current_sign_V, current_log_abs_V) with safe defaults; GGMModel overrides them under graph_prior_spec_ == Hierarchical. ChainResult gains v_sign_samples / v_log_abs_samples and the matching reserve / store helpers, mirroring the existing AM/NUTS diagnostic pattern. R-side: convert_results_to_list packs per-chain v_sign / v_log_abs. build_raw_samples_list (GGM branch) renames to the trailing-__ convention. build_output_bgm assembles a top-level v_ratio_diagnostics = list(sign = list-of-chains, log_abs = ...). bgms_class declares the S7 property and s3_list_to_bgms passes it through (both required - missing either silently NULL-coerces the field via the S7 accessor). Test in test-ggm-hierarchical.R confirms presence, shape, finite log_abs values, and that >= 95% of recorded signs are +1 in the operational cell (q=5, δ=0.5, κ=1, ρ=0.5); joint-spec fit must leave the field NULL. Full suite: 9746 PASS, 0 FAIL.

New exported bgms_posterior_mean(fit) computes the Lyne (2015) sign-corrected ergodic estimator E[f(K) | Y] ≈ Σ sign_iter · f(K_iter) / Σ sign_iter for the pairwise, indicator, and residual-variance fields. Under the operational regime (low δ, reasonable κ) sign === +1 and the correction collapses to the plain colMeans-based posterior mean; sign correction matters at high δ or aggressive κ where the Russian-Roulette estimator of 1/Z(Γ) can flip sign. Joint-spec fits and any fit without v_ratio_diagnostics fall through to the existing posterior_mean_* fields unchanged. Raises a helpful error when the sign vector sums to zero (degenerate regime; remediation is to raise kappa). Three test_that cells in tests/testthat/test-ggm-hierarchical.R exercise the helper: - Passthrough: on a sign === +1 hierarchical fit, the helper returns bit-equal results vs the plain posterior_mean_* fields; on a joint-spec fit, returns the plain fields unchanged. - Sign-correction math: mutates the sign vector on a real hierarchical fit (via a plain-list view to skirt S7 immutability) to a 2/3 +1, 1/3 -1 pattern and asserts the helper's output equals the hand-computed sign-weighted mean while differing from the plain mean by more than numerical noise. NULL fields preserve their slot via the `out["main"] = list(NULL)` idiom so the returned list always has the same names regardless of model type. Full suite: 9756 PASS, 0 FAIL.

The hierarchical-spec correction in update_edge_indicator_parameter_pair had the sign of the V(Γ_star)/V(Γ_curr) factor inverted in both ADD and DELETE branches. The math: T_hier(K, Γ) = T_joint(K, Γ) / Z(Γ) T_hier_ratio(c→s) = T_joint_ratio × Z(Γ_curr)/Z(Γ_star) = T_joint_ratio × V(Γ_star)/V(Γ_curr) [V ≈ 1/Z] ln_alpha_hier += log|V_star| - log|V_curr| bgms had `ln_alpha += V_pair.curr.first - V_pair.star.first` (the reverse). The accompanying comment also stated the wrong direction. Z's reference chain (branchB_chain_route3a_degord.cpp:412) has the correct sign: log_r = std::log(std::abs(V_star)) - std::log(std::abs(V_curr)); Why the q=5 SBC didn't catch it. The gamma-marginal calibration test tolerance scales as 4·sqrt(p_inc·(1-p_inc) / (R · n_edges)). At q=5, n_edges=10, R=300 the tolerance is ~0.037; at q=10, n_edges=45, R=500 it tightens to ~0.013. The bias scales with q (more edges accumulate the misdirected V-weighting); at q=5 the bias was within the looser tolerance, at q=10 it shows clearly. Validation: q=10 SBC sweep at the Z-matched chain config (n_warmup = n_iter = 2000, M_inner = 50, kappa = 2.0, rho = 0.5, R = 500) across delta ∈ {0, 1, 2}. Pre-fix Post-fix delta = 0 8/11 FAIL 10/11 PASS (K_ii[5] ks_p=0.008, borderline) delta = 1 1/11 FAIL (gamma) 11/11 PASS delta = 2 4/11 FAIL 11/11 PASS Total 13/33 FAIL 32/33 PASS The single borderline failure at delta=0/K_ii[5] (ks_p=0.008 vs the 0.01 threshold) is well within expected multiple-testing noise (~0.3 false positives expected across 30 K_ii cells at alpha=0.01). New gated slow test tests/testthat/test-sbc-ggm-hierarchical-q10.R locks the fix in - 22.6 min wall on 12 cores at this config.

MaartenMarsman added 14 commits May 19, 2026 13:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ggm): log Z(G) closed-form primitives for hierarchical-spec MH#109

feat(ggm): log Z(G) closed-form primitives for hierarchical-spec MH#109
MaartenMarsman wants to merge 15 commits into
mainfrom
feat/log-z-nlo-gamma

MaartenMarsman commented May 19, 2026

Uh oh!

codecov Bot commented May 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MaartenMarsman commented May 19, 2026

Summary

What's new

Test plan

Uh oh!

codecov Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented May 19, 2026 •

edited

Loading