Skip to content

Locked splice bug fixes during channel reestablishment#4624

Merged
jkczyz merged 2 commits into
lightningdevkit:mainfrom
wpaulino:splice-locked-reestablish-fixes
May 21, 2026
Merged

Locked splice bug fixes during channel reestablishment#4624
jkczyz merged 2 commits into
lightningdevkit:mainfrom
wpaulino:splice-locked-reestablish-fixes

Conversation

@wpaulino
Copy link
Copy Markdown
Contributor

This PR includes fixes for two bugs/edge cases when handling a locked splice during reestablishment. These were found by the chanmon_consistency fuzz target.

@wpaulino wpaulino added this to the 0.3 milestone May 19, 2026
@wpaulino wpaulino requested review from TheBlueMatt and jkczyz May 19, 2026 23:31
@wpaulino wpaulino self-assigned this May 19, 2026
@ldk-reviews-bot
Copy link
Copy Markdown

ldk-reviews-bot commented May 19, 2026

👋 Thanks for assigning @jkczyz as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@ldk-claude-review-bot
Copy link
Copy Markdown
Collaborator

ldk-claude-review-bot commented May 19, 2026

I've completed a thorough re-review of the entire PR diff, examining all changes across channel.rs, channelmanager.rs, and splicing_tests.rs. I verified:

  1. Lock ordering safety: The new _with_funded_channel methods acquire short_to_chan_info.write(), pending_events.lock(), config.read(), and best_block.read() while peer_state is held. This follows the established lock ordering pattern already used throughout the codebase (e.g., in do_chain_event at lines 16379, 16395).

  2. splice_locked vs inferred_splice_locked separation: splice_locked (what we need to re-send) flows through FundingTxSignedhandle_channel_resumptionSendSpliceLocked message event. inferred_splice_locked (what we infer the peer sent) flows through internal_splice_locked_with_funded_channel for local promotion. These are correctly distinct paths.

  3. funding_locked_txid_sent_in_reestablish filtering: Correctly prevents duplicate splice_locked when the txid was already conveyed via my_current_funding_locked in the reestablish. The field is transient (not serialized), set at reestablish-send time, and consumed via .take() at reestablish-receive time.

  4. Ordering of splice promotion vs holding cell freeing: internal_splice_locked_with_funded_channel runs inside the match arm (before check_free_peer_holding_cells). If the resulting monitor update is async, the channel's pending monitor flag prevents maybe_free_holding_cell_htlcs from releasing cells prematurely.

  5. Error propagation: try_channel_entry! returns from the function on error, so check_free_peer_holding_cells is correctly skipped when the channel is force-closed.

  6. All constructor sites: All 5 ReestablishResponses sites include splice_locked, all 4 ChannelContext constructors include funding_locked_txid_sent_in_reestablish: None.

No issues found.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

❌ Patch coverage is 96.98795% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.62%. Comparing base (1060865) to head (5e14a3f).
⚠️ Report is 12 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/ln/channelmanager.rs 96.52% 1 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4624      +/-   ##
==========================================
+ Coverage   86.59%   86.62%   +0.03%     
==========================================
  Files         159      159              
  Lines      110420   110568     +148     
  Branches   110420   110568     +148     
==========================================
+ Hits        95619    95784     +165     
+ Misses      12267    12250      -17     
  Partials     2534     2534              
Flag Coverage Δ
fuzzing-fake-hashes 6.61% <0.00%> (+0.04%) ⬆️
fuzzing-real-hashes 23.26% <28.91%> (+0.10%) ⬆️
tests 86.23% <96.98%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread lightning/src/ln/channelmanager.rs
Comment thread lightning/src/ln/channelmanager.rs Outdated
@wpaulino wpaulino force-pushed the splice-locked-reestablish-fixes branch from 16fb802 to 24bac54 Compare May 20, 2026 18:44
@wpaulino wpaulino requested a review from jkczyz May 20, 2026 18:44
Comment on lines +10524 to +10525
let funding_locked_txid_sent_in_reestablish =
self.context.funding_locked_txid_sent_in_reestablish.take();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any situation where we'd get here before calling get_channel_reestablish?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because get_channel_reestablish gets called as soon as the peer is tracked in the manager as connected

Comment thread lightning/src/ln/channelmanager.rs Outdated
Comment on lines +13356 to +13375
if let Some(channel_ready_msg) = need_lnd_workaround {
self.internal_channel_ready_with_peer_state(counterparty_node_id, &channel_ready_msg, peer_state)?;
}
};

self.handle_holding_cell_free_result(holding_cell_res);
// A reestablish may infer a missed `splice_locked`; apply it before freeing holding
// cells so we don't generate commitment updates against stale splice state.
let post_splice_locked_update = if let Some(splice_locked) = inferred_splice_locked {
self.internal_splice_locked_with_peer_state(counterparty_node_id, &splice_locked, peer_state)?
} else {
None
};

if let Some(channel_ready_msg) = need_lnd_workaround {
self.internal_channel_ready(counterparty_node_id, &channel_ready_msg)?;
}
let holding_cell_res = self.check_free_peer_holding_cells(peer_state);
(post_splice_locked_update, holding_cell_res)
};

if let Some(splice_locked) = inferred_splice_locked {
self.internal_splice_locked(counterparty_node_id, &splice_locked)?;
if let Some(data) = post_splice_locked_update {
self.handle_post_monitor_update_chan_resume(data);
}
self.handle_holding_cell_free_result(holding_cell_res);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do most of this inside of the hash_map::Entry::Occupied arm above? Was thinking we'd avoid the duplicate lookups, too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even just doing this change wasn't really worth it since this isn't a hot path

Comment thread lightning/src/ln/channelmanager.rs Outdated
Comment on lines +13607 to +13608
mem::drop(peer_state_lock);
mem::drop(per_peer_state);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we need to do the same in internal_channel_reestablish?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because there the peer state is declared within a nested scope

wpaulino added 2 commits May 20, 2026 16:35
In most cases, we end up sending our `splice_locked` either implicitly
during reestablishment via
`ChannelReestablish::my_current_funding_locked`, or explicitly after
reestablishment. However, we did not consider that it's possible for the
node to be notified of the splice confirmation after connecting to their
peer but prior to reestablishing their channel. In such cases, we need
to explicitly send the `splice_locked` since it wasn't included in
`my_current_funding_locked`, but only after the channel has been
reestablished.

Found by the chanmon_consistency fuzz target.
Upon channel reestablishment, we free our holding cells to send any
pending updates to our peer. If we happened to implicitly lock a pending
splice during reestablishment, we want to make sure any updates we send
after the fact are considering the new channel state (post-splice), even
if the update was queued while the splice was still pending. Therefore,
we must always handle the inferred `splice_locked` first.

Found by the `chanmon_consistency` fuzz target.
@wpaulino wpaulino force-pushed the splice-locked-reestablish-fixes branch from 24bac54 to 5e14a3f Compare May 20, 2026 23:36
@jkczyz jkczyz merged commit 9ce02b3 into lightningdevkit:main May 21, 2026
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants