Conversation
|
👋 Thanks for assigning @tnull as a reviewer! |
|
🔔 1st Reminder Hey @tnull! This PR has been waiting for your review. |
|
🔔 2nd Reminder Hey @tnull! This PR has been waiting for your review. |
|
🔔 3rd Reminder Hey @tnull! This PR has been waiting for your review. |
|
🔔 4th Reminder Hey @tnull! This PR has been waiting for your review. |
|
🔔 5th Reminder Hey @tnull! This PR has been waiting for your review. |
|
🔔 6th Reminder Hey @tnull! This PR has been waiting for your review. |
There was a problem hiding this comment.
Hi @randomlogin, thanks for the work on this! I've reviewed the first two commits:
I've left a bunch of inline comments addressing configuration and public API, commit hygiene, testing infrastructure, and test flakiness.
In summary:
- A couple of items are exposed publicly that seem like they should be scoped to probing or gated for tests only (see
scoring_fee_paramsinConfigandscorer_channel_liquidityonNode). - The probing tests duplicate existing test helpers (
setup_node,MockLogFacadeLogger). Reusing and extending what's already intests/common/would reduce duplication and keep the test file focused on the tests themselves. test_probe_budget_blocks_when_node_offlinehas a race condition where the prober dispatches probes before the baseline capacity is measured, causing the assertion between the baseline and stuck capacities to fail. Details in the inline comment.- A few nits about commit hygiene, import structure, and suggestions for renaming stuff.
Also needs to be rebased.
src/probing.rs
Outdated
| pub struct HighDegreeStrategy { | ||
| network_graph: Arc<Graph>, | ||
| /// How many of the highest-degree nodes to cycle through. | ||
| pub top_n: usize, |
There was a problem hiding this comment.
Could top_n be renamed to num_top_nodes? The latter reads less generic to me but up to you to modify or not.
There was a problem hiding this comment.
I'd leave it as is (maybe top_k, as somehow it is more common in algorithms to describe the number of samplings).
What about top_node_count?
Personally I don't like 'num' as a short for 'number'
|
🔔 7th Reminder Hey @tnull! This PR has been waiting for your review. |
|
@enigbe, thanks for a review, the updates are incoming soon. |
436e4a3 to
07dfde4
Compare
Introduce a background probing service that periodically dispatches probes to improve the scorer's liquidity estimates. Includes two built-in strategies.
ff741c2 to
c31f1ce
Compare
tnull
left a comment
There was a problem hiding this comment.
Thanks for taking this on and excuse the delay here!
Did a first review pass and this already looks great! Here are some relatively minor comments, mostly concerning the API design.
| } | ||
|
|
||
| let scoring_fee_params = ProbabilisticScoringFeeParameters::default(); | ||
| let mut scoring_fee_params = ProbabilisticScoringFeeParameters::default(); |
There was a problem hiding this comment.
I do wonder if we should allow the user to set the entire ProbabilisticScoringFeeParameters and ProbabilisticScoringDecayParameters via the ProbingConfigBuilder mentioned above? Do you see any reason where that would conflict with other API design decisions?
There was a problem hiding this comment.
I think we should expose these settings (in NodeBuilder, not probing builder).
However these are for fine tuning and should be used only by advanced users. Also I would expose UserConfig, as for example user cannot decide what features do advertise.
I can add builder methods for scoring parameters, though maybe it should be in another PR aimed on exposing settings for an advanced user?
There was a problem hiding this comment.
Also I would expose UserConfig, as for example user cannot decide what features do advertise.
Yes, this is very intentional, as we want to provide a sane/safe API. For example letting user freely choose to set certain parameters if we don't implement them properly will just lead to a lot of footguns, in some circumstances even with the potential for money loss.
There was a problem hiding this comment.
In that case I think we don't need to expose ProbabilisticScoringFeeParameters and ProbabilisticScoringFeeParameters by the very reason you mentioned.
|
🔔 4th Reminder Hey @enigbe! This PR has been waiting for your review. |
Change cursor of top nodes from HighDegreeStrategy to use cac: Create src/util.rs Add probe HTLC maximal lower bound Fix styling (config argument order), explicit Arc::clone instead of .clone() Change tests open_channel to reuse existing code
tnull
left a comment
There was a problem hiding this comment.
Seems tests are failing right now:
thread 'exhausted_probe_budget_blocks_new_probes' (167312) panicked at tests/probing_tests.rs:381:5:
no probe dispatched within 15 s
failures:
exhausted_probe_budget_blocks_new_probes
probe_budget_increments_and_decrements
f99786b to
1e73e6e
Compare
6841a0c to
c470da6
Compare
|
2 of macOs builds has hanged, but not on probing tests; they have passed in another macOs build. |
| use crate::message_handler::NodeCustomMessageHandler; | ||
| use crate::payment::asynchronous::om_mailbox::OnionMessageMailbox; | ||
| use crate::peer_store::PeerStore; | ||
| use crate::probing; |
There was a problem hiding this comment.
nit: Please import the objects you're using directly, also avoiding the probing:: prefix in docs.
| logger: Arc::clone(&logger), | ||
| strategy, | ||
| interval: probing_cfg.interval, | ||
| liquidity_limit_multiplier: Some(config.probing_liquidity_limit_multiplier), |
There was a problem hiding this comment.
I do wonder if we should move the probing_liquidity_limit_multiplier config field to ProbingConfig now?
| keys_manager: Arc<KeysManager>, static_invoice_store: Option<StaticInvoiceStore>, | ||
| onion_messenger: Arc<OnionMessenger>, om_mailbox: Option<Arc<OnionMessageMailbox>>, | ||
| runtime: Arc<Runtime>, logger: L, config: Arc<Config>, | ||
| runtime: Arc<Runtime>, logger: L, config: Arc<Config>, prober: Option<Arc<Prober>>, |
There was a problem hiding this comment.
Please move the prober after om_mailbox here, as runtime, logger, and config are usually the trailing fields in most constructors, etc.
| #[tokio::test(flavor = "multi_thread")] | ||
| async fn probe_budget_increments_and_decrements() { | ||
| let (bitcoind, electrsd) = setup_bitcoind_and_electrsd(); | ||
| let chain_source = TestChainSource::Electrum(&electrsd); |
There was a problem hiding this comment.
Please use random_chain_source() for all tests that aren't chainsource specific.
| mod message_handler; | ||
| pub mod payment; | ||
| mod peer_store; | ||
| mod probing; |
There was a problem hiding this comment.
Let's make this a pub mod and expose Prober and related objects under crate::probing, not at the root level. Note that for uniffi to still build you might need to add corresponding re-exports to the list in the beginning of ffi/types.rs, as uniffi indeed expects all objects to live under the crate root.
tnull
left a comment
There was a problem hiding this comment.
Also asked Claude to do a review, some of this might be worth addressing:
1. Bug: locked_msat accounting mismatch for Probe::Destination (most significant)
In run_prober (src/probing.rs:593-609), for a Destination probe, amount_msat is added to locked_msat once. But send_spontaneous_preflight_probes
can dispatch multiple probe paths. Each resolved path triggers a separate ProbeSuccessful/ProbeFailed event, and
handle_probe_successful/handle_probe_failed each subtract path.hops.iter().map(|h| h.fee_msat).sum() — which is the amount plus routing fees for
that path.
Two problems compound:
- Multiple paths: added once, subtracted N times (one per path event). With saturating_sub this silently clamps to 0, but the budget will appear
to clear prematurely.
- Fee mismatch: even for a single path, the subtracted amount includes routing fees (amount_msat + fees), while only amount_msat was added. So
the subtraction is always ≥ what was added.
The PrebuiltRoute variant is fine — it consistently uses path.hops.iter().map(|h| h.fee_msat).sum() on both sides.
2. Bug: Fee calculation in RandomStrategy::try_build_path uses final amount instead of forwarded amount
At src/probing.rs:462:
let fee = update.fees.base_msat as u64
+ (amount_msat * update.fees.proportional_millionths as u64 / 1_000_000);
This computes each intermediate hop's fee based on the delivery amount_msat, but each hop actually charges fees on the amount it forwards
(delivery + all downstream fees). For routes > 2 hops, this underestimates intermediate fees. The total_outgoing check at the end could then let
through routes that are actually too large for the first-hop HTLC limit. This is minor for probing purposes since probes fail deliberately, but
it's still incorrect.
3. Doc inconsistency on HighDegreeStrategy
The struct-level doc comment says "Returns None (skips the tick) if all top nodes are on cooldown." But the implementation at line 276-286 does
the opposite — when all top nodes are on cooldown, it clears the cooldown map and returns the first node, never returning None on cooldown
exhaustion. The None return only happens when the graph is empty.
4. random_range overflow edge case
In src/util.rs:14, let range = max - min + 1 overflows to 0 when max - min == u64::MAX, causing a division-by-zero panic at u64::MAX % range.
This can't happen with current callers, but the function has no guard against it.
5. Race condition in budget check (minor/theoretical)
In run_prober, the budget check (locked_msat.load() + amount > max_locked_msat) and the subsequent fetch_add are not atomic. Since the prober
loop is single-threaded and the event handler only decreases locked_msat, this can't cause the budget to be exceeded — the worst case is a
slightly stale read that rejects a probe that would have fit. So this is benign, but worth noting.
6. scorer_channel_liquidity is very expensive
src/lib.rs:1090-1108 serializes the entire scorer to a Vec<u8> and then deserializes it as a ProbabilisticScorer just to call
estimated_channel_liquidity_range. This is O(scorer size) per call. It's only used in tests currently, but it's a public API — worth a doc
warning or a #[cfg(test)] gate if it's not intended for production use.
---
Issues #1 and #2 are real bugs. #1 in particular means the budget tracking for Destination probes is systematically wrong.
Added a probing service which is used to send probes to estimate channels' capacities.
Related issue: #765.
Probing is intended to be used in two ways:
For probing a new abstraction
Proberis defined and is (optionally) created during node building.Prober periodically sends probes to feed the data to the scorer.
Prober sends probes using a ProbingStrategy.
ProbingStrategy trait has only one method:
fn next_probe(&self) -> Option<Probe>; every tick it generates a probe, whereProberepresents how to send a probe.To accommodate two different ways the probing is used, we either construct a probing route manually (
Probe::PrebuiltRoute) or rely on the router/scorer (Probe::Destination).Prober tracks how much liquidity is locked in-flight in probes, prevents the new probes from firing if the cap is reached.
There are two probing strategies implemented:
Random probing strategy, it picks a random route from the current node, the route is probed via
send_probe, thus ignores scoring parameters (what hops to pick), it also ignoresliquidity_limit_multiplierwhich prohibits taking a hop if its capacity is too small. It is a true random route.High degree probing strategy, it examines the graph and finds the nodes with the biggest number of (public) channels and probes routes to them using
send_spontaneous_preflight_probeswhich uses the current router/scorer.The former is meant to be used on payment nodes, while the latter on probing nodes. For the HighDegreeStrategy to work it is recommended to set
probing_diversity_penalty_msatto some nonzero value to prevent routes reuse, however it may fail to find any available routes.There are three tests added:
Example output (runs for ~1 minute, needs
--nocaptureflag):For performance testing I had to expose the scoring data (
scorer_channel_liquidity).Also exposed
scoring_fee_params: ProbabilisticScoringFeeParameterstoConfig.TODOs: