Skip to content

deposit fetch produces invalid aggregate signature when fewer than total operators submit partials #4525

@nickh-obol

Description

@nickh-obol

🐞 Bug Report

Description

When charon deposit fetch is run after only a threshold-sized subset of operators (but not all operators) have run charon deposit sign, the aggregated signature returned by MarshalDepositData fails BLS verification with invalid deposit data signature: signature not verified, even though every partial signature is individually valid and signed over the same message.

Root cause: the client-side aggregation in app/obolapi/deposit.go derives each partial's share index from its position in the API response slice (rawSignatures[sigIdx+1] = sig), rather than looking it up from the cluster lock. The Obol API response does not pad missing operators with empty entries — it returns only the partials that were actually submitted, packed contiguously. So if e.g. operators with share indices {2, 3, 4} signed (operator 1 has not), the client treats them as {1, 2, 3} and tbls.ThresholdAggregate runs with the wrong Lagrange coefficients. The output is a 96-byte blob that decodes successfully but fails BLS verification against the deposit message.

The continue for empty signatures at app/obolapi/deposit.go:131-134 suggests the original intent was for the API to return fixed positional slots, but in practice it does not.

Has this worked before in a previous version?

Unknown — likely present since cmd: add API partial deposit flow (#4032). Symptom only manifests with a strict-threshold submission (fewer than total operators sign).

🔬 Minimal Reproduction

In a cluster with 4 operators / threshold 3:

  1. Operators 2, 3, and 4 each run:
    charon deposit sign \
      --validator-public-keys=0x<pubkey> \
      --withdrawal-addresses=0x010000000000000000000000<addr>
    
  2. Operator 1 does not submit.
  3. Any operator runs:
    charon deposit fetch --validator-public-keys=0x<pubkey>
    

Result: Application failed to start: invalid deposit data signature: signature not verified.

Confirmed against the live Obol API by inspecting GET /v1/deposit_data/<lockHash>/<valPubkey> — the response contains exactly 3 partials entries (no empty positional slot for the missing operator), and matching each partial_public_key against distributed_validators[].public_shares in the cluster lock shows the actual share indices are non-contiguous (e.g. {2, 3, 4}), not {1, 2, 3} as the aggregation code assumes.

Workaround: make sure all operators (not just a threshold) submit a partial deposit. Then slice positions and share indices coincide and aggregation works.

🔥 Error


INFO cmd        Fetching full deposit message
INFO cmd        Fetched full deposit message
ERRO cmd        Application failed to start: invalid deposit data signature: signature not verified
	tbls/herumi.go:20 .init

Suggested fix

Either:

  1. Client-side (preferred): In app/obolapi/deposit.go, derive each partial's share index by looking up its partial_public_key against the cluster lock's distributed_validators[].public_shares (the caller already has *cluster.Lock). This is robust regardless of API response shape and removes the implicit ordering contract.
  2. Server-side: Have the API return a fixed total-length slice with empty Partial{} entries for operators that have not submitted.

(1) is safer because it makes the client self-sufficient and would catch any future API ordering changes.

🌍 Your Environment

Operating System: Linux

What version of Charon are you running? `obolnetwork/charon:v1.10.0` (docker image)

Anything else relevant? Cluster: 4 operators, threshold 3, mainnet, non-compounding. Triggered while running charon deposit sign / charon deposit fetch to update withdrawal credentials before activation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    protocolProtocol Team tickets

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions