Skip to content

smartcontract/telemetry: implement RFC 19 - User BGP Status #3461

@juan-malbeclabs

Description

@juan-malbeclabs

Summary

Persist the real BGP session state of each user onchain so that consumers can distinguish users who are configured but never connected from users with healthy sessions. Full design: rfcs/rfc19-user-bgp-status.md.

Tasks

PR 1 - Instruction scaffolding

Register the new SetUserBGPStatus instruction (variant 106) in the serviceability program so the instruction number and dispatch table are locked in before the account layout changes land.

  • Add BGPStatus enum (Unknown, Up, Down) to the user state module
  • Create SetUserBGPStatusArgs struct with a bgp_status field
  • Add the instruction variant to the DoubleZeroInstruction enum
  • Wire up the dispatch in the program entrypoint (stub processor - returns InvalidInstructionData until PR 3)
  • Add a unit test verifying that variant 106 round-trips through unpack

PR 2 - Account changes & SDK updates

Add the three new fields to the User struct and propagate the change to all read-only SDKs. Must ship before PR 3 so that consumers can read the new fields without deserialization errors.

  • Add bgp_status (1 byte), last_bgp_up_at (8 bytes), and last_bgp_reported_at (8 bytes) to the User struct in the serviceability program
  • Update the internal Go SDK (smartcontract/sdk/go/serviceability) to deserialize the new fields
  • Update the external Go SDK (sdk/serviceability/go) to deserialize the new fields and expose BGPStatus constants
  • Update the TypeScript SDK (sdk/serviceability/typescript) to include the new fields
  • Update the Python SDK (sdk/serviceability/python) to include the new fields
  • Regenerate the binary/JSON fixtures (make generate-fixtures)
  • Update compat tests in all three SDKs to assert on the new fields
  • Add a backward-compatibility test in each SDK: an account in the old (shorter) layout deserializes the new fields as zero/unknown

PR 3 - Processor implementation

Replace the stub from PR 1 with the real processor logic. Depends on PR 2.

  • Implement the full SetUserBGPStatus processor:
    • Validate that the signer is the device's metrics_publisher_pk
    • Validate that user.device_pk matches the device account
    • Update bgp_status and last_bgp_reported_at on every write
    • Update last_bgp_up_at only when the new status is Up
    • Account reallocation (17 bytes) is handled automatically on first write
  • Integration tests:
    • First write Up: account resized, all three fields set correctly
    • First write Down: last_bgp_up_at remains zero
    • Multiple writes: last_bgp_up_at only advances on Up transitions; last_bgp_reported_at advances on every write
    • Wrong signer: expect NotAllowed
    • User/device mismatch: expect NotAllowed

Phase 4 - Telemetry agent (separate plan, after PR 3 is deployed)

Extend the telemetry agent to submit SetUserBGPStatus onchain after each BGP socket collection tick.

  • Fetch all activated users for the device from the serviceability program on each collection tick
  • Map each user to its BGP peer IP from TunnelNet
  • Determine per-user status: Up if an established socket with that peer IP exists, Down otherwise
  • Submit SetUserBGPStatus when: the computed status differs from the last known onchain value, or the last write was more than a configurable interval ago (to keep last_bgp_reported_at fresh)
  • Implement submissions via a non-blocking background worker with retry so a single RPC error does not block the collection tick
  • Wire the existing metrics publisher keypair and Solana RPC client into the new submitter
  • Add configuration flags: periodic refresh interval (default 6h) and Down grace period
  • Unit tests for the IP-mapping logic, submission filtering, and periodic refresh behavior

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions