Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,6 @@ bin/
# Environment files
.env
.env.*

.DS_Store
# XML files (repomix output)
*.xml
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM golang:1.24.5-alpine AS builder
FROM golang:1.25-alpine AS builder

WORKDIR /app

Expand Down
67 changes: 65 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,27 @@

This service acts as a dedicated bootstrap node for the PowerLoom decentralized sequencer network. Its primary purpose is to provide a stable, well-known entry point for other libp2p nodes (like snapshotters and validators) to discover and connect to the network.

By connecting to this bootstrap node, new peers can quickly find other participants in the network, facilitating efficient peer discovery and message propagation for Gossipsub topics.
By connecting to this bootstrap node, new peers obtain a stable dial target and DHT routing assistance to find other participants; gossipsub mesh formation happens directly between snapshotters and validators.

## Features

- **Stable Entry Point:** Provides a consistent multiaddress for new nodes to join the network.
- **Peer Discovery:** Helps other nodes discover more peers in the network via libp2p's DHT.
- **Lightweight:** Designed to be a simple, robust, and long-running service with minimal overhead.
- **Discovery-only:** Does **not** run gossipsub. Snapshotters and validators carry mesh traffic; the bootstrap node only accepts dial-ins and serves DHT routing.
- **Lightweight defaults:** Tight connection limits, bounded libp2p memory, no per-connection info logs unless opted in.

## Resource model (why older builds used 2+ GiB / 300% CPU)

A bootstrap node only needs TCP listen + Kademlia DHT. Prior versions also started **full gossipsub** (700ms heartbeats, peer scoring, flood publish) on every inbound peer, enabled **circuit relay** by default, allowed **500–2000** connections, and logged **every** connect/disconnect at info. That behaves like a mesh participant, not a rendezvous point — the `fix/memory-leak` peerstore GC did not change that.

| Setting | Default (new) | Typical old prod `.env` |
|---|---|---|
| `CONN_MANAGER_HIGH_WATER` | `128` | `800` |
| Gossipsub | off | on (unused, no topic join) |
| `ENABLE_RELAY_SERVICE` | `true` (capped) | relay on, unbounded |
| `RELAY_MAX_RESERVATIONS` | `256` | — |
| `LOG_PEER_CONNECTIONS` | `false` | info log per peer |
| `RCMGR_MEMORY_LIMIT_MB` | `512` | unlimited |

## Build

Expand Down Expand Up @@ -74,6 +88,15 @@ To ensure a consistent Peer ID and multiaddress for your bootstrap node, you sho

```dotenv
PRIVATE_KEY=your_generated_private_key_here
PUBLIC_IP=your.vps.public.ip
# NAT snapshotters without PUBLIC_IP use bootstrap as AutoRelay static relay (default on)
# ENABLE_RELAY_SERVICE=true
# RELAY_MAX_RESERVATIONS=256
# RELAY_MAX_RESERVATIONS_PER_IP=32
# CONN_MANAGER_HIGH_WATER=256
# RCMGR_MEMORY_LIMIT_MB=512
# LOG_PEER_CONNECTIONS=false
# LIBP2P_LOGGING=warn
```

## Run (Local Executable)
Expand Down Expand Up @@ -107,6 +130,46 @@ INFO[2025-07-10T17:16:04+05:30] Listening on addresses: [/ip4/127.0.0.1/tcp/4001
From the example above, a full multiaddress to use for other nodes would be:
`/ip4/127.0.0.1/tcp/4001/p2p/12D3KooWCNsSau1o9MeMVpHudvHaZRLESRcaGVK9FPKhdLU36BtF`

## Debugging Memory / Performance

### Periodic Status Logs

Every 60 seconds the node logs a status line with key metrics:

```
Status: connected=150 peerstore=152 dht_rt=20 goroutines=45 heap_alloc=28MB heap_inuse=32MB sys=55MB
```

| Metric | What to watch for |
|---|---|
| `peerstore` growing >> `connected` | Peerstore GC not cleaning fast enough |
| `dht_rt` growing unbounded | DHT routing table accumulating entries |
| `goroutines` growing | Goroutine leak |
| `heap_alloc` growing while others stable | Leak in libp2p internals (gossipsub, relay, etc.) |

### pprof Endpoint

Set `PPROF_PORT=6060` in your `.env` file to enable the Go pprof debug server. The port is already wired in `docker-compose.yaml`.

```bash
# Heap profile — what's using memory right now
go tool pprof http://localhost:6060/debug/pprof/heap

# Allocations — what's been allocating the most over time
go tool pprof -alloc_space http://localhost:6060/debug/pprof/heap

# Compare two snapshots to find what grew (most useful)
curl -o heap1.pb.gz http://localhost:6060/debug/pprof/heap
# ... wait 30 min ...
curl -o heap2.pb.gz http://localhost:6060/debug/pprof/heap
go tool pprof -base heap1.pb.gz heap2.pb.gz

# Goroutine dump
curl http://localhost:6060/debug/pprof/goroutine?debug=2
```

The pprof diff (`-base`) is the most powerful — it shows exactly which allocations grew in the window, narrowing down whether the source is peerstore, DHT, gossipsub, relay, or something else.

## Usage with Other Nodes

To configure other libp2p nodes (like the `snapshotter-lite-local-collector` or `submission-topic-watcher`) to use this bootstrap node, you typically pass its full multiaddress via a command-line flag or environment variable (e.g., `--bootstrap` flag for the watcher, or `BOOTSTRAP_NODE_ADDR` environment variable for the collector).
12 changes: 11 additions & 1 deletion build-docker.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,14 @@
#!/bin/bash

# Detect docker compose command (docker compose plugin vs docker-compose standalone)
if docker compose version >/dev/null 2>&1; then
DOCKER_COMPOSE="docker compose"
elif docker-compose version >/dev/null 2>&1; then
DOCKER_COMPOSE="docker-compose"
else
echo "Error: Neither 'docker compose' nor 'docker-compose' found. Please install Docker Compose."
exit 1
fi

# Build the Docker image using docker-compose
docker-compose build
$DOCKER_COMPOSE build
37 changes: 32 additions & 5 deletions cmd/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,11 @@ import (
"encoding/hex"
"flag"
"fmt"
"net/http"
_ "net/http/pprof"
"os"
"os/signal"
"runtime"
"strconv"
"submissions-bootstrap-node/pkg/config"
"submissions-bootstrap-node/pkg/service"
Expand Down Expand Up @@ -100,8 +103,10 @@ func main() {
return
}

// Load config
cfg := config.LoadConfig()
log.Infof("Bootstrap config: conn_water=%d/%d relay=%t relay_slots=%d rcmgr_mem_mb=%d log_peer_conns=%t",
cfg.ConnManagerLowWater, cfg.ConnManagerHighWater,
cfg.EnableRelayService, cfg.RelayMaxReservations, cfg.RcmgrMemoryLimitMB, cfg.LogPeerConnections)

// Create a context that is canceled on a graceful shutdown signal
ctx, cancel := context.WithCancel(context.Background())
Expand All @@ -117,17 +122,36 @@ func main() {
log.Infof("🚀 Bootstrap node started. ID: %s", node.Host.ID().String())
log.Infof("🌍 Listening on addresses: %s", node.Host.Addrs())

// Start periodic peer logging
// Start pprof debug server if PPROF_PORT is set
if pprofPort := os.Getenv("PPROF_PORT"); pprofPort != "" {
go func() {
addr := ":" + pprofPort
log.Infof("Starting pprof server on %s", addr)
if err := http.ListenAndServe(addr, nil); err != nil {
log.Errorf("pprof server failed: %v", err)
}
}()
}

// Start periodic peer and memory logging
go func() {
ticker := time.NewTicker(60 * time.Second) // Log every 10 seconds
ticker := time.NewTicker(60 * time.Second)
defer ticker.Stop()
for {
select {
case <-ctx.Done():
return
case <-ticker.C:
peers := node.Host.Network().Peers()
log.Infof("Connected peers: %d", len(peers))
peerstoreSize := len(node.Host.Peerstore().Peers())
dhtSize := node.DHT.RoutingTable().Size()

var m runtime.MemStats
runtime.ReadMemStats(&m)

log.Infof("Status: connected=%d peerstore=%d dht_rt=%d goroutines=%d heap_alloc=%dMB heap_inuse=%dMB sys=%dMB",
len(peers), peerstoreSize, dhtSize, runtime.NumGoroutine(),
m.HeapAlloc/1024/1024, m.HeapInuse/1024/1024, m.Sys/1024/1024)
for _, p := range peers {
log.Debugf(" - %s", p.String())
}
Expand All @@ -139,8 +163,11 @@ func main() {
sigs := make(chan os.Signal, 1)
signal.Notify(sigs, syscall.SIGINT, syscall.SIGTERM)
<-sigs
signal.Stop(sigs)

fmt.Println()
log.Info("Shutting down bootstrap node...")
node.Host.Close()
if err := node.Close(); err != nil {
log.Errorf("Error during shutdown: %v", err)
}
}
10 changes: 8 additions & 2 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@
version: '3.8'

services:
bootstrap-node:
build:
context: .
dockerfile: Dockerfile
ports:
- "${BOOTSTRAP_PORT:-4001}:${BOOTSTRAP_PORT:-4001}"
- "${PPROF_PORT:-6060}:${PPROF_PORT:-6060}"
env_file:
- ./.env
restart: unless-stopped
Expand All @@ -17,3 +16,10 @@ services:
environment:
- LOG_FILE=/app/logs/bootstrap-node.log
- LOG_LEVEL=${LOG_LEVEL:-info}
- PPROF_PORT=${PPROF_PORT:-6060}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep pprof disabled unless explicitly configured

Setting PPROF_PORT=${PPROF_PORT:-6060} makes PPROF_PORT non-empty even when operators do not set it, because Compose :- interpolation injects the default when the variable is unset/empty. Since cmd/main.go starts pprof whenever PPROF_PORT is non-empty, this change enables and publishes the debug profiler by default, exposing runtime profiling endpoints in environments that expected pprof to stay opt-in.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Do not enable pprof by default

In the Docker Compose path, the README says PPROF_PORT must be set to enable pprof, but this line injects 6060 even when .env omits it; combined with the newly added port mapping, every default compose deployment starts http.ListenAndServe on :6060 and exposes /debug/pprof/* on the host. For public bootstrap nodes this leaks heap/goroutine/cmdline profiling data and can run expensive profiles without opt-in, so keep PPROF_PORT unset unless it is explicitly configured.

Useful? React with 👍 / 👎.

logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "5"
compress: "true"
4 changes: 0 additions & 4 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,7 @@ require (
github.com/ipfs/go-log/v2 v2.8.1
github.com/libp2p/go-libp2p v0.43.0
github.com/libp2p/go-libp2p-kad-dht v0.34.0
github.com/libp2p/go-libp2p-pubsub v0.14.2
github.com/multiformats/go-multiaddr v0.16.1
github.com/powerloom/snapshot-sequencer-validator v0.0.0-20250901113836-e70d23e8c3cd
github.com/sirupsen/logrus v1.9.3
gopkg.in/natefinch/lumberjack.v2 v2.2.1
)
Expand All @@ -24,12 +22,10 @@ require (
github.com/francoispqt/gojay v1.2.13 // indirect
github.com/go-logr/logr v1.4.3 // indirect
github.com/go-logr/stdr v1.2.2 // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/google/gopacket v1.1.19 // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/gorilla/websocket v1.5.3 // indirect
github.com/hashicorp/golang-lru v1.0.2 // indirect
github.com/hashicorp/golang-lru/v2 v2.0.7 // indirect
github.com/huin/goupnp v1.3.0 // indirect
github.com/ipfs/boxo v0.34.0 // indirect
github.com/ipfs/go-cid v0.5.0 // indirect
Expand Down
Loading