Add row miner refresh RPC by flesher · Pull Request #437 · block/proto-fleet

flesher · 2026-06-11T20:37:44Z

This PR adds a row-scoped miner refresh path so operators can ask Fleet to re-read one miner immediately instead of waiting for the normal telemetry poll. Architecturally, it introduces FleetManagementService.RefreshMiners, which accepts explicit device IDs, validates org scope, invokes the telemetry collector for each device with bounded concurrency, flushes pending status writes, and returns the latest MinerStateSnapshot rows alongside per-device errors.

The data flow is: row action -> useRefreshMiners -> generated Connect client -> fleetmanagement handler/service -> telemetry RefreshDevice -> plugin-backed miner collection/status update -> snapshot reload -> client snapshot merge plus current-page/error refetch. The UI adds a row Refresh action, status spinner, telemetry skeletons, and preserves failure visibility by showing per-device refresh errors while still merging any fresh snapshot returned by the server.

The telemetry layer now has an explicit flush path for status results so row refreshes can reflect offline/recovered state without waiting for the writer batch interval, while normal polling and scheduler behavior remain intact. The PR also includes a tiny dev TCP fault proxy for validating miner offline/recovery behavior against simulator miners.

Non-goals: this does not add bulk/select-all refresh, does not change command dispatch semantics, and does not implement post-command verification loops; those are deferred follow-up work.

Refs #431.

…rs-rpc # Conflicts: # server/internal/domain/telemetry/service.go

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7b4286cf2a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-06-11T20:41:52Z

+	snapshots, _, _, err := s.buildSnapshotsFromUnifiedQuery(ctx, orgID, "", int32(len(deviceIDs)), &interfaces.MinerFilter{
+		DeviceIdentifiers: deviceIDs,
+	}, nil)


Filter refreshed snapshots by requested device IDs

When RefreshMiners asks for snapshots for a specific ID, this filter does not actually constrain the dynamic list query: buildSnapshotsFromUnifiedQuery goes through SQLDeviceStore.buildListQuerySQL, and appendFilterSQL never emits a device_identifier = ANY(...) predicate for MinerFilter.DeviceIdentifiers. As a result, refreshing any miner other than the first miner in the default ordering returns an unrelated snapshot (and useFleet.mergeMiners updates that unrelated row) while reporting no error for the requested device.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-11T20:41:53Z

+	processErr := s.processDevice(ctx, device)
+	flushErr := s.FlushStatusNow(ctx)


Flush metrics before reading refreshed snapshots

For successful metric polls, processDevice only enqueues the new metrics on s.metricsResults; this path flushes only status before RefreshMiners immediately calls GetLatestDeviceMetrics while building the response. In normal operation the metrics writer may not persist the sample until its next tick, so the returned refreshed snapshot can still contain stale hashrate/power/temperature values even though the RPC just fetched fresh telemetry.

Useful? React with 👍 / 👎.

github-actions · 2026-06-11T20:42:23Z

🔐 Codex Security Review

Note: This is an automated security-focused code review generated by Codex.
It should be used as a supplementary check alongside human review.
False positives are possible - use your judgment.

Scope summary

Reviewed pull request diff only (4e477bde2d91904326a03fd27ca37c43d474251c...72a2846b82e187ef3bb7d3ade043d49aaa1dd1d4, exact PR three-dot diff)

Model: gpt-5.5

💡 Click "edited" above to see previous reviews for this PR.

Review Summary

Overall Risk: MEDIUM

Findings

[MEDIUM] Refresh response can return stale telemetry metrics

Category: Reliability
Location: server/internal/domain/telemetry/service.go:301
Description: RefreshDevice runs processDevice, then only flushes status via FlushStatusNow. Fresh metrics are queued to metricsResults and written later by metricsWriterRoutine, but RefreshMiners immediately reads snapshots from persisted latest metrics.
Impact: A successful manual refresh can show fresh status with stale hashrate, power, temperature, and efficiency, which defeats the operator-facing “refresh” behavior.
Recommendation: Add a synchronous metrics flush path mirroring status flush, or have RefreshDevice return the collected metrics and merge them into the response snapshot without waiting for the batch writer.

[MEDIUM] Refresh fan-out bypasses per-device in-flight protection

Category: Concurrency
Location: server/internal/domain/fleetmanagement/service.go:292
Description: RefreshMiners launches work for every raw device_ids entry and does not deduplicate. Because RefreshDevice calls processDevice directly instead of claiming TelemetryService.inFlight, a request with the same device ID repeated 50 times can run up to 10 concurrent collections against one miner.
Impact: Read-authorized clients can overload a single miner/plugin process and race scheduled telemetry/status polling, producing duplicate RPCs and inconsistent last-write-wins status/metric updates.
Recommendation: Trim and deduplicate IDs before launching goroutines, reject duplicate IDs, and have manual refresh share the telemetry service’s per-device in-flight guard or scheduler queue.

[LOW] Raw backend/plugin errors are returned as response data

Category: gRPC
Location: server/internal/domain/fleetmanagement/service.go:323
Description: Per-device failures are copied with err.Error() into RefreshMinersResponse.errors, including telemetry/plugin errors and DB lookup errors. This bypasses any chance to classify, bound, or sanitize per-device failure messages.
Impact: A plugin or miner-controlled error string can expose internal details to UI/API clients or create oversized response/toast content.
Recommendation: Return bounded, user-safe messages or error codes in the RPC response and log raw errors server-side with device/org context.

Notes

Reviewed only .git/codex-review.diff as requested. No cryptostealing/pool-hijack, SQL injection, auth bypass, or protobuf wire-format break was found in the changed hunks.

_{Generated by Codex Security Review |

Triggered by: @flesher |

Review workflow run}

Copilot

Pull request overview

Adds a row-scoped “Refresh” flow that lets ProtoFleet request an immediate telemetry/status collection for explicit device IDs, flush status updates, and return updated miner snapshots with per-device errors—so the UI can reflect remediation quickly without waiting for the normal polling cycle.

Changes:

Introduces RefreshMiners Connect-RPC (+ proto + permissions) and server-side fan-out/timeout/concurrency handling.
Extends the telemetry domain with a synchronous status flush mechanism and a refresh entry point.
Wires the Fleet UI row action to call refresh, show in-flight indicators, and merge refreshed snapshots into the visible list; adds a small dev fault-proxy script for offline/recovery testing.

Reviewed changes

Copilot reviewed 24 out of 27 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
server/internal/handlers/middleware/rpc_permissions.go	Grants `RefreshMiners` RPC the miner-read permission gate.
server/internal/handlers/fleetmanagement/handler.go	Adds the `RefreshMiners` RPC handler wiring to the domain service.
server/internal/domain/telemetry/service.go	Adds status flush requests + `RefreshDevice`/`FlushStatusNow`; updates in-flight handling and status writer flushing.
server/internal/domain/fleetmanagement/telemetry.go	Extends `TelemetryCollector` interface with `RefreshDevice` for fleet mgmt.
server/internal/domain/fleetmanagement/service.go	Implements `RefreshMiners` server logic + helper to fetch snapshots by explicit IDs.
server/generated/grpc/fleetmanagement/v1/fleetmanagementv1connect/fleetmanagement.connect.go	Regenerated Connect client/server bindings for new RPC.
scripts/dev/miner_fault_proxy.py	Adds a tiny TCP fault-injection proxy for local refresh/offline testing.
proto/fleetmanagement/v1/fleetmanagement.proto	Adds `RefreshMiners` RPC and request/response messages.
docs/plans/2026-06-11-refresh-miners-rpc-and-row-bulk-actions-tdd.md	Adds TDD describing the refresh RPC and row action behavior.
client/src/protoFleet/features/fleetManagement/components/RowActionsMenu/RowActionsMenu.tsx	Adds per-action `disabled` support for row action items.
client/src/protoFleet/features/fleetManagement/components/MinerList/types.ts	Adds per-row `isRefreshing` flag to list item type.
client/src/protoFleet/features/fleetManagement/components/MinerList/MinerStatusCell.tsx	Plumbs refresh state through to status rendering.
client/src/protoFleet/features/fleetManagement/components/MinerList/MinerStatus.tsx	Adds a “Refreshing” UI state with spinner.
client/src/protoFleet/features/fleetManagement/components/MinerList/MinerStatus.test.tsx	Adds unit test coverage for refreshing UI state.
client/src/protoFleet/features/fleetManagement/components/MinerList/MinerName.tsx	Passes merge/refresh-state callbacks down to action menu.
client/src/protoFleet/features/fleetManagement/components/MinerList/MinerList.tsx	Tracks refreshing IDs and threads merge/refresh-state callbacks into column config.
client/src/protoFleet/features/fleetManagement/components/MinerList/minerColConfig.tsx	Shows skeleton telemetry cells while a row is refreshing.
client/src/protoFleet/features/fleetManagement/components/MinerActionsMenu/SingleMinerActionsMenu.tsx	Adds “Refresh” row action that calls the RPC, merges snapshots, and toggles refresh state.
client/src/protoFleet/features/fleetManagement/components/Fleet/Fleet.tsx	Exposes `mergeMiners` and ensures error refetch on global refetch.
client/src/protoFleet/features/fleetManagement/components/Fleet/Fleet.test.tsx	Updates mock shape for new `mergeMiners` hook return.
client/src/protoFleet/features/auth/components/AuthenticateMiners/AuthenticateMiners.test.tsx	Updates mocked `useFleet` return shape to include `mergeMiners`.
client/src/protoFleet/api/useRefreshMiners.ts	Adds a client hook for calling `RefreshMiners` and tracking in-flight IDs.
client/src/protoFleet/api/useRefreshMiners.test.ts	Adds unit tests for hook validation, return shape, and refreshing tracking.
client/src/protoFleet/api/useFleet.ts	Adds `mergeMiners` upsert helper for optimistic snapshot merges.
client/src/protoFleet/api/useAuthNeededMiners.test.ts	Updates mocked `useFleet` return shape to include `mergeMiners`.
client/src/protoFleet/api/generated/fleetmanagement/v1/fleetmanagement_pb.ts	Regenerated TS protobuf types including the new refresh messages/service method.

+// RefreshDevice runs the same collection path used by scheduled telemetry for
+// one device, then asks the status writer to flush pending status updates.
+func (s *TelemetryService) RefreshDevice(ctx context.Context, device models.Device) error {
+	processErr := s.processDevice(ctx, device)
+	flushErr := s.FlushStatusNow(ctx)
+	if processErr != nil {
+		return processErr
+	}
+	return flushErr
+}


+// RefreshDevice runs the same collection path used by scheduled telemetry for
+// one device, then asks the status writer to flush pending status updates.
+func (s *TelemetryService) RefreshDevice(ctx context.Context, device models.Device) error {
+	processErr := s.processDevice(ctx, device)
+	flushErr := s.FlushStatusNow(ctx)
+	if processErr != nil {
+		return processErr
+	}
+	return flushErr
+}
+
+func (s *TelemetryService) FlushStatusNow(ctx context.Context) error {
+	req := statusFlushRequest{done: make(chan error, 1)}
+
+	select {
+	case s.statusFlushRequests <- req:
+	case <-ctx.Done():
+		return ctx.Err()
+	}
+
+	select {
+	case err := <-req.done:
+		return err
+	case <-ctx.Done():
+		return ctx.Err()
+	}
+}


+	for _, id := range req.DeviceIds {
+		if strings.TrimSpace(id) == "" {
+			return nil, fleeterror.NewInvalidArgumentError("device_ids cannot contain empty device identifiers")
+		}
+	}


+	for result := range results {
+		if result.errMsg != "" {
+			resp.Errors[result.id] = result.errMsg
+		}
+		if result.snapshot != nil {
+			resp.Snapshots = append(resp.Snapshots, result.snapshot)
+		}
+	}


+func (s *Service) RefreshMiners(ctx context.Context, req *pb.RefreshMinersRequest) (*pb.RefreshMinersResponse, error) {
+	if len(req.DeviceIds) == 0 {
+		return nil, fleeterror.NewInvalidArgumentError("device_ids must contain at least one device identifier")
+	}
+	if len(req.DeviceIds) > refreshMinersMaxDevices {
+		return nil, fleeterror.NewInvalidArgumentErrorf("device_ids must contain at most %d device identifiers", refreshMinersMaxDevices)
+	}
+	for _, id := range req.DeviceIds {
+		if strings.TrimSpace(id) == "" {
+			return nil, fleeterror.NewInvalidArgumentError("device_ids cannot contain empty device identifiers")
+		}
+	}


+  const mergeMiners = useCallback((snapshots: MinerStateSnapshot[]) => {
+    if (snapshots.length === 0) {
+      return;
+    }
+
+    setMiners((prev) => {
+      let next: Record<string, MinerStateSnapshot> | undefined;
+
+      snapshots.forEach((snapshot) => {
+        const existingMiner = prev[snapshot.deviceIdentifier];
+        if (existingMiner && equals(MinerStateSnapshotSchema, existingMiner, snapshot)) {
+          return;
+        }
+
+        next ??= { ...prev };
+        next[snapshot.deviceIdentifier] = snapshot;
+      });
+
+      return next ?? prev;
+    });
+  }, []);


chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2e215d417b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-06-11T21:58:09Z

+func (s *TelemetryService) RefreshDevice(ctx context.Context, device models.Device) error {
+	processErr := s.processDevice(ctx, device)


Prevent duplicate refresh collections

When a row refresh is requested while the scheduled worker is already polling the same device, this path bypasses the inFlight guard used by workers, so processDevice can run concurrently for that miner. That issues duplicate plugin calls and can race the queued status/metrics writes, and the TDD for this change explicitly relies on the in-flight collection being reused rather than starting another one. Please claim/wait on s.inFlight in RefreshDevice before calling processDevice.

Useful? React with 👍 / 👎.

flesher added 2 commits June 11, 2026 12:09

docs(plans): TDD for RefreshMiners RPC + row/bulk actions

57635d0

Refs #431.

Add miner refresh RPC and row action

7b4286c

Copilot AI review requested due to automatic review settings June 11, 2026 20:37

flesher requested a review from a team as a code owner June 11, 2026 20:37

github-actions Bot assigned flesher Jun 11, 2026

github-actions Bot added documentation Improvements or additions to documentation automation javascript Pull requests that update javascript code client server shared labels Jun 11, 2026

Copilot started reviewing on behalf of flesher June 11, 2026 20:38 View session

Merge remote-tracking branch 'origin/main' into feat/431-refresh-mine…

2ad81be

…rs-rpc # Conflicts: # server/internal/domain/telemetry/service.go

chatgpt-codex-connector Bot reviewed Jun 11, 2026

View reviewed changes

Copilot AI reviewed Jun 11, 2026

View reviewed changes

Fix row refresh snapshot fallback

2e215d4

chatgpt-codex-connector Bot reviewed Jun 11, 2026

View reviewed changes

flesher added 2 commits June 11, 2026 17:23

Fix CI lint and generated formatting

d54d425

Fix telemetry channel-full test expectation

72a2846

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add row miner refresh RPC#437

Add row miner refresh RPC#437
flesher wants to merge 6 commits into
mainfrom
feat/431-refresh-miners-rpc

flesher commented Jun 11, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		processErr := s.processDevice(ctx, device)
		flushErr := s.FlushStatusNow(ctx)

		func (s *TelemetryService) RefreshDevice(ctx context.Context, device models.Device) error {
		processErr := s.processDevice(ctx, device)

Conversation

flesher commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔐 Codex Security Review

Review Summary

Findings

[MEDIUM] Refresh response can return stale telemetry metrics

[MEDIUM] Refresh fan-out bypasses per-device in-flight protection

[LOW] Raw backend/plugin errors are returned as response data

Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

flesher commented Jun 11, 2026 •

edited

Loading

github-actions Bot commented Jun 11, 2026 •

edited

Loading