fix silent RX drops: enlarge ring to 4 KiB, add drop counters#5
Closed
cpunt wants to merge 1 commit into
Closed
Conversation
The RX software ring between the PL011 ISR and the foreground reader
was hard-coded to 128 bytes. At 115200 baud that is only ~11 ms of
headroom, so any foreground stall longer than a flash page program or
a GC pause caused bytes to silently disappear mid-frame. The ISR had
a `// optional rxDrops++` placeholder in the overflow branch but no
actual counter, so applications had no way to observe the loss.
This commit:
- widens the ring head/tail to uint16 and bumps bufferSize to 4096,
giving ~355 ms of headroom at 115200. Still a power of two so the
modulo compiles to an AND. Costs ~16 KiB of static RAM on a dual-
UART RP2350 config; fits easily in the 520 KiB SRAM budget.
- adds rxHwDrops and rxSwDrops counters on the UART struct and
increments them from handleInterrupt when UARTDR arrives with
per-byte error bits set (the OE bit indicates the hardware FIFO
overflowed before the ISR serviced it) and when Buffer.Put returns
false respectively. Both are volatile.Register32; the ISR is the
only writer so increments are race-free on a single-core MCU, and
aligned 32-bit loads/stores keep reads safe on dual-core.
- exposes the counters via (*UART).RXDrops() so applications can
surface them in telemetry. Both values should stay at zero under
normal operation; nonzero hw indicates ISR latency exceeded one
character time, nonzero sw indicates the application consumer is
not draining the ring fast enough.
RingBuffer.Size() and RingBuffer.Used() now return uint16 instead of
uint8. Every caller in this repo already converts through int(), so
this is a no-op internally. External callers that stored the return
as uint8 will need to widen the type; there is no tagged release so
no semver break.
Verified:
- tinygo build -target=pico2 ./cmd/uart_diag
- tinygo build -target=pico2 ./examples/simple/uartx-polling
- downstream: a 336 KiB firmware-update transfer over 115200 that
used to corrupt 15-20 frames and take 60-180 s now runs clean in
~55 s (stop-and-wait floor) with rxHwDrops == rxSwDrops == 0.
cpunt
added a commit
to jangala-dev/devicecode-go
that referenced
this pull request
Apr 14, 2026
The upstream tinygo-uartx software RX ring was hard-coded to 128
bytes — only ~11 ms of headroom at 115200 baud. Any reactor-side
stall longer than that (a flash page program, a TinyGo GC pause,
a periodic logger burst) caused bytes to be silently dropped at
the ISR's `Buffer.Put` and corrupted incoming `xfer_chunk` frames
mid-flight. This produced ~80-byte payload truncations and
duplicate-seq merged frames during firmware update transfers.
This commit:
- Adds a replace directive in go.mod pointing at a vendored copy
of tinygo-uartx that bumps the ring to 4 KiB with uint16 indices
(~355 ms of headroom at 115200) and adds rxHwDrops / rxSwDrops
counters on the UART struct. The replace stays in place until
the upstream PR (jangala-dev/tinygo-uartx#5) lands and we can
re-pin to a tagged release.
- Plumbs the new uartx.RXDrops accessor through rp2SerialPort and
surfaces it from serial_raw via a [serial-raw] log line that
also reports cumulative RX bytes and shmring-full events. The
line is throttled (1 Hz minimum, only on counter delta or 64
KiB byte quantum) so it stays silent in steady state.
- Logs the active A/B partition on the fabric session's
"waiting for connection start" message so post-reboot debugging
can confirm which slot the MCU is running.
- Extends the existing decode_failed / size_mismatch xfer_chunk
error logs with seq, off and data_len for clearer diagnostics
on any future receive-side parse failure.
In a clean steady-state run all three drop counters
(ring_full, rx_hw_drops, rx_sw_drops) stay at zero across a full
~336 KiB firmware update transfer.
Author
|
Closed as moved up in layers jangala-dev/devicecode-go#40 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The issue
The software RX ring between the PL011 ISR and the foreground reader is
hard-coded to 128 bytes in
uartx/ringbuffer.go:At 115200 baud a byte arrives every ~87 µs, so 128 bytes is only
~11 ms of headroom between the interrupt producer and the foreground
consumer. Any stall longer than that on the consumer side causes
Buffer.Put()insidehandleInterruptto start returning false, atwhich point bytes are silently discarded. The existing ISR code even
acknowledges the drop path:
…but the counter was never implemented, so applications had no way to
detect the loss. The symptom downstream is mid-frame data corruption
with no error path back to the caller.
How we hit it
We use this driver for a full-duplex fabric link between a CM5 host
and an RP2350 MCU at 115200 8N1, running a stop-and-wait firmware
update protocol that ships ~800-byte JSON chunks, writes each chunk
to flash on the MCU, acks, and waits for the next one.
The MCU side does a flash page program (5–10 ms) per accepted chunk,
plus occasional TinyGo GC cycles and periodic logger bursts. Any one
of those events pushed total foreground latency above the 11 ms
headroom and caused bytes to vanish from the middle of the next
incoming frame. Two surface symptoms:
datafield (typically 76–96 bytes missing); sender and receiverboth claimed the full chunk length.
\nterminator, our line assembler kept reading past the brokenframe and glued the sender's retry of the same seq onto the broken
copy. We would see
types=xfer_chunk,xfer_chunk seqs=N,Non thereceive side.
Both classes disappeared the moment the RX ring was resized.
The fix
1)
uartx/ringbuffer.go— widen and enlargebufferSizepromoted fromuint8 = 128touint16 = 4096.head/tailswitched fromvolatile.Register8tovolatile.Register16so sizes above 255 work without wraparoundambiguity.
Size()andUsed()now returnuint16.4096 is still a power of two so
(h+1) % bufferSizecompiles to abitmask. It gives ~355 ms of headroom at 115200 (scales linearly with
baud), which is enough to absorb any realistic foreground stall on
the MCUs this package currently targets. The extra RAM cost is
2 × (4096 − 128) = 7.75 KiBper UART instance (RX + TX ring); on adual-UART RP2350 config that is ~16 KiB total, negligible against the
520 KiB SRAM budget.
2)
uartx/rp2_uart.go— drop countersUART:rxHwDropsandrxSwDrops, each avolatile.Register32.handleInterruptincrementsrxHwDropsin the existing branchthat drops PL011 error bytes (the most informative of OE/BE/PE/FE
is UARTDR_OE, which means the hardware FIFO overflowed before the
ISR could service it), and increments
rxSwDropswhenBuffer.Putreturns false.sequence is race-free on a single core. On a dual-core MCU the
reader may see a slightly stale value but aligned 32-bit
loads/stores remain atomic.
3)
uartx/uartx.go— public accessorNew
(*UART).RXDrops() (hw, sw uint32)so applications can surfacethe counters in whatever telemetry / logging they already have. In a
healthy deployment at the supported bauds both values should stay at
zero; a nonzero
hwmeans the ISR latency is exceeding one charactertime, and a nonzero
swmeans the application consumer is notdraining the ring fast enough.
Compat notes
RingBuffer.Size()andRingBuffer.Used()return type changedfrom
uint8touint16. Every caller in this repo — includingthe
Buffered()andTxFree()wrappers — already goes throughint(...), so there is no in-repo break. External callers thatexplicitly stored the result as
uint8will need to widen thetype. The package has no tagged release yet, so no semver break.
~16 KiB. Trivial against the chip's 520 KiB SRAM.
Verification
tinygo build -target=pico2 ./cmd/uart_diag— builds cleanly withthe uint16 widening.
tinygo build -target=pico2 ./examples/simple/uartx-polling—builds cleanly; the example's
Buffered()call is unaffected.115200 that used to corrupt 15–20 frames per run and take 60–180 s
with retries now runs clean in ~55 s (stop-and-wait protocol floor)
with
rxHwDrops == rxSwDrops == 0across the entire transfer.Follow-ups (not in this PR)
bufferSizeconfigurable viaUARTConfigforcallers who care about RAM more than headroom. That would require
moving the ring from a fixed-size array to a slice and is a
larger change.
Size()/Used()/Buffered()to allreturn
intso the width is never a compat concern again.