Skip to content

SSE parser byte-level optimization, chunked decoding, auto-reconnect#9

Merged
dustturtle merged 3 commits intomainfrom
copilot/optimize-sse-parser-ios
Mar 30, 2026
Merged

SSE parser byte-level optimization, chunked decoding, auto-reconnect#9
dustturtle merged 3 commits intomainfrom
copilot/optimize-sse-parser-ios

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 10, 2026

Implements the four iOS-specific optimizations discussed in the issue: reducing ARC/CPU overhead in the SSE parser, handling proxied chunked streams, seamless reconnection on network switches, and thread-safety documentation.

SSEParser: byte-level accumulation

  • lineBuffer changed from String to Data; line-ending scan uses raw bytes (0x0A/0x0D)
  • Field names compared as [UInt8] arrays — no String allocation per line
  • String(data:encoding:.utf8) deferred to event dispatch boundary (\n\n)

ChunkedDecoder (new)

  • Incremental state-machine that strips Transfer-Encoding: chunked framing before SSE parsing
  • Handles partial chunks split across TCP segments, chunk extensions, hex size variants
  • Integrated via enableChunkedDecoding() on NWAsyncSocket

SSE auto-reconnect with Last-Event-ID

  • enableSSEAutoReconnect(retryInterval:) — on error disconnect, schedules reconnect preserving lastEventId
  • Honors server-sent retry: field for backoff interval
  • Delegate callback socket(_:willAutoReconnectWithLastEventId:afterDelay:) lets the app re-send the HTTP request with Last-Event-ID header
  • Explicit disconnect() cancels auto-reconnect

Thread safety docs

  • Documented socketQueue / delegateQueue separation in class-level and method-level doc comments (behavior was already correct)
let socket = NWAsyncSocket(delegate: self, delegateQueue: .main)
socket.enableSSEParsing()
socket.enableChunkedDecoding()        // behind Nginx/CDN
socket.enableSSEAutoReconnect()       // auto-retry with Last-Event-ID
try socket.connect(toHost: "api.example.com", onPort: 443)

87 tests pass (71 existing + 16 new ChunkedDecoder tests). All existing SSEParser/StreamBuffer/ReadRequest tests unchanged.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 2 commits March 10, 2026 07:37
…edDecoder, auto-reconnect with Last-Event-ID, thread safety docs

1. SSEParser byte-level optimization: Rewrite to accumulate raw Data bytes,
   scan for line endings at byte level (0x0A/0x0D), and defer String
   conversion until a complete event boundary is reached. Reduces ARC
   overhead and CPU usage.

2. ChunkedDecoder: New incremental decoder for HTTP Transfer-Encoding:
   chunked byte streams. Strips hex-length + CRLF framing so SSE parser
   receives clean data. Handles partial chunks across TCP segments.

3. Auto-reconnect with Last-Event-ID: NWAsyncSocket now supports
   enableSSEAutoReconnect() which auto-reconnects on error disconnect,
   preserves lastEventId across reconnections, and notifies the delegate
   via willAutoReconnectWithLastEventId for seamless SSE resumption.

4. Thread safety documentation: Enhanced doc comments clarifying that
   socketQueue handles all I/O and parsing off the main thread, while
   delegateQueue delivers callbacks to the UI thread.

Co-authored-by: dustturtle <2305214+dustturtle@users.noreply.github.com>
Co-authored-by: dustturtle <2305214+dustturtle@users.noreply.github.com>
Copilot AI changed the title [WIP] Optimize SSE parser implementation for iOS framework SSE parser byte-level optimization, chunked decoding, auto-reconnect Mar 10, 2026
@dustturtle dustturtle marked this pull request as ready for review March 30, 2026 06:32
Copilot AI review requested due to automatic review settings March 30, 2026 06:32
@dustturtle dustturtle merged commit 230c7bf into main Mar 30, 2026
3 checks passed
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds iOS-focused streaming improvements to NWAsyncSocket: more efficient SSE parsing, support for HTTP Transfer-Encoding: chunked streams (common behind proxies/CDNs), and an SSE auto-reconnect mechanism that preserves Last-Event-ID, along with updated thread-safety documentation.

Changes:

  • Refactors SSEParser to accumulate and scan at the byte level, deferring String conversion to event dispatch.
  • Introduces ChunkedDecoder (and tests) and integrates it into the socket read path via enableChunkedDecoding().
  • Adds SSE auto-reconnect state + delegate callback socket(_:willAutoReconnectWithLastEventId:afterDelay:) and documents queueing/threading behavior.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
Sources/NWAsyncSocket/SSEParser.swift Switches SSE parsing to byte-level buffering and field matching to reduce allocations.
Sources/NWAsyncSocket/ChunkedDecoder.swift New incremental decoder to strip HTTP chunked framing before higher-level parsing.
Sources/NWAsyncSocket/NWAsyncSocket.swift Wires chunked decoding into the read loop; adds auto-reconnect state machine and thread-safety docs.
Sources/NWAsyncSocket/NWAsyncSocketDelegate.swift Adds an optional delegate callback for auto-reconnect with last event ID and delay.
Tests/NWAsyncSocketTests/ChunkedDecoderTests.swift Adds unit tests covering chunk parsing across segment boundaries, extensions, and SSE payloads.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +58 to +61
public func decode(_ data: Data) -> Data {
buffer.append(data)
var output = Data()

Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decode(_:) always appends incoming bytes to buffer, even after the decoder reaches .complete. Since the .complete state breaks the loop without consuming or clearing buffer, any subsequent data (e.g., trailers after the 0-sized chunk, or accidental extra bytes) will be retained indefinitely and can grow memory unbounded. Consider short-circuiting when state == .complete (don’t append / immediately return empty) and/or clearing buffer when transitioning to .complete after the final chunk/trailers are consumed.

Copilot uses AI. Check for mistakes.
Comment on lines +103 to +107
// Skip the trailing CRLF.
if buffer[buffer.startIndex] == 0x0D
&& buffer[buffer.index(after: buffer.startIndex)] == 0x0A {
buffer = Data(buffer.suffix(from: buffer.startIndex + 2))
}
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In .readingTrailer, if the next two bytes are not \r\n, the code still transitions back to .waitingForSize without consuming anything. This can cause the next size-line parse to include invalid trailer bytes and desynchronize the state machine. Consider strictly requiring \r\n (wait for more data if partial, otherwise treat as malformed input by resetting/marking complete/adding an error path) before moving back to .waitingForSize.

Suggested change
// Skip the trailing CRLF.
if buffer[buffer.startIndex] == 0x0D
&& buffer[buffer.index(after: buffer.startIndex)] == 0x0A {
buffer = Data(buffer.suffix(from: buffer.startIndex + 2))
}
let first = buffer[buffer.startIndex]
let second = buffer[buffer.index(after: buffer.startIndex)]
// Require a proper CRLF trailer; otherwise treat as malformed input.
guard first == 0x0D, second == 0x0A else {
state = .complete
break loop
}
// Skip the trailing CRLF.
buffer = Data(buffer.suffix(from: buffer.startIndex + 2))

Copilot uses AI. Check for mistakes.
Comment on lines 571 to 586
connectedHost = nil
connectedPort = 0
readQueue.removeAll()
buffer.reset()
sseParser?.reset()
// Note: sseParser is intentionally NOT reset here so that
// lastEventId survives across reconnections.

delegateQueue.async { [weak self] in
guard let self = self else { return }
self.delegate?.socketDidDisconnect(self, withError: error)
if shouldAutoReconnect {
scheduleAutoReconnect()
} else {
sseParser?.reset()
delegateQueue.async { [weak self] in
guard let self = self else { return }
self.delegate?.socketDidDisconnect(self, withError: error)
}
}
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disconnectInternal resets buffer and (in the non-auto-reconnect path) resets the sseParser, but it never resets chunkedDecoder. If a disconnect happens mid-chunk, the decoder may carry buffered partial state into the next connection on the same NWAsyncSocket instance, corrupting subsequent reads/SSE parsing. Consider resetting (or recreating) chunkedDecoder during disconnect when chunked decoding is enabled.

Copilot uses AI. Check for mistakes.
private func disconnectInternal(error: Error?) {
guard isConnected || connection != nil else { return }

let shouldAutoReconnect = sseAutoReconnectEnabled && sseParser != nil && error != nil
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto-reconnect is gated on error != nil (shouldAutoReconnect = ... && error != nil). However, readNextChunk() calls disconnectInternal(error: nil) when isComplete == true (EOF), which is also an unexpected disconnect scenario for SSE streams. If the goal is to recover from network switches / remote closes, consider triggering auto-reconnect on EOF/cancelled states as well (e.g., treat isComplete as an auto-reconnect-eligible disconnect, unless the user explicitly called disconnect()).

Suggested change
let shouldAutoReconnect = sseAutoReconnectEnabled && sseParser != nil && error != nil
let shouldAutoReconnect = sseAutoReconnectEnabled && sseParser != nil

Copilot uses AI. Check for mistakes.
Comment on lines +123 to +140
private func extractLineBytes(from offset: Int) -> (Data, Int)? {
let start = lineBuffer.startIndex + offset
var i = offset
while i < lineBuffer.count {
let byte = lineBuffer[lineBuffer.startIndex + i]
if byte == SSEParser.CR {
let lineData = Data(lineBuffer[start..<(lineBuffer.startIndex + i)])
// \r\n counts as a single line ending
if next < scalars.endIndex && scalars[next] == "\n" {
let afterCRLF = scalars.index(after: next)
return (String(scalars[scalars.startIndex..<lineEnd]),
String(scalars[afterCRLF...]))
if i + 1 < lineBuffer.count && lineBuffer[lineBuffer.startIndex + i + 1] == SSEParser.LF {
return (lineData, i + 2)
} else {
return (String(scalars[scalars.startIndex..<lineEnd]),
String(scalars[next...]))
return (lineData, i + 1)
}
} else if scalar == "\n" {
let lineEnd = idx
let next = scalars.index(after: idx)
return (String(scalars[scalars.startIndex..<lineEnd]),
String(scalars[next...]))
} else if byte == SSEParser.LF {
let lineData = Data(lineBuffer[start..<(lineBuffer.startIndex + i)])
return (lineData, i + 1)
}
idx = scalars.index(after: idx)
i += 1
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The byte-level parser still copies data for every extracted line (let lineData = Data(lineBuffer[start..<...])) and also copies the remaining buffer when trimming consumed bytes (lineBuffer = Data(lineBuffer.suffix(...))). For high-throughput SSE, this can add significant allocation/copy overhead and partially undermines the intended optimization. Consider operating on Data.SubSequence/indices (or a ring buffer) to avoid per-line copies, and only materialize Data/String at dispatch time.

Copilot uses AI. Check for mistakes.

// MARK: - Byte-by-byte delivery

func testByteByteFeed() {
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test name testByteByteFeed looks like a typo and reads ambiguously. Renaming to something like testByteByByteFeed (or similar) would improve clarity when scanning test output and failure reports.

Suggested change
func testByteByteFeed() {
func testByteByByteFeed() {

Copilot uses AI. Check for mistakes.
dustturtle added a commit that referenced this pull request Apr 9, 2026
SSE parser byte-level optimization, chunked decoding, auto-reconnect
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants