Skip to content

fix: stuck on "Connecting..."#1007

Open
Dzuchun wants to merge 13 commits intoesp-rs:mainfrom
Dzuchun:write-sync-fix
Open

fix: stuck on "Connecting..."#1007
Dzuchun wants to merge 13 commits intoesp-rs:mainfrom
Dzuchun:write-sync-fix

Conversation

@Dzuchun
Copy link
Copy Markdown

@Dzuchun Dzuchun commented Feb 18, 2026

I've encountered an issue with the same symptoms as #989, except in my case espflash would sometimes start working again after re-plugging the cable.

I reproduced the issue on the main branch, and traced to core reason -- when sending the Sync command, slip-codec's decoder appeared to be stuck reading bytes indefinitely here. Specifically -- byte 0x0a would be read indefinitely, never terminating the loop.

So I've added a byte count bounded variant of Connection::read (Connection::read_bounded), and used it to read the response (right now, the code expects responses of at most 44 bytes, so that was selected as an upper bound).

I also removed the claim that response is read with the timeout -- I am sure I've spent more than 100ms waiting for the Sync command to complete. My guess is that reading 0x0a from the port overrides the timeout, and thus the command itself never times out. This change is questionable, let me know if I should revert it, or...?

The tool works fine for me now -- ELF gets flashed every time.

As you could probably tell, I have little knowledge of what's actually going on. So I would appreciate extra attention to make sure this change doesn't actually break anything.

@Dzuchun Dzuchun changed the title fix: stuck on "Connectting..." fix: stuck on "Connecting..." Feb 18, 2026
@Dzuchun Dzuchun force-pushed the write-sync-fix branch 2 times, most recently from 7fcaad3 to f8f860c Compare February 18, 2026 05:40
@Dzuchun
Copy link
Copy Markdown
Author

Dzuchun commented Feb 18, 2026

My initial assumption that responses are up to 44 bytes long clearly was incorrect. However, I assume that response cannot be longer than the entire flash memory, so I bounded it by 5Mi now.

@Dzuchun Dzuchun marked this pull request as draft February 18, 2026 06:20
@Dzuchun
Copy link
Copy Markdown
Author

Dzuchun commented Feb 18, 2026

Using 8MiB as a static upper bound still causes the tool to hang -- apparently, reading 8MiB one-by-one takes a while.

So I've added max response length guessing based of the CommandType:

  • If Sync, expect at most 44 bytes
  • By default, assume 8Mib

Apparently, this is enough to both pass the CI and solve the issue. Please let me know what you think.

@Dzuchun Dzuchun marked this pull request as ready for review February 18, 2026 06:55
Copy link
Copy Markdown
Member

@SergioGasquez SergioGasquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me, but they introduce major breaking changes. There are two options:

  • Restoring read_response, marking it as deprecated and adding two new functions
  • Waiting for the espflash@v5 development to get this merged

Comment thread CHANGELOG.md Outdated
Copy link
Copy Markdown
Member

@SergioGasquez SergioGasquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it took me so long for the second round!

Comment thread espflash/src/command.rs Outdated
Comment thread espflash/src/command.rs Outdated
const FLASH_DEFLATE_END_TIMEOUT: Duration = Duration::from_secs(10);
const FLASH_MD5_TIMEOUT_PER_MB: Duration = Duration::from_secs(8);

const SYNC_MAX_LEN: u64 = 44;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should add a comment about why you chose 44

@SergioGasquez SergioGasquez added this to the v4.4 milestone Mar 27, 2026
@SergioGasquez
Copy link
Copy Markdown
Member

Hello @Dzuchun! Would you like to continue working on this PR or you would prefer me to add the two last changes and merge it?

@Dzuchun
Copy link
Copy Markdown
Author

Dzuchun commented Apr 11, 2026

Sorry for the long delay. A lot had happended, including a complete workstation change. So now I can't even check if the problem still occurs, or if the fix is still applicable.

I don't have a good source for why I've chosen 44 as a max response length for Sync -- I just looked at the code that read the response, and assumed it would never get a Vector value (see Connection::read_response_bounded, the match at L587-601). Value if not even read in Connection::sync.
Again, I have no sources for that, so please ensure it actually makes sense.

Pulled the main branch and made the requested changes, to the best of my abillity.

Dzuchun added 2 commits April 11, 2026 17:09
they are only used by the `connection` module, and I assume CI gets angry because of that
@Dzuchun
Copy link
Copy Markdown
Author

Dzuchun commented Apr 11, 2026

I am sorry, but I have no idea as of what would be the real cause here, and why it occurs on esp32 with UART specifically.
I bumped max length of the response for non-sync messages to 1Gi, but it brought no change. I doubt that esp32 can send messages this long, so I assume problem is with the response to Sync specifically (i.e. the 44 byte max length thing).

Maybe some vector response gets stuck in the buffer, and Sync request normally reads it, discards it, and tries again (up to 10 times)? I don't know, this might not even be a correct change to make.

https://github.com/esp-rs/espflash/actions/runs/24286820167/job/70917563010?pr=1007

UPD: slip has the convept of escapes, but from what I could see in the code, this can at most double the received data length. Doubling all the max lengths did not help.

@SergioGasquez
Copy link
Copy Markdown
Member

I am sorry, but I have no idea as of what would be the real cause here, and why it occurs on esp32 with UART specifically.
I bumped max length of the response for non-sync messages to 1Gi, but it brought no change. I doubt that esp32 can send messages this long, so I assume problem is with the response to Sync specifically (i.e. the 44 byte max length thing).

Reruning the HIL test made it green, not sure what happened on that run, but tried it several times now and it passes the test.

@SergioGasquez SergioGasquez linked an issue Apr 14, 2026 that may be closed by this pull request
@SergioGasquez
Copy link
Copy Markdown
Member

@Dzuchun could you rebase the PR so its mergeable?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flashing gets stuck at "Connecting..."

2 participants