fix: add retry with || true for post-snapshot invoke in test-04#736
fix: add retry with || true for post-snapshot invoke in test-04#736Slambot01 wants to merge 5 commits into
Conversation
Signed-off-by: Ritesh Pandit <riteshpandit1708@gmail.com>
sorry, I missed it didn't resolved the underlying issue
|
I initially approved, but then I realized it does not solved the underlying issue. Test still fail, probably because the error is in a different place. Not in the test, and waiting till the container is ready - but somewhere in the network boot/restore process. There is probably race condition when the chaincode container starts. Maybe as a workaround we should try to restart it in the test to verify if that's the issue? And then fix in a proper way, eliminating the root cause. |
Makes sense. I’ll look into the chaincode container restart angle. I’ll check if restarting the CCaaS container after snapshot restore fixes the failure, and if it does, I’ll trace it back to the actual issue in the restore flow. |
…RPC connections Signed-off-by: Ritesh Pandit <riteshpandit1708@gmail.com>
The CCaaS restart approach worked locally, but CI is still failing. All 10 retries end up hitting |
Fixes #734
The first
expectInvokeRestafter snapshot restore intermittently fails withDEADLINE_EXCEEDED.waitForContainerconfirms the CCaaS gRPC server is listening, but the peer's gRPC client reconnection happens asynchronously -sleep 2(from #648) is insufficient under variable CI load.The previous retry attempt in #648 used
&& breakwithout|| true, which couldn't work becauseexpect-invoke-rest.shcallsexit 1on failure and the test runs underset -e, aborting before the loop can retry.Fix
Added
expectInvokeRestWithRetryusing the same|| truepattern proven intest-01-v3-simple.sh(expectQueryWithRetry, lines 48–61). Applied only to the first invoke after snapshot restore .the second invoke runs normally since the connection is established by then.