| title | (routing-release-0.262.0) Healthy App Route Pruning | ||
|---|---|---|---|
| expires_at | 2028-05-08 | ||
| tags |
|
Cloud Foundry environments may experience many 503 errors with x_cf_routererror:"no_endpoints" even though all the apps appear to be up and functional.
The route exists in the routing table but has no healthy endpoints available.
This is caused by changes introduced in routing-release 0.262.0 to enable Gorouter to retry more types of idempotent requests to failed backends.
The following commands can be run against the Gorouter log file to check for possible occurrences. Look for cases where data.error has a value of "context canceled" followed by a prune-failed-endpoint error.
# Here is an example from a Gorouter log bundle collected from BOSH. We find a vcap id of a failed request that meets the criteria using the above command.
find . -name "gorouter.stdout.log*" | while read line; do grep backend-endpoint-failed $line | jq -r '. | select(.data.error | contains("context canceled")) | .data.vcap_request_id'; done | head -1
27116dd3-f047-4a35-7873-e9ef7e1d3f71
# Next we find the log line that has the application ID
find . -name "gorouter.stdout.log*" | xargs egrep -Hn 27116dd3-f047-4a35-7873-e9ef7e1d3f71
./router.d60e75ac-5459-49f8-b029-543579d74ed0.2023-05-05-18-05-52/gorouter/gorouter.stdout.log:192:{"log_level":3,"timestamp":"2023-05-04T19:38:42.838473790Z","message":"backend-endpoint-failed","source":"vcap.gorouter","data":{"route-endpoint":{"ApplicationId":"d45e4b57-3420-40b3-b13d-9ef0562d58c5",REDACTED,"RouteServiceUrl":""},"error":"incomplete request (context canceled)","attempt":1,"vcap_request_id":"27116dd3-f047-4a35-7873-e9ef7e1d3f71","retriable":true,"num-endpoints":1,"got-connection":false,"wrote-headers":false,"conn-reused":false,"dns-lookup-time":0,"dial-time":0,"tls-handshake-time":0}}
# and verify the endpoint was pruned as a result of this fault
egrep -A5 -Hn 27116dd3-f047-4a35-7873-e9ef7e1d3f71 ./router.d60e75ac-5459-49f8-b029-543579d74ed0.2023-05-05-18-05-52/gorouter/gorouter.stdout.log | egrep "prune-failed-endpoint|d45e4b57-3420-40b3-b13d-9ef0562d58c5" | egrep prune-failed-endpoint
./router.d60e75ac-5459-49f8-b029-543579d74ed0.2023-05-05-18-05-52/gorouter/gorouter.stdout.log-193-{"log_level":3,"timestamp":"2023-05-04T19:38:42.838565797Z","message":"prune-failed-endpoint","source":"vcap.gorouter.registry","data":{"route-endpoint":{"ApplicationId":"d45e4b57-3420-40b3-b13d-9ef0562d58c5",REDACTED,"process_instance_id":"2ea1596c-a745-4fdc-53a4-d885","process_type":"web","source_id":"d45e4b57-3420-40b3-b13d-9ef0562d58c5",REDACTED,"RouteServiceUrl":""}}}
To resolve this issue, upgrade routing-release to v0.266.0 or above.