Skip to content

Conversation

@cfsmp3
Copy link
Contributor

@cfsmp3 cfsmp3 commented Jan 19, 2026

Summary

  • Added retry logic with exponential backoff (3 attempts, 5s → 10s → 20s delay) to both Linux and Windows CI VM scripts
  • Both postStatus and sendLogFile functions now check curl exit codes and HTTP response codes
  • Detailed logging added for retry attempts and failures
  • Created separate sendLogFile function in Windows script (previously inline)

Root Cause

Test 7935 completed all 237 tests but never reported completion to the server. The test remained stuck in "testing" state with 100% progress. Investigation revealed that the CI VM scripts had no error handling for status POST requests - if the curl request failed due to network issues or server timeout, the script would continue without retrying and then shut down, leaving the test stuck forever.

Changes

Linux (install/ci-vm/ci-linux/ci/runCI):

  • postStatus(): Now retries up to 3 times with exponential backoff
  • sendLogFile(): Now retries up to 3 times with exponential backoff

Windows (install/ci-vm/ci-windows/ci/runCI.bat):

  • :postStatus: Now retries up to 3 times with exponential backoff
  • :sendLogFile: New function with retry logic (previously inline curl call)
  • :haltAndCatchFire: Now uses sendLogFile function

Test plan

  • Deploy to staging environment
  • Run a test to verify status posting works normally
  • Simulate network failure (e.g., block outbound traffic briefly) to verify retry logic works
  • Check logs after test completion to verify success messages are logged

🤖 Generated with Claude Code

Previously, the CI VM scripts (both Linux and Windows) had no error
handling for status POST requests. If the curl request failed due to
network issues or server errors, the script would continue without
retrying, potentially leaving tests stuck in "testing" state forever.

This fix adds:
- Retry logic with exponential backoff (3 attempts, 5s -> 10s -> 20s delay)
- HTTP status code checking (success = 2xx response)
- Curl exit code checking
- Detailed logging of retry attempts and failures

This addresses the issue where test 7935 completed all 237 tests but
the completion status was never reported to the server, leaving the
test stuck at 100% progress but "testing" state.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@canihavesomecoffee canihavesomecoffee force-pushed the fix/ci-script-retry-logic branch from 7fcc78c to c2d5db0 Compare January 19, 2026 14:39
@sonarqubecloud
Copy link

@canihavesomecoffee canihavesomecoffee merged commit 7fb3b73 into master Jan 19, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants