-
Notifications
You must be signed in to change notification settings - Fork 226
Add fetch-cache script for collecting cached dep images from GitHub CI #899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds functionality to download pre-built Docker dependency images from GitHub Actions artifacts, enabling faster local builds by reusing CI-built images instead of compiling dependencies from scratch.
Changes:
- Added
fetch-cachetarget to Makefile that downloads cached dependency images from GitHub Actions artifacts - Added
build-with-cachetarget that builds quickstart using pre-fetched cached images instead of compiling dependencies - Added architecture detection logic to support both amd64 and arm64 platforms
- Created
.scripts/fetch-cachePython script that interfaces with GitHub CLI to download artifacts and load them into Docker
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 13 comments.
| File | Description |
|---|---|
| Makefile | Added architecture detection, cache configuration variables, and two new targets (fetch-cache and build-with-cache) to support fetching and using cached dependency images |
| .scripts/fetch-cache | New Python script that downloads dependency images from GitHub Actions artifacts, handles pagination, and loads images into Docker with appropriate tags |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| print(f"Command failed: {' '.join(cmd)}", file=sys.stderr) | ||
| print(f"stdout: {result.stdout}", file=sys.stderr) | ||
| print(f"stderr: {result.stderr}", file=sys.stderr) | ||
| raise subprocess.CalledProcessError(result.returncode, cmd) |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The run_cmd_quiet function captures output and only shows it on failure, which is good. However, when it raises CalledProcessError on line 74, it doesn't include the stdout/stderr in the exception, only prints them. This means any code catching this exception won't have access to the error output. Consider including the output in the exception message or as part of the exception's attributes for better error handling by callers.
| raise subprocess.CalledProcessError(result.returncode, cmd) | |
| raise subprocess.CalledProcessError( | |
| result.returncode, | |
| cmd, | |
| output=result.stdout, | |
| stderr=result.stderr, | |
| ) |
|
|
||
| page += 1 | ||
| except (subprocess.CalledProcessError, json.JSONDecodeError) as e: | ||
| print(f" Warning: Failed to list artifacts (page {page}): {e}", file=sys.stderr) |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the artifact pagination loop, if an exception occurs (line 126-128), the loop breaks and returns whatever artifacts were collected so far. This is appropriate for handling transient errors, but it means partial results could be silently returned if there are more artifacts available. Consider logging a warning that indicates how many pages were successfully processed before the error, so users understand if they're getting incomplete results.
| print(f" Warning: Failed to list artifacts (page {page}): {e}", file=sys.stderr) | |
| print( | |
| f" Warning: Failed to list artifacts for run {run_id} on page {page}: {e}. " | |
| f"Returning {len(all_artifacts)} artifact(s) collected from previous page(s).", | |
| file=sys.stderr, | |
| ) |
.scripts/fetch-cache
Outdated
| run_cmd(["docker", "load", "-i", tar_path]) | ||
|
|
||
| # Verify the image was loaded | ||
| result = run_cmd_quiet(["docker", "images", "-q", expected_tag], check=False) | ||
| if result.stdout.strip(): | ||
| print(f" Verified: {expected_tag}", file=sys.stderr) | ||
| return True | ||
| else: | ||
| print(f" Warning: Image {expected_tag} not found after load", file=sys.stderr) |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docker_load function verifies the image was loaded by checking if 'docker images -q' returns output for the expected tag. However, if the tar file contains an image with a different tag than expected (due to mismatch between CI build and local expectations), the load will succeed but the verification will fail with a warning. Consider also verifying what tag was actually loaded from the tar file (using 'docker load' output or 'docker inspect') to provide more helpful debugging information when tags don't match.
| run_cmd(["docker", "load", "-i", tar_path]) | |
| # Verify the image was loaded | |
| result = run_cmd_quiet(["docker", "images", "-q", expected_tag], check=False) | |
| if result.stdout.strip(): | |
| print(f" Verified: {expected_tag}", file=sys.stderr) | |
| return True | |
| else: | |
| print(f" Warning: Image {expected_tag} not found after load", file=sys.stderr) | |
| # Capture the output from docker load so we can report what was actually loaded | |
| load_output = run_cmd(["docker", "load", "-i", tar_path]) | |
| # Parse docker load output for loaded image references (e.g., "Loaded image: repo/name:tag") | |
| loaded_refs = [] | |
| for line in load_output.splitlines(): | |
| match = re.search(r"Loaded image(?:\sID)?:\s*(.+)", line) | |
| if match: | |
| loaded_refs.append(match.group(1).strip()) | |
| # Verify the image was loaded under the expected tag | |
| result = run_cmd_quiet(["docker", "images", "-q", expected_tag], check=False) | |
| if result.stdout.strip(): | |
| print(f" Verified: {expected_tag}", file=sys.stderr) | |
| return True | |
| else: | |
| if loaded_refs: | |
| print( | |
| f" Warning: Image {expected_tag} not found after load. " | |
| f"'docker load' reported: {', '.join(loaded_refs)}", | |
| file=sys.stderr, | |
| ) | |
| else: | |
| print(f" Warning: Image {expected_tag} not found after load", file=sys.stderr) |
.scripts/fetch-cache
Outdated
| This script downloads cached Docker images for quickstart dependencies from | ||
| the GitHub Actions artifacts. If artifacts are not available (expired or cache hit), | ||
| it falls back to downloading from the GitHub Actions cache. | ||
|
|
||
| After downloading, it loads the images into Docker with the correct tags | ||
| expected by the Dockerfile. | ||
|
|
||
| Primary source: Artifacts from the latest completed CI workflow on main branch | ||
| Fallback source: GitHub Actions cache (requires gh-actions-cache extension) |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring incorrectly states that artifacts are the fallback when they are actually the primary source. The docstring should be corrected to state:
- Primary source: Artifacts from the latest completed CI workflow on main branch
- Fallback source: GitHub Actions cache (requires gh-actions-cache extension)
.scripts/fetch-cache
Outdated
| Requirements: | ||
| - gh CLI authenticated with access to stellar/quickstart | ||
| - docker CLI available | ||
| - jq available | ||
| - (optional) gh-actions-cache extension for cache fallback |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The requirement mentions "gh-actions-cache extension" but the code never actually uses it. The download_from_cache function at line 171 always returns False with a note that caches cannot be downloaded outside GitHub Actions. Either remove the mention of this extension from the requirements, or clarify that it's not actually useful for this purpose since the GitHub Actions cache API doesn't support downloading caches outside of Actions workflows.
.scripts/fetch-cache
Outdated
| print(f" - The CI only builds nightly images on schedule, not 'latest' tag images", file=sys.stderr) | ||
| print(f" - Images are in GitHub Actions cache (only accessible within Actions runners)", file=sys.stderr) | ||
| print(f"\nTo get pre-built images, you can:", file=sys.stderr) | ||
| print(f" 1. Trigger a CI build by pushing to main branch", file=sys.stderr) |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message suggests "The CI only builds nightly images on schedule, not 'latest' tag images" but this may not be accurate based on the images.json configuration. The images.json shows that 'latest' tag has events ["pull_request", "push"], meaning it should be built on those events. The error message should be updated to reflect the actual CI build triggers or be made more generic to avoid misleading users.
| print(f" - The CI only builds nightly images on schedule, not 'latest' tag images", file=sys.stderr) | |
| print(f" - Images are in GitHub Actions cache (only accessible within Actions runners)", file=sys.stderr) | |
| print(f"\nTo get pre-built images, you can:", file=sys.stderr) | |
| print(f" 1. Trigger a CI build by pushing to main branch", file=sys.stderr) | |
| print(f" - The CI has not produced artifacts for this tag or branch yet", file=sys.stderr) | |
| print(f" - Images are in GitHub Actions cache (only accessible within Actions runners)", file=sys.stderr) | |
| print(f"\nTo get pre-built images, you can:", file=sys.stderr) | |
| print(f" 1. Trigger a CI build (for example, by pushing a commit or opening a pull request)", file=sys.stderr) |
| Requirements: | ||
| - gh CLI authenticated with access to stellar/quickstart | ||
| - docker CLI available | ||
| - jq available | ||
| - (optional) gh-actions-cache extension for cache fallback | ||
| """ |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script requires 'gh', 'docker', and 'jq' CLI tools as documented in the requirements section, but it doesn't validate that these tools are available before attempting to use them. If any of these tools are missing, the user will get cryptic error messages from subprocess calls rather than helpful guidance. Consider adding upfront checks for required tools and providing clear error messages if they're missing, similar to checking if the image.json file exists (lines 259-261).
| def run_cmd(cmd, capture=True, check=True, verbose=True): | ||
| """Run a command and return output.""" | ||
| if verbose: | ||
| print(f" Running: {' '.join(cmd)}", file=sys.stderr) | ||
| result = subprocess.run( | ||
| cmd, | ||
| capture_output=capture, | ||
| text=True, | ||
| check=check | ||
| ) | ||
| return result.stdout.strip() if capture else None |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The run_cmd function has a 'verbose' parameter that defaults to True, printing every command to stderr. This could be excessive for operations that run many times (like API calls for pagination). The verbose output is helpful for debugging but may clutter the output during normal operation. Consider making verbosity configurable via a command-line flag or at least reducing verbosity for repetitive operations like pagination.
Makefile
Outdated
| # Primary source: GitHub Actions cache (only accessible in GitHub Actions) | ||
| # Fallback source: Artifacts from latest completed CI workflow on main branch |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment incorrectly states that GitHub Actions cache is the "Primary source" when in fact the code in the fetch-cache script treats artifacts as the primary source and cache as a fallback. The comment should be updated to match the actual implementation:
- Primary source: Artifacts from latest completed CI workflow on main branch
- Fallback source: GitHub Actions cache (only accessible in GitHub Actions)
| # Primary source: GitHub Actions cache (only accessible in GitHub Actions) | |
| # Fallback source: Artifacts from latest completed CI workflow on main branch | |
| # Primary source: Artifacts from latest completed CI workflow on main branch | |
| # Fallback source: GitHub Actions cache (only accessible in GitHub Actions) |
| artifact_dir = tmpdir / f"artifact-{dep_name}" | ||
| artifact_dir.mkdir(exist_ok=True) |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When downloading artifacts, the script creates subdirectories in the temp directory using user-controlled input (dep_name). While tempfile.TemporaryDirectory() provides a secure temp directory, the dep_name comes from the image.json file which could theoretically contain path traversal characters. Although this is unlikely in practice since the image.json is part of the repository, consider sanitizing dep_name before using it in path construction to prevent potential path traversal issues (e.g., if dep_name contained '../').
What
Add a script and Makefile targets to download pre-built Docker dependency images from GitHub Actions artifacts. The script attempts to fetch cached images from recent CI runs, loads them into Docker, and enables faster local builds via
make build-with-cachethat skips dependency compilation.Why
Building quickstart from source requires compiling all dependencies, which is slow. This enables developers to reuse pre-built images from CI, significantly reducing local build times.
Todo