Add fetch-cache script for collecting cached dep images from GitHub CI #899

leighmcculloch · 2026-02-02T14:26:47Z

What

Add a script and Makefile targets to download pre-built Docker dependency images from GitHub Actions artifacts. The script attempts to fetch cached images from recent CI runs, loads them into Docker, and enables faster local builds via make build-with-cache that skips dependency compilation.

Why

Building quickstart from source requires compiling all dependencies, which is slow. This enables developers to reuse pre-built images from CI, significantly reducing local build times.

Todo

Modify the tags used for the downloaded cached images so they match the stage tags that the existing build target will create
Merge build-with-cache into build
Remove GitHub Action Cache fallback
Tidy and polish

Copilot

Pull request overview

This PR adds functionality to download pre-built Docker dependency images from GitHub Actions artifacts, enabling faster local builds by reusing CI-built images instead of compiling dependencies from scratch.

Changes:

Added fetch-cache target to Makefile that downloads cached dependency images from GitHub Actions artifacts
Added build-with-cache target that builds quickstart using pre-fetched cached images instead of compiling dependencies
Added architecture detection logic to support both amd64 and arm64 platforms
Created .scripts/fetch-cache Python script that interfaces with GitHub CLI to download artifacts and load them into Docker

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 13 comments.

File	Description
Makefile	Added architecture detection, cache configuration variables, and two new targets (`fetch-cache` and `build-with-cache`) to support fetching and using cached dependency images
.scripts/fetch-cache	New Python script that downloads dependency images from GitHub Actions artifacts, handles pagination, and loads images into Docker with appropriate tags

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-02T14:35:21Z

.scripts/fetch-cache

+        print(f"Command failed: {' '.join(cmd)}", file=sys.stderr)
+        print(f"stdout: {result.stdout}", file=sys.stderr)
+        print(f"stderr: {result.stderr}", file=sys.stderr)
+        raise subprocess.CalledProcessError(result.returncode, cmd)


The run_cmd_quiet function captures output and only shows it on failure, which is good. However, when it raises CalledProcessError on line 74, it doesn't include the stdout/stderr in the exception, only prints them. This means any code catching this exception won't have access to the error output. Consider including the output in the exception message or as part of the exception's attributes for better error handling by callers.

Suggested change

raise subprocess.CalledProcessError(result.returncode, cmd)

raise subprocess.CalledProcessError(

result.returncode,

cmd,

output=result.stdout,

stderr=result.stderr,

)

Copilot · 2026-02-02T14:35:21Z

.scripts/fetch-cache

+
+            page += 1
+        except (subprocess.CalledProcessError, json.JSONDecodeError) as e:
+            print(f"  Warning: Failed to list artifacts (page {page}): {e}", file=sys.stderr)


In the artifact pagination loop, if an exception occurs (line 126-128), the loop breaks and returns whatever artifacts were collected so far. This is appropriate for handling transient errors, but it means partial results could be silently returned if there are more artifacts available. Consider logging a warning that indicates how many pages were successfully processed before the error, so users understand if they're getting incomplete results.

Suggested change

print(f" Warning: Failed to list artifacts (page {page}): {e}", file=sys.stderr)

print(

f" Warning: Failed to list artifacts for run {run_id} on page {page}: {e}. "

f"Returning {len(all_artifacts)} artifact(s) collected from previous page(s).",

file=sys.stderr,

)

Copilot · 2026-02-02T14:35:22Z

.scripts/fetch-cache

+    run_cmd(["docker", "load", "-i", tar_path])
+
+    # Verify the image was loaded
+    result = run_cmd_quiet(["docker", "images", "-q", expected_tag], check=False)
+    if result.stdout.strip():
+        print(f"  Verified: {expected_tag}", file=sys.stderr)
+        return True
+    else:
+        print(f"  Warning: Image {expected_tag} not found after load", file=sys.stderr)


The docker_load function verifies the image was loaded by checking if 'docker images -q' returns output for the expected tag. However, if the tar file contains an image with a different tag than expected (due to mismatch between CI build and local expectations), the load will succeed but the verification will fail with a warning. Consider also verifying what tag was actually loaded from the tar file (using 'docker load' output or 'docker inspect') to provide more helpful debugging information when tags don't match.

Suggested change

run_cmd(["docker", "load", "-i", tar_path])

# Verify the image was loaded

result = run_cmd_quiet(["docker", "images", "-q", expected_tag], check=False)

if result.stdout.strip():

print(f" Verified: {expected_tag}", file=sys.stderr)

return True

else:

print(f" Warning: Image {expected_tag} not found after load", file=sys.stderr)

# Capture the output from docker load so we can report what was actually loaded

load_output = run_cmd(["docker", "load", "-i", tar_path])

# Parse docker load output for loaded image references (e.g., "Loaded image: repo/name:tag")

loaded_refs = []

for line in load_output.splitlines():

match = re.search(r"Loaded image(?:\sID)?:\s*(.+)", line)

if match:

loaded_refs.append(match.group(1).strip())

# Verify the image was loaded under the expected tag

result = run_cmd_quiet(["docker", "images", "-q", expected_tag], check=False)

if result.stdout.strip():

print(f" Verified: {expected_tag}", file=sys.stderr)

return True

else:

if loaded_refs:

print(

f" Warning: Image {expected_tag} not found after load. "

f"'docker load' reported: {', '.join(loaded_refs)}",

file=sys.stderr,

)

else:

print(f" Warning: Image {expected_tag} not found after load", file=sys.stderr)

Copilot · 2026-02-02T14:35:22Z

.scripts/fetch-cache

+This script downloads cached Docker images for quickstart dependencies from
+the GitHub Actions artifacts. If artifacts are not available (expired or cache hit),
+it falls back to downloading from the GitHub Actions cache.
+
+After downloading, it loads the images into Docker with the correct tags
+expected by the Dockerfile.
+
+Primary source: Artifacts from the latest completed CI workflow on main branch
+Fallback source: GitHub Actions cache (requires gh-actions-cache extension)


The docstring incorrectly states that artifacts are the fallback when they are actually the primary source. The docstring should be corrected to state:

Primary source: Artifacts from the latest completed CI workflow on main branch

Fallback source: GitHub Actions cache (requires gh-actions-cache extension)

Copilot · 2026-02-02T14:35:22Z

.scripts/fetch-cache

+Requirements:
+    - gh CLI authenticated with access to stellar/quickstart
+    - docker CLI available
+    - jq available
+    - (optional) gh-actions-cache extension for cache fallback


The requirement mentions "gh-actions-cache extension" but the code never actually uses it. The download_from_cache function at line 171 always returns False with a note that caches cannot be downloaded outside GitHub Actions. Either remove the mention of this extension from the requirements, or clarify that it's not actually useful for this purpose since the GitHub Actions cache API doesn't support downloading caches outside of Actions workflows.

Copilot · 2026-02-02T14:35:24Z

.scripts/fetch-cache

+        print(f"  - The CI only builds nightly images on schedule, not 'latest' tag images", file=sys.stderr)
+        print(f"  - Images are in GitHub Actions cache (only accessible within Actions runners)", file=sys.stderr)
+        print(f"\nTo get pre-built images, you can:", file=sys.stderr)
+        print(f"  1. Trigger a CI build by pushing to main branch", file=sys.stderr)


The error message suggests "The CI only builds nightly images on schedule, not 'latest' tag images" but this may not be accurate based on the images.json configuration. The images.json shows that 'latest' tag has events ["pull_request", "push"], meaning it should be built on those events. The error message should be updated to reflect the actual CI build triggers or be made more generic to avoid misleading users.

Suggested change

print(f" - The CI only builds nightly images on schedule, not 'latest' tag images", file=sys.stderr)

print(f" - Images are in GitHub Actions cache (only accessible within Actions runners)", file=sys.stderr)

print(f"\nTo get pre-built images, you can:", file=sys.stderr)

print(f" 1. Trigger a CI build by pushing to main branch", file=sys.stderr)

print(f" - The CI has not produced artifacts for this tag or branch yet", file=sys.stderr)

print(f" - Images are in GitHub Actions cache (only accessible within Actions runners)", file=sys.stderr)

print(f"\nTo get pre-built images, you can:", file=sys.stderr)

print(f" 1. Trigger a CI build (for example, by pushing a commit or opening a pull request)", file=sys.stderr)

Copilot · 2026-02-02T14:35:24Z

.scripts/fetch-cache

+Requirements:
+    - gh CLI authenticated with access to stellar/quickstart
+    - docker CLI available
+    - jq available
+    - (optional) gh-actions-cache extension for cache fallback
+"""


The script requires 'gh', 'docker', and 'jq' CLI tools as documented in the requirements section, but it doesn't validate that these tools are available before attempting to use them. If any of these tools are missing, the user will get cryptic error messages from subprocess calls rather than helpful guidance. Consider adding upfront checks for required tools and providing clear error messages if they're missing, similar to checking if the image.json file exists (lines 259-261).

Copilot · 2026-02-02T14:35:24Z

.scripts/fetch-cache

+def run_cmd(cmd, capture=True, check=True, verbose=True):
+    """Run a command and return output."""
+    if verbose:
+        print(f"  Running: {' '.join(cmd)}", file=sys.stderr)
+    result = subprocess.run(
+        cmd,
+        capture_output=capture,
+        text=True,
+        check=check
+    )
+    return result.stdout.strip() if capture else None


The run_cmd function has a 'verbose' parameter that defaults to True, printing every command to stderr. This could be excessive for operations that run many times (like API calls for pagination). The verbose output is helpful for debugging but may clutter the output during normal operation. Consider making verbosity configurable via a command-line flag or at least reducing verbosity for repetitive operations like pagination.

Copilot · 2026-02-02T14:35:25Z

Makefile

+# Primary source: GitHub Actions cache (only accessible in GitHub Actions)
+# Fallback source: Artifacts from latest completed CI workflow on main branch


The comment incorrectly states that GitHub Actions cache is the "Primary source" when in fact the code in the fetch-cache script treats artifacts as the primary source and cache as a fallback. The comment should be updated to match the actual implementation:

Primary source: Artifacts from latest completed CI workflow on main branch

Fallback source: GitHub Actions cache (only accessible in GitHub Actions)

Suggested change

# Primary source: GitHub Actions cache (only accessible in GitHub Actions)

# Fallback source: Artifacts from latest completed CI workflow on main branch

# Primary source: Artifacts from latest completed CI workflow on main branch

# Fallback source: GitHub Actions cache (only accessible in GitHub Actions)

Copilot · 2026-02-02T14:35:25Z

.scripts/fetch-cache

+                artifact_dir = tmpdir / f"artifact-{dep_name}"
+                artifact_dir.mkdir(exist_ok=True)


When downloading artifacts, the script creates subdirectories in the temp directory using user-controlled input (dep_name). While tempfile.TemporaryDirectory() provides a secure temp directory, the dep_name comes from the image.json file which could theoretically contain path traversal characters. Although this is unlikely in practice since the image.json is part of the repository, consider sanitizing dep_name before using it in path construction to prevent potential path traversal issues (e.g., if dep_name contained '../').

add fetch-cache script for downloading CI cached images

7664762

github-project-automation bot added this to DevX Feb 2, 2026

github-project-automation bot moved this to Backlog (Not Ready) in DevX Feb 2, 2026

leighmcculloch requested a review from Copilot February 2, 2026 14:26

Copilot started reviewing on behalf of leighmcculloch February 2, 2026 14:27 View session

Copilot AI reviewed Feb 2, 2026

View reviewed changes

simplify fetch-cache to use artifacts only, remove cache fallback

5bc8a1a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fetch-cache script for collecting cached dep images from GitHub CI #899

Add fetch-cache script for collecting cached dep images from GitHub CI #899

Uh oh!

leighmcculloch commented Feb 2, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-            print(f"  Warning: Failed to list artifacts (page {page}): {e}", file=sys.stderr)
+            print(
+                f"  Warning: Failed to list artifacts for run {run_id} on page {page}: {e}. "
+                f"Returning {len(all_artifacts)} artifact(s) collected from previous page(s).",
+                file=sys.stderr,
+            )

		# Primary source: GitHub Actions cache (only accessible in GitHub Actions)
		# Fallback source: Artifacts from latest completed CI workflow on main branch

		artifact_dir = tmpdir / f"artifact-{dep_name}"
		artifact_dir.mkdir(exist_ok=True)

Add fetch-cache script for collecting cached dep images from GitHub CI #899

Are you sure you want to change the base?

Add fetch-cache script for collecting cached dep images from GitHub CI #899

Uh oh!

Conversation

leighmcculloch commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

Todo

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

leighmcculloch commented Feb 2, 2026 •

edited

Loading