Using caching to speed up GitHub Actions workflows
As we continue to refine our GitHub Actions workflows, it’s essential to look for optimizations that can save time and resources. One feature at our disposal is caching. By caching dependencies and other frequently reused files, we can significantly reduce the time our workflows take to run.
Understanding caching in GitHub Actions
Caching works by storing a copy of specific files or directories, like your project’s dependencies, between workflow runs. When your workflow runs again, it can reuse the cached files instead of regenerating or downloading them, which can often be time-consuming.
Why Caching?
- Faster workflows: reusing previously downloaded or compiled files can drastically reduce build and setup times.
- Reduced network latency: by avoiding repeated downloads, you also reduce potential network-related delays.
- Improved reliability: minimizing external network requests decreases the chance of a failure due to network issues.
Implementing caching in your workflows
GitHub Actions provides a built-in cache
action that you can use to cache dependencies and other files. The key to effective caching is identifying what to cache and how to invalidate the cache when necessary (for example, when dependencies change).
- Basic caching example
Here’s how to cache Node.js dependencies using the cache
action in a GitHub Actions workflow:
In this example:
- path: we specify the cache location (
~/.npm
) where npm stores the downloaded modules. - key: a unique key that represents the specific cache instance. We use a combination of the runner OS and a hash of the
package-lock.json
file. The hash changes if the dependencies change, creating a new cache entry for the next run. - restore-keys: used to restore from partially matching cache keys if there’s no exact match on the key.
- Caching across different environments and languages
Caching is not limited to Node.js or npm. You can implement similar strategies for other environments and dependency managers (e.g., Python with pip, Ruby with Bundler, etc.) by adjusting the path
and key
to match those environments’ specifics.
Best practices for caching
- Be specific with cache keys: use precise keys to avoid restoring the wrong cache. Including the platform, language version, and a hash of dependency files in the cache key is a good practice.
- Cache dependencies, not build outputs: cache items that take a long time to download or generate but change infrequently. Avoid caching build outputs which can vary significantly between runs.
- Understand cache limits: GitHub Actions provides a limited amount of free caching storage per repository. Be mindful of what you cache to stay within these limits, or switch to other cache backend ↗ or GitHub Action providers that come with much higher cache sizes.
Conclusion
Caching is a powerful tool in your GitHub Actions workflows, enabling faster builds, reducing network dependency, and making your CI/CD process more efficient. This article focuses on dependency caching, but a future article will cover various strategies for caching docker layers, as well as covering how to upload artifacts. Stay tuned!