Using caching to speed up GitHub Actions workflows

Understanding caching in GitHub Actions

Caching works by storing a copy of specific files or directories, like your project’s dependencies, between workflow runs. When your workflow runs again, it can reuse the cached files instead of regenerating or downloading them, which can often be time-consuming.

Why Caching?

Faster workflows: reusing previously downloaded or compiled files can drastically reduce build and setup times.
Reduced network latency: by avoiding repeated downloads, you also reduce potential network-related delays.
Improved reliability: minimizing external network requests decreases the chance of a failure due to network issues.

Implementing caching in your workflows

GitHub Actions provides a built-in cache action that you can use to cache dependencies and other files. The key to effective caching is identifying what to cache and how to invalidate the cache when necessary (for example, when dependencies change).

Basic caching example

Here’s how to cache Node.js dependencies using the cache action in a GitHub Actions workflow:

name: Node.js CI

on:
  push:
    branches:
      - main

jobs:
  build:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Cache Node modules
        uses: actions/cache@v4
        with:
          path: ~/.npm
          key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
          restore-keys: |
            ${{ runner.os }}-node-

      - name: Install Dependencies
        run: npm install

      - name: Test
        run: npm test

In this example:

path: we specify the cache location (~/.npm) where npm stores the downloaded modules.
key: a unique key that represents the specific cache instance. We use a combination of the runner OS and a hash of the package-lock.json file. The hash changes if the dependencies change, creating a new cache entry for the next run.
restore-keys: used to restore from partially matching cache keys if there’s no exact match on the key.

Caching across different environments and languages

Caching is not limited to Node.js or npm. You can implement similar strategies for other environments and dependency managers (e.g., Python with pip, Ruby with Bundler, etc.) by adjusting the path and key to match those environments’ specifics.

Best practices for caching

Be specific with cache keys: use precise keys to avoid restoring the wrong cache. Including the platform, language version, and a hash of dependency files in the cache key is a good practice.
Cache dependencies, not build outputs: cache items that take a long time to download or generate but change infrequently. Avoid caching build outputs which can vary significantly between runs.
Understand cache limits: GitHub Actions provides a limited amount of free caching storage per repository. Be mindful of what you cache to stay within these limits, or switch to other cache backend ↗ or GitHub Action providers that come with much higher cache sizes.

Conclusion

Caching is a powerful tool in your GitHub Actions workflows, enabling faster builds, reducing network dependency, and making your CI/CD process more efficient. This article focuses on dependency caching, but a future article will cover various strategies for caching docker layers, as well as covering how to upload artifacts. Stay tuned!