self-host →

Warm pools

Learn how to use runner warm pools to get <10s queuing time for your GitHub Actions workflows on self-hosted runners.

Runner pools allow you to pre-provision runners that are ready to pick up jobs immediately, dramatically reducing queue times from ~25 seconds (cold-start) to under 6 seconds (hot instances). This is ideal for improving developer experience, reducing wait times for smaller jobs, and achieving predictable performance.

Available since v2.9.0.

Usage

Flex

Flex jobs opt into a warm pool with the pool label. The pool itself is defined in .github-private/.github/runs-on.yml:

jobs:
build:
runs-on: runs-on=${{ github.run_id }}/pool=small-x64

Fleet

Fleet warm capacity is configured on the runner fleet in Terraform. Workflows still target the same fleet label; the platform-owned schedule decides whether the job gets hot, stopped, or cold capacity:

fleets = {
linux-small = {
runner = "small-x64"
schedule = [{ name = "default", hot = 1, stopped = 2 }]
}
}
jobs:
build:
runs-on: runs-on/fleet=linux-small/env=production

Expected Queue Times

For Linux runners (m7a-type instances):

Instance TypeQueue TimeUse Case
Cold-start< 25sDefault behavior, most cost-efficient
Stopped< 15sPre-warmed EBS, balanced cost/performance
Hot< 6sAlways running, fastest response time
Hot pool timing - under 6 secondsStopped pool timing - under 15 secondsOverflow pool timing

For Windows runners:

Instance TypeQueue Time
Cold-start< 3min
Stopped< 40s
Hot< 6s
Windows hot pool timing - under 6 secondsWindows stopped pool timing - under 60 seconds

Pool Types

Hot instances stay running and ready to accept jobs immediately. They provide the fastest response times but incur EC2 compute costs. Hot instances are automatically terminated after 16h of idle time to ensure they stay fresh and up-to-date.

Stopped instances are pre-provisioned, warmed up, and then stopped to minimize costs. When a job arrives, they start quickly (EBS volume is already warmed, dependencies installed). You only pay for EBS storage while stopped, making them a good balance between cost and performance.

Configuration

Pools are always configured in the .github/runs-on.yml file in your organization’s .github-private repository. This single file serves as the source of truth for all pool configurations.

Basic Example

.github-private/.github/runs-on.yml
runners:
small-x64:
image: ubuntu24-full-x64
ram: 1
family: [t3]
volume: gp3:30gb:125mbps:3000iops
pools:
small-x64:
env: production
runner: small-x64
timezone: "Europe/Paris"
schedule:
- name: default
stopped: 2
hot: 1

This configuration:

  • Defines a custom runner named small-x64 with 1GB RAM
  • Creates a pool named small-x64 that maintains 2 stopped instances and 1 hot instance
  • Uses Paris timezone for schedule calculations

Advanced Example with Scheduling

pools:
small-x64:
env: production
runner: small-x64
timezone: "America/New_York"
schedule:
- name: business-hours
match:
day: ["monday", "tuesday", "wednesday", "thursday", "friday"]
time: ["08:00", "18:00"]
stopped: 5
hot: 2
- name: nights
match:
day: ["monday", "tuesday", "wednesday", "thursday", "friday"]
time: ["18:00", "08:00"]
stopped: 2
hot: 0
- name: weekends
match:
day: ["saturday", "sunday"]
stopped: 1
hot: 0
- name: default
stopped: 2
hot: 1

This creates different pool capacities based on your usage patterns:

  • Business hours (weekdays 8am-6pm): 5 stopped + 2 hot instances
  • Nights (weekdays 6pm-8am): 2 stopped instances, no hot instances
  • Weekends: 1 stopped instance only
  • Default: Fallback for any unmatched time periods

Configuration Options

FieldDescription
envStack environment this pool belongs to (e.g., production, dev)
runnerReference to a runner definition in the runners section
timezoneIANA timezone for schedule calculations (e.g., America/New_York, Europe/Paris)
scheduleList of schedule rules with capacity targets
schedule[].nameHuman-readable name for this schedule
schedule[].match.dayArray of days (monday-sunday) when this schedule applies
schedule[].match.timeTime range [start, end] in 24-hour format
schedule[].stoppedNumber of stopped instances to maintain
schedule[].hotNumber of hot instances to maintain

Using Pools in Workflows

To use a pool in your workflow, add the pool=POOL_NAME label to your runs-on definition:

jobs:
test:
runs-on: runs-on/pool=small-x64
steps:
- uses: actions/checkout@v6
- run: npm test

For more deterministic runner-to-job assignment:

jobs:
test:
runs-on: runs-on=${{ github.run_id }}/pool=small-x64
steps:
- uses: actions/checkout@v6
- run: npm test

When using pool labels, all other RunsOn labels (like cpu, ram, family) are ignored. Only the runner specification defined in the pool configuration is used.

Automatic Overflow

If your pool is exhausted (all instances are in use), RunsOn automatically creates a cold-start instance to handle the job. This ensures jobs never fail due to lack of capacity:

Job arrives → Check pool capacity → Instance available? → Pick from pool (fast)
→ No capacity? → Cold-start instance (fallback)

Dependabot Integration

Pools enable using RunsOn for Dependabot jobs. If you define a pool named dependabot, it will automatically be used for any Dependabot jobs:

.github-private/.github/runs-on.yml
# You must define the runner that the pool will use
runners:
small-x64:
image: ubuntu24-full-x64
ram: 2
family: [t3]
volume: gp3:30gb:125mbps:3000iops
pools:
dependabot:
env: production
# must reference a runner defined in the `runners` section above
runner: small-x64
schedule:
- name: default
stopped: 2
hot: 0

When RunsOn sees a job with the dependabot label, it automatically expands it internally to the equivalent of runs-on/pool=dependabot.

Note that you need to enable Dependabot to run on self-hosted runners, in your GitHub repository settings, otherwise they will still be launched on GitHub official runners.

Cost Considerations

Storage Costs for Stopped Instances

Stopped instances still incur EBS storage costs. To minimize expenses:

runners:
efficient-runner:
volume: gp3:30gb:125mbps:3000iops # Free tier eligible

Hot Instance Costs

Hot instances incur both EC2 compute and storage costs while running. They are automatically recycled after 16 hours (or on pool capacity changes due to schedule, etc.) of idle time to:

  • Keep costs under control
  • Ensure instances stay updated with latest AMI
  • Prevent long-running instances from accumulating issues

Cost Comparison

Assuming an m7a.medium instance (on-demand price: $0.07/hour) with 30GB gp3 storage ($0.08/GB-month), available 24/7, 7 days a week

TypeEC2 CostStorage CostTotal/Month (1 instance)
Hot~$50.40/month (24/7)~$2.40/month~$52.80/month
Stopped$0~$2.40/month (24/7)~$2.40/month
Cold-start$0$0$0 (pay per use)

Note that with schedules, you can make those hot or stopped instances run only for half a day and not on weekend, so your real costs would be lower.

Availability

Pools are available for both Linux and Windows runners. They are especially useful in the following cases:

  • you have short jobs that are queued frequently, and you want them to be executed as fast as possible.
  • you are using Windows runners, and don’t want to wait multiple minutes for them to start up.
  • you are preinstalling a lot of dependencies in your runners with the preinstall feature, and you want that process to be done during the warm-up phase so that you get much better pick-up times for your jobs.
  • you need just-in-time setup right before the job starts, such as refreshing a docker login token with the prerun attribute on a custom runner or image.

Limitations

  • On-demand instances only: Spot instance support for hot instances will be added once pools are considered stable
  • SSH access: There is no stack-level DefaultAdmins parameter in v3. If you need privileged instance access for pool-backed runners, prefer SSM and keep repository-level SSH access intentionally scoped.

FAQ

How do I know if my pool is working?

Check the EC2 console for instances with the runs-on-pool-name tag matching your pool name. You should see instances in various states (warming-up, ready, detached).

You also get monitoring widgets in the embedded CloudWatch dashboard. If the pool runner spec enables extras=otel, those jobs follow the same runner OTEL behavior documented on /docs/observability/opentelemetry/.

What happens if I delete a pool from config?

The pool manager will automatically terminate all instances in that pool during the next convergence cycle.

Can I use spot instances for hot pools?

Not yet. Hot pool instances currently use on-demand pricing. Spot support will be added once pools are stable.

What if my job needs custom labels (cpu, ram, etc.)?

Pool jobs ignore all labels except pool, env, and region. To customize runner specs, define them in the pool’s runner configuration in .github-private/.github/runs-on.yml.

How do I update my runner configuration?

Simply update the runner definition in .github-private/.github/runs-on.yml. The pool manager will automatically detect the change (via spec hash) and roll out new instances within a few convergence cycles (~1-2 minutes).

Can I have multiple pools with different specs?

Yes! Define multiple entries in the pools section, each referencing different runners:

runners:
small: { ram: 1, ... }
large: { ram: 16, ... }
pools:
pool-small: { runner: small, ... }
pool-large: { runner: large, ... }

How It Works

Pool Manager

A pool manager process runs a convergence loop every 30 seconds that:

  1. Fetches configuration from .github-private/.github/runs-on.yml
  2. Matches schedule to determine current target capacity (hot/stopped counts)
  3. Rebalances instances to match target capacity
  4. Updates states of instances through their lifecycle

Instance Lifecycle

Pool instances move through these states (tracked via runs-on-pool-standby-status EC2 tag):

StateDescription
warming-upInstance is being created, EBS warming, running preinstall scripts
readyInstance is available to be picked up for jobs
ready-to-stopStopped-type instance that has completed warmup, ready to be stopped
detachedInstance picked up for a job, no longer managed by pool
errorInstance encountered an error during setup

Rebalance Algorithm

On each 30-second cycle, the pool manager:

  1. Categorizes instances by state (hot, stopped, outdated, error, etc.)
  2. Terminates error instances that failed setup
  3. Terminates dangling instances that should have started jobs but didn’t
  4. Terminates outdated instances with old spec hash (runner config or AMI changed)
    • Happens immediately to free AWS quota before creating new instances
  5. Stops ready-to-stop instances (stopped-type that finished warmup)
  6. Creates missing instances to reach target capacity (batched creation)
  7. Terminates excess instances beyond target capacity

Spec Hash and Rollouts

Each pool instance is tagged with a spec hash that includes:

  • Runner configuration (CPU, RAM, disk, image)
  • AMI ID
  • Pool configuration

When you update runner configuration or a new AMI is published, the pool manager automatically:

  1. Detects outdated instances (spec hash mismatch)
  2. Terminates them to free AWS quota
  3. Creates new instances with the updated specification

This ensures pools always run the latest configuration and AMI versions.

Batch Operations

To handle large pools efficiently:

  • Termination: Batched up to 50 instances per EC2 API call
  • Creation: Uses EC2 Fleet API to create multiple instances atomically
  • Starting stopped instances: Batched up to 50 instances per API call

Safety Mechanisms

Instances are protected from termination when:

  • Job has started (runs-on-workflow-job-started tag is set)
  • Instance is detached from pool (status = detached)
  • Instance is currently executing a workflow job

The rebalance algorithm explicitly filters out these instances before any termination operations.

Hot and stopped pools (Fleet)

Fleet normally launches a fresh EC2 runner after GitHub assigns a job to a runner scale set. Hot and stopped pools let a runner fleet keep pre-warmed EC2 inventory so assigned jobs can start faster, at the cost of some idle EC2 or EBS spend.

When you actually need a pool

Most runner fleets do not need one. Start with cold launches and only add a pool if you can name the runner fleet and the latency complaint.

  • Linux runner fleets: usually skip. Fleet’s cold launch path on Linux x64 or ARM64 is fast enough that adding a pool rarely moves the needle. Save the idle spend.
  • Heavily used Linux runner fleets: a small hot pool can help. If a linux-small-style runner fleet is hit continuously through the workday and the first few seconds of cold launch are visible, hot = 1 or hot = 2 smooths the experience without much waste, because the hot instances get reused constantly.
  • Windows runner fleets: stopped pools earn their keep. Windows boot plus AMI hydration is the slowest part of a fresh runner. A stopped pool pre-pays that one-time disk warmup so the only cost at assignment is starting an already-warmed instance.
  • GPU runner fleets: stopped pools earn their keep too. GPU AMIs are large and driver init is slow on cold boot. A stopped pool pays the EBS storage cost in exchange for skipping the slow first-boot path.

If a runner fleet does not fit one of the patterns above, leave hot = 0 and stopped = 0. Cold overflow is free.

Pool types

TypeState before demandStartup profileCost while idle
HotRunning EC2 instanceFastest, instance is already upFull on-demand EC2 cost
StoppedWarmed once, then stoppedSkips first-boot AMI hydration; still has to start EC2EBS storage cost only
Cold overflowNo standby instanceFull launch from scratchNothing

Fleet uses ready hot instances first, then ready stopped instances, then cold overflow. Warm pool instances are created as on-demand EC2 capacity.

Basic pool

Add a schedule entry to a runner fleet:

fleets = {
linux-small = {
runner = "small-x64"
runner_group = "ci-standard"
timezone = "UTC"
schedule = [
{
name = "default"
hot = 1
},
]
}
}

That keeps one hot instance for a heavily-used Linux runner fleet. For Windows or GPU, prefer stopped:

fleets = {
windows-large = {
runner = "large-windows-x64"
runner_group = "ci-standard"
timezone = "UTC"
schedule = [
{
name = "default"
stopped = 2
},
]
}
}

Scheduled pool

Schedules are evaluated in the fleet’s timezone. Put more specific schedules before the default entry.

fleets = {
linux-large = {
runner = "large-x64"
runner_group = "ci-standard"
timezone = "Europe/Paris"
schedule = [
{
name = "weekday-peak"
hot = 2
match = {
day = ["monday", "tuesday", "wednesday", "thursday", "friday"]
time = ["08:00", "19:00"]
}
},
{
name = "default"
hot = 0
},
]
}
}

Use an explicit default so the runner fleet has a defined off-hours policy. Set both counts to 0 to fall back to cold launches outside matched windows.

How pickup works

For each assigned job, Fleet prefers:

  1. a ready hot instance
  2. a ready stopped instance
  3. a cold EC2 launch through CreateFleet

max_launch_batch_size caps cold launches per CreateFleet attempt. It does not cap warm pool pickup, total runner fleet demand, or GitHub matrix concurrency.

Troubleshooting

If a pool is configured but jobs still launch cold, check:

  • the workflow label targets the same fleet key and module environment
  • the schedule matches the current time in the runner fleet’s timezone
  • CloudWatch logs for the Fleet worker show the expected fleet_name and schedule
  • EC2 quotas allow the standby instances to exist
  • the runner family and image can launch on on-demand capacity in the selected subnets