self-host →

Spot pricing

How RunsOn launches runners on EC2 Spot capacity, and when to opt out.

What are spot instances?

Spot instances are a purchasing option that allows you to take advantage of unused EC2 capacity in the AWS cloud.

This can result in significant cost savings, often up to 90% off the on-demand price.

However, spot instances are not guaranteed to be available at all times and can be terminated by AWS if the capacity is needed for on-demand instances.

Usage

Flex

In Flex, disable Spot per job with spot=false:

jobs:
ci:
runs-on: runs-on=${{ github.run_id }}/runner=2cpu-linux-x64
deploy:
runs-on: runs-on=${{ github.run_id }}/runner=2cpu-linux-x64/spot=false

Test Flex interruption handling against a running Spot job with the CLI:

AWS_PROFILE=runs-on-admin roc interrupt "$JOB_URL" --wait

Fleet

In Fleet, spot behavior belongs in the Terraform-owned runner fleet definition. Publish a separate on-demand runner fleet for workflows that cannot tolerate interruption, and target it from workflows via its fleet label. See Fleet installation guide for the supported Terraform inputs.

Default behavior

RunsOn always tries Spot first and falls back to on-demand if Spot capacity is unavailable. The penalty is a 2–3s launch delay on fallback.

Spot interruption is also rare for short jobs: AWS does not bill any Spot instance interrupted within its first hour, and the request is rejected up front if capacity is critical. That makes Spot a safe default for almost all CI work under one hour without critical side effects. See AWS billing for interrupted Spot instances for the exact rules.

Disable Spot when:

  • the workflow cannot be interrupted (deployment, migration, release signing, long-running jobs with poor checkpointing);
  • you pin a specific instance type that often hits capacity ceilings and want to skip the Spot retry delay.

Spot allocation strategies

Supported allocation strategies on RunsOn include:

  • spot=price-capacity-optimized or spot=pco: This strategy balances between price and capacity to optimize cost while minimizing the risk of interruption.
  • spot=lowest-price or spot=lp: This strategy focuses on obtaining the lowest possible price, which may increase the risk of interruption.
  • spot=capacity-optimized or spot=co: This strategy prioritizes the allocation of instances from the pools with the most available capacity, reducing the likelihood of interruption.

For more details on each strategy, refer to the official AWS documentation on Spot Instance allocation strategies. For guidance on which strategy to pick when trading cost against stability, see Cost control.

Automatic spot disablement

On Flex, RunsOn will automatically force spot to false in the following conditions:

  • anytime the job is retried (i.e. run attempt > 1);
  • whenever the spot instance quota is exhausted — Flex falls back to on-demand for the next 5 minutes and emits an email alert so you can request an EC2 spot quota increase;
  • if the spot circuit breaker (available since v2.6.7) is active, in which case Flex falls back to on-demand for the recovery duration configured on the circuit breaker.

The spot circuit breaker trips to on-demand once spot interruptions exceed a threshold within a time window. Its value uses the format COUNT/WINDOW_MINUTES/RECOVERY_MINUTES:

  • COUNT: number of interruptions before the breaker trips;
  • WINDOW_MINUTES: time window, in minutes, over which interruptions are counted;
  • RECOVERY_MINUTES: time, in minutes, before spot is tried again.

Raising COUNT makes the breaker less sensitive, tolerating more interruptions before switching to on-demand.

Fleet currently does only the basic launch-time Spot-then-on-demand fallback. Interruption tracking, quota snoozing, and the spot circuit breaker are Flex-only — when a Fleet runner needs guaranteed on-demand capacity, publish a separate on-demand runner fleet.

Flex vs Fleet

Same Spot-first default; the difference is where the override lives.

AspectFlexFleet
Default behaviorSpot first, on-demand fallbackSpot first, on-demand fallback
Disable per jobspot=false workflow label— (workflow can’t override the fleet)
Disable per fleetn/aConfigured in the Terraform runner fleet
Allocation strategyspot=pco / lp / co labelConfigured in the Terraform runner fleet
Auto-disable on retry— (no run-attempt-based on-demand rerun)
Auto-disable on quota— (no MaxSpotInstanceCountExceeded snooze)
Spot circuit breaker— (no spot interruption tracker yet)
Test with roc interrupt✓ (triggers interruption, but no recovery to observe)