self-host →

Cost control

Reduce RunsOn spend with spot tuning and right-sizing, and keep it in check with budgets, daily cost reports, cost-allocation tags, and automatic termination safeguards.

RunsOn takes cost control seriously, since you will be tempted to use beefy runners to expedite your workflows. This page covers both sides of the problem: reducing spend through spot tuning and right-sizing, and keeping it in check with a daily AWS budget, daily cost reports, cost-allocation tags, and automatic termination safeguards.

Optimizing for Less Spot Interruption

Spot instances can be interrupted when AWS needs the capacity back, which can disrupt your workflows. Here are strategies to minimize spot interruptions:

Diversify Instance Types

The most effective way to reduce spot interruptions is to diversify your instance type selection:

.github/runs-on.yml
runners:
my-custom-runner:
# Include multiple families instead of restricting to one.
family: ["m7", "c7", "r7"]
# Use ram and cpu options to restrict the instance type among the wide family range.
ram: [8, 16] # Specify a range instead of a single value
cpu: [2, 8] # Allows more flexibility

Or using job label syntax:

runs-on: runs-on=${{ github.run_id }}/family=m7+c7+r7/ram=8+16/cpu=2+8

Use Capacity-Optimized Allocation Strategy

For better stability, switch the allocation strategy to capacity-optimized, which draws from the pools with the most available capacity:

.github/runs-on.yml
runners:
my-custom-runner:
spot: capacity-optimized # or spot: co

Or per job:

runs-on: runs-on=${{ github.run_id }}/spot=capacity-optimized

To learn more about the allocation strategies, see Spot Allocation Strategies.

Configure the Spot Circuit Breaker

For interruption-heavy workloads, enable and tune the spot circuit breaker so Flex automatically falls back to on-demand once interruptions spike. See Automatic spot disablement for how it works and how to tune its sensitivity.

Consider Regional Availability

Spot availability varies by region. If you consistently face interruptions:

  1. Try deploying RunsOn in a different AWS region with better spot capacity
  2. Monitor spot interruption rates across regions
  3. Use multi-region deployments for critical workloads (one RunsOn stack per region, and use the region job label to select the region)

Balance Cost vs. Stability

For workloads where interruptions are particularly disruptive:

  • Consider using a wider range of instance types
  • Accept slightly higher costs for better stability with capacity-optimized allocation strategy
  • For critical jobs, use on-demand instances instead of spot (spot=false).

By implementing these strategies, you can significantly reduce spot interruptions while still maintaining cost efficiency.

Reducing cost

How do I maximize savings when using spot?

If your workflows do not require high-performance runners, use the spot=lowest-price allocation strategy in your configuration. This prioritizes cost over performance by selecting the cheapest available spot instance that meets your requirements.

# In your runs-on.yml file
runners:
my-custom-runner:
spot: lowest-price

Or per job:

jobs:
build:
runs-on: runs-on=${{ github.run_id }}/spot=lowest-price

What instance families should I include for better cost efficiency?

Include a wide range of instance families, especially the more common ones like m7 variants:

family: ["m7a", "m7i", "r7i", "c7i", "c7a", "r7a", "t3", "t3a"]

Or using the job label syntax:

runs-on: runs-on=${{ github.run_id }}/family=m7a+m7i+r7i+c7i+c7a+r7a+t3+t3a

How should I specify RAM and CPU requirements?

Use ranges instead of listing every possible value:

# Instead of listing every value
ram: [4, 512] # This specifies a range from 4GB to 512GB
cpu: [2, 128] # This specifies a range from 2 to 128 cores

Or using the job label syntax:

runs-on: runs-on=${{ github.run_id }}/family=m7a+m7i+m7i-flex+r7i+c7i+c7a+r7a+t3+t3a/ram=4+512/cpu=2+128

Can I set default spot configurations for all jobs?

Yes, you can specify spot configuration in your runs-on.yml file rather than in each individual job:

.github/runs-on.yml
runners:
my-custom-runner:
spot: lowest-price
family: ["m7a", "m7i", "r7i", "c7i", "c7a", "r7a", "t3", "t3a"]
ram: [4, 512]
cpu: [2, 128]

This can be further simplified using wildcard syntax:

.github/runs-on.yml
runners:
my-custom-runner:
spot: lowest-price
family: ["m7*", "c7*", "r7*", "t3*"]
ram: [4, 512]
cpu: [2, 128]

AWS will automatically select the cheapest instance at the time of launch that meets your requirements.

What results can I expect from these optimizations?

Users have reported significant cost reductions - in some cases reducing daily charges by 60% by implementing the lowest-price strategy and including more instance families like the m7 series.

Daily AWS budget

Starting with v3, RunsOn automatically provisions an AWS Budget for your stack. It is scoped to the RunsOn cost-allocation tag, so it only tracks spend from this stack’s resources — nothing else in your account counts against it.

  • Default limit: $10 USD per day, configurable via the AppBudgetDailyUsd CloudFormation parameter.
  • Notification: when actual spend exceeds 100% of the limit, an alert is published to the RunsOn SNS topic — so it reaches the same email (and Slack channel) as your other alerts.
  • Disable: set AppBudgetDailyUsd to 0 to skip creating the budget entirely.

This replaces the old daily-minutes alarm. For more on how alerts are delivered, see the Alerts page.

Cost reports in your inbox

RunsOn sends a daily cost report for your stack’s resources, broken down by day over the last 15 days. It goes to the EmailAddress configured at installation time, every day at 00:05 UTC.

cost report

Reports are enabled by default and can be turned off with the CostReportsEnabled parameter.

Cost allocation and resource tags

Both the daily budget and the email reports filter on a single cost-allocation tag so they only ever count RunsOn spend. By default the tag key is stack and its value is your CloudFormation stack name — configurable via CostAllocationTag.

RunsOn attempts to activate this tag automatically in your account each day. Activation makes the tag available for filtering in budgets and Cost Explorer.

Beyond the stack-level tag, RunsOn tags every ephemeral resource (EC2 instances, disks, etc.) with the cost-allocation tag plus any custom tags you define. EC2 instances also carry workflow-related metadata tags like runs-on-workflow-job-name, runs-on-workflow-name, and runs-on-repo-full-name. Activate these as cost-allocation tags too, and you can break costs down by job, workflow, or repository in Cost Explorer. Newly activated tags can take up to 24 hours to appear in your cost reports.

Full list of default tags applied to every instance (click to expand)

RunsOn applies the following default tags to EC2 instances. Some are specific to a product: the per-job workflow tags are set by the on-instance agent in Flex, while a few are only set in Fleet.

Tag KeyDescriptionFlexFleet
runs-on-workflow-job-startedWhether the workflow job has started
runs-on-workflow-job-nameName of the GitHub Actions workflow job
runs-on-workflow-job-interruptedWhether the job was interrupted
runs-on-workflow-job-conclusionFinal status of the workflow job
runs-on-workflow-run-idUnique identifier for the workflow run
runs-on-workflow-nameName of the GitHub Actions workflow
runs-on-head-branchBranch that triggered the workflow run
runs-on-repo-full-nameFull repository name (owner/repo)
runs-on-orgGitHub organization name
runs-on-labelsRunner labels assigned to the job
runs-on-envEnvironment name
runs-on-is-privateWhether the runner was launched in a private subnet
runs-on-image-idUnique identifier for the image spec used
runs-on-ami-nameName of the AMI used
runs-on-runner-idUnique identifier for the runner spec used
runs-on-extrasAdditional configuration extras
runs-on-networking-stackNetworking stack configuration
runs-on-bucket-cacheName of the RunsOn S3 cache bucket
runs-on-versionRunsOn version used
runs-on-role-idIAM role identifier
runs-on-integrations-activeActive integrations status
runs-on-is-ghesWhether running on GitHub Enterprise Server
runs-on-stack-nameCloudFormation stack name

The runs-on-is-private tag (Flex) is useful if you want to break down usage between public and private runners, for example to understand which jobs are driving NAT gateway traffic.

Custom tags

Beyond the default tags, you can attach your own tags to RunsOn resources for finer-grained cost allocation and resource identification.

How to define custom tags and their precedence (click to expand)

Custom tags can be set in different places:

  1. custom tags defined in the RunnerCustomTags stack parameters.

  2. custom tags defined in a specific custom property of your GitHub repository settings. The custom property must be named runs-on-custom-tags, and the value is a comma-separated list of tag keys and values, e.g. key1=value1,key2=value2.

Custom property
  1. custom tags defined for a runner specification in the runs-on.yml file.

If the same tag name is defined in multiple places, the last one wins. The precedence (highest priority first) is:

  1. Custom property (runs-on-custom-tags)
  2. runs-on.yml runner tags
  3. Stack-level RunnerCustomTags

Cost reports in AWS Cost Explorer

For a richer, interactive breakdown you can view RunsOn costs directly in AWS Cost Explorer, filtering on your cost-allocation tag (default key stack, value: the stack name).

Cost Explorer

The default runs-on-* tags are standard AWS resource tags on the EC2 instances launched for your jobs. If you activate them as cost-allocation tags in AWS Billing, you can drill into where your RunsOn spend comes from. The most useful tags to activate are usually:

  • runs-on-repo-full-name
  • runs-on-workflow-name
  • runs-on-workflow-job-name
  • runs-on-stack-name
  • runs-on-is-private

To enable them:

  1. Open the AWS Billing and Cost Management console and make sure Cost Explorer is enabled.
  2. Go to Cost allocation tags.
  3. Search for the runs-on-* tags that you want to report on, then choose Activate.
  4. If the AWS account where RunsOn is installed is a member account in AWS Organizations, do this from the organization’s management account (the top-level billing or payer account), not from the member account itself.
  5. Wait up to 24 hours for the tags to become active in billing data and show up in Cost Explorer.
  6. In Cost Explorer, group or filter by those tags to see costs per repository, workflow, or job.

Any custom tags you add through RunsOn can be activated the same way. For the AWS side of this flow, see the official guide on user-defined cost allocation tags.

AWS Config

If you have AWS Config enabled in your AWS account, with the default settings it will record an event for every resource created in your account, including every EC2 instances created by RunsOn. Each EC2 instance will trigger at least 3 events that could quickly add up:

  • AWS EC2 Fleet
  • AWS EC2 Network Interface
  • AWS EC2 Volume

To avoid this, you should modify your AWS Config settings to skip recording for those events, in the AWS account where RunsOn is deployed.

AWS Config settings

You can also skip recording AWS EC2 Instance events if you have really high usage.

Automatic termination safeguards

To avoid dangling resources, every instance is bootstrapped with two watchdogs, in case GitHub doesn’t send the job-completion webhooks (this happens):

  • the instance terminates itself after 10 minutes if no workflow job has been scheduled on it;
  • the instance terminates itself after 12 hours, regardless of whether a job is still running. This ceiling is configurable via RunnerMaxRuntime (in minutes).

On the server side, a cleanup process runs continuously and terminates any instance that hasn’t been tagged runs-on-workflow-job-started=true after 25 minutes. That tag is set by the agent once a job starts processing, so this is a last-resort safety net for instances where the agent never started — e.g. a custom AMI with cloud-init disabled, or a network issue at boot.

See Job retries and housekeeping for the full set of server-side housekeeping behaviours.