Saved around 75% of our costs , and tests now run 5× faster on gigantic spot instances.
Tim Dumol
Founding Engineer & Chief of Infrastructure, Expedock
//self-hosted github actions runners · your aws account
You went through ARC. You tuned the autoscaler, fought the controller, and gave up on Windows. RunsOn is the version that just runs: ephemeral EC2 in your own account, one CloudFormation or TF stack. No Kubernetes, no controller to babysit, choose any EC2 instance for your job.
# the label is a query — each line one constraint, resolved at launch jobs: build: runs-on: - runs-on=${{ github.run_id }} - cpu=2 - family=c7i+m7i - image=ubuntu24-full-x64 - volume=80gb:gp3:125mbs steps: [...]
running in production, every single day
RunsOn launches and tears down real EC2 across hundreds of AWS accounts — roughly 1.5% of all GitHub Actions runs worldwide go through it.
jobs run every day — ephemeral runners launched and destroyed, one box per job.
vCPUs put to work every day — from 1 to 896 per runner, across x64 · arm64 · gpu.
the hard parts, handled
If you've self-hosted before, you know the runner is the easy 20%. The other 80% — the parts below — is why you're reading this page. RunsOn ships them as defaults, not as a weekend project.
With ARC, every instance shape is another scale set — a new deployment, new autoscaler, new thing to forget about. Need a fat build host for one job? Provision it ahead of time and hope it's warm.
Here the instance is the job. The runs-on label is a query — family, cpu, ram, image, volume — resolved
against live spot capacity at launch. No pools to size.
# 64 vCPU Graviton + local NVMe (c7gd ships it, auto-mounted)runs-on: runs-on=${{ github.run_id }}/family=c7gd/cpu=64/image=ubuntu24-full-arm64
# next job wants a GPU box — same grammar, no new infraruns-on: runs-on=${{ github.run_id }}/family=g6/image=ubuntu24-gpu-x64GitHub's cache caps at 10 GB and lives across the internet. Self-hosting it means standing up — and securing — your own cache service, then keeping it alive.
Flip extras=s3-cache and the magic cache
backs
actions/cache with an S3 bucket in your
account — same region, same VPC, no size cap. Nothing else in the
workflow changes.
# turn on the S3-backed magic cache for this jobruns-on: runs-on=${{ github.run_id }}/cpu=8/extras=s3-cache
# then keep using actions/cache exactly as before —# transparently proxied to S3, no 10GB cap.- uses: actions/cache@v4On ephemeral runners the layer cache evaporates with the box. You wire up registry caching, then watch every build pull every layer back over the network anyway.
Flip extras=ecr-cache and RunsOn stands
up an ephemeral ECR registry in your account — point buildx at
type=registry and layers stay in-region
across jobs. Or snapshot /var/lib/docker
wholesale with runs-on/snapshot@v1 and
restore the daemon block-for-block on the next run.
# in-account ephemeral registry — shared layer cache across jobsruns-on: runs-on=${{ github.run_id }}/runner=2cpu-linux-x64/extras=ecr-cache
- uses: docker/build-push-action@v6 with: cache-from: type=registry,ref=${{ env.RUNS_ON_ECR_CACHE }}:cache cache-to: type=registry,ref=${{ env.RUNS_ON_ECR_CACHE }}:cache,mode=max
# …or block-level snapshot the whole docker dir between runs- uses: runs-on/snapshot@v1 with: { path: /var/lib/docker } Most hosted runners forbid it outright. KVM
and Hyper-V need nested-capable instance types — on AWS that's the
m8i, c8i, and
r8i families — exactly the thing a managed
pool won't hand you.
Nested virt is a label, on both sides of the house: KVM for Linux, Hyper-V for Windows. Android emulators, VM-based e2e suites, Windows containers — they run.
# linux KVM — android emulator, firecracker, e2e VMsruns-on: runs-on=${{ github.run_id }}/family=c8i+m8i+r8i/nested-virt
# windows Hyper-V — a nested-capable family + windows imageruns-on: runs-on=${{ github.run_id }}/family=m8i/image=windows25-full-x64/nested-virtYou want per-job cost and timing without standing up a metrics pipeline to get it — and without exporting any of it to a third party.
Every job reports the instance it landed on, its duration, the live spot price, and the total it cost. Metrics to CloudWatch, traces over OTEL, logs in your own account. Nothing leaves the VPC.
# opt into runner-side OpenTelemetry (traces + host metrics)runs-on: runs-on=${{ github.run_id }}/cpu=8/extras=otel
# and every job summary gets this, automatically:instance c7g.16xlarge (spot, eu-west-1c)duration 4m 12sspot $0.412 / hrjob cost $0.029retry=when-interrupted private=true debug=true extras=tmpfs+efs pool=small-x64 ami=ami-0… sixty seconds
The Kubernetes cluster, the controller, the autoscaler — then the same job on ephemeral EC2 in your own account. Watch the moving parts fall away.
the reasons teams stay
Teams switch for cost. They stay because builds get faster, setup is uneventful, and everything keeps running inside their own AWS account.
Saved around 75% of our costs , and tests now run 5× faster on gigantic spot instances.
Tim Dumol
Founding Engineer & Chief of Infrastructure, Expedock
After benchmarking a lot of tools, it's the best. We run costs divided by 4 — thousands of jobs per day .
Corentin Smith
CTO, Dashdoc
Reduced GitHub Actions costs by 70% , and CI runtime improved by up to 80% . A clear win with virtually no downside.
Théophile Dunoyer de Segonzac
Lead DevOps Engineer, Lingoda
Less than 10 min to test, install and use. Cache download speed is blazing fast .
Christopher Brookes
SRE, Choose
install
The easy path is one CloudFormation template. Manage infra as code? Deploy the same stack with the official Terraform module instead. Either way, nothing of yours leaves your account.
One CloudFormation template — or the official Terraform module for advanced, IaC-managed setups. VPC, S3 cache bucket, IAM, and the scheduler come up together.
Install the GitHub App on your org or repo. RunsOn registers ephemeral runners on demand — nothing stays running idle.
Swap a single line in your workflow. Every existing action, cache step, and secret keeps working untouched.
# the entire migration, in one diffruns-on: ubuntu-latestruns-on: runs-on=${{ github.run_id }}/runner=2cpu-linux-x64aws cloudformation create-stack --stack-name runs-on \ --template-url https://runs-on.s3.eu-west-1.amazonaws.com/cloudformation/template-v3.1.0.yaml \ --capabilities CAPABILITY_IAM CAPABILITY_AUTO_EXPAND \ --parameters \ ParameterKey=GithubOrganization,ParameterValue=your-org \ ParameterKey=LicenseKey,ParameterValue=your-license-key \ ParameterKey=EmailAddress,ParameterValue=you@example.comThe one-click path — or launch the same template from the AWS console. Fill in your org, license key, and notification email.
# main.tf — RunsOn Flex, managed as codemodule "runs_on" { source = "runs-on/runs-on/aws//flex" version = "v3.1.0"
github_organization = "your-org" license_key = "your-license-key" email = "you@example.com"
# further networking and sizing settings — see the docs}$ terraform init && terraform apply