Rate limits and scale
How RunsOn paces AWS and GitHub API calls to protect shared quotas at high volume, what you can tune, and the AWS quotas you must raise yourself.
At tens of thousands of jobs per day, two sets of limits come into play: your AWS account’s EC2 API throttling and service quotas, and GitHub’s API rate limits. RunsOn paces itself to stay within both, so a busy stack does not exhaust quotas shared with the rest of your account. This page covers what RunsOn does automatically, the few knobs you can turn, and the quotas only you can raise.
What RunsOn does automatically
You do not configure any of this — it is built in.
- Paces every AWS API call. EC2 launch, terminate, start/stop, and
CreateTagscalls each go through a token-bucket limiter before hitting AWS. Defaults are intentionally low (for example, ~2RunInstances/CreateFleetcalls per second) because RunsOn cannot know how many other stacks or tools share the account’s EC2 API quota. The launch rate scales up automatically with theAppSizepreset. - Meters GitHub API usage against the secondary rate limit. Calls are budgeted as points —
1for reads (GET),5for writes (POST/PATCH/PUT/DELETE) — and held just under GitHub’s 15-points/second ceiling. Each GitHub App installation gets its own budget, and critical-path calls (like generating a runner’s registration token) take priority over background cleanup. - Retries transient failures. GitHub
5xxresponses are retried with backoff, and GitHub secondary-rate-limit responses honor theRetry-Afterheader within the runner’s boot window. EC2 throttling (RequestLimitExceeded) is retried by the AWS SDK on top of the pacing above. - Falls back when capacity runs out. If a spot launch hits your account’s spot quota (
MaxSpotInstanceCountExceeded), RunsOn raises an alert, briefly snoozes spot, and launches on-demand instead so jobs still run. See Spot pricing.
What you can tune
AppSize — the main throughput lever
AppSize (CloudFormation) / app_size (Terraform) sets the control-plane worker concurrency, the ECS task CPU/memory, and the EC2 launch rate-limit assumptions together. It is the single knob for overall throughput. Current presets:
| Preset | Webhook workers | Provisioning workers | Registration workers |
|---|---|---|---|
small | 4 | 4 | 2 |
medium | 8 | 8 | 4 |
high | 20 | 20 | 10 |
xhigh | 40 | 40 | 20 |
AppGithubApiStrategy
Set AppGithubApiStrategy to conservative to spend fewer GitHub API tokens — RunsOn stops eagerly de-registering finished runners (GitHub reaps them within 24h anyway). Recommended once you launch tens of thousands of jobs per day.
Advanced concurrency overrides
The per-stage worker counts can be overridden independently of AppSize via the RUNS_ON_APP_WEBHOOK_CONCURRENCY, RUNS_ON_APP_PROVISIONING_CONCURRENCY, and RUNS_ON_APP_REGISTRATION_CONCURRENCY environment variables (passed through extra_env_vars on the Terraform module). Prefer AppSize unless you have a specific bottleneck to target.
What you must do in your AWS account
RunsOn cannot raise your account’s quotas — these are on you, and they are the usual cause of “launches are slow” or “jobs fall back to on-demand”:
Symptoms and where to look
| Symptom | Likely cause | Where to look |
|---|---|---|
RequestLimitExceeded in logs, bursts launch slowly | EC2 API request-rate quota too low for the AppSize | Troubleshooting → RequestLimitExceeded |
| Jobs run on-demand when you expected spot | Spot quota hit (MaxSpotInstanceCountExceeded) | SNS alerts, Spot pricing |
| Queue depth climbing under load | Worker concurrency saturated | Raise AppSize; CloudWatch dashboard |
| GitHub API errors at very high volume | Secondary rate limit | Set AppGithubApiStrategy=conservative |