Rate limits and scale

At tens of thousands of jobs per day, two sets of limits come into play: your AWS account’s EC2 API throttling and service quotas, and GitHub’s API rate limits. RunsOn paces itself to stay within both, so a busy stack does not exhaust quotas shared with the rest of your account. This page covers what RunsOn does automatically, the few knobs you can turn, and the quotas only you can raise.

What RunsOn does automatically#

You do not configure any of this — it is built in.

Paces every AWS API call. EC2 launch, terminate, start/stop, and CreateTags calls each go through a token-bucket limiter before hitting AWS. Defaults are intentionally low (for example, ~2 RunInstances/CreateFleet calls per second) because RunsOn cannot know how many other stacks or tools share the account’s EC2 API quota. The launch rate scales up automatically with the AppSize preset.
Meters GitHub API usage against the secondary rate limit. Calls are budgeted as points — 1 for reads (GET), 5 for writes (POST/PATCH/PUT/DELETE) — and held just under GitHub’s 15-points/second ceiling. Each GitHub App installation gets its own budget, and critical-path calls (like generating a runner’s registration token) take priority over background cleanup.
Retries transient failures. GitHub 5xx responses are retried with backoff, and GitHub secondary-rate-limit responses honor the Retry-After header within the runner’s boot window. EC2 throttling (RequestLimitExceeded) is retried by the AWS SDK on top of the pacing above.
Falls back when capacity runs out. If a spot launch hits your account’s spot quota (MaxSpotInstanceCountExceeded), RunsOn raises an alert, briefly snoozes spot, and launches on-demand instead so jobs still run. See Spot pricing.

What you can tune#

AppSize — the main throughput lever#

AppSize (CloudFormation) / app_size (Terraform) sets the control-plane worker concurrency, the ECS task CPU/memory, and the EC2 launch rate-limit assumptions together. It is the single knob for overall throughput. Current presets:

Preset	Webhook workers	Provisioning workers	Registration workers
`small`	4	4	2
`medium`	8	8	4
`high`	20	20	10
`xhigh`	40	40	20

As a rough EC2 API quota target, set both RunInstances and CreateFleet request-rate quotas to about half the Provisioning workers count:

AppSize	Suggested `RunInstances` quota	Suggested `CreateFleet` quota
`small`	2	2
`medium`	4	4
`high`	10	10
`xhigh`	20	20

The AWS account default for each of those request-rate quotas is usually 2, which is enough for small but too low for larger presets.

AppGithubApiStrategy#

Set AppGithubApiStrategy to conservative to spend fewer GitHub API tokens — RunsOn stops eagerly de-registering finished runners (GitHub reaps them within 24h anyway). Recommended once you launch tens of thousands of jobs per day.

Advanced concurrency overrides#

The per-stage worker counts can be overridden independently of AppSize via the RUNS_ON_APP_WEBHOOK_CONCURRENCY, RUNS_ON_APP_PROVISIONING_CONCURRENCY, and RUNS_ON_APP_REGISTRATION_CONCURRENCY environment variables (passed through extra_env_vars on the Terraform module). Prefer AppSize unless you have a specific bottleneck to target.

What you must do in your AWS account#

RunsOn cannot raise your account’s quotas — these are on you, and they are the usual cause of “launches are slow” or “jobs fall back to on-demand”:

EC2 on-demand vCPU quota Standard/compute families have a per-region running-vCPU limit. Raise it to your expected peak concurrency × vCPUs per runner. EC2 spot quota A low 'All Standard Spot Instance Requests' quota forces RunsOn to fall back to on-demand and costs you more. Raise it to match your spot usage. EC2 API request rate RunInstances/CreateFleet request-rate throttling is what AppSize assumes headroom for. Request a higher rate before moving past small.

Symptoms and where to look#

Symptom	Likely cause	Where to look
`RequestLimitExceeded` in logs, bursts launch slowly	EC2 API request-rate quota too low for the `AppSize`	Troubleshooting → RequestLimitExceeded
Jobs run on-demand when you expected spot	Spot quota hit (`MaxSpotInstanceCountExceeded`)	SNS alerts, Spot pricing
Queue depth climbing under load	Worker concurrency saturated	Raise `AppSize`; CloudWatch dashboard
GitHub API errors at very high volume	Secondary rate limit	Set `AppGithubApiStrategy=conservative`