Flex vs Fleet - RunsOn

RunsOn is the umbrella product for GitHub Actions runners in your AWS account. It has two modes: RunsOn Flex and RunsOn Fleet.

They are complementary, not exclusive. Most teams start with Flex because it gives each workflow precise runner control per job. Platform teams add Fleet on top when the same runner capabilities should be published as a small set of standardized organization or enterprise runner fleets that workflow authors cannot edit.

Quick decision#

If you want…	Choose
Each workflow to pick its own CPU, RAM, architecture, image, disk, cache, or networking	Flex
A platform team to publish a small set of approved runner fleets, governed in Terraform	Fleet
Pull-based scale-set polling instead of GitHub webhooks into your control plane	Fleet
The narrowest GitHub permissions — only a self-hosted runners scope, no repo admin or webhook	Fleet
GitHub `strategy.max-parallel` to throttle large matrix jobs	Fleet
Both: shared runner fleets for common jobs, dynamic labels for one-offs	Both

Use both#

The most common pattern in larger orgs:

Fleet publishes 3-6 named runner fleets that cover the bulk of CI traffic (linux-small, linux-large, linux-gpu, windows-large, etc.), targeted from workflows as runs-on: runs-on/fleet=<fleet-name>/env=<env>. Runner fleet shape, image, and capacity are reviewed in Terraform.
Flex stays available for workflows that need a one-off configuration: an unusual instance family, custom AMI, nested virtualization, or a disk size the runner fleets don’t cover.

Both share one license and one AWS footprint. Repositories can target runner fleets by default and reach for Flex labels only when they actually need to.

Same capabilities, different ownership#

Many runner features exist in both modes. What changes is where the setting lives: in Flex, the workflow’s label picks them per job; in Fleet, the platform team enables them on a runner fleet in Terraform, and workflows can only target the fleet by name.

Examples that work this way:

extras = ["s3-cache"] for the magic S3 actions cache
extras = ["tmpfs"] for tmpfs-backed /home/runner, /tmp, and /var/lib/docker
nested-virt = true for nested virtualization (KVM, Hyper-V workloads)
Custom AMIs through the images catalog and each runner’s image value
Private networking through stack or runner-fleet networking configuration

The shape of the choice is the same; the question is who owns it.

For the full mapping at the Terraform level, see the Fleet concepts map.

RunsOn Flex#

A workflow label describes the runner it needs, RunsOn launches a matching ephemeral EC2 instance in your AWS account, and the runner terminates after the job.

Flex shines when:

the right answer changes per job (architecture, image, disk, GPU, networking, cache extras)
workflow authors should make the call, not the platform team
you need access to the full menu of EC2 families and runner sizes

How Flex is wired#

Flex is a software stack you install in your own AWS account. GitHub sends workflow events to the stack through API Gateway and Lambda ingress, an ECS worker turns those events into queued runner launches, and each job gets a fresh ephemeral EC2 instance that registers with GitHub, runs one job, and terminates. Supporting services include SQS queues for control-plane work, DynamoDB and Secrets Manager for state and GitHub credentials, an S3 cache bucket, and CloudWatch Logs and SNS for logs, alerts, and cost reports. In private mode, Flex adds a NAT Gateway with a static egress IP and an S3 gateway VPC endpoint.

RunsOn Flex architecture - everything runs in your AWS account

Start with the Flex installation guide.

RunsOn Fleet#

A platform team defines runner fleets in Terraform. Repositories target those runner fleets with stable organization or enterprise runner labels. The runner fleet owns the runner shape, image, networking, and capacity, and workflow code only chooses which runner fleet to use.

Fleet shines when:

one CI platform team supports many repositories or organizations
runner choices belong in code review, not in .github/workflows/*.yml
you need scale-set features like strategy.max-parallel, or you want no public webhook endpoint in front of the control plane
you have strict security requirements: Fleet’s GitHub credential is scoped only to managing self-hosted runners — an org-level self-hosted runners write permission (organization mode) or an enterprise self-hosted runners scope (enterprise mode) — with no repository administration, code access, or inbound webhook
one legal entity owns multiple GitHub organizations under a single license

How Fleet is wired#

The Fleet stack also runs inside your AWS account and VPC, but the control plane is fleetd on ECS Fargate, and it is pull-based: unlike Flex, there is no inbound webhook, API Gateway, or SQS queue. fleetd resolves the active GitHub boundary, ensures one GitHub runner scale set per runner fleet, opens a long-lived message session to each scale set, and reacts to the assigned-job demand GitHub reports. It keeps its capacity ledger in DynamoDB, stores cache data in S3, and reads configuration and GitHub credentials from Secrets Manager. As demand arrives, the capacity loop launches ephemeral Linux or Windows EC2 runners (drawing from any warm or stopped standby pool first) that register into the scale set, run the assigned job, and terminate.

Start with the Fleet installation guide.

Fleet today (early access)#

Fleet is production-ready for the workloads it covers, but it is still in early access — pin your Terraform module version before rolling out broadly. Today it covers Linux, Windows, and GPU runner fleets with GitHub runner scale sets, Terraform-owned configuration, and optional hot or stopped standby capacity. The areas below are expected to land in Fleet over time, but no dates are committed — until then, keep those workloads on Flex.

Area	Status	Use today
OpenTelemetry runner integration	No Fleet `extras=otel` runner-side OTEL path or Fleet-specific OTLP export yet.	Use Flex for workloads that need runner-side OTEL export.
EFS cache and shared mounts	No Fleet runner-fleet feature for EFS cache or shared mounts yet.	Use Flex EFS caching for large shared working sets.
Ephemeral ECR registry	No Fleet `ecr-cache` / ephemeral registry flow yet.	Use Flex when jobs depend on the managed ECR cache registry.
Block-level snapshots	No Fleet runner-fleet feature for the snapshot cache workflow yet.	Use Flex for `runs-on/snapshot`-based Docker or dependency caches.
macOS runner fleets	Same AWS/macOS licensing constraints as Flex.	Use another provider until AWS/macOS licensing makes this practical.
Per-job runner labels	Intentional: Fleet does not support workflow-selected CPU, RAM, image, disk, networking, or extras.	Use Flex when each job must describe its own runner.

Licensing#

Flex and Fleet share one company-scoped license, covering the legal entity that purchased it and the GitHub organizations and AWS accounts that entity owns or controls. Usage is measured by monthly runner volume across both modes. Exceeding a tier does not stop CI; it surfaces in tier notices and renewal conversations.