Migrate from RunsOn v2 to v3

Breaking changes, replacements, and rollout guidance for upgrading an existing RunsOn v2 installation to v3.

RunsOn v3 is a real breaking release. Treat it like a migration, not a routine stack update.

If you are already on v2, the right way to approach v3 is:

  1. Read the current v3 docs before changing production.
  2. Inventory any v2-only parameters, outputs, labels, and pool config you still rely on.
  3. Stand up a test or parallel v3 environment.
  4. Move representative workflows first.
  5. Keep the curated v2 reference pages open only for historical semantics while you migrate.

Highest-risk changes

These are the changes most likely to break an existing v2 install if you update casually:

  • CloudFormation is now GitHub.com-only. GithubEnterpriseUrl is gone from the built-in template. GHES installs must move to Terraform / OpenTofu and set github_enterprise_url there.
  • CloudFormation no longer supports reusing an external VPC. The built-in template always uses the embedded networking path in v3. If you need an existing VPC, use Terraform / OpenTofu.
  • AppSize replaces old app tuning knobs. AppCPU, AppMemory, and AppEc2QueueSize are gone. Pick a single app-size preset instead.
  • AppBudgetDailyUsd replaces the old daily minutes alarm. Budgeting is now based on AWS daily cost in USD, filtered by your RunsOn cost-allocation tag.
  • EC2InstanceCustomPolicy was renamed to RunnerCustomPolicy.
  • disk=... is no longer a compatibility path you should depend on. Replace it with explicit volume=... values.
  • CloudFormation WAF customization is narrower. EnableWAF=true now attaches the RunsOn-managed Web ACL. Use Terraform / OpenTofu if you need a user-managed public ingress Web ACL.

Migration matrix

Areav2 patternv3 replacement
CloudFormation sizingAppCPU, AppMemory, AppEc2QueueSizeAppSize
CloudFormation budget/alarmAppAlarmDailyMinutesAppBudgetDailyUsd
CloudFormation runner IAM policyEC2InstanceCustomPolicyRunnerCustomPolicy
CloudFormation GHESGithubEnterpriseUrlMigrate to Terraform/OpenTofu with github_enterprise_url
CloudFormation networkingNetworkingStack=external plus ExternalVpc*Use Terraform/OpenTofu for existing VPCs
CloudFormation VPC tuningVpcCidrBlock, VpcEndpoints, NatGateway*, flow-log paramsBuilt-in networking is now fixed; no per-stack tuning
CloudFormation custom public ingress WAFPublicIngressWebAclArnUse EnableWAF=true for the managed ACL, or Terraform/OpenTofu with public_ingress_web_acl_arn
Stack-level SSH adminsDefaultAdminsUse SSM for privileged instance access
Legacy disk compatibilitydisk=default, disk=large, stack default disk knobsExplicit volume=...
Permission boundaries in CloudFormationDefaultPermissionBoundaryArnManage outside CloudFormation or use Terraform permission_boundary_arn

What changed in CloudFormation

Removed parameters

These CloudFormation parameters were removed in v3 and must be deleted from any saved stack-update workflow:

  • GithubEnterpriseUrl
  • ECInstanceDetailedMonitoring
  • VpcCidrSubnetBits
  • VpcFlowLogFormat
  • VpcFlowLogS3BucketArn
  • VpcFlowLogRetentionInDays
  • RunnerLargeDiskSize
  • RunnerLargeVolumeThroughput
  • RunnerDefaultDiskSize
  • RunnerDefaultVolumeThroughput
  • DefaultAdmins
  • AppEc2QueueSize
  • AppCPU
  • AppMemory
  • NetworkingStack
  • ExternalVpcId
  • ExternalVpcPublicSubnetIds
  • ExternalVpcPrivateSubnetIds
  • ExternalVpcSecurityGroupId
  • VpcCidrBlock
  • DefaultPermissionBoundaryArn
  • AppDebug
  • EnableDashboard
  • Ec2LogRetentionInDays
  • SqsQueueOldestMessageThresholdSeconds
  • AppAlarmDailyMinutes
  • VpcEndpoints
  • NatGatewayAvailability
  • NatGatewayElasticIPCount
  • AlertTopicSubscriptionHttpsEndpoint

Renames and replacements

  • Replace EC2InstanceCustomPolicy with RunnerCustomPolicy.
  • Replace old app CPU/memory/queue tuning with AppSize.
  • Replace AppAlarmDailyMinutes with AppBudgetDailyUsd.

Fixed built-in behavior in v3

The built-in CloudFormation path is simpler and less tunable now:

If you still need any of the removed tuning surface or infrastructure flexibility below, switch that install to the Terraform / OpenTofu module instead of trying to preserve the old CloudFormation shape.

  • Embedded networking is always used.
  • The embedded VPC CIDR is fixed to 10.1.0.0/16.
  • The built-in topology is fixed to two AZs.
  • Only the free S3 gateway VPC endpoint is created by the built-in template.
  • EC2 and ECR interface VPC endpoints are no longer created by CloudFormation. Use Terraform/OpenTofu or external infrastructure if you need those PrivateLink endpoints.
  • Built-in VPC flow-log tuning is gone.
  • Built-in CloudWatch dashboard creation is always on.
  • EC2 instance log retention is fixed at 7 days.
  • Built-in SQS queue-age alarms are gone.
  • The required EmailAddress path is always used directly.
  • Additional HTTPS SNS subscription wiring is no longer created from template input.
  • API Gateway access logs remain, but v3 no longer enables API Gateway execution logging.
  • EnableAdminRoutes controls whether the public admin/setup routes are exposed.

Outputs that disappeared

If you automated around these outputs in v2, update that automation before migrating:

  • RunsOnVpcCidrBlock
  • RunsOnPublicRouteTableId
  • RunsOnPrivateRouteTable1Id
  • RunsOnPrivateRouteTable2Id
  • RunsOnBootstrapTag
  • RunsOnPrivate
  • RunsOnService
  • RunsOnProvisioningTable

EphemeralRegistryUri was renamed to RunsOnEphemeralRegistryUri.

What changed in Terraform / OpenTofu

Terraform moved to the same simpler model:

  • New module consumption should use the explicit Flex submodule source: runs-on/runs-on/aws//flex.
  • app_size replaces app_cpu, app_memory, and ec2_queue_size.
  • app_budget_daily_usd replaces the daily minutes alarm path.
  • detailed_monitoring_enabled, default_admins, legacy disk defaults, app_debug, enable_dashboard, queue-age alarms, optional HTTPS alert subscription, and s3_encryption_key_id are gone.
  • GHES, existing VPCs, and IAM permission-boundary cases remain supported there, which is why Terraform is now the escape hatch for installs that outgrow the built-in CloudFormation path.
  • public_ingress_web_acl_arn is the Terraform path for a user-managed public ingress Web ACL.
  • Root outputs are grouped by subsystem. Expose values such as module.runs_on_flex.ingress.url and module.runs_on_flex.stack.getting_started from your own root module if you want them printed after terraform apply.
  • Cache storage now uses SSE-KMS with the AWS-managed S3 key. For EBS encryption with customer-managed KMS keys, make sure the key policy trusts the generated RunsOn service role.

Runtime behavior changes

Most v3 runtime behavior is compatible with v2 workflow labels, but a few operational details changed:

  • Completed runner instances are finalized faster, and housekeeping turns runs-on-terminate=true into EC2 termination more quickly.
  • Fresh queued workflow jobs are launchable immediately once RunsOn observes them. Delayed launch timestamps are now reserved for retry and recovery paths.
  • Counted launch retries now use staged backoff: 45s, 2m, 5m, 10m, and 20m, with the sixth counted failure becoming terminal.
  • Manual GitHub reruns no longer force on-demand capacity by themselves. They follow the normal spot/on-demand policy unless RunsOn itself triggered the rerun after a spot interruption.
  • Inline OTEL job summaries graph disk and network counters as per-interval rates, making short spikes easier to see.

Repo config and label changes

Replace disk=... with volume=...

Do not carry forward legacy disk compatibility assumptions into v3.

  • disk=default no longer means “use the stack default disk size”.
  • disk=large no longer means “use the legacy large disk preset”.
  • .github/runs-on.yml may still contain disk, but RunsOn now warns that it is deprecated and ignored.

Use explicit volume values instead:

  • disk=large -> volume=80gb
  • custom size/perf -> volume=80gb:gp3:1000mbps:4000iops

Stack-level SSH admins are gone

If you relied on DefaultAdmins, move your privileged access workflow to SSM. Repository-level admins in .github/runs-on.yml still exist, but they are no longer layered on top of a stack-wide default-admin list.

Suggested rollout

For most teams, a blue-green migration is safer than an in-place update:

  1. Deploy a fresh v3 stack.
  2. Register the new GitHub App and give it access to the repositories you want to test.
  3. Suspend the old GitHub App.
  4. Pause the old App Runner service.
  5. Test representative workflows against the new stack.

If everything works as expected, keep the old stack around briefly as a rollback point, then delete it. If you need to roll back, suspend the new GitHub App, delete the new stack, unsuspend the old GitHub App, and resume the old App Runner service.

Treat RunsOn stacks as cattle, not pets. The clean migration path is to create a new stack, test it, and delete the old one once you trust the replacement.

Use the installation guide, stack configuration, job labels, and repo config pages as the v3 source of truth during that process.

What to verify after cutover

  • Jobs launch on the expected environment and runner family.
  • Any previous disk=... usage has been replaced with volume=....
  • Any CloudFormation automation no longer expects removed parameters or outputs.
  • Custom AMI, registry, EFS, and OTEL behavior still matches your expectations.
  • Cost monitoring now uses AppBudgetDailyUsd and your chosen cost-allocation tag is activated in AWS Billing.

Need the old docs?

The curated RunsOn v2 reference pages stay available for teams that need the historical semantics of v2 parameters, outputs, and labels while they migrate.