RunsOn RunsOn

v2.12.0

View on GitHub Upgrade guide

Spotlight

  • OTEL integration on runners: you can now set extras=otel, and a local OTEL collector will be started on each runner, that will ship logs, traces, AND host metrics to your OTEL backend (if any). This means you can now see traces for a job from the moment it is received, until it is terminated, including spans for each job step. This is in beta but eager to take as much feedback as possible.
Spans for a job
  • Improvements to GitHub rate limiters, reconciler, and many more places. Thanks @cfsnate from CFS for the help and production tests at scale!

  • New prerun attribute for custom runners / images: tasks to be launched just before the job starts. Only useful for warm pools, where preinstall runs during the warm-up phase. Since hot/stopped instances can be warmed-up hours before running a job, some steps like docker login would see the generated tokens expire before the job launches. prerun fixes that.

Instance type selection

  • When specifying multiple families, we now intentionally try to increase the chance AWS checks those alternatives independently, instead of treating all families as one mixed set and sticking too much to the cheapest pool.

  • When specifying specific instance types as family (e.g. family=m8azn.large+m8a.large), and running as on-demand, then the fleet request now enables the prioritized mode, so that you have more chances to get the first instance type that you put in your family selection.

GitHub App configuration (beta)

  • Allow to pass pre-existing github app credentials instead of creating a new one. Only available through the terraform module.

Fixes

  • More robust OTEL parameters parsing for endpoints.
  • Setup mount points earlier in the process, so that pool instances have them properly setup in case the preinstall needs them. Fixes #431.
  • Add stack parameter OtelExporterTemporality. Fixes #433.
  • Verify that labels don't contain whitespace. Fixes #448.
  • Order of Docker daemon restart and preinstall script. Fixes #442.
  • Update SQS deduplication ID generation. Fixes #451.
  • Supports protobuf protocol for Magic Cache. Useful for compatibility with some github actions from the ecosystem.
  • Fix max throughput and iops allowed for EBS volumes, depending on types.
  • Add runs-on-is-private tag to EC2 resources, so that it can be used to track how many jobs run from private subnets, and use it to better understand the workflows that trigger NAT gateway costs.
  • Introduce new stack parameter MaintenanceMode. If enabled, AppRunner service no longer processes any queues or reconcilations.
  • Fix issues with instances from pools that could stay dangling if the job has started but somehow crashed (due to OOM) and was not auto-terminated.
  • Ensure snapshot cleanup takes into account the snapshot version set by user (it used to keep only one snapshot across all versions).

Potentially breaking changes

  • CloudWatch agent is now shut down when the agent launches. OTEL collector replaces it. If you rely on the CloudWatch agent for other things, you will need to manually start it in your workflows.
  • The time at which the Preinstall / Mountpoints are setup have been modified. This may alter behaviour, especially for stopped instances from pools.
  • Removed outdated v1 version of cache toolkit support. This is no longer used by GitHub anyway, but maybe you were pinning to older versions of actions/cache and didn't realize.
  • user-data updates for linux and windows. Linux now launches a systemd service instead of executing the bootstrap sequence directly from the cloud-init script.
  • update default runners to include more recent generations (i.e. m8/c8/r8 etc) in addition to m7/c7/r7.