v2.10.0
Summary
Re-architects job handling with a DynamoDB-backed workflow job store, new jobs/GitHub queues and reconciler, enhanced GitHub/spec/pool/label logic, and restores Prometheus/OTEL /metrics endpoint.
Spotlight
- Fix numerous bugs for pools (still in beta).
- New reconciliation loop should ensure that no queued job is left pending.
- FIFO semantics for SQS queues now enforced for job scheduling, even under high load.
- Repository config (
runs-on.yml) is now validated and warnings are emitted when an error is found. Validator is also available using the CLI. - You should see lower GitHub API token usage, with auto-disablement of some low-priority features (e.g. unregistering runners) over high-impact ones (e.g. registering runners). Also new metricsa are available to track GitHub API operations.
- Re-enable a
/metricsendpoint (prometheus format) if you prefer polling for metrics instead of sending to an OTEL collector.
Details
- Infrastructure (CloudFormation):
- Queues: Add
RunsOnQueueJobsandRunsOnQueueGithubFIFO queues (with DLQs) and wire into env vars, IAM, and outputs. - DynamoDB: Add
RunsOnWorkflowJobsTable. Should stay well below free tier usage for 90% of users. - App Runner/IAM: Add
RUNS_ON_QUEUE_JOBS,RUNS_ON_QUEUE_GITHUB,RUNS_ON_WORKFLOW_JOBS_TABLE,RUNS_ON_SERVER_PASSWORD,RUNS_ON_LOGGER_LEVEL(dev/v2.10.0) env vars; grants3:DeleteObjectonS3Bucket/runs-on/db/*. - Parameters: Add
Environmentto main group; addServerPassword; addLoggerLevel(dev/v2.10.0); refineOtelExporterEndpointandRunnerConfigAutoExtendsFromdescriptions. - Outputs: Export new queues and
RunsOnWorkflowJobsTable.
- Queues: Add
- Server/Architecture:
- Replace legacy queue processors with new
jobsandgithubqueues (processJobsQueue,processGitHubQueue); add reconciliation loopprocessWorkflowJobsReconciliation. - Introduce DynamoDB-backed
WorkflowJobsStore(WorkflowJobRecord), instance attachment, next-check indexing, and recent repo discovery. - Refactor webhook handling to context-aware flow; persist job payloads; schedule via job IDs.
- Replace legacy queue processors with new
- GitHub Integration:
- Add stale-while-revalidate caches, repo metadata/collaborators retrieval, global/local/composite config loaders, and repo listing per installation.
- Runner/Specs:
SpecResolvernow validates config, resolves from labels/repo config, and normalizesprivate/ssh.
- Pools:
- Validate pool names; include name in SHA; improved schedule evaluator; parse/detach timestamps; explicit excess instance termination.
- Labels:
- Robust parsing (trim/control chars), support
runs-on=run id, and auto-assigndependabotpool.
- Robust parsing (trim/control chars), support
- Metrics/Endpoints:
- Add Prometheus exporter and GitHub operation counter; improve OTEL config;
/metricsendpoint with basic auth.
- Add Prometheus exporter and GitHub operation counter; improve OTEL config;
- Housekeeping/Cleanup:
- Batch terminate instances/fleets; honor
runs-on-terminatetag.
- Batch terminate instances/fleets; honor
Fixes #400, fixes #402, fixes #395, fixes #386.