Skip to content

changelog

9 posts with the tag “changelog”

Changelog v2.1.0 - new Server and Agent, shared SQS queue, and more

RunsOn v2.1.0 has just been released 🎉.

Main changes

NodeJS => Go

I switched the server to the Go language, for better concurrency control. NodeJS allowed me to put something out quickly, and test the waters. But now that more and more people are using it, large clients (> 10k jobs a day) were hitting into some hard-to-troubleshoot concurrency issues due to the way NodeJS works. Go has a much better concurrency model, and I think it’s a better fit for the project anyway.

Before : Screenshot 2024-04-04 at 13 13 56

After : Screenshot 2024-04-04 at 13 13 36

If you are coming from a previous v2 version, the upgrade can be done in-place.

Agent and Server no longer public with the base license

Agent and Server source codes are now in separate private repositories, and added as submodules of runs-on/runs-on. Only the CloudFormation template and base AMIs are public.

A Sponsorship license will give you access to everything, so that you or your security team can review all the code, and choose to build from source if needed. Other licenses only get the compiled agent and server binaries.

The reason for this change is two-fold:

  • make it more difficult for the competition to see how the sausage is made, especially now that RunsOn beats the majority of the competition in terms of concurrency, speed, hardware availability, and pricing.

  • nudge larger clients into buying the more expensive license: until now there was no real incentive to buy a more expensive license. I could put some more advanced features into the more expensive tier, but my current view is to provide the best self-hosted runner solution out there, irrespective of the company size. I also didn’t want to use volume-based pricing, since I like to keep billing simple and predictable for users.

Hopefully this will strike a good balance between keeping RunsOn affordable to everyone, and still being sustainable. Please let me know if you have any feedback about this, nothing is written is stone yet.

Features

  • use an SQS FIFO queue to handle pending job workflows. If your AppRunner service needs to scale up horizontaly, this queue will now be shared across all instances, instead of each having its own in-memory queue. This also helps to not lose jobs in case an AppRunner instance goes down. Nice thing is that it also comes with integrated CloudWatch monitoring, so that you can see the number of pending jobs and maximum delay.

  • allow to disable cost reports: a new parameter CostReportsEnabled is in the CloudFormation stack, to disable the generation and sending of cost reports, if you prefer to look at them in CostExplorer or other means anyway.

  • allow to specify the disk size for default and large runner templates: 2 new CloudFormation parameters are now present, to specify the disk size of the default and large runner templates. In your job definition, simply indicate an hdd size and RunsOn will use the default template is hdd <= default size, or the large template if hdd > default size.

image

Fixes

  • remove the AppWorkflowQueueSize parameter from the CF stack. It’s no longer needed, as we align on the EC2 rate-limit for now.

  • bring back default runner and image: you can specify runs-on: runs-on, and it will work again. Same if you don’t specify an image, it will use the ubuntu22-full-x64 by default.

Breaking changes

  • older runner definitions (i.e. runner=2cpu-linux) are no longer supported. You must now use either runner=2cpu-linux-x64 or runner=2cpu-linux-arm64.

Deprecations

  • base and docker variant of the images as they stand are no longer useful, as the boot time of the full images is now considerably faster. They will most likely be removed in a next version, or will be rebuilt as a much lighter version of the full images.

Misc

  • setup flow design has changed a bit.
image

Changelog v2.0.13 - multi-az, multi-region, and much more

RunsOn v2.0.13 has just been released 🎉.

Warning: this is a major release bump, with a new VPC being created. You are advised to upgrade either during a quiet time (no runner running, otherwise the old VPC cannot be destroyed), or simply create a new stack with that template, follow the configuration process, and then Pause the previous AppRunner service until you validate that everything is going fine. Doing it this way will allow you to easily roll back to the previous version by just removing the new stack and clicking Resume on the previous AppRunner service.

Main changes

  • Replaces RunInstances call with CreateFleet, to reduce the number of API calls and increase the chances of finding a spot instance.
  • Multi-az support (3 AZ by default for the stack). stack no longer asks for an AZ choice.
  • capacity-optimized-prioritized allocation, so that it selects the instance type from the pool with the least risk of being interrupted
  • Modify launch sequence so that instance retrieves boot details from the S3 bucket (no more user-data)
  • Make RunsOn region aware (with region label), allowing deployments of RunsOn in multiple regions

General improvements

  • Default runner types are now separated into -x64 and -arm64 variants (simplifies configuration, no need to explicitly specify image), e.g. runs-on: runs-on,runner=2cpu-linux-arm64
  • Implement new rate limiters for EC2 RunInstances and TerminateInstances operations, as well as for workflow queuing. All are configurable.
  • New ubuntu22 full images, with some more cleanup of legacy software to reduce image sizes, and use of an agent to launch the runner earlier, instead of waiting for the execution of the cloud-final service. Current timings (from workflow job created to workflow job running) with full image: x64=39s, arm64=34s
  • Add timings for when the workflow job was created on GitHub, when the workflow job webhook got received, when the workflow started to be scheduled, and when the instance was seen as pending by AWS
image

Fixes

  • Fix default alarm. Make threshold configurable.
  • Stack no longer requires extended IAM permissions.

Misc

  • Truncate CloudWatch dimension values to 250 chars.
  • Change runner name format (runs-on--<INSTANCE_ID>--<RANDOM>), so that it contains the instance id.
  • No more success email when service is up, since you could receive those whenever the service is scaled up by AppRunner.
  • No more cost email when service is up. Wait 24h before the first one.

Breaking changes

  • Stack requires a VPC and subnet change, so perform the upgrade in a quiet time.
  • Runners no longer defaults to the 2cpu-linux x64 runner. You always need to specify a runner label as a base.
  • Specifying an image or runner label that does not exist will now raise an error, instead of silently falling back to the default image or runner specification.

Changelog v1.7.3 - now in eu-central-1 and us-west-2

RunsOn v1.7.3 has just been released 🎉.

What’s Changed

  • Official support for Frankfurt (eu-central-1) and Oregon (us-west-2) regions.
  • Disable AWS SDK retries for RunInstances API calls, to avoid rate limit issues.
  • Add m7i as an additional family type for default runners. Since m7a/c7a instances are in short supply, this should help make the onboarding for new users easier.

Changelog v1.7.2 - Streamline install procedure

RunsOn v1.7.2 has just been released 🎉.

What’s Changed

  • instant reload after first setup
  • fix templates
  • fix request limit exceeded errors for RunInstances API and DescribeInstanceTypes API
  • check license key
  • remove nodemon from prod
  • no longer log health check requests
  • unify logging formats, tag all lines with workflow job details for easy troubleshoot
  • allow runner config to define image, spot, ssh settings
  • specify IMDSv2 (closes #24).
  • add run-id, job-name and job-id to instance tags
  • publish consumed minutes across many dimensions:
    • Repository
    • WorkflowName
    • WorkflowJobConclusion
    • WorkflowJobName
    • InstanceType
    • InstanceLifecycle
    • ImageId
    • RunnerId

Changelog v1.6.2 - Restore launch queue size to sane limit

RunsOn v1.6.2 has just been released 🎉.

This is mostly a maintenance release, but important for the users who are launching a lot of runners in a short time.

By default EC2 has pretty aggressive rate limits set on the RunInstances API (2/s, with some burst allowed), and if you go over that limit, your runner will fail to start and RunsOn will  send you an email alert telling you about it (RequestLimitExceeded).

Until now the queue size was set to 8/s, but since most users are using new accounts to install RunsOn, it can cause issues with the low default of max 2/s.

So from now on RunsOn will default to 2/s as well, and if your account has increased quota for the RunInstances API, you can then specify a higher number by using the new CloudFormation template AppEc2QueueSize:

EC2 queue size setting

Have a great day!

Changelog v1.6.1 - ARM64 full image, support for S3 cache for workflows

RunsOn v1.6.1 has just been released 🎉.

Availability of ARM64 full image

New image name available: ubuntu22-full-arm64.

This image is mostly compatible with the GitHub Action ecosystem. It also has a lot of development and CI tooling (docker, kubernetes, nodejs, various languages, etc.) preinstalled. As soon as GitHub releases an official image, we will align the RunsOn image towards theirs.

Boot times for ARM64 full image is only 20s, vs 40s for the x64 full image.

Support for unlimited cache to S3

RunsOn will now create an S3 bucket dedicated to cache artefacts. This bucket will be automatically accessible to the runners thanks to an IAM EC2 Instance Profile, so that no credentials need to be setup.

Then, simply replace actions/cache@v4 with runs-on/cache@v4 and your workflows will now store their caches on the local S3 bucket, which allows for:

  • much faster download/restore speed (300MB/s+ vs 50-100MB/s on GitHub)
  • UNLIMITED cache storage size (GitHub only gives you 10GB).

Pretty excited about that one! You can read more about it here.

s3 action cache

New info in logs

Added AMI ID, as well as availability zone in log outputs

Log outputs

Changelog v1.5.0

RunsOn v1.5.0 has just been released 🎉.

Faster boot for large images

ubuntu22-full-x64:

  • Before: between 140 to 160s from launch to runner ready
  • After: between 40 to 60s from launch to runner ready
image

Proper disk resizing

Volume is now properly resized, if given hdd size is greater than AMI size.

New timings section in logs

Very useful to compare boot times between runner types / images.

image

New license types

  • standard
  • sponsorship

Changelog v1.4.2

RunsOn v1.4.2 has just been released 🎉.

Two notable changes:

  • Can now customize the users that will have SSH access using the new admins attribute in the RunsOn configuration file.
  • Fixed a bug in the instance selection matching, leading to incorrect priority order when having a long list of family types defined.

Changelog v1.4.0

RunsOn v1.4.0 has just been released 🎉.

  • Bring full ubuntu22 images in line with official image, using the runner user for running workflows (as GitHub does).
  • Allow to select specific availability zone when installing RunsOn, to avoid hitting limitations in terms of what resource types are available in certain AZs.
  • Add S3 gateway endpoint for VPC, in case you’re uploading/fetching artefacts from S3 buckets in your workflows.