RunsOn is now handling 200k jobs per day
New achievement unlocked: RunsOn is now handling 200k jobs per day across all users (at least those with telemetry enabled)! 🎉
New achievement unlocked: RunsOn is now handling 200k jobs per day across all users (at least those with telemetry enabled)! 🎉
I was invited in the French Podcast Nom d’un Pipeline ↗ to talk about RunsOn and how to make large savings with your own runners. Here is the episode link ↗ and a direct embed if you want to listen to it:
Note: this is a recording of a similar talk given for a DevOps meetup on May 16, Rennes, France. You’ll find a generated transcript summary below, but you probably want to watch the video instead.
Hello everyone and thanks for coming to this presentation on GitHub Actions and how to make it faster and 10x cheaper. But first a brief primer on GitHub Actions and especially the good parts.
GitHub Actions is a way to run workflows automatically whenever you push code or open a pull request or do anything on your repository.
It has very high adoption, a flexible workflow syntax, and a large choice of architectures so you can run workflows targeted at Linux x64, macOS, Windows, so it’s quite versatile and really useful.
Here are some major issues with GitHub Actions:
Performance and Cost: The default runners on GitHub are pretty weak, sporting just two cores that are both slow and expensive, costing over $300 a month if used non-stop. On the other hand, alternatives like Buildjet, Warpbuild, and UbiCloud offer quicker and cheaper services.
Caching and Compatibility Issues: GitHub’s caching tops out at 100MB/s, which can bog down workflows involving large files. Also, there’s no full support for ARM64 runners yet —- they’re still in beta —- slowing down builds that need multiple architectures.
Resource Optimization and Time Waste: GitHub’s weaker machines mean you often have to spend a lot of time fine-tuning your test suites to get decent run times. This eats up a lot of engineering hours that could be saved by switching to more robust runners from other providers or by setting up your own.
Self-hosted runners offer a practical solution for those looking to speed up their builds and reduce costs. By setting up your own machines and configuring them with GitHub’s runner agent, you can achieve faster build times at a lower price.
When using non-official runners, you can choose among 3 levels:
This approach, which I’ll call ‘artisanal on-premise’, involves using a few of your own servers and register them with GitHub. It’s cost-effective and manageable for a small number of machines but has limitations such as limited concurrency, maintenance requirements, security risks, and lack of environment consistency with GitHub’s official runners.
For a more robust setup, consider the ‘productized on-premise’ approach. This involves similar self-hosting principles but requires additional software like the Action Runner Controller or the Philips Terraform project to help manage the runners. This setup offers better hardware flexibility and scalability, as it can dynamically adjust the number of virtual machines based on demand. However, it requires more expertise to maintain and still lacks full image compatibility with GitHub’s official runners, necessitating custom Docker images or AMIs.
The final option is to use third-party providers for more affordable machines. These providers handle maintenance, so you just pay for the service. Most support official images, and they typically offer a 50% cost reduction. However, using these services means you’ll need to share your repository content and secrets, which could be exposed if there’s a security breach. The hardware options are limited; you can choose the number of CPUs but not specific details like the processor type, disk space, or GPU. Additionally, if you need more than 64 CPUs concurrently, extra fees may apply. Often, these services are hosted in locations with suboptimal network speeds.
Here’s a quick overview of the market options for GitHub Actions alternatives:
While searching for a cost-effective and efficient self-hosted solution, I found the fully on-premise options challenging to set up, slow to start, and with lengthy queuing times. Additionally, AWS CodeBuild, despite its advantages, is costly and comes with its own set of limitations.
I’ve been developing RunsOn, a new software aimed at creating a more affordable and efficient on-premise GitHub Actions Runner. Here’s a quick rundown:
Overall, the goal is to make RunsOn a robust, user-friendly solution that enhances the efficiency of running automated workflows.
Key points on scalability for RunsOn:
So basically, I wanted to do just this: change one line, and my workflow should still work. This is probably one of the hardest parts because you have to make compatible OS images, in my case for EC2, and nobody did this, or nobody published it at least.
So in my case, thankfully, GitHub publishes the Packer templates for the Runner images on Azure, so I just ported them for AWS, and this is now available for anyone to use. You can find the links here.
The final feature, low maintenance, and so as you can see, the architecture diagram has changed a bit since the last slide, but basically what I use for RunsOn is just managed services everywhere, and cheap services. So I have basically one CloudFormation stack which provisions an SQS queue, an SNS alert topic, a CloudWatch logs and metrics, and some S3 buckets, and then the RunsOn server is running on the AppRunner AWS service, which is really a cheap way to run containers on AWS. I recommend you check that out, and yeah, on the VM there is a small RunsOn agent that launches to configure the VM and then register with GitHub, and all that stack, like if you have a reasonable number of jobs, it costs only about one or two dollars a month, which is pretty impressive.
Here’s a quick overview of the additional features and real-world results of RunsOn:
Future enhancements for RunsOn include:
Gateway endpoints for Amazon S3 are a must-have whenever your EC2 instances send and receive traffic from S3, because they allow the traffic to stay within the AWS network, hence better security, bandwidth, throughput, and costs. They can easily be created, and added to your VPC route tables.
But how do you verify that traffic is indeed going through the S3 gateway, and not crossing the outer internet?
Using traceroute
, you can probe the routes and see whether you are directly hitting the S3 servers (i.e. no intermediate gateway). In this example, the instance is running from a VPC located in us-east-1
:
Both outputs produce the expected result, i.e. no intermediary gateway. This is what would happen if you were accessing a bucket located in the us-east-1
region.
Let’s see what happens if we try to access an S3 endpoint located in another zone:
As you can see, the route is completely different, and as expected does not hit straight to the S3 endpoint.
TL;DR: make sure your route tables are correct, and only point to S3 buckets located in the same region.
As part of the RunsOn service, we automatically maintain and publish replicas of the official GitHub runner images ↗ images as AWS-formatted images (AMIs) in this repository: https://github.com/runs-on/runner-images-for-aws ↗.
New images are automatically released every 2 weeks, and are slightly trimmed to remove outdated software, or (mostly useless) caches.
ubuntu22-full-x64
ubuntu22-full-arm64
ubuntu24-full-x64
ubuntu24-full-arm64
us-east-1
)us-east-2
)us-west-2
)eu-west-1
)eu-west-2
)eu-west-3
)eu-central-1
)ap-south-1
)ap-northeast-1
)ap-southeast-1
)ap-southeast-2
)For the x86_64
image, search for:
runs-on-v2.2-<IMAGE_ID>-*
135269210855
For instance, for the ubuntu22-full-x64
image, search for:
runs-on-v2.2-ubuntu22-full-x64-*
135269210855
You can find more details on https://github.com/runs-on/runner-images-for-aws ↗.