RunsOn RunsOn

How building a Terraform module made me fall in love with CloudFormation

Building the RunsOn Terraform module made me rethink CloudFormation and where it is still the right deployment tool.

A battle between Terraform and CloudFormation

I used to think CloudFormation was useless.

Not in a nuanced “it has its trade-offs” way. I mean I genuinely avoided it whenever I could.

And I don’t think I was alone. According to the 2025 Answers for AWS survey, CloudFormation is the most used IaC tool in the AWS ecosystem at 88% usage and 100% awareness. But only 62% of people who use it actually want to keep using it. Meanwhile Terraform and OpenTofu sit above 84% retention.

Survey Snapshot

CloudFormation in the draft survey framing

Usage
88%
Awareness
100%
Retention
62%

Terraform/OpenTofu retention: 84%+ in the draft survey framing

Everyone knows CF and it’s used everywhere. But nobody wants to use it.

I was firmly in that camp until I built a Terraform module for RunsOn which was previously only deployable via CloudFormation. That project showed me what CloudFormation was actually good at.

And it turns out, it’s good at a lot.

Why CloudFormation was (and still is) perfect for RunsOn

RunsOn gives you self-hosted GitHub Actions runners on your own AWS account. You deploy it, it spins up ephemeral EC2 instances for your CI jobs, and you get faster builds at a fraction of what GitHub charges for their runners.

The key thing about RunsOn is the experience: you go from zero to running your first job on self-hosted infrastructure in a few minutes. No DevOps expertise required. Just deploy the stack and you’re live.

This experience was only made possible via CloudFormation. You click a link, fill in a few parameters in the AWS Console, hit create, and everything gets created as one atomic unit.

It just works, and users love it despite CloudFormation’s bad reputation.

So I thought to myself:

How hard could it be to replicate the CF experience in Terraform?

When I sat down to build the module, the sole design principle was to give TF users the same experience CF users already had. Zero to running with minimal configuration and stateful resources protected out of the box. No surprises.

Sounds straightforward, right?

Well, not quite.

Not because Terraform is bad, it’s brilliant at what it’s designed for. But because what it’s designed for and what RunsOn needs from its deployment tool turned out to be two very different things.

And that gap is what changed how I think about CloudFormation.

Let’s take resource lifecycle management for example

The first time I was really impressed by CF was when I tried to replicate one of its simplest concepts in Terraform.

RunsOn creates resources that hold user data: S3 buckets, EFS file systems, ECR repositories. These need careful lifecycle management because if someone accidentally deletes their stack, you don’t want that data to disappear. But if they’re deploying for the first time and something fails, they do want a clean slate —no orphaned resources cluttering up their account.

CloudFormation handles all of this with one concept: DeletionPolicy. You set it on each resource and the behaviour is exactly what you’d expect.

DeletionPolicy: RetainExceptOnCreate

One line.

If the initial creation fails, clean up. If you delete the stack later, keep the data. And this works the same way for every resource type: S3 buckets, EFS, ECR, whatever.

Terraform doesn’t have an equivalent concept.

For S3 buckets, the provider has a force_destroy attribute that can be set dynamically. For ECR repositories, there’s a separate force_delete attribute. These work fine on their own, but they’re different mechanisms for different resource types.

And then there’s EFS, where Terraform’s prevent_destroy lifecycle attribute is static.

You can’t make it conditional.

You can’t say “protect this resource unless we’re running tests” or “protect this unless it’s the first deployment.” The workaround is to create two copies of the resource: one with protection on, one without, and use conditional logic to pick the right one.

So what CF handles with one concept across all resource types, Terraform needed three different mechanisms: force_destroy for buckets, force_delete for ECR, and duplicate resources with conditional counts for everything that relies on prevent_destroy.

And here’s what made me laugh:

One of Terraform’s main advantage is testability; you can use tools like Terratest to spin up infrastructure, validate it, and tear it down. But to make that teardown actually work, I had to create three separate test flags (force_destroy_buckets, force_delete_ecr, prevent_destroy_optional_resources) just so the test harness could clean up after itself.

CloudFormation doesn’t need any of that. You delete the stack and the deletion policies handle the rest.

This again comes down to Terraform’s design choice. It favours explicit, composable, per-resource control over unified implicit behaviours. That’s usually a strength, but for this specific problem, CloudFormation’s approach is just… better.

That was the first thing I noticed.

Things like having 10+ resources per S3 bucket and scattered conditional logic (as opposed to a unified conditional block in CF) were noticeable too, but those are more in the territory of accepted Terraform trade-offs. Plus with AI, writing more lines of code is not really a pain anymore.

But there’s a bigger thing I noticed:

What if the application wants to talk to the infrastructure it’s housed in?

I realised this is where the CF vs TF debate has a massive blind spot.

Most comparisons between these tools focus on the authoring experience. Which is easier to maintain? Which handles state better? Etc. But they almost never talk about what happens after you run apply or create-stack.

For a product like RunsOn, that’s where the real difference lives.

You see, RunsOn’s entire purpose is to manipulate its own infrastructure. The application lives in an App Runner service that gets deployed as part of the stack. When a GitHub webhook comes in, the app talks to SQS for job queuing, DynamoDB for job config management, then spawns ephemeral EC2 instances to run the CI jobs. It uses S3 for caching, CloudWatch for logging, EFS for ephemeral storage, ECR for docker build caching and so on.

The application doesn’t just sit on the infrastructure. It is infrastructure orchestration.

Which means the application needs to know what infrastructure it has available to it. It needs to find its own resources at runtime.

With CloudFormation, this is dead simple.

A CF stack is a first-class entity in AWS. It has an ARN and it’s queryable. You call one API and get back every resource in the deployment with a deterministic logical ID. The stack is essentially a manifest that says “here’s everything this deployment contains and here’s how to find each piece.”

The original RunsOn CLI used this to discover its own infrastructure in about 10 lines of code. Look up the stack, find the resources by logical ID, done.

Terraform doesn’t leave anything like this behind. Once apply finishes, AWS sees a bunch of independent resources with no inherent grouping. There’s no “stack” to query. From AWS’s perspective, the resources are unrelated.

When we added Terraform as a deployment option, the discovery layer that was 10 lines of CloudFormation API calls became roughly 200 lines of tag-based querying, fallback logic, and ARN pattern matching. Now every new resource we add needs changes in two places for TF (the module and the CLI) versus one for CF.

But the line count isn’t really the point. The point is what it reveals about the relationship between IaC and the application it deploys.

CloudFormation treats the stack as a living entity that persists after deployment. The application can query it, reference it, lean on it. There’s a continuous relationship between the deployment tool and the deployed application.

Terraform treats deployment as a finished event. The tool’s job is done after apply. What happens at runtime is the application’s problem.

For a typical web app, that’s fine. The app doesn’t need to know how it was deployed.

But for something like RunsOn, that post-deployment relationship matters.

A lot.

So why does RunsOn need a Terraform module?

Because Terraform is genuinely better for a different set of users.

The teams that want the RunsOn Terraform module aren’t the ones who want a one-click deployment. They’re platform engineers who already have a VPC, already have state management configured, already have CI pipelines that run terraform plan on every PR. They want to plug RunsOn into that existing workflow, not manage a separate CloudFormation stack alongside it.

And for them, the Terraform module is the right answer.

They get composability: bring your own VPC and configure things exactly how you want. They get testability. They get OpenTofu as an open-source option. They get GitOps integration.

These are real strengths. For teams with existing Terraform pipelines, the module is genuinely the better deployment path.

Terraform isn’t bad, it’s just not the universal answer the community sometimes treats it as.

Different tools for different users

After going through this whole experience and looking back at those survey numbers, here’s where I land:

I think those numbers reflect a community that’s evaluating CF against the wrong criteria. They’re judging it as a general-purpose IaC tool. And yeah, against Terraform in that context, it falls short.

But that’s like judging a screwdriver for being a bad hammer. CloudFormation isn’t trying to be Terraform, it’s solving a different problem.

For vendor-deployed, self-contained, single-stack AWS products like RunsOn, CloudFormation is genuinely brilliant. The Console GUI, the managed state, the atomic stack operations, the first-class stack entity that your application can query at runtime.

And the roadmap for RunsOn reflects this belief:

CloudFormation will always be the default deployment path, but we’re actually making it simpler and streamlining it for users who just want to start using RunsOn without any special configuration.

The Terraform module, on the other hand, will get all the bells and whistles for users who want fine-tuned control over every aspect of their deployment.

Best of both worlds? Maybe.

More like having the right tools for the right jobs.

The takeaway

I built a Terraform module to give RunsOn users more deployment options. What I didn’t expect was that the process would make me fundamentally reconsider a tool I’d written off years ago.

CloudFormation isn’t the outdated, clunky thing I thought it was. It’s a deployment mechanism that’s purpose-built for a specific category of problems, and for that category, nothing else comes close.

The IaC ecosystem needs both.

And the next time someone dunks on CloudFormation in a thread, I hope they’ve actually tried to build the thing they’re dunking on in Terraform first.

Because I did. And it changed my mind.