We’ve been asked: OpenTofu/Terraform just released feature X — should I still use Terragrunt?

In a word, YES!

In 2016 Terragrunt was started to bridge the gap between Terraform’s limitations and the needs of platform engineers. That was nine years ago, a lifetime in the tech world, and a lot has changed. Today, it stands as the premier orchestration tool for keeping Infrastructure as Code (IaC) safe and productive. Terragrunt does that in five key areas:

In this post we’ll first explain why features get added to Terragrunt, along with a bit of the history, then dig into the five primary ways in which Terragrunt adds value in 2025 for platform teams and IaC engineers.

Why features are added to Terragrunt

Features are added to Terragrunt when there’s a need that isn’t being met in the IaC ecosystem. Gruntwork has worked with organizations large and small to maximize the safety and efficiency of their infrastructure management. Terragrunt was introduced because Gruntwork customers needed better tooling to handle provisioning infrastructure at scale, especially when collaborating between multiple platform engineers.

Early on, those features largely plugged the gaps in Terraform functionality. Nowadays, most features are added to expand on Terragrunt’s role as an IaC orchestrator, as OpenTofu and Terraform have become feature complete enough that their responsibilities have diverged from that of Terragrunt. Some of that progress in OpenTofu/Terraform is actually adoption of early Terragrunt features.

The earliest example of a Terragrunt feature that OpenTofu/Terraform adopted is that of DynamoDB state locking (the second Terragrunt commit in fact). Terraform gained support for state locking in 2017 with the release of Terraform v0.9.0. With that release, Terragrunt needed to address one less gap in Terraform, and was able to put more of its focus on addressing problems of orchestration and scaling IaC.

If you’re curious (like I was), Terragrunt had released version v0.11.0 around that time, where the commands spin-up and spin-down were deprecated in favor of (the now also deprecated) apply-all and destroy-all commands used to manage a “stack” of infrastructure. This might give you a sense of the direction Terragrunt was moving.

You might also find it interesting to read the migration instructions released about a month later deprecating that initial Terragrunt feature in Terragrunt, in favor of recommending that users use the new underlying Terraform functionality natively.

With the introduction of OpenTofu, we are actually working to make these kinds of gap-closing features in Terragrunt even less necessary. The best example of this is the Provider Cache Server feature. OpenTofu/Terraform currently do not have any mechanism for ensuring that multiple concurrent invocations can safely access the same provider cache. To address this for Terragrunt users, we introduced a workaround feature that made Terragrunt the mediator for provider cache access to make concurrent OpenTofu/Terraform invocations safer. We don’t think most users should have to use Terragrunt to get this benefit, however. We are working with the OpenTofu core team to introduce capabilities into OpenTofu to make concurrent invocations of OpenTofu safer for everyone, regardless of whether or not they want to use Terragrunt.

There has always been a symbiotic relationship between Terragrunt and Terraform (and now OpenTofu). Terragrunt exists to support and extend the underlying IaC ecosystem, to innovate on what’s possible with IaC and to help teams build on reliable IaC fundamentals at scale. We get excited when OpenTofu/Terraform introduce new features that plug gaps in what their users demand because that means we get to focus on the orchestration and safety features that make Terragrunt special.

Units

The best example of how Terragrunt works in a way that is totally different from OpenTofu/Terraform is the ability to segment state across multiple smaller units of infrastructure in a file-system (if you’re not familiar with this terminology, a unit is a directory with a terragrunt.hcl file in it).

Terragrunt users have the ability to get really granular with the blast radius of their IaC updates, and to control that blast radius simply by navigating their file-system.

The average Terragrunt project has many terragrunt.hcl files, each representing their own piece of state, and users are able to intuitively decide what infrastructure they’re updating simply by navigating to a particular directory in their file-system.

undefined

In this diagram you can see that a change to a VPC unit is limited in potential damage when using Terragrunt, but that isn’t the case when using OpenTofu/Terraform alone. Terragrunt users achieve this blast radius reduction simply by running terragrunt commands in the live/prod/vpc directory.

When using OpenTofu alone, any update to infrastructure (updating ecs-service-a1 for example) puts the entire root module at risk, as any infrastructure might be updated. By default, OpenTofu/Terraform will try to update anything that needs updating when running a tofu command in the live/prod directory. OpenTofu/Terraform users can use resource targeting to try to get more granular with their updates (and OpenTofu users specifically have better tooling via the new, as of writing, -exclude flag), but it’s not a recommended reliable way of making isolated infrastructure updates as mentioned in OpenTofu/Terraform docs.

There are, of course, trade-offs to this design, but (in our opinion) it’s a design that works a lot better in Terragrunt than it feasibly can in OpenTofu/Terraform alone. We actively develop against this design in infrastructure management, and have built-in tooling for Terragrunt to address it. As a result, Terragrunt users have a much easier time working working with individual units, and across them.

Terragrunt users have tooling to work with multiple units in concert and to pass data between units, encode dependencies and order of operations between units. We’re also pioneering new tooling to make this process even more convenient at scale for platform teams (see The Road to 1.0: Terragrunt Stacks for more).

Stacks

What most users consider the killer feature of Terragrunt is the run-all command. It provides the ability to perform multiple concurrent Terragrunt runs in a stack of units while using a Directed Acyclic Graph (DAG) to preserve proper ordering of updates.

Let’s take a simple example of a requirement to provision some ECS services (container orchestration service from AWS). In this example, a user has no infrastructure provisioned in AWS, and would like to get some services up and running.

If you’re not familiar with ECS, know that an ECS service is a way to run containers in a cluster, and that a service must be provisioned in the context of a cluster. Without a cluster, you cannot provision an ECS service. Similarly, all of those ECS resources exist within the context of a VPC.

Let’s take a look at a diagram of some infrastructure that needs to be updated.

undefined

In this diagram, we can see that, as required by AWS, the VPC is created first, followed by the ECS clusters, followed by ECS services. These units have their own independent state, and they aren’t explicitly told to provision in that order, so how does Terragrunt know to provision them that way?

The answer is that Terragrunt uses the same mechanism that OpenTofu/Terraform use within an individual root module to coordinate these updates. It tracks dependencies between units, and follows the DAG to ensure that applies (creations and updates) flow in the direction of dependency to dependent and destroys flow in the opposite direction (dependent to dependency).

This ensures that every unit has its dependencies provisioned before it, and every unit is cleaned up before any of its dependencies.

This killer feature allows Terragrunt to scale IaC far more reliably than when using OpenTofu/Terraform alone. At a certain scale, engineers find it too dangerous to have the same CLI invocation used to update an ECS service potentially destroy their VPC, and when the two resources are managed in the same OpenTofu/Terraform state, that risk is ever-present. By isolating state into different units, and providing tooling for working across units, Terragrunt provides the best of both worlds: Safely isolated infrastructure units that are well integrated into cohesive stacks.

Side effects

Another subtle example of how Terragrunt tackles challenges differently than OpenTofu/Terraform is that it accepts a far messier view of the world than they do. Namely, Terragrunt allows the developer to intentionally cause or react to side effects and unhappy-path conditions before/after a OpenTofu/Terraform run.

OpenTofu/Terraform are really good at getting a set of configurations defined in .tf files accurately provisioned as real infrastructure. One of the reasons they are so good at this is that they limit what you can do to in your configuration files and what those configuration files can do when you run them.

They are designed to work like functional programming languages, taking state and configuration, then driving resources to a new desired state when those configurations change, with the only allowed side effects (the stuff outside configuration and state changes) being the resources they are supposed to manage (for the most part).

To be clear, this is really good design. If you want to be able to reliably reproduce some infrastructure that you define with a given pattern, using something that has as few side effects as possible is a good thing. It means that the system behaves like a pure function and as a result has predictable, testable properties — things that are very valuable when dealing with expensive and sensitive infrastructure.

Unfortunately, in the real world, there are often scenarios where it can be really handy to have controlled introduction of side-effects in infrastructure management. Networks can be flaky, one-off scripts can get the job done and sometimes it’s better to integrate your (potentially buggy) code with your coworkers early than avoid integration for fear of taking down all your infrastructure.

Terragrunt has a lot of tooling designed to handle these real-world scenarios, and it does them in simple ways that the average engineer can take advantage of quickly.

Error handling

Take the following configuration block that the average Terragrunt user might see in their terragrunt.hcl files:

errors {
retry "transient_errors" {
retryable_errors = [".*Error: transient network issue.*"]
max_attempts = 3
sleep_interval_sec = 5
}
ignore "known_safe_errors" {
ignorable_errors = [
".*Error: safe warning.*",
"!.*Error: do not ignore.*"
]
message = "Ignoring safe warning errors"
}
}

That is the Terragrunt-native way of handling errors in units. I bet you can guess what those configurations do, and how they make it so that the OpenTofu/Terraform code Terragrunt is orchestrating can safely retry updates with transient errors and ignore safe to ignore errors.

This is the kind of practical tooling that is built into Terragrunt to handle messy, real-world edge-cases. Terragrunt allows you to neatly handle side-effects so that you don’t really need to think about that complexity when writing your OpenTofu/Terraform code. You don’t need to file a pull request to fix the provider that doesn’t handle network outages well, or add a bash script that will retry your applies. You codify all of your infrastructure management practices in your configuration, and Terragrunt makes sure it happens reliably and consistently.

Hooks

Another killer feature users adopt Terragrunt for is the ability to add hooks that drive behavior surrounding IaC updates. Using hooks allows users to keep their OpenTofu/Terraform code generic and easier to maintain, while accounting for operational procedures that is more convenient to define outside of IaC.

If, for example, you wanted to make sure that you always took a backup of your database before you made any update to it (as many experienced engineers like to do), you can codify that practice like this:

terraform {
source = "tfr:///terraform-aws-modules/rds/aws?version=6.10.0"

before_hook "backup" {
commands     = ["apply"]
execute      = ["my-backup-script.sh"]
}
}

Similarly, if you wanted to ensure that updates to a service were always accompanied by a smoke test of the service to confirm they are still healthy, you can codify that just as simply:

terraform {
source = "tfr:///terraform-aws-modules/ecs/aws?version=5.12.0"

after_hook "smoke_test" {
commands     = ["apply"]
execute      = ["my-smoke-test-script.sh"]
}
}

By taking advantage of these tools, you can conveniently codify what you and your colleagues do outside IaC updates, and ensure that they are always reproduced reliably, regardless of whether they are done locally, in a colleague’s machine or in CI/CD. You don’t have to explain the domain specific knowledge relevant to the procedure or ask someone if they remembered to do it, it’s just part of how the infrastructure is defined.

Terragrunt has much more in the way of this kind of tooling for use-cases like data fetching, error handling, state management, and more. The goal is to make it so that you, as the operator, do as little Gruntwork as possible when managing your infrastructure, regardless of whether you’re only using IaC to drive updates to it.

The Future of Terragrunt

We’re passionate about the future of Terragrunt. There’s a lot in the works to expand its potential as an IaC orchestrator, and to make the lives of DevOps engineers better!

You can learn more about the next major milestones for Terragrunt by reading The Road to Terragrunt 1.0 blog post. We’re excited about the advancements we discuss there and how they’ll improve the safety and productivity of infrastructure updates.

Make sure to give this blog post a clap or two if you’ve enjoyed it, and give us a star on GitHub.

Special thanks to Eben Eliason, Josh Padnick, Tin Nguyen and Zach Goldberg for their feedback on this blog post. A special thank you to Eben for the beautiful graphics used in this blog post. Make sure to give him a shout out in the comments!