Skip to main content

Pure Terraform modules

· 6 min read
Arthur Busser
Site Reliability Engineer

As our infrastructure scales, our Terraform codebase grows. To remain productive, we need to keep our code flexible and extendable. A design pattern we find useful is the pure Terraform module. Let's dive into what it is and how it helps us write maintainable Terraform code.

What are pure modules?

We are borrowing the term "pure" from functional programming. In functional programming, a pure function is a function that does not have any side effects. It does not mutate any state outside of its scope. Pure functions are easier to reason about and test. They are also easier to reuse.

Just like pure functions, pure Terraform modules do not interact with any external system. They do not create any resources, read any data sources, or use any providers. Their role is to provide information about our infrastructure to the caller. This seems counterintuitive at first. After all, Terraform is all about creating and managing resources. However, we find that pure modules are a useful way to structure our Terraform codebase.

Let's look at an example from our codebase: the cluster_info module. It provides information about our Kubernetes clusters. We call it like this:

module "info" {
source = "../cluster_info"

cluster_id = "production-eu1"
}

The module.info object now contains information about our cluster that we can use to create resources, configure providers, or express business logic. Once called, the module is essentially an object:

module.info = {
name = "production"
purpose = "product"
project = "pigment-production-123456"
region = "europe-west3"
cidr = "10.12.0.0/16"
# and more...
}

The cluster_info module's implementation is straightforward. It contains information about our clusters, organized by cluster ID. When called, it looks up the information for the given cluster ID and returns it:

variable "cluster_id" {
description = "ID of the cluster to provide info about"
type = string
}

locals {
# The GCP region where each cluster is located.
region_by_cluster = {
argocd = "europe-west3" # Frankfurt, DE

staging-eu1 = "europe-west3" # Frankfurt, DE
staging-eu1-dr = "europe-west9" # Paris, FR

staging-us1 = "us-west1" # Oregon, US
staging-us1-dr = "us-central1" # Iowa, US

staging-vault-eu1 = "europe-west9" # Paris, FR
staging-vault-eu2 = "europe-west1" # St. Ghislain, BE

production-eu1 = "europe-west3" # Frankfurt, DE
production-eu1-dr = "europe-west9" # Paris, FR

production-us1 = "us-west1" # Oregon, US
production-us1-dr = "us-central1" # Iowa, US
}
region = local.region_by_cluster[var.cluster_id]
}

output "region" {
description = "GCP region where the cluster is located"
value = local.region
}

Beyond simple lookups, a pure module can also contain logic to generate the information it provides. For example, the naming convention for our clusters changed as their number grew. The cluster_info module's implementation reflects that:

locals {
# Our naming conventions changed when Pigment became multi-regional.
# Newer clusters use their ID as their name.
# Older clusters' names are tracked here.
default_name = var.cluster_id
name_override_by_cluster = {
staging-eu1 = "staging"
production-eu1 = "production"
}
name = lookup(local.name_by_cluster, var.cluster_id, local.default_name)
}

output "name" {
description = "Name of the cluster"
value = local.name
}

We find that grouping information about our infrastructure in this way fits well with how we use Terraform.

Key benefits of pure modules

Pure modules have no effect on the provisioned infrastructure. They're sole impact is on how we write and structure our Terraform code. Their extreme lightness makes them pretty much free to call. Any part of our codebase can call pure modules without worrying about side effects. There is no need to pass information around as variables or duplicate it, since we can look up the information we need wherever we need it.

Using pure modules reduces the need for reading remote states. Using terraform_remote_state data sources is a common pattern in Terraform, but it comes at a cost. It creates a strong coupling between different parts of our codebase. Pure modules allow us to avoid this coupling when possible. Any static information about our infrastructure can be stored in a pure module. We only need to read remote states when we need dynamic information, like the ID of a resource that was created in a separate part of our infrastructure. This makes our codebase much more modular and reduces the need for complex data flows.

Pure modules are also easy to extend as our infrastructure grows. Adding new information to them does not affect the rest of the codebase. For instance, adding a new cluster to our infrastructure is as simple as adding a new entry to the relevant maps in our pure modules.

Pure modules can also implement simple logic to handle edge cases. We found that this allows us to compartmentalize the complexity of our infrastructure and keep our codebase clean. For instance, modules can look up a cluster's name without worrying about how our naming conventions have changed over time.

Additionally, pure modules centralize core information about our infrastructure. Rather than having to look up information in multiple places, we can find it all in one place. This makes it easier to reason about our infrastructure and allows our code to act as an inventory of our infrastructure. A good example of this is network IP ranges. Our codebase contains a file called cluster_network_info.tf that contains all the IP ranges for our clusters. This makes it easy to understand our network layout and to avoid IP conflicts:

  # The IP range used by k8s pods, services, and nodes for a given cluster.
k8s_cidr_by_cluster = {
argocd = "10.201.0.0/16"

staging-eu1 = "10.101.0.0/16"
staging-eu1-dr = "10.111.0.0/16"

staging-us1 = "10.102.0.0/16"
staging-us1-dr = "10.112.0.0/16"

staging-vault-eu1 = "10.103.0.0/16"
staging-vault-eu2 = "10.104.0.0/16"

production-eu1 = "10.1.0.0/16"
production-eu1-dr = "10.11.0.0/16"

production-us1 = "10.2.0.0/16"
production-us1-dr = "10.12.0.0/16"
}
k8s_cidr = local.k8s_cidr_by_cluster[var.cluster_id]

Finally, pure modules are fully compatible with vanilla Terraform. There is no need for any special tooling or plugins, no need to generate code or orchestrate Terraform commands, and no need to ramp up on more technology. Pure modules are just Terraform modules that provide information about our infrastructure.

Takeaways

Terraform HCL, as a declarative language, can benefit from patterns from functional programming. Using tried and true software design patterns is essential to keeping a Terraform codebase modular and easy to understand, which we need to scale our infrastructure without scaling our technical debt.

Pure modules are a useful way to structure Terraform code. They provide a way to keep complexity contained and to centralize information about our resources.

In our experience, pure modules have helped us keep our Terraform codebase clean, maintainable, and scalable. We hope that you find them useful in your Terraform projects as well.