Terraform Module UX

In this post you will learn the power of Terraform modules at scale across an organisation, the importance of user experience for Terraform modules, and heuristics for improving their UX. By the end you will see modules in a new light and be able to support your colleagues in building higher quality modules in the future.

UX is more than skin deep

When you say the words User Experience to someone, their mind flits straight to visuals, but user experience, like love, is more than just skin deep. Hashicorp are one company that understand this better than most, as you can see Mitchell Hashimoto explain ever so eloquently:

User experience defines your demographic, and your success within the demographic

In my experience, when people write Terraform modules they become laser focused on the now at the expense of the future.

The power of Terraform modules

People often talk of the power of Terraform modules through the lens of reuse, if I write this module I can stop people having to write the code again. I can abstract away the complexity. This is a first-order advantage, this is the thin end of the wedge on what modules can help us achieve.

Second-order advantages

Developers are lightning

My favourite analogy for developers in this context is that they are lightning, they seek and follow the path of least resistance. To stretch the metaphor too far, Terraform modules are a lightning rod. In the quest for speed, if we make modules the least resistance path, we can leverage that to achieve other objectives.

Secure by default

The most common objective to achieve through modules is a secure by default approach, we can abstract away the security based complexities, baking in correct configurations. Ensuring that the principle of least privilege is applied at scale is generally incredibly difficult, but modules give you a fantastic platform for ensuring this.

Other common things to bake into modules from a security perspective are correct KMS configurations, blocking publicly accessible resources and setting up conformant VPCs that have correct routing by default.

With a healthy, heavily leveraged module ecosystem you enable options that are otherwise impossible.

Premature optimisation is the root of all evil

A common adage in programming circles is that “Premature optimisation is the root of all evil” and abstractions are a form of optimisation. To look at this in more detail we need to understand two things, the difference between deep modules and wide modules, and the nature of public interfaces over time.

Deep vs Wide modules

In the book A Philosophy of Software Design, John Ousterhout introduces the concept of having a deep abstraction and a wide abstraction.

Deep abstractions hide complexity below the surface, wide abstractions expose more complexity at the surface

Deep vs Wide Module

There is a minimum width that a module must have to serve its purpose, the close to that I can get, the better the abstraction. The less the consumer needs to know to use my module. The more likely they are to use my module. The more positive impact I can have at scale.

Public interfaces over time

Public interfaces as a general rule grow over time. Think back to the last time you deprecated an API endpoint, or removed a public method from a class relied upon by another team. It was hard wasn’t it?

Contrast this against adding a new API endpoint. That was relatively trivial wasn’t it?

Removing items from a public interface is at least 100 times harder than adding them

So when’s the easiest time to set a minimally wide interface?

When we first write the module

The impact of premature optimisation

Now we can see in the quest for deep modules, the initial interface we write has an outsized impact in our success. Once people are using the modules we write it’s very hard to make backwards incompatible changes, and in doing so we erode the trust and the value that teams get from them. Interestingly enough, this is the same logic the AWS CloudFormation team often use for not supporting resources on the day of release, once they set out the resource definition, they’re stuck with it. It’s a 1-way decision, and needs to be taken with a lot of care.

The public interface of modules

In other system domains the public interface is very obvious, do we have a publicly accessible API? What methods are exposed when someone imports our library? For a terraform module, the public interface is the variables. Which leads us to our heuristics of module design:

Heuristics of module design

Less variables is better

If you were presented a module that had 400 variables, would you know where to start? How long would it take you to understand how to achieve your goal? Would you ever feel completely confident that you had everything configured correctly?

Looking at some numbers, it can get quite scary.

If you have 5 binary variables, i.e. they only have two valid options

There are 32 possible combinations of those variables

If you have 5 variables that each have 5 valid options

There are 3,125 possible combinations!

Build only for today’s requirements

Now we understand the exponential impact of adding variables, it is critical that we develop for only today’s usecases. Often I see people implementing options that no one has yet asked for. The two key reasons for avoiding this are:

  1. The additional variables increases the cognitive load on the consumer
  2. We will know more tomorrow than we do today, the central tenet of agile.

Agile as the counter to arrogance

For me, agile is all about becoming comfortable with uncertainty and humility over arrogance. Now we understand that removing variables is much harder than taking them away, adding unneeded options is just inserting pain into the future.

Challenge new default variables

Default variables can be a wonderful thing, they allow you to open up new options in a module whilst easily preserving the existing functionality. However, they are worth examining to see if they are truly required or are a future requirement creeping into today’s module. Is there a way you hide that complexity from your consumer?

Embed sensible defaults

Lets take an S3 bucket module, as a consumer what I want is the bucket so I can deliver on my feature, in a compliant, secure way. What I don’t want is having to understand the minutae of what makes the bucket secure and compliant. Modules are the vector by which we can embed systemic changes across our estate.

It is not realistic to expect all engineers be able to write perfect KMS policies, if we can bake such things into the module that is how secure by default can be achieved.

Documentation is king

Tools such as Terraform Docs automatically produce documentation for your modules. Great documentation is a key driver of user experience, nothing turns engineers off quicker than subpar documentation. It can be hard for the person making the change to write documentation at the right level, so this is where pairing and PR reviews are worth their weight in gold.

Version modules as you would anything else

Especially since the pandemic hit, the shift to remote work has stretched communication pathways. This has led to a focus on how to communicate effectively asynchronously, and module versioning along with documentation is key to this.

The most common versioning strategy in the industry currently is semver, so we’ll use that as our example.

In semver you have three numbers: major.minor.patch

Major indicates a breaking change, and sets an expectation that a significant amount of effort will be needed to migrate

Minor indicates new functionality has been added, migrating should not have any impact

Patch indicates a non-observable change has occurred, potentially a bug fix, and similarly to minor, migrating should be simple

By embedding this versioning into your modules you can explicitly communicate and set expectations with your consumers so they understand the scope and impact of moving versions.

As a side benefit, this also allows you to observe where people are still using old versions of modules, building a critical feedback loop to understand why they have stalled there.

Thoroughly test your modules

There exists many options for testing module code, a powerful combination is Terratest and Terraform Compliance

Much like you shouldn’t accept a pull request for code that doesn’t have tests, nor should you accept module pull requests without appropriate tests. Terratest allows you to quickly instantiate modules and run post-apply validations. Going back to the variable permutation explosion from Less variables is better you should have a test that confirms that the changes being made enable the wanted outcome.

Terraform Compliance allows you to define your compliance requirements in plain English and have them operate as an executable specification against your infrastructure. When developing modules it is critical that you confirm it is possible to use the module to produce compliant infrastructure, and that as new controls are developed and added to the suite that this continues to be the case.

Contributions

As we now see user experience is not something you can neglect when authoring Terraform modules. Without it you cannot hope to extract even 1/10th of the possible value. A mature internal module ecosystem is a foundation that enables systemic change that is otherwise impossible.

We’ve learnt with the great power of abstraction comes great responsibility.

We’ve explored some heuristics and techniques for improving the UX of all modules:

  • Less variables is better
  • Build only for today’s requirements
  • Challenge new default variables
  • Embed sensible defaults
  • Documentation is king
  • Version modules as you would anything else
  • Thoroughly test your modules

Now, using these heuristics you can have better conversations, build better modules and enable systemic change.