DIY Stacksets for Terraform
What you’ll learn by the end
By the end of this blog you will understand how you can build out stackset-esque functionality for your Terraform code and how this will make deploying common infrastructure across vast AWS estate significantly simpler.
Why we want Terraform stacksets
Having been a heavy CloudFormation user for many years, and seeing the ecosystem of tooling that has built up around it, there was one piece I really wanted for Terraform, stacksets.
For those unfamiliar, CloudFormation stacksets allow you to provision infrastructure across all accounts in an organisation or under a particular organisational unit. This allows you to build out account baselines, making sure certain resources are uniquitously enabled.
Commonly these take the form of such things as foundational IAM roles, AWS config rules and logging buckets. CloudFormation stacksets allow you to define a single template and have that deployed into every account from one central location.
Terraform unfortunately does not have this inherent capability, but how hard is it to build our own?
Steps
In this example we’re going to deploy the world’s simplest IAM role to all accounts in our organization.
Prerequisites
- Install Terraform 14, instructions here
- You will need an assumable role in all accounts, by default that’s the
OrganizationAccountAccessRole
- Confirm that the role also exists in your payer account (the one with access to
organizations
) as it doesn’t by default - Assume a role in your payer account that has read access to
organizations
and write access tos3
- Create an S3 bucket for storing our state:
aws s3api create-bucket --bucket terraform-state-${aws sts get-caller-identity | jq -r '.Account'}
- Ensure you have
pipenv
installed, instructions here - Create a new folder and open it in your IDE of choice
Terraform Workspaces
As we’re using S3 as our remote state storage, we have access to Terraform Workspaces, they were initially pitched as a way of logically separating production and non-production infrastructure, but in this case we’re going to create one per account to keep our state distinct.
Setting up Terraform
First we need to set up our Terraform so we can deploy the role to all accounts.
Create a file called main.tf
and paste the below content in.
terraform {
backend "s3" {
bucket = "terraform-stackset-<payer-account-id>"
key = "terraform"
region = "eu-west-2"
}
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 3.27"
}
}
}
provider "aws" {
region = "eu-west-2"
}
provider "aws" {
alias = "target"
region = "eu-west-2"
assume_role {
role_arn = "arn:aws:iam::${var.target_account_id}:role/OrganizationAccountAccessRole"
}
}
resource "aws_iam_role" "cross_account_role" {
provider = aws.target
name = "stackset_cross_account_role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Sid = ""
Principal = {
AWS = "<payer-account-id>"
}
},
]
})
}
variable "target_account_id" {
type = string
}
Make sure to replace the two <payer-account-id>
statements with your payer account id.
Now run terraform init
and everything should come back clean.
Initialising the python environment
Run pipenv --three
to create a python virtualenv to run our code in
Run pipenv install boto3
so we have access to the excellent boto3 library for invoking AWS.
Create a file called stacksets.py
and copy the below content into it:
import subprocess
import boto3
def init():
subprocess.run(f"terraform init", check=True, shell=True)
def get_accounts():
organizations = boto3.client('organizations')
paginator = organizations.get_paginator("list_accounts")
return [
account["Id"]
for page in paginator.paginate()
for account in page["Accounts"]
]
def workspace_exists(account):
completed_process = subprocess.run(f"terraform workspace list | grep {account}", shell=True)
return completed_process.returncode == 0
def create_workspace(account):
subprocess.run(f"terraform workspace new {account}", check=True, shell=True)
def switch_to_workspace(account):
subprocess.run(f"terraform workspace select {account}", check=True, shell=True)
def plan(account):
subprocess.run(f"terraform plan -var target_account_id={account}", check=True, shell=True)
def apply(account):
subprocess.run(f"terraform apply -var target_account_id={account} -auto-approve", check=True, shell=True)
def run():
init()
for account in get_accounts():
if not workspace_exists(account):
create_workspace(account)
switch_to_workspace(account)
plan(account)
apply(account)
if __name__ == "__main__":
run()
Stepping through the python
Lets quickly break down what the python code is doing:
- We initialise terraform to ensure our environment is set up to work
- We query AWS for a list of all accounts in the organization
- We check whether a Terraform workspace exists for that account
- If not, we create one
- We select the approrpriate workspace for the account
- We plan the changes against the account
- We apply the changes to the account
Apply our changes
If we now run pipenv run python stacksets.py
the code will now iterate through every account and deploy our role.
Confirm we can assume the role
Lets test our new role by picking an account id and invoking aws sts assume-role --role-arn arn:aws:iam::<your-account-id>:role/stackset_cross_account_role --role-session-name terraform-stacksets
You should see that you’re sent back a set of credentials, and if so we have succeeded!
Where are we now
So now we have a skeleton that allows us to rapidly deploy resources across all our AWS accounts that allows us to stay purely in terraform.
Hopefully, one of these days Terraform Cloud might provide us with a fully managed version of this so we don’t have to maintain and extend this code ourselves. Until then, I think this will do nicely.