Scalable DNS with EventBridge

The Problem

DNS, the source of all network problems, can be frustrating to implement at scale. Helpfully, AWS have released this guide on doing multi-account DNS. As with most things multi-account, automating the process is more involved than we would like, nevertheless we persevere and in this case show how AWS EventBridge can be the glue by which we stick together infrastructure at scale.

The Outcome

By the end of this you should have:

  • One central DNS VPC account
  • Two child accounts that can resolve each others Private Hosted Zone
  • A CloudFormation template and pattern to enroll new accounts into the DNS web

Prerequisites

The Core DNS Account

Setting up the DNS VPC

As you can see in the high level architecture diagram in the AWS documentation:

High Level Architecture

We need a centralised VPC to act as our DNS hub. To set that up:

  1. Assume a role in your designated central DNS account
  2. Update the dns-parameters.json file with your Organization ARN.
  3. Run: aws cloudformation create-stack --stack-name DNSVPC --template-file dns-vpc.yaml --parameters file://dns-parameters.json

Now let’s quickly look at what we deployed:

  1. We have the world’s simplest VPC, nothing interesting there
  2. We have an outbound endpoint, which we need for the resolver rule
  3. We have an inbound endpoint, which we manually set the IP addresses for as CloudFormation doesn’t return them as attributes
  4. We have the parent private hosted zone that we’re going to subdomain off for the child accounts
  5. We have a resolver rule that is the magic sauce, that directs all traffic for our hosted zone to this VPC via the outbound endpoint
  6. We have a share via AWS RAM that shares said resolver rule with your organization

Adding EventBridge

The next step is configuring EventBridge on the DNS account so we can accept events from the child accounts:

  1. Update the master-parameters.json file with your Organization Id 1 Run aws cloudformation create-stack --stack-name EventBus --template-body file://event-bus-master.yaml --parameters file://master-parameters.json --capabilities CAPABILITY_IAM

Let’s quickly look at what we have deployed now:

  1. We have an EventBridge policy that allows all accounts in our Organization to push events into the account
  2. We have a lambda function to associate new Private Hosted Zones with the DNS VPC
  3. We have a rule that based on an event source triggers the lambda function

The First Child Account

Before we start provisioning resources in the child accounts we need to get a few details from the master account.

  1. Run aws cloudformation describe-stacks --stack-name DNSVPC
  2. Grab the outputs for the ResolverRuleId and DNSVPCId and copy them into client-parameters-1.json and client-parameters-2.json

Now we’re ready to deploy into the child accounts

  1. Assume a role in the child account
  2. Run aws cloudformation create-stack --stack-name EventBus --template-body file://event-bus-client.yaml --parameters file://client-parameters-1.json --capabilities CAPABILITY_IAM

And we have deployed:

  1. Another simple VPC with just enough configuration
  2. A private hosted zone subdomain
  3. A custom resource to associate the private hosted zone with the DNS VPC
  4. A custom resource to fire a custom event to the account default event bus
  5. An EventBridge rule to fire said event over to the DNS master account
  6. A record set for testing the inter-account DNS

The Second Child Account

Now we can set up the other child:

  1. Assume a role in the child account
  2. Run aws cloudformation create-stack --stack-name EventBus --template-body file://event-bus-client.yaml --parameters file://client-parameters-2.json --capabilities CAPABILITY_IAM

And we have deployed the same resources as in the first child, but under a different subdomain and CIDR.

Testing What We’ve Built

The simplest test is to create an EC2 machine in either of the child accounts.

In the second account

  1. Set up an EC2 machine with a public IP and a known key pair.

  2. SSH onto the machine

  3. Run nslookup test.beta.cloud.private

    You should see:

    [ec2-user@ip-10-0-2-124 ~]$ nslookup test.beta.cloud.private
    Server:		10.0.2.2
    Address:	10.0.2.2#53
    
    Non-authoritative answer:
    Name:	test.beta.cloud.private
    Address: 10.0.1.10
    

!!SUCCESS!!

Reviewing What We Did

  1. We deployed a central DNS VPC
  2. We deployed Route 53 infrastructure to share across the organization
  3. We set up EventBridge to automatically enroll new Private Hosted Zones
  4. We configured private hosted zones in two child accounts
  5. We tested that we now could resolve hostnames between accounts

Next Steps

Now we are in a position where we can continue to enroll more accounts, VPCs and private hosted zones. However, the templates are already feeling somewhat unwieldly.

Refactoring

  • Making a private resource to act as an EventBridge event emitter
  • Breaking the Custom Resource Definitions out into a separate, potentially nested, templates
  • Extracting out the EventBridge policy in the DNS account into a separate template
  • Potentially Serverless Framework could reduce the amount of code to maintain