Blog
June 15, 2017 Marie H.

Terraform Remote State in S3 and Writing Reusable Modules

Terraform Remote State in S3 and Writing Reusable Modules

Photo by <a href="https://www.pexels.com/@divinetechygirl" target="_blank" rel="noopener">Christina Morillo</a> on <a href="https://www.pexels.com" target="_blank" rel="noopener">Pexels</a>

Local Terraform state is fine when you're the only person touching infrastructure. The moment a second person runs terraform apply, you have a problem. I've seen state files committed to git, overwritten, and straight-up lost. Remote state in S3 with DynamoDB locking solves the collaboration problem, and once you've got that sorted, modules let you stop copy-pasting the same EC2 + security group config across every project. Let's set both up.

Why Local State Is a Disaster on a Team

When state lives in terraform.tfstate on your laptop:

  • Someone else runs terraform apply without your latest state and clobbers your changes
  • Someone commits terraform.tfstate to git, including any secrets that ended up in it
  • You lose the file and now Terraform thinks all your infrastructure doesn't exist

Remote state stores the file in S3 (versioned, so recoverable), and DynamoDB provides a lock so two people can't run terraform apply simultaneously.

Setting Up the S3 Backend

First, create the S3 bucket and DynamoDB table. I do this by hand the first time since you can't use Terraform to manage its own backend:

aws s3api create-bucket \
  --bucket my-terraform-state \
  --region us-east-1

aws s3api put-bucket-versioning \
  --bucket my-terraform-state \
  --versioning-configuration Status=Enabled

aws dynamodb create-table \
  --table-name terraform-locks \
  --attribute-definitions AttributeName=LockID,AttributeType=S \
  --key-schema AttributeName=LockID,KeyType=HASH \
  --provisioned-throughput ReadCapacityUnits=1,WriteCapacityUnits=1

Then in your Terraform config (this is 0.9/0.10 syntax):

# main.tf
terraform {
  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "myapp/prod/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
    encrypt        = true
  }
}

Run terraform init and it migrates any existing local state to S3 automatically. From this point on, terraform apply acquires a lock in DynamoDB, does its thing, writes state to S3, and releases the lock. If someone else is running apply at the same time, they get an error instead of a corrupted state file.

Writing a Reusable Module

I was copy-pasting the same EC2 instance + security group configuration into every project until I got tired of fixing the same typo in four places. Modules fix this.

Here's the directory structure I use:

my-infra/
├── main.tf
├── variables.tf
└── modules/
    └── ec2/
        ├── main.tf
        ├── variables.tf
        └── outputs.tf

modules/ec2/variables.tf — inputs the module accepts:

variable "instance_type" {
  description = "EC2 instance type"
  default     = "t2.micro"
}

variable "ami_id" {
  description = "AMI to launch"
}

variable "name" {
  description = "Name tag for the instance and security group"
}

variable "allowed_cidr" {
  description = "CIDR block allowed SSH access"
  default     = "0.0.0.0/0"
}

modules/ec2/main.tf — the actual resources:

resource "aws_security_group" "this" {
  name        = "${var.name}-sg"
  description = "Security group for ${var.name}"

  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["${var.allowed_cidr}"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

resource "aws_instance" "this" {
  ami                    = "${var.ami_id}"
  instance_type          = "${var.instance_type}"
  vpc_security_group_ids = ["${aws_security_group.this.id}"]

  tags {
    Name = "${var.name}"
  }
}

modules/ec2/outputs.tf — values the caller can reference:

output "instance_id" {
  value = "${aws_instance.this.id}"
}

output "public_ip" {
  value = "${aws_instance.this.public_ip}"
}

output "security_group_id" {
  value = "${aws_security_group.this.id}"
}

Calling the Module from Root Config

Back in the root main.tf, using the module is clean:

terraform {
  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "myapp/prod/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
    encrypt        = true
  }
}

provider "aws" {
  region = "us-east-1"
}

module "web_server" {
  source        = "./modules/ec2"
  name          = "web-prod"
  ami_id        = "ami-0c55b159cbfafe1f0"
  instance_type = "t2.small"
  allowed_cidr  = "10.0.0.0/8"
}

module "bastion" {
  source   = "./modules/ec2"
  name     = "bastion-prod"
  ami_id   = "ami-0c55b159cbfafe1f0"
}

output "web_server_ip" {
  value = "${module.web_server.public_ip}"
}

Run terraform get to pull in local modules, then terraform plan as usual. Each module instance manages its own resources independently, and the outputs bubble up so you can reference them or pass them to other modules.

Wrapping Up

Remote state with S3 + DynamoDB locking should be the first thing you configure on any team Terraform project — before you've written a single resource. And if you find yourself pasting the same resource blocks more than twice, that's your cue to pull it into a module. Both changes are low effort and will save you real pain down the road.