Things I wish I knew before working with Terraform

I appreciate my current working place, Endava, for providing me with the time and tools needed to create this blog post.

Introduction

For an introduction of the Terraform basics you can be learn from the excellent Terraform documentation

Terraform

The scope of this is related with learned lessons or things that I wish I knew before working with Terraform that may help to save some time before starting using Terraform on a project.

Always use Terraform

As Terraform is an IaC, it helps to define the Infrastructure on the different supported platforms. Using Terraform and also making changes with other tools (Web Consoles, CLI Tools, or SDK) besides Terraform will create inconsistencies and affect the stability and confidence on the infrastructure.

Terraform will try to keep the state defined previously, and also the manual changes won’t be on the defined VCS so if the redeployment is needed those changes will be lost.

Exceptions can be necessary, but for specific needs, like security restrictions (Key Pairs) or specific debugging of issues (security group rules). But these changes should affect controlled components that you keep in mind.

Modules for everything

How do you avoid having to copy and paste the code for the same app deployed in multiple environments, such as stage/services/frontend-app and prod/services/frontend-app?

Modules on Terraform allows to reuse predefined structures of resources. This decrease the snowflake effect and provides a great way to reuse infrastructure already coded.

Modules have some variable as inputs, are located in a different place (different folder, or even repository), they define elements from a provider and can define multiple resources in itself:

# my_module.tf:

resource "aws_launch_configuration" "launch_configuration" {
  name = "${var.environment}-launch-configuration-instance"
  image_id = "ami-04681a1dbd79675a5"
  instance_type = "t3.micro"
}

resource "aws_autoscaling_group" "autoscaling_group" {
  launch_configuration = "${aws_launch_configuration.launch_configuration.id}"
  availability_zones = ["us-east-1a"]
  min_size = "${var.min_size}"
  max_size = "${var.max_size}"
}

Modules are called using the module block in our Terraform configuration file, variables are defined according to the desired requirement, in this example we call the module twice but with different values for different environments.

This uses the module my_module and create an AutoScaling Group with a minimum size of instances of 1 and maximum of 2, also creates a Launch Configuration. Both resources are defined with an specific prefix name, in this case dev:

# my_dev.tf

module "develpment_frontend" {
  source = "./modules/my_module"
  min_size = 1
  max_size = 2
  environment = "dev"

}

The, we can reuse the module but for our production environment where we call the same module my_module and create the ASG a minimum size of instances of 2 and maximum of 4, also creates Launch Configuration and both with the specified prod prefix.

# my_prod.tf

module "development_frontend" {
  source = "./modules/my_module"
  min_size = 2
  max_size = 4
  environment = "prod"
}

Is possible and recommended to define and use different versions of an specific module, this allow us to work using a version control system.

If we store our modules in a VCS, for example git, we can use tags or branches names to call an specific version using the ?ref= option:

module "my-db-module" {
  source               = "git::ssh://git@mygitserver.com/my-modules.git//modules/my_module?ref=feature-branch-001"
  allocated_storage    = "200"
  instance_class       = "db.t2.micro"
  engine               = "postgres"

Terraform States

The Terraform state file is important for Terraform because here is stored all the current state of our Infrastructure. It’s a json file normally located in the hidden folder .terraform inside your Terraform configuration files (.terraform/terraform.tfstate) and is autogenerated when you execute the command terraform apply. The direct file editing of the state is not recommended.

This is an example of a terraform.tfstate file:

{
    "version": 3,
    "terraform_version": "0.11.8",
    "serial": 4,
    "lineage": "35a9fcf6-c658-3697-9d74-480408535ce6",
    "modules": [
        {
            "path": [
                "root"
            ],
            "outputs": {},
            "resources": {
                "aws_s3_bucket.b": {
                    "type": "aws_s3_bucket",
                    "depends_on": [],
                    "primary": {
                        "id": "my-terraform-myaws",
                        "attributes": {
                            "acceleration_status": "",
                            "acl": "private",
                            "arn": "arn:aws:s3:::my-terraform-myaws",
                            "bucket": "my-terraform-myaws",
                            "bucket_domain_name": "my-terraform-myaws.s3.amazonaws.com",
                            "bucket_regional_domain_name": "my-terraform-myaws.s3.amazonaws.com",
                            "cors_rule.#": "0",
                            "force_destroy": "false",
                            "hosted_zone_id": "Z3AQBSTGFYJSTF",
                            "id": "my-terraform-myaws",
                            "lifecycle_rule.#": "0",
                            "logging.#": "0",
                            "region": "us-east-1",
                            "replication_configuration.#": "0",
                            "request_payer": "BucketOwner",
                            "server_side_encryption_configuration.#": "0",
                            "tags.%": "2",
                            "tags.Environment": "Dev",
                            "tags.Name": "Myterraformbucket",
                            "versioning.#": "1",
                            "versioning.0.enabled": "false",
                            "versioning.0.mfa_delete": "false",
                            "website.#": "0"
                        },
                        "meta": {},
                        "tainted": false
                    },
                    "deposed": [],
                    "provider": "provider.aws"
                },
                "aws_s3_bucket_object.b_object": {
                    "type": "aws_s3_bucket_object",
                    "depends_on": [
                        "aws_s3_bucket.b"
                    ],
                    "primary": {
                        "id": "terraform/terraform.tfstate",
                        "attributes": {
                            "acl": "private",
                            "bucket": "my-terraform-myaws",
                            "cache_control": "",
                            "content_disposition": "",
                            "content_encoding": "",
                            "content_language": "",
                            "content_type": "binary/octet-stream",
                            "etag": "d41d8cd98f00b204e9800998ecf8427e",
                            "id": "terraform/terraform.tfstate",
                            "key": "terraform/terraform.tfstate",
                            "server_side_encryption": "",
                            "source": "/dev/null",
                            "storage_class": "STANDARD",
                            "tags.%": "0",
                            "version_id": "",
                            "website_redirect": ""
                        },
                        "meta": {},
                        "tainted": false
                    },
                    "deposed": [],
                    "provider": "provider.aws"
                },
                "aws_vpc.main": {
                    "type": "aws_vpc",
                    "depends_on": [],
                    "primary": {
                        "id": "vpc-04ca185a8098e7fe4",
                        "attributes": {
                            "arn": "arn:aws:ec2:us-east-1:415014508385:vpc/vpc-04ca185a8098e7fe4",
                            "assign_generated_ipv6_cidr_block": "false",
                            "cidr_block": "10.0.0.0/16",
                            "default_network_acl_id": "acl-032ee9a52d3e2c936",
                            "default_route_table_id": "rtb-05b20ced6846f0f56",
                            "default_security_group_id": "sg-06295eabdd0ff3888",
                            "dhcp_options_id": "dopt-cf96eab4",
                            "enable_classiclink": "false",
                            "enable_classiclink_dns_support": "false",
                            "enable_dns_hostnames": "false",
                            "enable_dns_support": "true",
                            "id": "vpc-04ca185a8098e7fe4",
                            "instance_tenancy": "default",
                            "main_route_table_id": "rtb-05b20ced6846f0f56",
                            "tags.%": "0"
                        },
                        "meta": {
                            "schema_version": "1"
                        },
                        "tainted": false
                    },
                    "deposed": [],
                    "provider": "provider.aws"
                }
            },
            "depends_on": []
        }
    ]
}

As we work on our infrastructure, others partners will need to modify the infrastructure and apply their changes, changing the Terraform state file, this’s why it’s recommended to store this file in shared storage. Terraform support multiple Backends to store this file, like etcd, azurem, S3 or Consul.

This is an example of how to define the path of the Terraform State file using the S3 provider. A DynamoDB is also used to control the lock access to the file, needed in case that someone else is editing the Infrastructure at the same time. This will lock the write access to just one user at a time.

terraform {
  required_version = ">= 0.11.7"
  backend "s3" {
    encrypt        = true
    bucket         = "bucket-with-terraform-state"
    dynamodb_table = "terraform-state-lock"
    region         = "us-east-1"
    key            = "locking_states/terraform.tfstate"
  }
}

As the Terraform state file could store delicate information (credentials), it’s recommended to encrypt the Storage using the options provided by the Backends.

Also, as your Infrastructure grows and you need to define multiple environments, you need to split your terraform state by environments and by components inside your environment. This way you will be able to work on different environments at the same time and multiple partners could work on different components of the same Infrastructure without being locked (one user modifying Databases and other modifying Load Balancers). This can be achieve using the specific key component in the Backend definition:

# my_infra/prod/database/main.tf:
    ...
    key            = "prod/database/terraform.tfstate"
    ...
# my_infra/dev/database/main.tf:
    ...
    key            = "dev/database/terraform.tfstate"
    ...
# my_infra/dev/loadbalancer/main.tf:
    ...
    key            = "dev/loadbalancer/terraform.tfstate"
    ...

Split all the stuff

On the Terraform states section, it was mentioned to Split the Terraform state by Environments and by Components, you create this building all the different components of you infrastructure isolated from the others. What kind of division should you manage? That depends on the size of the project, complexity and the size of your Team.

For example, some components options can be defined inside itself as inline blocks. But sometimes it’s recommended to define this structures in a different resource. In this example an AWS Route Table has the routes definition inline:

resource "aws_route_table" "route_table" {
  vpc_id = "${aws_vpc.vpc.id}"
  route {
    cidr_block = "10.0.1.0/24"
    gateway_id = "${aws_internet_gateway.example.id}"
  }
}

Alternatively, you can create the exact same route as a separate AWS Route resource:

resource "aws_route_table" "route_table" {
  vpc_id = "${aws_vpc.vpc.id}"
}
resource "aws_route" "route_1" {
  route_table_id = "${aws_route_table.route_table.id}"
  destination_cidr_block = "10.0.1.0/24"
  gateway_id = "${aws_internet_gateway.example.id}"
}

This allows to be more flexible in the definition of your Infrastructure but increase complexity. Just take in mind that it’s easier to group stuff once it’s defined that split it when you already deployed your Infrastructure.

As the complexity can increase, if you want to deploy all your infrastructure in one command you can use Bash scripts or tools like Ansible or Terragrunt.

Outputs everywhere

Outputs shows the information wanted after the Terraform templates are deployed. They also work inside the modules to export the information from modules.

output "instance_id" {
  value = "${aws_instance.instance_ws.id}"
}

When used on modules, two outputs must be defined, one on the module and a similar in the configuration files. The outputs have to be explicitly defined. Output information is stored in Terraform state file and can be queried from other terraform templates.

It’s recommended to define Outputs for resources even if you are not using them in that moment. Check the resource, the outputs provided by the resource and choose wisely which information can be useful in your infrastructure when you are using this Terraform resource. This way you decrease the need to go back and edit your module and edit your resource just because one little output is needed by another new resource that you are defining.

Also, as you may want to organize your files, you can save the outputs file in an specific file called outputs.tf.

Define all the little things

On the creation of the components, Terraform use the defaults options of the provider that you are using if these are not defined. It’s important to acknowledge the default components in use and define them in Terraform, as it’s possible that you need them in the future and defaults options can be modified by the providers with no notice.

One example of this are the Route Tables, not focused sometimes at the beginning of projects, or Elastic Container Repositories, easy to define but not always in the mindset.

resource "aws_ecr_repository" "repository" {
  name = "name_of_repo"
}

Terraform interpolations

The interpolation syntax is powerful and allows you to reference variables, attributes of resources, call functions, etc.

String variables -> ${var.foo}
Map variables -> ${var.amis["us-east-1"]}
List variables -> ${var.subnets[idx]}

When you need to grab data from a module output or from the state of some resource you can use the module. or the data. syntax to call the desired attributes.

# Getting information from a module
output "my_module_bar_value_from_module" {
  value = ${module.my_module.bar}
}

You can also provide use some arithmetic or logical operations using interpolation. On this snip of code, only if the evaluation of var.something is true (1, true) the VPN resource will be included.:

resource "aws_instance" "vpn" {
  count = "${var.something ? 1 : 0}"
}

You can check more information about the supported Interpolations in the Terraform documentation

Environments management

DVCS Branch for each environment

It’s possible to manage the different environments for Terraform deployments using Branching. But this will have issues and advantages:

Advantages:

  1. Usage of git flow

Issues:

  1. In each branch, the value for Terraform state storage must be modified always
  2. Not clear at the moment of deployment (terraform apply) which environment is really being affected (check environment git status, git branch)
  3. Not too easy to define variables, as before each merge the values should be specified for the wanted environment and reversed after a merge

Workspaces from Terraform

From Terraform version 0.9 (called Environments) and since 0.10 onwards provides a feature called Workspaces

It is possible to define new workspaces using the terraform workspace command, change between them or delete.

$ terraform workspace -h
Usage: terraform workspace

  Create, change and delete Terraform workspaces.

Subcommands:
    delete    Delete a workspace
    list      List Workspaces
    new       Create a new workspace
    select    Select a workspace
    show      Show the name of the current workspace

Advantages:

  1. Defined by Hashicorp, so it’s possible that most and better features could be developed
  2. Reduce the usage of code

Issues:

  1. Still an early implementation
  2. Not yet supported by all the backends
  3. Not clear on the deployment (terraform apply) which workspace will be used (terraform workspace show)

Folders structure

One simple and useful option is to define components inside folders by environment.

project-01
├── dev
│   ├── clusters
│   │   └── ecs_cluster
│   │       ├── service01
│   │       │   ├── main.tf
│   │       │   ├── outputs.tf
│   │       │   ├── variables.tf
│   │       ├── service02
│   │       ├── service03
│   │       ├── service04
│   ├── database
│   │   ├── main.tf
│   │   ├── outputs.tf
│   │   └── variables.tf
│   ├── elasticsearch
│   │   ├── main.tf
│   │   ├── outputs.tf
│   │   └── variables.tf
│   └── vpc
│       ├── main.tf
│       ├── outputs.tf
│       └── variables.tf
├── global
│   │   └── web_login
│   │       ├── main.tf
│   │       ├── outputs.tf
│   │       └── variables.tf
│   └── terraform_state
│       ├── output.tf
│       ├── variables.tf
│       └── main.tf
├── prod
│   ├── clusters
│   ├── database
│   ├── elasticsearch
│   └── vpc
│       ├── main.tf
│       ├── outputs.tf
│       └── variables.tf
├── qa
│   ├── clusters
│   ├── database
│   ├── elasticsearch
│   └── vpc
│       ├── main.tf
│       ├── outputs.tf
│       └── variables.tf
└── README.md

Advantages:

  1. Clear definition of the environment being deployed (it’s in the folder path)
  2. Most used and fail-proof on public deployments
  3. Terraform States can be defined for each environment folder with no issues
  4. Specify the name of the outputs for each environment

Issues:

  1. Duplicated code
  2. An overwhelming amount of folders if the project is too big
  3. Copying of code and replacing of core values is always needed

Which one should you choose? It depends. The last one, the folder structure by environments is easy and simple to use, but other environment management can be better. But check the folder structure on the folder structure, it’s a recommended way to split the different components of your infrastructure.

The terraform command has multiple options, but these is a recommended workflow at the moment of deployment:

  1. Download the modules and force the update

    The command terraform init initialize the workspace, downloading the providers, modules and setting up the terraform state backend. But if a module is already downloaded Terraform won’t recognize that a new version of a module is available. With terraform get is possible to download the modules, but it’s recommended to use the -update option to force an update.

    terraform get -update
    
  2. Once that you have the latest modules, is necessary to initialize the Terraform workspace downloading the providers, modules (already made with the first command) and initialize the terraform state backend

    terraform init
    

    It’s possible to also use the -upgrade option to force the update of providers, plugins and modules.

  3. Before the deployment, Terraform is able to define a plan of the deployment (what will be created, modified or destroyed). This plan is useful to use on a Pipeline to check the changes before the real deployment.

    It’s important to note that the plan option is not always 1:1 with the deployment. If a component is already deployed, or if not enough permissions are provided, the plan will pass but the deployment could fail.

    terraform plan
    

    To simplify things, it’s possible to run all in one bash line: terraform get -update && terraform init && terraform plan

  4. On the final step, the deployment, Terraform will create a plan and provide the option to response with yes or no to deploy the desired architecture.

    terraform apply
    

    Terraform can’t rollback deployed stuff. So, if in the deployment an error appears, the issue should be solved. Also, is possible to destroy (terraform destroy), but it will destroy and not rollback the changes.

    It’s possible to specify the application of an specific change or destroy an specific resource with the option -target.

Calling the data from outside

Use of data sources allows a Terraform configuration to build on information defined outside of Terraform, or defined by another separate Terraform configuration.

  • Data from a remote state, this is useful to call states from another terraform deployments>

    data "terraform_remote_state" "vpc_state" {
      backend = "s3"
      config {
        bucket = "ci-cd-terraform-state"
        key = "vpc/terraform.tfstate"
        region = "us-east-1"
      }
    }
    
  • Data from AWS or external systems>

    data "aws_ami" "linux" {
      most_recent = true
    
      filter {
        name   = "name"
        values = ["amzn2-ami-hvm-2.0.20180810-x86_64-gp2*"]
      }
    
      filter {
        name   = "virtualization-type"
        values = ["hvm"]
      }
    
      owners = ["137112412989"]
    }
    

Little issues and how to handle them

Timeouts

Network timeouts can occur because of latency in network connection to Cloud Providers, there are also the time of some operations that should be managed.

This can be managed using the timeouts option inside the Terraform resources.

timeouts {
  create = "60m"
  delete = "2h"
}

Also, if the timeout operation is not defined and had some issues related to incomplete deployments, another terraform apply execution can solve the issues.

Dependencies

Terraform can handle dependencies between items, but sometimes some resources need to be defined after the deployment of another resource. Examples of this can be the Target Groups and Application Load Balancer relationship, the option depends_on on Terraform can help to create interdependence:

resource "aws_alb_target_group" "alb_target_group" {
  depends_on = ["["aws_alb.alb"]"]
}

Sometimes it’s also possible to “solve” this redeploying the infrastructure, as Terraform saves the state of your Infrastructure, there is no issue to redeploy your infrastructure.

Remaining lock states

It’s possible to have issues on the machine deploying the infrastructure, and as a result of that lock status can be defined creating a “deadlock”. This can be solved by editing the states backend and removing the remaining wrong value.

This can be dangerous and should be double checked before making changes

New and interesting stuff

Terratest

To test your code, you can use the great tool called Terratest. This tools uses Go’s unit testing framework.

Terraform version 0.12

The next Terraform version was announced a few months ago and will improve some interpolation and bring some changes to the HCL language.

comments powered by Disqus