Once a month, we send out a newsletter to all Gruntwork customers that describes all the updates we’ve made in the last month, news in the DevOps industry, and important security updates. Note that many of the links below go to private repos in the Gruntwork Infrastructure as Code Library and Reference Architecture that are only accessible to customers.
Hello Grunts,
In the last month, we made a number of improvements to our EKS modules, including adding support for Managed Node Groups and Fargate (serverless worker nodes), launched Gruntwork CIS Compliance for the Reference Architecture, added a new module for configuring AWS Security Hub, added new modules for creating a Terraform-backed AWS Landing Zone, updated the Reference Architecture to enable VPC logs and build Packer templates in custom VPCs, gave a talk on automated testing best practices for infrastructure code, launched a dedicated website for Terragrunt, wrote a blog post on how to manage multiple versions of Terragrunt and Terraform as a team, and shared our 2019 Year in Review.
As always, if you have any questions or need help, email us at support@gruntwork.io!
Motivation: Last Year in November, AWS announced support for Managed Node Groups, a feature that provides a managed ASG that is optimized for use with EKS. Then, in December, AWS announced support for Fargate in EKS, a serverless option for Kubernetes workers where you do not need to deploy and manage any EC2 instances (servers). With these two features, users now had three options for running their workloads on EKS. To many new users of EKS, it isn’t immediately obvious what the trade offs are between the three options, and there is no way easy to see that other than to ramp up on all three options.
Solution: We wrote A Comprehensive Guide to EKS Worker Nodes blog post that introduces each of the options for worker nodes, how they work, and what trade offs you should consider when picking which one to use. We also updated our EKS module, terraform-aws-eks, to support Managed Node Groups and Fargate. Youo can now use either the eks-cluster-managed-worker
module to provision AWS-managed worker nodes or the eks-cluster-workers
module to provision self-managed worker nodes. Additionally, the Control Plane module (eks-cluster-control-plane
) now exposes an input variable fargate_only
which can be used to configure the cluster with default Fargate Profiles such that all pods in the kube-system
and default
Namespaces (the two Namespaces created by default on all EKS clusters) will deploy to Fargate.
Other EKS updates:
eks-vpc-tags
now supports multiple EKS clusters.eks-alb-ingress-controller
module to execute arbitrary code on destroy of the module.eks-k8s-role-mapping
and eks-cluster-control-plane
now support Python 3.8. Finally, fluentd helm charts used in eks-cloudwatch-container-logs
have been updated to the latest.eks-cluster-managed-workers
module which can be used to provision Managed Node Groups.eks-cluster-workers
module using the cluster_instance_root_volume_encryption
input variable. Additionally, you can now define the --txt-owner-id
argument using the txt_owner_id
input variable for external-dns
, which can be used to uniquely identify which instance of external-dns
created the DNS records on the Hosted Zone. This release also fixes a bug where the YAML config outputted by the eks-k8s-role-mapping
module was in a non-deterministic order. While this was behaviorally correct, it induced an inconvenience where the ConfigMap was reported as out of date by Terraform.What to do about it: Check out the blog post to learn more about the three worker node options, and upgrade your EKS modules to the latest version (v0.13.0) to try them out!
Motivation: Last month, we announced Gruntwork Compliance for the CIS AWS Foundations Benchmark. The AWS Foundations Benchmark is an objective, consensus-driven guideline for establishing secure infrastructure on AWS from the Center for Internet Security. With Gruntwork Compliance, you gain access to a set of modules from the Infrastructure as Code Library that were certified by Gruntwork and CIS to be compliant with the Benchmark. However, you still had to combine the modules together into a coherent architecture to set up CIS compliant accounts.
Solution:**** This month, we are excited to announce a CIS Compliant Reference Architecture! The Reference Architecture is an opinionated end-to-end tech stack built on top of the Infrastructure as Code Library that we deploy into your AWS accounts in about a day. This month we have extended the Reference Architecture to be CIS compliant for customers who have Gruntwork Compliance. This means that not only do you get a full end-to-end, production grade architecture in a day, but it will also be certified to comply with the CIS AWS Foundations Benchmark.
If you are a Gruntwork Compliance customer, you can check out example Reference Architecture code at the following repositories:
AWS Security Hub Module: This month, we also extended our compliance modules with a module that enables AWS Security Hub in all regions of an account. AWS Security Hub will automatically and continuously monitor your AWS account for CIS compliance status. This provides a way to ensure your account stays compliant over time as you apply changes to your infrastructure.
Other Compliance Updates
terraform
deprecation warnings in the aws-config
module caused by referring to providers as strings. Also fix bug where cloudtrail
module can fail as it attempts to create the access logging bucket even when s3_bucket_already_exists
is set to true
.iam-password-policy
module no longer embeds the provider configuration, similar to the other modules in this repository. This allows users to better customize the provider setup. Note that this is a backwards incompatible release. Refer to the release notes for information on how to migrate to this version.custom-iam-entity
to latest version to pull in fix for newer versions of terraform.custom-iam-entity
module now supports creating policies to grant full access to arbitrary services that may not have AWS managed policies.cloudtrail
module to directly call the cloudwatch-logs-metric-filters
module to configure the metric filters alongside the management of CloudTrail. Additionally, both the cloudtrail
module and the cloudwatch-logs-metric-filters
modules have been modified to no longer configure the aws
provider within the module. Instead, you should configure the providers in your wrapper modules, or provide the configuration with terragrunt
. This release introduces a new module aws-securityhub
which can be used to configure AWS Security Hub in all regions of the authenticated account. Refer to the README for more information.cloudtrail
module.What to do about it: If you are a Gruntwork Compliance customer, check out the new AWS Security Hub module. If you are interested in a CIS compliant Reference Architecture, contact us!
Motivation: Create all the AWS accounts in an AWS Organization can be tricky, as you have to configure each account with a security baseline that includes IAM Roles, IAM Users, IAM Groups, IAM Password Policy, AWS CloudTrail, AWS Config, AWS Config Rules, Amazon GuardDuty, a VPC, and more. AWS offers two products to make it easier to set up AWS accounts, AWS Landing Zone and AWS Control Tower, but both have significant limitations: AWS Landing Zone is only available via an AWS Pro Serve consulting engagement, which can be quite expensive and time consuming, and the the quality and flexibility of the (mostly CloudFormation) code is questionable; AWS Control Tower allows you to create new AWS accounts by clicking around the AWS console, but it’s slow, clunky, only supported in new AWS accounts that don’t already have AWS Organizations enabled, and it only supports a limited set of pre-defined account baseline blueprints, so you have little ability to customize the accounts or manage them as code.
Solution: We have released a set of new Terraform modules to help you create AWS accounts with all the features of AWS Landing Zone, but 100% customizable and managed as code. The new modules include:
aws-organizations
: Create a new AWS Organization and provision new AWS accounts under your organization.aws-organizations-config-rules
: Configure a best-practices set of Managed Config Rules for your AWS Organization root and all child accounts.guard-duty-multi-region
: Configure AWS GuardDuty, a service for detecting threats and continuously monitoring your AWS accounts and workloads against malicious activity and unauthorized behavior.You can combine these modules to create your own Terraform-backed Landing Zone that matches your exact requirements and is fully managed as code. We have examples, docs, and reusable account baseline modules coming soon!
What to do about it: Check out the new modules in module-security to learn more.
Motivation: The Reference Architecture is an opinionated, end-to-end tech stack built on top of the Infrastructure as Code Library that can be deployed in about a day. However, we noted a few feature gaps in the current Reference Architecture implementation:
Solution: We have updated the Reference Architecture to address these gaps:
gruntwork.io/allow-packer
, and the Mgmt VPC has been updated to include this tag. This ensures that even if you no longer have a default VPC, the packer templates still work. You can view this commit in infrastructure-modules to see the changes.vpc-mgmt
and vpc-app
modules have been updated to configure VPC flow logs. You can view this commit for the infrastructure-modules update, and this commit for the equivalent changes in infrastructure-live.download-rds-ca-certs.sh
script, which we use for downloading the RDS TLS CA certificates for use in end to end encryption, has been updated to download the 2019 certificates. You can view this commit in infrastructure-modules to see the changes.Other Ref Arch updates:
destroy-all
. You can see this commit in infrastructure-live to see the changes.ecs-service
modules into one. You can see this commit in infrastructure-modules to see the changes.What to do about it: Follow the examples in the commits above to update your own Reference Architecture.
Motivation: The amount of documentation for Terragrunt has grown over the last few years, and the README.md
for the repo had gotten so big as to be nearly unusable.
Solution: We’ve launched a dedicated website for Terragrunt at https://terragrunt.gruntwork.io/! It breaks the documentation down across multiple pages, has more examples, and an easier-to-use navigation.
What to do about it: We hope the new website makes it easier to get started with Terragrunt. Let us know what you think!
Motivation: Many customers were asking for guidance on the best way to write automated tests for the infrastructure code.
Solution: We did a talk at QCon SF and AWS re:Invent called Automated Testing for Terraform, Docker, Packer, Kubernetes, and More. Topics covered include: unit tests, integration tests, end-to-end tests, dependency injection, test parallelism, retries and error handling, static analysis, and more.
What to do about it: Check out the slides and video on the InfoQ website.
Motivation: We wanted to take some time to pause, reflect on what we did, and think a little about the future of Gruntwork.
Solution: We wrote a blog post called A Year in Review, 2019, where we shared three lessons that helped us nearly triple our recurring revenue from $1M to $2.7M (with $0 in funding).
What to do about it: Read the blog post and share your feedback, especially on what you’d love to see us do in 2020!
Motivation: Terragrunt and Terraform are relatively young projects in the DevOps ecosystem. As such, both projects introduce backwards incompatible changes more often than we like. Oftentimes we find the need to switch between multiple terraform and terragrunt versions. We also want to enforce versions across the team to avoid inadvertent state file upgrades that are backwards incompatible.
Solution: We wrote a blog post describing a workflow around tfenv and tgenv that makes it easier to work with multi version Infrastructure as Code projects!
What to do about it: Check out our blog post and let us know what you think!
RunTerragrunt
is now a public function that can be called programmatically when using terragrunt as a library.terragrunt console
does not have readline properties and thus the output is garbled. Also fix bug where get_aws_account_id
will cause terragrunt to crash if you do not have any credentials in the chain. Instead, terragrunt now fails gracefully with a proper error message.validate
is no longer a command that accepts -var
and -var-file
arguments in terraform 0.12.X. Therefore, it is no longer in the list of commands returned by get_terraform_commands_that_need_vars
.AWS_SECURITY_TOKEN
after assuming the role. This is useful for overriding the credentials set by tools like aws-vault
when role chaining, as these tools also set the AWS_SECURITY_TOKEN
which confuses the AWS SDK.enabled-aws-regions
module, which returns all enabled regions for an account. This is useful for designing modules that need to enable a specific resource or module on all regions of the account.run-pex-as-resource
now supports configuring a destroy
provisioner that runs the pex on destroy of the resource.run-pex-as-resource
now outputs pex_done
, which can be used as a dependency for linking resources that depend on the pex script being run.Lock
and LockTimeout
parameters on terraform.Options
.docker.Stop
and docker.StopE
which can be used to stop a container on docker using docker stop
.ssh
module.ServiceAccount
annotations were not properly aligned with the metadata
block.
vpc-app
and vpc-mgmt
modules using the new input variables availability_zone_blacklisted_names
, availability_zone_blacklisted_ids
, and availability_zone_state
.vpc_custom_tags
input variable.vpc-app
and vpc-mgmt
will create a single VPC endpoint for all tiers, instead of one for each tier. NOTE: Since the VPC endpoints need to be recreated with this change, existing VPCs will experience a brief outage when trying to reach these endpoints (S3 and DynamoDB) while the endpoints are being recreated when you upgrade to this release. You can expect up to 10 seconds of endpoint access downtime for terraform to do the recreation.acm-tls-certificate
.aurora
module now configures cluster instances with (a) create_before_destroy = true
, to ensure new instances are created before old ones are removed and (b) ignore_changes = [engine_version]
, to ensure updates to engine_version
will flow from the aws_rds_cluster.DescribeDBClusterSnapshots
IAM policy.cloud-build-triggers
module.aws-config
module now supports conditional logic to turn off all resources in the module. When you set the create_resources
input variable to false
, no resources will be created by the module. This is useful to conditionally turn off the module call in your code. Additionally, this fixes a bug where the AWS provider was being configured within the aws-config
module. This makes the module less flexible for use since you can't override the provider configuration.aws-organizations
module allows you to create and manage your AWS Organization and child AWS accounts as code.aws-organizations-config-rules
allows you to configure a best-practices set of AWS Organization level managed config rules.cloudtrail
module will no longer attempt to create the server access logging S3 bucket if s3_bucket_already_exists
is set to true
, even if enable_s3_server_access_logging
is true
.custom-iam-entity
module now supports creating policies to grant full access to arbitrary services that may not have AWS managed policies.aws_region
variable in the cloudtrail
module. This variable was not used in the module, so you can safely omit it from the module parameters.tags
input variable on the lambda
and lambda-edge
modules.create_resources
boolean flag to the sns
module, which works similarly as setting count
to 1 or 0, which is necessary as terraform does not yet support this feature for modules.s3-cloudfront
module was not able to send CloudFront access logs to the S3 bucket. This has now been fixed by updating the policy on that S3 bucket.wait_for_deployment
to tell Terraform whether it should wait for CloudFront to finish deploying the distribution. If true
, the module will wait for the distribution status to change from InProgress
to Deployed
. Setting this to false
will skip the process.terraform-update-variable
: fixes bug where errors with running terraform fmt
caused the tfvars file to be cleared out; also fixes bug where string matching for the variable name was too relaxed, causing it to ignore prefixes. E.g tag
would match both tag
and canary_tag
.ec2-backup
module to run on NodeJS 12 instead of 8, as version 8 is going EOL in February, 2020.git-add-commit-push
.install-exhibitor
fails because maven version 3.6.1 is no longer available.cloudwatch-logs-metric-filters
module no longer configures an aws provider, and thus no longer needs the aws_region
input variable. This also means that you will need to configure your provider outside of the module, which in turn allows you to customize the provider to your needs.logs/cloudwatch-log-aggregation-iam-policy
module can now be conditionally excluded based on the input variable create_resources
. When create_resources
is false
, the module will not create any resources and become a no-op.disable_api_termination
input variable.spot_price
input variable.ca_cert_identifier
argument for aws_db_instance
. This argument configures which CA certificate bundle is used by RDS. The expiration of the previous CA bundle is March 5, 2020, at which point TLS connections that haven't been updated will break. Refer to the AWS documentation on this. The argument defaults to rds-ca-2019
. Once you run terraform apply
with this update, it will update the instance, but the change will not take effect until the next DB modification window. You can use apply_immediately=true
to restart the instance. Until the instance is restarted, the Terraform plan will result in a perpetual diff.server-group
module now outputs the instance profile name via the output variable iam_instance_profile_name
.What happened:**** Packer 1.5.0**** released beta support for configuring Packer templates using HCL. You can read more about it in the official docs.
Why it matters: Prior to 1.5.0, Packer only supported templates that were defined using JSON. JSON has the advantage of being ubiquitous, but has a number of drawbacks, including lack of support for comments, multi-line strings, and code reuse, which made it difficult to write complex templates. Now, with Packer 1.5.0, you have the option to define your Packer templates using HCL (the same language used in Terraform). This makes it easier to:
source
and build
blocksWhat to do about it: Check out the new official docs on using HCL2 with Packer, including the migration guides to convert your packer JSON templates to HCL, to give it a try and let us know what you think!
What happened: AWS announced support for serverless Kubernetes in the form of Fargate for EKS.
Why it matters: Up until this point, you always had to run servers (EC2 instances) in your AWS account to act as worker nodes for your EKS cluster. This meant that you still had to manage the lifecycle of the instances, including worrying about SSH access and applying security patches. Now with Fargate, you can get a truly serverless experience with EKS. Fargate allows you to schedule Pods on ephemeral VMs that AWS fully manages for you. You no longer have to worry about all the concerns around scaling, logging, security, etc that comes with managing servers when you use Fargate.
What to do about it: Check out our blog post on EKS worker nodes to learn more about all the trade offs with using Fargate, and try it out using the latest version of our terraform-aws-eks
module.
What happened: AWS announced support for automatic resolution of EKS private Kubernetes endpoints, even over a peered VPC network.
Why it matters: Locking down your Kubernetes API endpoint is an important step to improve your security posture of the EKS cluster. EKS offers the ability to lock down the Kubernetes API endpoint such that it is only available within the VPC (private endpoint). Up until now, you could not access this endpoint over a peered VPC network unless you had enabled Route 53 DNS forwarding rules across the peered network. This was quite expensive to maintain for a trivial functionality. Now, with this update, you no longer need to configure, maintain, and pay for the forwarding rules as AWS automatically sets them up for you.
What to do about it: This feature is available to all EKS clusters. You can safely remove the DNS forwarding rules and still access your EKS endpoints over a peered network.
What happened: AWS has added support for managed auto scaling for ECS.
Why it matters: Auto scaling ECS clusters used to be tricky, even if you used an Auto Scaling Group under the hood, as you had to carefully detect scale down events, identify the EC2 instances that were going to be terminated, and move any Docker containers on those instances to other instances in the cluster. Now, instead of manually managing all of this, managed auto scaling will do it all for you, automatically.
What to do about it: You can try out managed auto scaling using the aws_ecs_capacity_provider resource. However, please note that Terraform currently has some limitations that may make this resource tricky to use: e.g., Terraform doesn’t currently support deleting ECS cluster capacity providers, so terraform destroy
might not clean everything up correctly; also, managed auto scaling will automatically add tags to your Auto Scaling Groups that Terraform won’t know how to manage, so you might get spurious diffs on plan
.
What happened: AWS has added a provisioned concurrency feature for AWS Lambda, which allows you to automatically keep some number of Lambda functions warm.
Why it matters: One of the biggest drawbacks with AWS Lambda are cold starts, where each concurrent invocation of a Lambda function could take hundreds of milliseconds or even multiple seconds to boot up the first time. This made Lambda very tricky to use when latency mattered: e.g., for a web application. AWS now supports provisioned concurrency for Lambda functions, which allows you to “pre-warm” a configurable number of Lambda function containers to avoid cold starts. You can also use auto scaling policies to dynamically change the level of provisioned concurrency based on load.
What to do about it: If you’re using Lambda in latency-sensitive situations, give provisioned concurrency a try! You can configure it with Terraform by using the aws_lambda_provisioned_concurrency_config resource.
Below is a list of critical security updates that may impact your services. We notify Gruntwork customers of these vulnerabilities as soon as we know of them via the Gruntwork Security Alerts mailing list. It is up to you to scan this list and decide which of these apply and what to do about them, but most of these are severe vulnerabilities, and we recommend patching them ASAP.
npm
and yarn
CLI used to install javascript packages. npm
versions prior to 6.13.3 and yarn
versions prior to 1.21.1 are vulnerable to a path escaping attack that allows any arbitrary package to overwrite binaries installed in /usr/local/bin
. This means that if you installed a compromised package, you could end up with a malicious version of utilities that you regularly use. For example, an exploit of this vulnerability can overwrite your version of node
or npm
with a malicious version that can do arbitrary things like open a backdoor for remote access. We recommend that you immediately update your versions of npm
and yarn
to the latest versions to mitigate this vulnerability. You can learn more about this vulnerability at https://blog.npmjs.org/post/189618601100/binary-planting-with-the-npm-cli. We alerted our security mailing list about this vulnerability on December 12th, 2019