Once a month, we send out a newsletter to all Gruntwork customers that describes all the updates we’ve made in the last month, news in the DevOps industry, and important security updates. Note that many of the links below go to private repos in the Gruntwork Infrastructure as Code Library and Reference Architecture that are only accessible to customers.
Hello Grunts,
We haven’t written a newsletter since last December, so we wish you a belated happy new year! In the last couple months, we updated all of our modules to work with Terraform 0.11, released a new health-checker module, and made lots of important bug fixes. Make sure to check out the Security Updates section for information about Spectre and Meltdown, as they are two of the most severe security vulnerabilities in recent history.
As always, if you have any questions or need help, email us at support@gruntwork.io!
Motivation: Terraform 0.11 came out, but it wasn’t possible to upgrade to it because it included a backwards incompatible change that broke many modules.
Solution: We’ve gone through all the modules in our Infrastructure as Code Library and updated them to work with Terraform 0.11.
What to do: It should now be safe to upgrade to Terraform 0.11. You’ll need to:
required_version
setting (if you’re using it) in all the main.tf
files in your infrastructure-modules
repo.gruntwork-io/module-aws-monitoring, v0.9.0 gruntwork-io/module-data-storage, v0.5.0 gruntwork-io/module-ecs, v0.6.1 gruntwork-io/module-security, v0.7.0 gruntwork-io/module-server, v0.3.0 gruntwork-io/module-vpc, v0.4.0 gruntwork-io/package-lambda, v0.2.0 gruntwork-io/package-messaging, v0.1.0
terragrunt plan
on each of your modules and keep your eyes open for warnings like this: Warning: must use splat syntax to access xxx.yyy attribute “zzz”, because it has “count” set
. This is the backwards incompatibility in Terraform 0.11 rearing its ugly head. We fixed it in all of our modules, but you’ll need to fix it in your own code in the infrastructure-modules
repo. The fix is to find the offendingxxx.yyy
resource that is using a count
parameter and to update any references to it from xxx.yyy.zzz
to element(concat(xxx.yyy.*.zzz, list("")), 0)
. For example, if you had an aws_instance
called foo
with a count
parameter and you wanted to access the public_ip
attribute, instead of looking it up by doing aws_instance.foo.public_ip
, you’d have to do element(concat(aws_instance.foo.*.public_ip, list("")), 0)
(see Terraform 0.11 upgrade guide for more info). Yes, it’s ugly.terraform apply
or terragrunt apply
, add the -auto-approve
flag to them. By default, the apply
command in Terraform 0.11 is interactive, so if you don’t add this flag, your CI build will hang!Check out this commit (you must have access to the Acme sample Reference Architecture) for an example of the changes you’ll have to make.
Motivation: While setting up the Confluent tools, we needed the ability to have a single health check report the uptime of multiple separate services.
Solution: We wrote a new open source tool, health-checker, that exposes an HTTP listener that responds with a 200 OK if a TCP connection can be successfully opened to one or more configurable ports.
What to do about it: If you need such a health-checker, consider downloading one of the binaries!
Motivation: The syslog
module in module-aws-monitoring
configures logrotate
on all servers to automatically rotate and clean up old syslog files so they don’t take up too much disk space. Unfortunately, the configuration had an issue where a process could maintain a file handle to the old log file and continue writing to it, even after rotation, allowing that file to grow indefinitely and eat up lots of disk space.
Solution: We’ve fixed the logrotate
config in module-aws-monitoring, v0.8.0 [Update: actually, please use module-aws-monitoring, v0.8.1 due to a minor bug fix] using the copytruncate
and maxsize
settings.
What to do about it: Update your Packer templates to use v0.8.1 of the syslog
module and redeploy your servers to pick up the fix!
Motivation: The install-oracle-jdk module was no longer working, so Packer builds that were trying to install JDK 8 (e.g., for ZooKeeper or Kafka) were failing.
Solution: It seems that Oracle deletes old versions of the JDK when it releases new ones, so the URLs used to install the JDK fail. Moreover, new versions of the JDK require you to specify the new checksum, which there’s obviously no way of getting ahead of time. For now, we’ve released v0.3.0 of package-zookeeper with a patched install-oracle-jdk
module, but this is likely to break again in the future, so we may have to find a different way to manage JDK installs.
What to do about it: Update your Packer templates to use v0.3.0 of package-zookeeper.
What happened: Whereas before, you could set the user and admin lists empty with our CloudTrail module, it seems that AWS has changed its validation logic and all KMS keys, including the ones used to encrypt CloudTrail logs, must now have at least one user and admin associated with them.
Why it matters: If kms_key_administrator_iam_arn
or kms_key_user_iam_arns
are empty (as they were, by default, in all accounts excepthe security account of the multi-account Reference Architecture), next time you run apply in a cloudtrail
folder in infrastructure-live, you’ll get a validation error.
What to do about it: Specify the ARN of (trusted!) users in your security account for the kms_key_administrator_iam_arn
and kms_key_user_iam_arns
settings.
backend { ... }
block now also checks .tf.json
files.local
backends.apply-all
command now automatically sets the -auto-approve
parameter so apply
happens non-interactively with Terraform 0.11.server-group
module now supports the option of adding DNS records to each ENI.server-group
module now allows users to specify their own list of names to be used when creating DNS records, as well as to associate an Elastic IP address with each ENI.ecs-cluster-alarms
and ecs-service-alarms
modules now expose new input variables you can use to configure what the alarms should do if no data is being emitted (default is missing
).aurora
module now exposes a db_cluster_parameter_group_name
parameter you can use to set a custom parameter group name.roll-out-ecs-cluster-update.py
script will now display better error messages if it can't find your ECS cluster for some reason (e.g., you specified the wrong region).cloudfront
module now enables gzip compression by default.openvpn-server
module are now configurable.What happened: AWS announced an Auto Naming API for Service Name Management and Discovery with Route53.
Why it matters: Previously, the only way for Microservice A to connect to Microservice B in an “AWS-native” way was to go through an ALB, which added an extra network hop and made it difficult to run a single microservice with both public and private access endpoints.
With this release, AWS allows a microservice to register itself with Route53 on boot, and Route53 will confirm those addresses with either Route53 globally distributed health checks (for publicly accessible services) or API queries to your ALB or NLB to determine the health check status of your microservice instance.
That means that it’s no longer necessary to run a service like Consul just to gather an up-to-date list of microservice endpoints.
What to do about it: Feel free to play around with this on your own. In the future, we’ll plan to add support to our ECS Docker Cluster package (and in the future Fargate and EKS Docker Cluster packages) so that you can start using this out of the box!
What happened: After the Spectre and Meltdown vulnerabilities were announced, AWS has periodically been rolling out fixes.
Why it matters:**** Since these are hardware vulnerabilities in the CPU itself, the fixes often result in performance degradation. The impact is not entirely clear, but if in the last month you saw a sudden drop in CPU performance, despite absolutely no change on your behalf, it’s possible that this is a fix AWS has rolled out to protect you. See this post for an example.
What to do about it: There’s nothing you can do about the performance impact. However, you should also ensure you update the kernel on all of your servers—and install updates on all developer computers—to ensure these vulnerabilities don’t affect you.
What happened: AWS has launched a new, unified way to manage Auto Scaling for EC2 instances, ECS, Aurora, DynamoDB, and other resources.
Why it matters: This gives you a more intuitive and centralized way to manage auto scaling for everything in your AWS account.
What to do about it: It does not look like Terraform supports this yet, so for now, you have to manually use the AWS Auto Scaling console.
What happened:**** The latest version of the Docker app for Mac allows you to run Kubernetes locally!
Why it matters: If you’re planning on using Kubernetes in the cloud (e.g., when Amazon’s managed Kubernetes, EKS, hits general availability), being able to run it locally will make local development and testing easier.
What to do about it: Follow these docs to run a Kubernetes cluster locally.
What happened: AWS has announced that it now officially supports writing Lambda functions in Go.
Why it matters: You no longer need hacks to run Go code using Lambda.
What to do about it: Deploy your lambda functions as always (e.g., using package-lambda), but specify go1.x
as your runtime.
What happened: Ansible has released a Terraform plugin.
Why it matters: If you’re using both Ansible and Terraform, this allows you to manage everything from Ansible.
What to do about it: Check out the Terraform plugin docs.
What happened: AWS now allows you to publish MySQL and MariaDB logs from RDS to CloudWatch Logs.
Why it matters: You can now see the general log, slow query log, audit log, and error log for your database directly in CloudWatch Logs, which will make it easier to debug and troubleshoot.
What to do about it: This feature is not yet supported in Terraform (see issue #3056), so if you’d like to use it, you’ll have to enable it manually in the AWS console.
Below is a list of critical security updates that may impact your services. We notify Gruntwork customers of these vulnerabilities as soon as we know of them via the Gruntwork Security Alerts mailing list. It is up to you to scan this list and decide which of these apply and what to do about them, but most of these are severe vulnerabilities, and we recommend patching them ASAP.
yum update kernel
or apt-get upgrade
in your Packer or Docker builds and roll out the new images to all your servers. For personal computers, install the latest Windows and OS X updates. More patches will likely be released in the near future, so keep your eyes open and be ready to update. We sent an email about this vulnerability to the security alerts mailing list on January 4, 2018.