Avoiding AWS Access Keys

The AWS Well-Architected framework is a recommendation by AWS, summarized in an 80 page PDF document. After focusing on cost optimization in my first article, this article looks at one specific aspect of the security pillar.

Passwords are bad

Yes, passwords are bad. I don’t need to repeat that, right? Anyways, a few words on that: Managing passwords is a challenge. Especially when you have to manage and hand them out as a central team. First, you end up spending a lot of time resetting passwords, and potentially even managing the secrets in some “secure” store. Second, you have a security risk by keeping passwords active after employees left the company or simply by having the headache on how to protect a central credentials store.

Instead of using AWS IAM users, use AWS IAM Roles. Roles are a central piece of the AWS infrastructure, and every AWS service supports them. Notably EC2 can have an attached IAM profile. Once you attach an IAM instance profile, all calls to AWS services from that EC2 machine are invoked with the specified IAM role.

Custom applications

I often experience teams discussing how to securely store AWS Secret Keys in their development environment or tool they configure. Discussions are usually around how to pass them along to the build server and the production server. The answer is almost always: You don’t. Just ensure the EC2 machine uses an IAM Instance Profile (limited to the required permissions).

But wait, what about local development? I can’t assign an IAM Instance Profile to my machine. Again, don’t do anything in code. Instead, rely on well-documented credential configuration outside of your application (see example documentation for Node.js). Short version is to simply configure your user’s AWS credentials (~/.aws/credentials) and auto-rotate them on a schedule (mirri.js is a good tool to do that).

If you use federated logins to your AWS account, an alternative is to leverage AWS STS and automatically generate a temporary key every time you need one. This eliminates key rotation completely.

External services

There is also the case where you need to grant access to external services. For example, an external build server like Travis CI, a log collector like SumoLogic, etc. Some might have an option to configure an IAM Role with an enterprise subscription, but often the only way is to actually use access keys. So you’re tied to simply rotate them regularly. The key is to automate the log rotation. Felix is a tool that supports some external services, and definitely gives a baseline on how automation can be written.

AWS Well-Architected Framework applied – Cost Optimization

The AWS Well-Architected framework is a recommendation by AWS, summarized in an 80 page PDF document. Booooring. True. So I’m taking a different approach. This is a hands-on, developer focused way of thinking of it.

Context

The 80 pager talks about 5 pillars: (1) Operational Excellence, (2) Security, (3) Reliability, (4) Performance and (5) Cost Optimization. Those are very high-level aspects. Taking those guidelines and following them however results in a cost-efficient, scalable and reliable cloud environment.

In just joined a new team. A team that was not using any deployment automation and manages many systems that were built before the cloud became a real thing. So there’s clearly some trickiness to managing this infrastructure, but the initial impact by cleaning up the current state is high.

In my first two weeks I’ve been focusing on the pillars cost optimization (5th pillar) and security (2nd pillar). Today, I’m only going to talk about cost optimization.

(5) Cost Optimization

Let’s start with “why?”. Why should a “normal software engineer” bother? In the end, you might just be an employee of a multi-billion-dollar company. It’s simple. It’s to help increasing the core financial metric of the company (be it UFCF, profitability, margins), and therefore your bonus (which likely depends on one of those metrics). The less you and your team spends for getting the same value, the higher the contribution. While it might be a minor contribution to the overall pot of a multi-billion dollar company, those targets are often getting pushed down. In my specific case they were pushed down to our team: Our team’s AWS spending was around 60K USD / month, resulting in over 700.000 USD per year. Think of: How many employees could we add to our team instead of spending that much? (Or how many first-class flights around the world can you fly, if you want a more fun relation?)

In short, I believe it’s everyone’s responsibility as well-payed employee’s to be cost conscious.

As specific actions, I’ve started to automate some parts of the AWS infrastructure by managing IAM roles, IAM users and a handful of other resources through AWS CloudFormation templates. At the same time, I also started to auto-curate some parts around cost and security. Specifically, there are now daily scripts that auto-terminate EC2 machines 30 days after they have been stopped, and deleting detached volumes. That 1-2 day activity resulted in cost avoidance of 3000 USD / month (5% of the spend, and having measures in place to avoid these kind of costs forever).

So, now, what can everyone do and how can everyone contribute? Let’s start with specifics that allow to change your mindset as well as an easy introduction to provided tools:

  • Have a look at the AWS Trusted Advisor’s Cost Optimization section. It’s fairly basic. But as specific example, it gave me insight that our team had 10 TB of detached EBS volumes.
  • And simply the AWS Cost Explorer from the AWS Billing Dashboard.
  • Way better is Cloudability, a 3rd party tool that allows to analyze AWS costs, and offer optimization. The easiest is probably to setup a weekly report from Cloudability. This helps to raise awareness. Nothing else. Just slowly helping with becoming cost-conscious. Then there are the simple and advanced reports, insights and optimizations, which are well-documented on the Cloudability pages.
  • One of my next targets will likely be Rightsizing. For example, our team still has 25’000 IOPS (guaranteed I/O operations on disk) provisioned in a test environment, resulting in 1750 USD / month (in addition to disk space). This might be the right choice for the testing needs, but if not, let’s simply not buy guaranteed I/O.
  • And then there are reserved instances. Cloudability offers very good insight and a risk assessment of which and how many instance hours a team should buy. Also, for larger companies, the reserved instances can be bought on the global account and therefore distributing the risk among all accounts in the organization.
  • Once you’ve done the basics, look what’s available out there. From auto-scaling to auto-spotting to leveraging more AWS managed services.

I doubt you’ll end up flying first-class around the world. But at least you avoid someone asking you to row across the ocean, or even paying back part of the unnecessary spend you caused.

Deploying a Single Page App to AWS S3

There are a million ways to deploy a Single Page App (SPA) to S3. However, this is a very lightway way, and is based on aws-architect.js. I don’t want to repeat the great readme of this repository, but nevertheless, some introduction.

Bootstrap the deployment

To bootstrap the deployment, run the following code:

After that, adjust package.json and make.js to only contain the S3 relevant options since aws-architect.js would also allow to create AWS Lambda based web services.

Add AWS CloudFront distribution

The most declarative way with a straight forward deployment option is to leverage AWS CloudFormation to configure additional resources, notably an SSL Certificate attached to your domain, as well as the AWS CloudFront distribution. An easy way is to add that to the deploy command of make.js, but since it’s rarely changing items, you could as well apply those manually.

First, you’ll need an AWS Certificate that can be used for AWS CloudFront later. The CloudFormation template could look like this, allowing for multiple domains, for example to deploy a test or pull request as well as the production version to the same AWS S3 Bucket:

Now we can take the ACM Certificate, and create an AWS CloudFront Distribution with it, which includes two URLs, one for production, and one for test or pull requests. In this example, it’s assumed that pull requests are deployed to S3 using a path that starts with PR. The configuration looks like this:

That’s all. Slightly painful to get the ACM Certificate and CloudFront distribution settings correctly set to your needs. But afterwards, just execute make.js as part of your build, and aws-architect.js does the rest for you.

Delete your local master

I’ve been involved in many pull requests where the pull request got discarded and re-created from scratch with a reason of “merge conflicts”. Or people taking half a day to solve “unexpected” or “weird” behavior in their git repository. While reasons are various, one pattern emerged: Most of the time people were committing to the local master, potentially pushing it and having an out-of-sync master on their fork repository, and further along merging from a branch called “master” that doesn’t represent what they think master should represent.

One of the simplest solution is to just not have a local master.

Further, besides a lot of git features that can be leveraged, I usually get along with the following commands to get the essntial branching right:

There are some repositories where I keep a local master, for example when working with forked repositories on GitHub. But even then, I only use the master to merge the latest changes from upstream, but follow the pattern described abover.

Using Nginx on Docker to redirect HTTP to HTTPS

I had a website running using HTTPS behind a load balancer, and didn’t want to bother setting up HTTP as well. Instead, I configured the load balancer to point to a very simple Nginx webserver that does nothing else than redirecting HTTP to HTTPS. So from the application side I only had to take care of HTTPS and could ignore additional configuration. As a nice side-effect, the Nginx redirection is generic so that I only need to run a single instance for all my applications.

Since I don’t need anything else than Nginx on the Docker image, I used Alpine Linux as a base and added Nginx, or more precisely the preconfigured Nginx alpine-stable docker image from https://hub.docker.com/_/nginx/. The Dockerfile looks like the following:

And the related nginx.conf file, which gets copied when the docker image is created like this:

Assuming the Dockerfile and nginx.conf are in the same directory, a simple docker build command creates the docker image which can be loaded into your docker host. Writing a simple script to include this step in your build automation should be fairly trivial, depending on your needs.

LambdaWrap blog post at LifeInVistaprint

LambdaWrapRecently, I’ve worked on LambdaWrap as part of deploying an AWS Lambda based microservice. LambdaWrap is an open source ruby gem that allows to easily publish Lambda functions and its associated API Gateway configuration during automated deployment when using rakefiles.

I got the opportunity to write a larger blog post than this note at LiveInVistaprint. The blog post “LambdaWrap, a Ruby GEM for AWS Lambda” is available at http://lifeinvistaprint.com/techblog/lambdawrap-ruby-gem-aws-lambda/.

xUnit2NUnit web service

There are various easy ways to convert one XML file into another XML file, usually through an XSL transformation. Most languages support this with little code. For example, in C# it could be something like this (simplified):

However, I decided to create a simple web service to achieve a few things. For one, to demonstrate that such repetitive code can be wrapped into a service and accessed from everywhere, for example as part of build scripts. Other reasons include that I wanted to play with ASP.NET 5, experimenting with more complex options than just request-response despite the simple use case and host the service on an Azure Web App.

The current version of the service can be found on github at https://github.com/thoean/xUnit2NUnit. It builds with AppVeyor and as part of the build, I start the web service in the background and run a very basic service level test against it (I’ve written a dedicated blog post on service level testing recently).

Logging an exception and structured message data with ASP.NET 5

Once you started using ASP.NET 5 (aka. ASP.NET vNext), it’s surprisingly difficult to add a message that contains structured data to an exception. The LoggerExtensions either supports a flat message (which is probably good for a lot of cases), but it doesn’t support structured data for more sophisticated analysis.

Workaround

Let’s quickly look into a simple workaround, which is a 2-liner instead of a one-liner:

Solution

The logging extensions only support structured log messages when FormattedLogValues are provided, but don’t provide an interface to do the formatting behind the scenes. This is what I actually want to do:

I have created a pull request to the aspnet/Logging repository, but as of late October 2015, this wasn’t accepted. For the time being, you can simply copy/paste my changes to your project and start using the simplified logging capabilities.

Background

Why am I so obsessed with structured data in messages? Probably simply because I got used to it. I started using Elasticsearch as a logging sink over a year ago, and I mostly look into analyzing fields instead of full text messages. The detailed story is long, and probably worth a few posts, but it’s mostly around analyzing numeric data like performance data, or in the case of errors, by looking into categories of errors (outside of a hard to read error numbers).

Service level tests with the .NET Execution Environment (DNX)

While writing a fairly simple RESTful service with ASP.NET 5, I started to leverage the ease of self-hosting with the .NET Execution Environment (DNX) by executing service-level tests as part of the automated build. I use Powershell for the build script. I first publish the service to a temporary directory, start it in a background thread, and perform the actual tests.

With these few lines of code I am able to validate the service’s health after every commit, and notably running these set of tests as part of a pull-request build before being merged to the main line.

One example where I used such a build script is the xUnit2NUnit web service. The source code can be found on github, and a description in a separate blog post.

Automated build for ASP.NET vNext hangs

I had an automated build with ASP.NET vNext correctly working for a few weeks until it suddenly stopped working. It was hanging with the last output message at:

removing Process DNX_HOME

My powershell script started with something like the following:

Internet search pointed me to an issue on ASP.NET and a few other sources, but double-installation or spaces in usernames weren’t my problem.

Running this set of commands works very nice when an environment is installed, it downloads the dnvm scripts, it lists the currently installed environments, and eventually ensures the required environment gets installed if not already available.

My build server went through some auto-cleanup process and had therefore no environments installed. If that’s the case, dnvm list will start in interactive mode, asking the user “It looks like you don’t have any runtimes installed. Do you want us to install a runtime to get you started?“. That prompt didn’t show up in my automated build output (using Jenkins), so it was hard to troubleshoot, especially since I didn’t expect any interactive mode in the dnvm commands.

Once found, the solution was really simple by just removing dnvm list, or if that’s interesting information as part of the build script, to move it after dnvm install.