Introduction

Introducing Houston, by Turbine Labs

Iterating rapidly while maintaining a stable experience is difficult, especially in an industry where the state of the art is in constant flux. Turbine Labs bridges the gap. With Houston, our application routing and release system, you can confidently test your code in production, release it incrementally, or rebuild your infrastructure, with no visible impact to customers.

Update your roadmap, not your site map

Product success depends on constant innovation, but it shouldn’t come at the expense of customer experience. As you add new features, test and release new software, or replace legacy infrastructure, your customers’ experience should continue uninterrupted. Houston lets you map customers’ traffic to your infrastructure, providing a flexible, dynamic interface between your changes and their experience.

Houston keeps changes narrowly focused and reversible, reducing the risk and cost of service outages. You can safely develop features on production infrastructure that stay invisible until they’re polished. You can release new software incrementally to customers, comparing old and new versions as you go. Simply turn off the release if anything looks wrong. You can use the same approach to migrate to new infrastructure. Your customers will continue with business as usual.

Understand structure and behavior, past and present

Modern architectural trends like containers, orchestration, and microservices give you unprecedented expressiveness, but are complex to reason about and difficult to instrument; it’s hard to connect the dots from what you’ve built to what your customer sees. Houston bridges the gap, combining a customer-centric approach to monitoring and observation with insight into changes to your infrastructure.

Houston provides a concise, consistent set of metrics that let you understand your customer’s experience at any level of granularity, from the entire domain to a single endpoint. You can slice those same metrics by service or software version to understand how your changes affect that experience. Houston keeps a record of these change, making it easy to correlate them with incidents, compare customer experience across multiple software versions, and measure the quality and pace of your software releases.

You can keep your existing infrastructure

Houston integrates easily with existing systems. Try it out on a container on a laptop, then deploy it in your normal process. Houston’s extensible service discovery agent integrates easily with AWS, Kubernetes, DC/OS, Consul and others. Everything is managed from our hosted application, with a robust public API for scripting and integration with your existing management tools.




Use Cases

Incremental release

Blue/green deploy (aka Red/Black deploys or A/B deploys) is a technique detailed by Martin Fowler - here. Instead of upgrading software in place, you deploy new instances running new software. Once it's running, you switch over customer traffic. If it goes bad, you switch back to your still-running old service. Houston's built-in release workflow is similar, but allows you to shift traffic incrementally to your new software. Instead of an atomic cutover to the new version, Houston lets you shift a small percentage of traffic to the new version, compare behavior of the old and new system, and proceed when you're confident the user experience won't be impacted. Canceling the release is simple and fast. It's just routing traffic away from the new version.

Blue/Green

Testing in production

There are a lot of ways to test software. Unit tests, integration tests, staging environments, and manual test suites are all good at catching different classes of defects. But bugs slip through to production even with these methods. Houston's flexible approach to routing lets you set up routes based on headers or cookies to send traffic to new, non-public versions of software. Engineers can deploy and evaluate their code, on their schedule, without affecting customers. Defects in failed production releases can be safely root-caused; Simply shift customer traffic to a known-good version, and allow engineers to inspect the bad version at their own pace.

Test in Prod

Monolith decomposition

Many applications begin life as a single, monolithic service. As both the application and the team grow, there is often a desire to split the monolith into smaller services. Houston's flexible routing lets you execute these splits with minimal client disruption. Split out the route you plan to migrate, without affecting production traffic. Then use the same tools and methods you use for blue/green deploy to safely and incrementally shift traffic for that route from the monolith to your new service.

Decomp




Houston Architecture

Houston, by Turbine Labs, consists of several components. Our proxy is installed in your environment, along with a service discovery collector configured for your infrastructure. Our hosted API, web application, and analytics backend provide a control plane to observe and manage your application.

In your environment

tbncollect

tbncollect is an agent that scans your environment for running service clusters and instance labels. It has integrations with Kubernetes, DC/OS, Consul, ECS, and EC2, with more integrations on their way. It can also poll a YAML or JSON file, for static or custom integrations. Changes to your environment are mirrored to the Turbine Labs API.

tbnproxy

The NGINX-based proxy is responsible for receiving customer requests and dispatching them to appropriate service instances. An admin server runs alongside the proxy, and is responsible for managing its configuration and forwarding request/response metrics to the Turbine Labs API. When it detects changes in your environment, it updates and reloads the NGINX configuration, with no downtime.

Hosted by Turbine Labs

API

Our public API provides a central, hosted management service for environment configuration and metrics. It maintains a catalog of the zones, domains, routes, service clusters and instances, and proxies in your environment. It also provides a detailed log of changes to these objects, with a query interface for request/response metrics dimensionalization.

Management web app

The UI, built on top of our public API, provides a simple, intuitive interface for managing and observing the state of your environment. You can release new software, migrate to new architectures, triage incidents, all in a single, consistent interface. You can understand the current behavior of your site, and know what has changed, at any level of granularity.

Supported deployment platforms

While the Turbine Labs software will run on a wide variety of architectures, we've built specific integrations with Kubernetes, DC/OS, Consul, EC2, and ECS. We plan to add more integrations in the future, and our YAML/JSON file polling mechanism provides extensibility if you wish to create your own.




Quickstart

Time to complete: 10 minutes

This guide walks you through setting up, and using an all-in-one example app, as well as a few exercises to illustrate what Houston and the Turbine Labs API can accomplish.

Signing up for an account

To get started with Houston, you'll need a Turbine Labs account. Click here to get started.

Install the tbnctl command line interface (CLI)

tbnctl is a CLI for interacting with the Turbine Labs public API, and is used throughout this guide to set up tbnproxy. Install tbnctl with these commands (Requires installation of Go, and that $GOPATH/bin is in your $PATH):

$ go get -u github.com/turbinelabs/tbnctl
$ go install github.com/turbinelabs/tbnctl
$ tbnctl login

Use your Houston username and password to login.

Username [somebody@example.com]:
Password:

See the tbnctl Guide for more information.

Get an API Access Token

Create an API Access Token using tbnctl:

$ tbnctl access-tokens add "demo key"
{
  "access_token_key": "<redacted>",
  "description": "demo key",
  "signed_token": "<redacted>",
  "user_key": "<redacted>",
  "created_at": "2017-08-25T22:11:30.907200482Z",
  "checksum": "d60ed8a6-1a40-49a5-5bb1-5bad322d9723"
}

You'll need the value of signed_token later on, so keep it somewhere secure.

What's in the All-In-One image?

tbnproxy and tbncollect

These two applications will run in a real-world deployment, connected to Turbine Labs' API.

All-in-one server

A simple HTTP server application that returns hex color value strings. There are three "versions" of the server, each returning a different color value:

All-in-one client

This app is used to demonstrate the use of Houston through a simple visualization of routing and responses, but is disposable after experimenting with this demo.

Starting the all-in-one example

The three environment variables you'll need to set in order to run the demo are:

To run the Docker container with tbnproxy, tbncollect, and the all-in-one server and client, use the following command:

$ docker run -p 80:80 \
  -e "TBNPROXY_API_KEY=<signed_token>" \
  -e "TBNPROXY_API_ZONE_NAME=all-in-one-demo" \
  -e "TBNPROXY_PROXY_NAME=all-in-one-demo-proxy" \
  turbinelabs/all-in-one:0.13.0

This command will:

Note: In some cases the local Docker time may have drifted significantly from your host's time. If this is the case, you'll see the following message in the docker run output:

FATAL: your docker system clock differs from actual (google) time by more
than a minute. This will cause stats and charts to behave strangely.

If you see this error, restart Docker and re-run the all-in-one container.

Demo exercises

What's going on here?

With the all-in-one container running, you should be able to navigate to localhost to view the all-in-one client. (On older versions of Docker for Mac, and on Windows < 10, you'll access the result of invoking docker-machine ip (with a standard value of 192.168.99.100) rather than localhost)

The all-in-one client/server provide a UI and a set of services that help visualize changes in the mapping of user requests to backend services. This lets you visualize the impact of Houston on a real deployment without having to involve real customer traffic or load generators.

The application is composed of three sets of blocks, each simulating a user making a request. These are simple users, and they all repeat the same request forever. The services they call return a color. When a user receives a response it paints the box that color, then waits a random amount of time to make another request. While it’s waiting the colors in the box fade. Users are organized into rows based on URL.

The colors indicate the following:

You should see pulsating blue boxes for each service, to indicate the initial state of your production services.

Deployed state

Let’s dig deeper into how tbnproxy routes traffic. Traffic is received by a proxy that handles traffic for a given domain. The proxy maps requests to service instances via routes and rules. Routes let you split your domain into manageable segments, for example /bar and /baz. Rules let you map requests to a constrained set of service instances in clusters, for example “by default send traffic to servers labeled stage=prod. Clusters contain sets of service instances, each of which can be labeled with key/value pairs to provide more information to the routing engine.

Your environment should look like the following:

There is a single domain, all-in-one-demo:80 that contains two routes. /api handles requests to our demo service instances, and / handles requests for everything else (in this case the demo app). There are two clusters:

The rules currently map traffic to instances labeled with stage=prod,version=blue, which is why only blue is showing. If we were to map instead to stage=prod without with version label constraint, both blue and green instances would match, and tbnproxy would load balance across them. In this case you'd see an even split of blue and green.

Incremental release

Now we're ready to do an incremental release from blue to green. Right now the default rules for /api send all traffic to blue. Let’s introduce a small percentage of green traffic to customers.

Navigate to app.turbinelabs.io, then click "Release Groups" below the top-line charts. The row "server" should be marked "RELEASE READY". Click anywhere in the row to expand it, then click "Start Release".

Let's send 25% of traffic to our new green version by moving the slider and clicking "Start Release". The release group should now be marked "RELEASING".

The all in one client should now show a mix of blue and green. You can increment the green percentage as you like. When you get to 100%, the release is complete.

Congratulations! You've safely and incrementally released a new version of your production software. Both blue and green versions are still running; if a problem were found with green, a rollback to blue would be just as easy.

Browser overrides

Let’s test our yellow dev version before we release it to customers. tbnproxy allows you to route to service instances based on headers set in the request. Navigate to app.turbinelabs.io, log in and select the zone you’re working with (all-in-one-demo by default). Click "Settings" -> "Edit Routes", and select all-in-one-demo:80/api from the top left dropdown. You should see the following screen

Click “Add Rule” from the top right, and enter the following values:

IF Header: X-Tbn-Version & version Send 1 to all-in-one-server.

This tells the proxy to look for a header called X-Tbn-Version. If the proxy finds that header, it uses the value to find servers in the all-in-one-server cluster that have a matching version tag. For example, setting X-Tbn-Version: blue on a request would match blue production servers, and X-Tbn-Version: yellow would match yellow dev servers.

The all-in-one client converts a X-Tbn-Version query parameter into a header in calls to the backend; if you navigate to localhost?X-Tbn-Version=yellow you should see all yellow boxes. Meanwhile going to localhost without that parameter still shows blue or green based on the release state of previous steps in this guide.

This technique is extremely powerful. New software was tested in production without customers being affected. You were able to test the new software on the live site before releasing to customers. In a real world scenario your testers can perform validation, you can load test, and you can demo to stakeholders without running through a complicated multi-environment scenario, even during another release.

Testing latency and error rates

In order to demo what errors and latency issues may look like in a production environment, we implemented a few parameters that can be set to illustrate these scenarios. By default, each of the demo servers returns a successful (status code 200) response with its color (as a hex string) as the response body.

URL parameters passed to the client web page at can be used to control the mean latency and error rate of each of the different server colors.

An example

The following URL will show an error rate and delayed response for green and blue servers.

http://<your external IP>/?x-blue-delay=25&x-blue-error=.001&x-green-delay=10&x-green-error=.25

This will simulate a bad green release, and a need to rollback to a known good blue release.

Parameter effect

These parameters can be modified in the above example as follows:

The latency and error rates are passed to the demo servers as HTTP headers with the same name and value as the URL parameters described. You can use these parameters to help you visualize the effects of a bad release, or an issue with the code in a new version of your application, which would be cause to step-down the release and return traffic to a known-good version.

Driving synthetic traffic

If you'd like to drive steady traffic to your all-in-one server without keeping a browser window open, you can add ALL_IN_ONE_DRIVER=1 to the environment variables in your docker run invocation. You can also add error rates and latencies for various using environment variables:

$ docker run -p 80:80 \
  -e "TBNPROXY_API_KEY=<signed_token>" \
  -e "TBNPROXY_API_ZONE_NAME=all-in-one-demo" \
  -e "TBNPROXY_PROXY_NAME=all-in-one-demo-proxy" \
  -e "ALL_IN_ONE_DRIVER=1" \
  -e "ALL_IN_ONE_DRIVER_LATENCIES=blue:50ms,green:20ms" \
  -e "ALL_IN_ONE_DRIVER_ERROR_RATES=blue:0.01,green:0.005" \
  turbinelabs/all-in-one:0.13.0

Next steps

Now that you've seen all-in-one demo in action, you can move on to deploying Houston in your own environment. After reading the configuration guide below, proceed to one of the following cloud integrations:




Application Usage Guide

This guide explains the features, knobs, and buttons for the Houston web app. With Houston, you can test and release new software incrementally to customers, compare the customer experience across software versions, and measure the quality and pace of iteration.

Dashboard

The dashboard show a top-line chart of the currently parented object (initially the Zone), a list of changes to that object, and sparkline charts for other related objects.

Return to this page at any time by clicking the Turbine Labs logo on the top left of the screen. Clicking on the pencil next to a Release Group will take you to the Release Group Editor (see Editing Release Groups below)

View Layout

Each dashboard view includes a top-line set of charts showing the aggregate data from the currently selected Zone, Domain, Service, Release Group, or Route. Below, charts are displayed for relevant related objects. These charts all share a common x-axis. Each sparkline can be expanded to a larger inline chart view, or can be made the new top-line view.

The default view is of a Zone, from which you can see sparklines for the underlying Domains, Services, and Release Groups. From Zone, you can explore the routing and release objects recursively, or choose one from the dropdown to the right of your selected Zone. The chart below summarizes the different sparkline row types available for each top-line view:

The following chart shows the relationship between views, and charts

Views Sparklines
Zone Domains / Release Groups / Services
Domain Routes / Release Groups / Services
Release Group Routes / Services
Route Services
Service Instances / Release Groups / Routes

Charts

Latency

Displays the 50th and 99th percentile latencies, in milliseconds, of the currently selected Zone, Domain, Service, Route, or Release Group.

Requests

Displays requests, successes, errors, and failures for the currently selected Zone, Domain, Service, Route, or Release Group.

Success Rate

Displays the percentage of requests that were successful for the currently selected Zone, Domain, Service, Route, or Release Group.

Zones

The Zone showing in the top bar is the currently selected Zone. Clicking on it will show any other available Zones.

Time Filter

This filters the time period for the charts. Choose from the past hour, the past day, the past week, or a custom time period.

Changelog

All recent changes within the current view appear here. For example, in a Zone view, all changes to Routes, Release Groups, and Services would be present.

More Menu

Add Route

This option displays a screen allowing you to choose the Domain, path, and Release Group for your new Route. Once the new Route is created, you can add additional rules to it.

Delete Route

Remove an existing Route and its rules. A dialog will appear, showing which Services are currently being Routed to Caution, this is irreversible once you confim by clicking Delete

Add Domain

Add a new Domain, comprising a hostname and port. You can also map this Domain to one or more Proxies.

Add Proxy

Add a new Proxy, which represents the configuration of one or more tbnproxy instances. You can choose which Domains to make the Proxy available to from a list of Domains on your Zone.

User Menu

Log out

Click to return to the login screen, after logging your user out of the app.

Editing Release Groups

With your Release Group selected, click the pencil to invoke edit mode. You will see the following:

Undo

Click to undo any unsaved changes to your Release Group.

More

This allows you to do the following:

Save changes to <Release Group name>

This button saves your current edits and changes. It will be greyed-out if no changes were made since last save.




Similar Systems

Houston is an application and routing and release system. It combines capabilities usually found in web proxies, logging solutions and monitoring tools. It's the synthesis of these things into a product squarely focused on providing a flexible, yet stable experience to customers that sets it apart.

Traffic Control

Houston can replace your existing web proxy, or work in conjunction with it. Compared to popular proxies (NGINX, HAProxy, ELB, ALB) Houston supports finer grained request routing, and has a more detailed view of your deployed services. Houston's proxy is dynamically configured directly from the Turbine Labs API, so changes to configuration are simple and responsive.

Logging

Houston collects, stores, and visualizes all configuration and state changes related to your software release. In most release systems the configuration and management of logging collectors is tedious, error prone, and low fidelity. Houston provides you a great solution out of the box, and can also integrate with your existing logging systems, enriching the data you already collect and store.

Monitoring

Houston collects, stores, and visualizes a concise yet comprehensive set of metrics that give you an easily digestible view of your site health. Most monitoring systems require invasive instrumentation and provide an overwhelming flood of metrics. Houston collects a focused set of metrics that let you understand what your customers are seeing, without the code surgery. Houston provides you a great high level picture of service health, and can forward metrics to your monitoring system for a more detailed analysis of incidents that arise.

CI/CD

Houston understands that deploying code to hardware shouldn't be the same step as releasing it to customers. Existing CI/CD tools are great for packaging and deploying software, but don't integrate the logging, monitoring and traffic control you need for a robust release to your customers. Houston integrates the information you need to feel confident in your software release and increase your pace of iteration.