Docker Compose: Creating a Dev Environment like Production

Integration Testing

Integration testing is an important part of any production system. Whether it is automated, performed by a QA team, or done by the developer him- or herself, it is essential that all the product bits are verified before revealing the result to the customer. Generally speaking, integration testing is simply running a set of tests against a production-like environment. That means you remove any testing mocks you may have in place and observe the actual interactions between the various services which make up your product.

Development Issues with Complex Systems

The larger your systems scale, the more difficult it is to test them all on the same box. This simple fact comes from increasing system complexity; as your user-base grows, you utilize more resources, and your software architecture begins to span multiple machines. For instance, if you're building a lot of microservices, then each of these needs to be stood up and configured properly to work on your local box. Now each developer needs to duplicate this work for his or her own local setup.

Eventually, maintaining your local "full-integration" development environment becomes unwieldy and possibly even more difficult to manage than production (due to dependency issues, etc.). What this ultimately means is that developers kill this environment entirely. They write their code and unit tests and then just ship it off to the build pipeline. After waiting a period of time, their code should show up in an "pre"-production environment for a full-scale integration test. At this point, they identify if there are any glaring bugs in their code or some other additional oversight.

The problem here is that this process is inefficient. Not only is this slow, but when you share an environment for testing, it is common that multiple changes are deployed at once. As a result, it can sometimes become unclear which change is causing issues. Now let me make myself clear: you should have an environment that integrates all the latest changes before they go to production. This environment will protect you from a plethora of additional production issues (i.e. logically incompatible changesets), however, it is not the best way for developers to test their code. Developers should have tested their code thoroughly before pushing it to the environment that is going to certify it and send it to production.

Docker Compose

Well, this sure seems like a predicament. I just mentioned that for large services it often becomes incredibly difficult to maintain a local version of your complete system. While this is true, we can significantly reduce the burden with docker compose. Docker compose is a tool for managing multi-container Docker applications. In short, you can define an arbitrary system composed of many containers. This tool provides a perfect foundation for us to reproduce a small-scale version of production that can be run entirely locally.

Using Docker compose should be trivial if you're already deploying your services using Docker containers. If not you should first create Docker images for all of your services; while this is labor intensive, you and your team members can reuse these images in the future.

Our Example

Now that we understand the problem and our tools for solving it, we will work through an example. Below is a diagram describing our scenario.

Network model. WordPress application server connects to Internet and intranet while MySQL DB only connects to intranet.

In summary, we have a basic WordPress setup with a few minor tweaks. Rather than hosting both MySQL and WordPress on the same box, we have separated the concerns. Our WordPress application server is accessible on the open internet and our internal network. Our MySQL server, on the other hand, lives on a separate box only accessible on our intranet to prohibit anyone from external requests directly to the database. This example illustrates how one may naturally expand their services. Similarly, you could generalize this concept to arbitrarily complex networks.

Assuming this is the network we want to model with docker compose, let's take a look at configuration file below.

Without delving too deeply into the configuration format, the most notable information in this configuration file is the virtual networking. We have two different networks-- external-net and backend-- which correspond to Internet and intranet in our diagram, respectively. These networks provide the separation of concern as we had designed above. However, more important than the implementation details is the concept which this represents. Namely, we can specify the images, settings, and networking configuration for our docker containers and reuse this file everywhere. Once this file has been built once, it can be shared with the entire team making local integration testing accessible again. With a little maintenance for the addition of new services, this file can become a more faithful representation of your production environment for developers.

Conclusion

We have briefly discussed a major impediment to local integration testing today; most notably, the growing complexity of our products with microservice architectures. However, since this architecture has many benefits, we need to revisit the way we enable developers to perform more comprehensive testing before pushing their changes into the production build pipeline. We have demonstrated a simple use case of using docker compose to perform this task. In creating a single, shareable representation of the production setup, we can keep developers moving forward and reduce the overall number of bugs merged into mainline code.

The Build Pipeline: From Unit Testing to Production

The Build Pipeline

The build pipeline describes the process by which new code makes its way out to a production environment. One may even consider a developer building code on his or her local machine and manually deploying it to a server a primitive build pipeline. While this approach may work well for small or non-critical operations, it is insufficient for most professional work. Whether you're working in the hottest new startup or for a larger company, defining an effective build pipeline and streamlining your deployment process is of utmost importance. While this article will omit implementation details, I will go through a thorough explanation of each step and why it's important and how it improves the lives of developers and overall stability of products.

Continuous Integration, Continuous Delivery

Before I delve deeper into build pipelines, I want to briefly familiarize the audience with continuous integration, continuous delivery (CICD). This concept has been around for several years, but I have heard grumbles about this from colleagues. In summary, the idea is that every commit to the mainline (i.e. usually master branch in git) is built and continuously tested (i.e. continuous integration) and when all of those tests pass, the code is then deployed immediately to production (i.e. continuous delivery).

Many people claim that such a system sounds good in theory, but always fails in practice. Well, I happen to have it on good authority (i.e. personal experience) that this sentiment is categorically false. Yahoo/Flurry/Oath have been using CICD for some time now and the method works very well. In fact, it saves a lot of headache and avoids many mistakes or potential outages which occur from manual deploys or even gated deploys (the discussion of distinction between the two may be for another time, however).

While I am a proponent of CICD and will center our build pipeline discussion around this idea, I must admit that it does front-load a lot of the work. That is to say, CICD requires a larger upfront investment cost than traditional means of operations and code deployment. While the infrastructure can theoretically be built over a period of time, it is best to have all of the infrastructure in place before releasing your product.

In this way, you will be able to allocate sufficient resources into building a robust system. If the product is released before the CICD infrastructure has been properly laid out, it's very easy to get side tracked into focusing only on improving the product rather than process of releasing changes. This ultimately ends up wasting a significant amount of developer resources. Please note, when I say infrastructure I really mean your deploy scripts or something similar. I expect most companies will not be rolling their own CICD solution and instead use something like Jenkins or Screwdriver.

tl;dr. CICD is great but you need to give it the upfront investment it deserves when you're building a new system. Ensure that the infrastructure is in place (even if not all the testing is finished depending on how fast and loose you're playing) before officially launching your product. See cert

Philosophy of the Build Pipeline

Let's move on and discuss a bit more deeply about the ideas of our build pipeline. In summary, an effective build pipeline should have at at least 3 phases:

  1. Unit testing phase. Often times this is the first step in your build pipeline. Unit testing runs before you've packaged your code for shipping. In the unit testing phase, all unit tests should be run for the codebase that is actively being built. Similarly, you can run "local" integration-style testing (with mocks and so forth) if you have them in this phase.
  2. Smoke testing phase. If you have the resources, you should have a non-production environment which looks nearly identical to your production environment (though probably at much smaller scale). It's even possible to run this environment on a single box if the services won't conflict with each other. Similarly, you would not necessarily use production data in this environment. Most importantly, this environment runs real services. At this point you should run a set of smoke tests which will effectively test basic integration of your services.
  3. Integration testing phase. The final essential component of a build pipeline is the integration testing phase. This phase should deploy your services to a production or production-like environment and verify a full suite of integrations on your production system. With a proper test suite, performing this step enables the developers to find the vast majority of issues before they become customer-facing.

While we have discussed 3 primary components of a build pipeline, this often represents the bare minimum. Build pipelines can be arbitrarily complex and can even include triggering up- or downstream dependencies. No matter how complex your build pipeline dependency graph becomes, these 3 phases should be present in some capacity.

A More Sophisticated Build Pipeline

With the 3 components listed above, we will now go through an example of a more sophisticated build pipeline. While not overly complex, this is a realistic pipeline that one could use to deploy their own code. Again, implementation details are omitted, but the core concepts remain.

Example of build pipeline

A brief explanation of the diagram above follows:

  • Code repository. This is where your raw source code lives. It is likely a version control system (VCS) such as git, svn, or otherwise.
  • Artifact repository. The artifact repository is where your compiled code packages live. For instance, this code be a local artifactory of NPM repository.
  • Unit testing. The unit testing phase is described above. It first pulls in code from your repository, then it runs and verifies its unit testing. Upon successful completion, it will upload a compiled artifact to the artifact repository and trigger the smoke testing job.
  • Smoke testing. Smoke testing is also described as above. It should deploy the latest artifact from the artifact repository and run a series of smoke tests. Upon successful completion, it can optionally tag an artifact as the last smoke verified artifact (to better ensure you never accidentally deploy untested code) and then trigger the pre-prod environment.
  • Pre-Prod Testing. The pre-production environment is an "extra" production box. Namely, one that is either taken out of rotation or a dedicated host (or set of hosts) that are connected to production services but are never actually visible to the outside world. This environment tests your current production setup against the code you wish to deploy (but before you actually deploy it). It should pull the latest available service artifact (unless you tagged an artifact as latest smoke verified) and run a series of typical production-style integration tests. Upon successful completion, it should tag its artifact as the latest verified artifact and trigger the int testing job.
  • Int Testing. Finally, the integration testing is the last step in this build pipeline. Assuming you have a cluster of hosts running your services (read: this is good practice for redundancy) it will take a subset of those hosts out of rotation (OOR); this ensures that the service stays fully available to customers while the deployment is on-going. For the OOR hosts, it will deploy the latest verified service artifact and wait for the service to come up. When the service is ready, it runs the set of integration tests on those boxes. After those boxes have been verified successfully, it will return the OOR hosts back to the production rotation and then take out a different subset. This process repeats for however many distinct subsets exist. That is, if you deploy to a single box at a time and have a 5 box cluster, then this step will repeat 5 times or once per box.

By the end of this build pipeline, your newly built and tested code is fully deployed to production if all steps pass. If at any point the tests fail, the build stops at that point and does not proceed further in the build pipeline. It is important to recognize that during the final integration testing phase that this could, in fact, leave the subset of boxes OOR if the tests fail. As a result, the number of boxes deployed at once should be an acceptable number of failed hosts for your application.

Conclusion

While more complicated build pipelines than we've discussed exist, build pipelines do not need to be complex to be useful. However, there exists a minimum set of functionality they must test for effectiveness. Even the simplest of build pipelines can improve developer productivity and reduce operations mistakes. By simply automating the testing process alone, we've avoided mistakes of human error (i.e. forgot to run a test, skipped a test intentionally, didn't follow deployment steps properly for service, etc.) and ensured that our tests are always properly run. Not only does this avoid error, but it also frees up engineering resources to perform other useful work.

Above all, if you do not currently have a build pipeline, you should consider designing one and implementing it. Not only will it improve the lives of your engineers, it will provide confidence to all of your business units. A proper build pipeline allows everyone in your organization to feel confident about code quality for user-facing products.