Author
Gustavo Armagno
This post is intended for automation lovers, who want to automate the setup of the development environment in microservice-based projects. One option is to use Vagrant.
If you are setting up a project based on microservices (or any project organized into a few distributed, self-contained components) you will probably end up hosting them in separate Git repositories. This is to allow for division of work, keep an organized workflow, prevent bottlenecks, etc.
Some of these components may share the same development environment configuration and may even be designed to run on the same machine.
A quick digression
We have different approaches to automating the development environment configuration for our dev team.
One option is to use a custom bootstrap script to setup our machine. This option has some drawbacks, which include constraining the variety and types of workstations our developers can use and making it difficult to select and install specific versions of our project’s dependencies, due to project restrictions — e.g. when the project requires PostgreSQL 9.2, but we have 9.4 installed on our machine.
A second option is to use Vagrant. Vagrant allows us to create and configure guest virtual machines (VMs) easily on our developer workstations, using a single description file. This file is called
Vagrantfile
and is normally located in the project’s base directory. Using this file, we can tell Vagrant to choose a variety of VM providers (e.g. VirtualBox or AWS), to define a list of tasks to be run once the VM has been provisioned, to configure the network, and so on. Then, Vagrant will quickly spin up a VM containing our development environment and keep it in sync with our source code in the host OS. Changing the source code in the host OS will automatically reflect in the VM, and vice versa. One of the benefits of this mechanism is that we can edit the source code with our preferred IDE installed in the host OS (e.g. OS X) and run/debug/test the application in the VM (e.g. an Ubuntu Server 12.04, including a particular set of dependencies). Team members using Vagrant can reproduce exactly the same configuration on their machines.
A third option is Docker. Docker is a more efficient alternative to using VMs. In a nutshell, Docker’s service layer runs over the host OS and creates isolated containers that include the app under development and all of its dependencies. VMs, on the other hand, don’t share the same OS instance. For example, two apps placed in two separate Docker containers will run on the same host OS. Conversely, two apps placed in two separate VMs will run on two different OS instances — and will therefore consume more resources.
Problem
To define the scope of the problem without losing generality, our goal is to automate the development environment’s installation and configuration for a project that has the following constraints:
- It is organized in many Git repositories.
- Each repository contains the source code of an individual component (e.g. a service, an app, the project’s core).
- Components run independently from each other.
- Components can communicate between each other (e.g. through RESTful endpoints).
- Components (may) share the same development environment.
- Components (may) run on the same (virtual) machine.
- The environment setup should be easily portable.
- The solution should encompass at least Linux or Mac workstations.
- The solution shouldn’t constrain any development, communication or management tool that developers are used to.
Solution
Our solution involves using Vagrant and Git subtrees to reference and checkout the external components from a single repo.
We are not using Docker-native on the host machine, this time, because we want our solution to be as generic as possible: Docker would narrow our solution down to only using Linux machines and we want to be nice to OS X devs. If we still want to use Docker, Vagrant comes with a Docker provisioner out of the box that fits in with our solution (see Vagrant’s Docker provisioner).
Let’s call this single repo
dev-env
.
dev-env
will contain the
Vagrantfile
with all the development environment configuration, a Readme file, including any manual setup required after provisioning the VM, and the subtrees.
Git subtrees are great because they allow us to work on our external repositories, right from our super project —
dev-env
, in our example — with relative ease. You only have to learn a new merge strategy — i.e.
subtree
— and be a little disciplined when pulling and pushing.
To simplify the explanation, let’s suppose we have three different repos,
service-auth
,
service-notifier
and
core
, each one containing the components we want to configure.
As a first step, we will create the repo and initialize it, using Git:
$ mkdir dev-env
$ git init
Then, we are going to add a
Vagrantfile
and a
Readme
file to
dev-env
‘s base directory. (The
Readme
is not strictly required, but it is recommended that we devote some time to writing one.)
Assuming that our components will listen to different TCP ports, our
Vagrantfile
would end up looking like this:
VAGRANTFILE_API_VERSION = "2"
Vagrant.configure(2) do |config|
config.vm.box = "ubuntu/trusty64"
config.vm.network :forwarded_port, host: 8080, guest: 8080 # core
config.vm.network :forwarded_port, host: 8081, guest: 8081 # service-auth
config.vm.network :forwarded_port, host: 8181, guest: 8181 # service-notifier
end
At this point, we are ready to set up the subtrees.
The first step is to add one Git remote per external component. This is a trick that will allow us to refer to the subtrees in a shorter form:
$ git remote add core git@github.com:moove-it/core.git
$ git remote add service-auth git@github.com:moove-it/service-auth.git
$ git remote add service-notifier git@github.com:moove-it/service-notifier.git
In the
dev-env
base directory, we will add the subtrees in their respective
prefix
folders, referring to the remotes we have just created:
$ git subtree add --prefix core core master --squash
$ git subtree add --prefix service-auth service-auth master --squash
$ git subtree add --prefix service-notifier service-notifier --squash
(We should only have to do this once, or just each time we want to include a new external component in our setup, which won’t happen very often. The general command to add a new subtree is
git subtree add --prefix [prefix_folder] [remote_name] [branch] --squash
.)
Note that we are using the
--squash
modifier. This is to avoid storing the entire history of the components’ repos in
dev-env
.
After adding the subtrees, our commit log will look something like this:
9298f0a Merge commit 'b376ce343d7ebfa47811e1fb98ced870ce346a4c' as 'service-notifier'
b376ce3 Squashed 'service-notifier/' content from commit 3d51460
a5a0407 Merge commit '56c06a59c1fc79d1a60a6712f467517c22fbdc05' as 'service-auth'
56c06a5 Squashed 'service-auth/' content from commit 1a7d7a3
1ce0b38 Merge commit '049d2492ae3fe15ec65d3663745e758638746414' as 'core'
049d249 Squashed 'core/' content from commit f51b5ba
Through the
git subtree
command, we can keep our individual components up to date and push changes for review.
We will then use
git subtree pull
to update the components. For example, to update
core
‘s master branch:
$ git fetch core master
$ git subtree pull --prefix core core master --squash
If we are working on a
feature
branch, we will then use
git subtree push
to push our
feature
branch:
$ git subtree push --prefix=core core feature
Finally, we can push
dev-env
to our remote repo.
Discussion
One of the benefits of this solution is that the team maintaining the original component’s repo doesn’t necessarily have to be aware of the existence of
dev-env
and all the infrastructure we have created to automate the development environment setup.
This solution, to set up a common development environment for autonomous components living in separate repositories, may or may not make sense, depending on your organization’s policies and/or the project’s complexity.
Scenarios that may benefit from this approach include: projects that are difficult to set up, requiring specific dependency versions, overwriting certain libraries or copying certain files to specific folders; or, projects that require distributing the development environment on separate machines. Some developers may benefit from using the same Vagrant setup across a lot of projects (for example, WordPress sites), having a single custom base box, so they can update them from a single place. You will be able to tell if this solution suits your needs.
This blog post follows a discussion I started on Stack Overflow. There, you will find alternative solutions that may be a better fit for your project.
Conclusion
Vagrant is an excellent solution for automating your project’s development environment setup. When the project’s architecture is organized in several components, living in separate repos, and you need to run them in the same guest machine, while still keeping the ability to edit/debug/test/interact with them from your host machine, you will want to avoid adding your Vagrant setup to each project repo.
Instead, you need to find a way to end up with a single Vagrant folder pointing to the sub-projects’ folders. In addition, you will want a solution that keeps the folders in the guest machine synchronized with the ones in the host machine.
The approach outlined in this article describes creating a Git repo, containing a single
Vagrantfile
and Git subtrees, each one linked to an external project. Once the VM has been provisioned and loaded, the external projects will be placed in Vagrant’s default directory in the VM, where they can run in the proper environment.