Setting up Docker Swarm in AWS with EFS backed storage and Traefik with Consul

Docker is awesome, but Docker is also hard.  The documentation can be sparse or cryptic, and every guide I find seems to only apply for that particular use case, or in testing, or etc.  There is always a footnote stating that more needs to be done.  And use cases can be varied and complicated, but I had a simple task: I needed to have docker containers running in the cloud delivering websites, preferably all in the same swarm to keep costs down and management simple.

This is an attempt to document an easy, dynamic setup for Docker Swarm in AWS that allows a user to have multiple WEBSITES (not just microservices) living in the swarm, with their own microservices if needed, all with SSL certificates auto generated and living behind a load balancer.

In the past few months I have probably spent hundreds of hours trying to figure out the Docker ecosystem and technologies.  I was part of an initiative at my company to move to containerized applications with Docker.  I wanted to move to shipping environments instead of code, and if they are scalable, all the better.

Setting up ad using a swarm requires some re-thinking about how your application will work and how to leverage some technology you may have taken for granted – like volumes and networks.

This will be a quick guide to setting up a swarm in AWS with all the necessary bells and whistles for fast deployment.

Requirements:

  • Scalable architecture
  • One Docker Swarm
  • HTTP to HTTPS redirection and SSL generation for many actual websites (could be up to a few hundred)
  • Load balancing
  • Reverse proxying for multiple TLDs and subdomains, so I can point whatever domains I want at the swarm and it works, as long as there is a service listening for that domain.
  • Persistent storage for SSL certs in a high-availability system – so if I lose my proxy, I can rebuild it in seconds and re-attach to that storage and not have to regenerate certificates
  • Internal sevice discovery and service to service communication – so I can have microservices backing a main website.  The swarm does this automatically.

Step 1:

There are a bazillion ways to create a swarm in AWS, but the most straightforward way is to use the Cloud Formation template Docker has put together.  They have a page with hot links that will take you to your Cloud Formation page in AWS and pre-fill the template for a swarm.

Make sure you have a key-pair for the zone you are working in with AWS so you can SSH to your instances/nodes.

Select the link that works for your needs, probably CE Stable.  You will be taken to the AWS stack formation page with the template pre-selected.  Continuing to the next page will show you options for this template.

You get a few options to determine how many managers and workers will initially be created in the swarm.  I just chose 3 managers and 3 workers.  Also, and this is important, you must enable EFS storage at this point so you can create volumes that are shareable among several running containers – basically, a shared file system.  You can go back later and do this, but it will essentially destroy and rebuild your entire swarm.

This template will create and entire Virtual Private Cloud with all the necessary instances, a load balancer that magically knows about published ports in the swarm (AWESOME), security groups, and the like.

NOTE: There is some confusing terminology, since a ‘Stack’ in AWS is a set of instances, load balancers, subnets, security groups, etc that make up a VPC (Virtual Private Cloud).  A stack in docker is a collection of services running together in a scalable manner.

WARNING: In order to get EFS backed storage so you can share volumes between several containers/services/stacks, you MUST manually enable EFS storage on the setup page for this Cloud Formation Stack.

WARNING: You CANNOT use Docker Cloud to create your swarm if you want EFS backed storage, since it does not enable it by default.

Step 2:

The VPC and Swarm are now running.  Now it is time to set up a connection to your swarm so you can run commands on it.  You have a few different ways to do this.

Method 1: SSH

Using SSH to one of your manager nodes, you can run docker commands all day long.  Unfortunately, this means that you need to copy any files you want Docker to use onto that manager node, which can get messy.  For example, deploying a stack would mean you need to copy that .yml file to that node and then run docker stack deploy from there.

Method 2: SSH Tunneling

This methods involves creating an SSH tunnel to your manager node and then using the export DOCKER_HOST method to send your local Docker commands to the tunnel.

Normally, Docker commands are sent to the local docker socket of your machine.  However, you can set the DOCKER_HOST env variable so that docker sends commands to a different host, either a VM or even a remote machine (like we want to here).  Instead of exporting the env variable, which overwrites it for your current terminal session, you can specify the host target using the -H argument of docker, e.g.:

docker -H <machine>:<port> <command>

The benefit is that you run commands from your machine and send them to the swarm, and also use local stack files to configure and deploy stacks on the remote swarm.  You don’t need to copy the files to the remote, since the command is parsed locally and then sent to the swarm!

For example, you can set up a tunnel:
ssh -i key.pem -NL localhost:2376:/var/run/docker.sock docker@<ip_address_of_manager> &

and then run
export DOCKER_HOST=localhost:2376

Then, for this session on the command line, all docker commands will be sent to this tunnel.  Optionally, you can omit this step and simply use:

docker -H localhost:2376 <commands>

Which accomplishes the same thing.

The downside to this method is the need to maintain your tunnels, which can often break if unused and get weird when moving networks and VPNs.  Therefore, I am not a fan of this method.

Method 3: Exposing the Docker Port

You can also open the Docker Port in your security group for your managers (you will see an appropriately labeled security group for your swarm manager in AWS) and then export the manager IP and port to your DOCKER_HOST env variable.  You will have to use a non-secure connection to do so without setting up some advanced certificates and the like, which is beyond scope here.  The non secure port is 2375 – open that up to your IP pool or VPN (NOT THE WORLD), and then run

export DOCKER_HOST=<manager_node_ip>:2375

Now your docker commands will be sent to that manager.  I do not recommend this setup, since it can be a security risk and does not establish secure connections, and it means if you don’t have a VPN and are on the move, you would have to keep updating the allowed IPs.

Method 3: My Favorite – Docker Cloud

Method 3 involved leveraging Docker Cloud to proxy your commands.  You get the benefits of the SSH tunnel, namely, running commands from your local machine with access to local stack file, without needing to maintain SSH tunnels yourself.  It is also secure, since it uses the secure connection on port 2376 to talk to the docker daemon in the swarm from Docker Cloud.

Log in to Docker Cloud and switch to swarm mode.  Navigate to the swarms page and hit ‘Bring your own swarm’ – a popup will appear with instructions to run a docker command on one of your manager nodes.  Copy that command, SSH in to one of your manager nodes, paste and run.  This command will create a client proxy container that will essentially receive remote docker commands and pipe them into the swarm.  You will be asked to log in to Docker Cloud and set a name for the swarm.

Once that is complete and the swarm is connected to Docker Cloud, your swarm will appear in Docker Cloud with a blue dot, indicating is is running.  You can click your swarm and get a script to run on your local machine that will install another proxy, binding to one of your local ports.  Copy/paste and run it.  Once this is running, you now have a proxy tunnel to your swarm that is managed securely by Docker Cloud.  It will display instructions on what local port this tunnel is connected to, and how to set up your session to use it.  You can export that DOCKER_HOST like in method 2, or just add the -H argument to your docker commands, but either way you now have a stable and manageable connection.  If you forget what port you need to use, just run

docker ps

On your local machine, and you will see the client proxy container and its local port!

Step 3:

You now have a VPC and active Swarm, and a reliable way to communicate to that swarm. Next, we need to set up Traefik as a reverse proxy.   Traefik works by mapping a front end service (a domain name) to a back end service (a service in the swarm).  It does this by reading labels you set on your docker stacks which tell Traefik what domain they want to be associated with, and what port they want to talk on.

Essentially, we want to have ports 80 and 443 open to the world, which would work fine if we had just one service on those ports (the nature of Docker and swarm is that only one service can use a port at a time) – but we want multiple websites.  The reverse proxy handles this by listening to all requests and sending them to a service on the backside using the requested domain/path as a routing mechanism – a lot like how Apache virtual hosts work.  Traefik is awesome because it also handles auto HTTPS redirection and automatically generating SSL certs for each domain.  We store the certificates in a key value store – Consul, per Traefik docs – so we have constant, controlled, and persistent access to these certs, even when Traefik is running in high-availability mode across 3 replicas on 3 nodes.

First, we need to create a network that both Traefik and all our services will use to talk to each other.  If we didn’t specify a network, then every time we ran docker stack deploy, it would create a private network just for that stack and not be able to talk to or know about other stacks, services, or containers.

docker network create -d overlay <network name>

Now let’s create an EFS-backed shared volume for Consul to use.  We need Consul to act as a controller for setting and getting data from a shared source, and it has a Key-Value store that our reverse proxy, Traefik, can use to store configuration and SSL certificates.  We then map an EFS-backed shared volume into Consul so that if Consul dies, we still have all our key value storage and can simply bring up a new Consul service and reconnect with little downtime, and no need to rebuild configurations and regenerate SSL certs.  EFS is our driver mechanism because we can then create volumes that are shareable across multiple services and containers that have been scaled, whereas normal volumes cannot do this.

docker volume create -d "cloudstor:aws" --opt backing=shared {volume_name}

Now we are ready to actually boot up Traefik and Consul.

Step 4:

We want Traefik to be in high-availablity mode, which basically means it will be living on all points of contacts to the swarm – the manager nodes.  The load balancer generated in step 1 listens to all docker events in the That means a request will be routed from the load balancer to a node manager to Traefik, and then to a service on the backend.  To do this, we need to deploy it as a stack, or set of replicated services across all manager nodes.

We actually created a DevOps repo that contained this documentation and our stack file for Traefik and Consul, and then deployed the stack from a local machine using the above proxy method to send that command to the swarm.

This stack file was taken from the Traefik documentation directly, and you can read more about it here.  This is just a quick start guide:

version: "3.4"
services:
  traefik_init:
    image: traefik:1.5
    command:
      - "storeconfig"
      - "--api"
      - "--entrypoints=Name:http Address::80 Redirect.EntryPoint:https"
      - "--entrypoints=Name:https Address::443 TLS"
      - "--defaultentrypoints=http,https"
      - "--acme"
      - "--acme.storage=traefik/acme/account"
      - "--acme.entryPoint=https"
      - "--acme.httpChallenge.entryPoint=http"
      - "--acme.OnHostRule=true"
      - "--acme.onDemand=false"
      - "--acme.email=email@email.com"
      - "--docker"
      - "--docker.swarmmode"
      - "--docker.domain=domain.com"
      - "--docker.watch"
      - "--consul"
      - "--consul.endpoint=consul:8500"
      - "--consul.prefix=traefik"
    networks:
      - traefik
    deploy:
      restart_policy:
        condition: on-failure
    depends_on:
      - consul
  traefik:
    image: traefik:1.5
    depends_on:
      - traefik_init
      - consul
    command:
      - "--consul"
      - "--consul.endpoint=consul:8500"
      - "--consul.prefix=traefik"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    networks:
      - webgateway
      - traefik
    ports:
      - 80:80
      - 443:443
      - 8080:8080
    deploy:
      mode: global
      placement:
        constraints:
          - node.role == manager
      update_config:
        parallelism: 1
        delay: 10s
      restart_policy:
        condition: on-failure
  consul:
    image: consul
    command: agent -server -bootstrap-expect=1
    volumes:
      - consul-data:/consul/data
    environment:
      - CONSUL_LOCAL_CONFIG={"datacenter":"us_west2","server":true}
      - CONSUL_BIND_INTERFACE=eth0
      - CONSUL_CLIENT_INTERFACE=eth0
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.role == manager
      restart_policy:
        condition: on-failure
    networks:
      - traefik

networks:
  webgateway:
    driver: overlay
    external: true
  traefik:
    driver: overlay

volumes:
  consul-data:

Modify the values to reflect what you need – LetsEncrypt wants an email address, and Traefik wants to know about a default domain.  This is arbitrary, since you can point multiple domains and subdomains here and it will all be handled.  Make sure your networks are named corretly (the Traefik network is fine the way it is, and is just a way to talk to Consul).  Make sure you set the Consul datacenter setting correctly.

Once ready, run:

docker stack deploy -c <name of stack file>.yml traefik

And everything should start smoothly.  Traefik now monitors the docker socket and will publish any services that start up and have the appropriate labels on them (read more about labels and docker on the Traefik site).  Essentially, it will auto discover publishable services, and on the first request to them, generate an SSL cert.  You also don’t need to touch to load balancer for any of this, since it listens to the swarm and automatically adds listeners for any published ports!

Traefik has a simple dashboard you can view on port 8080.  If you have a domain pointed to your load balancer, you can hit it on port 8080, or you can hit your load balancer directly on 8080.  I recommend adding a security policy to protect unwanted viewing.

You are now up and running.  As an example, I pointed a domain and all it’s subdomains to the swarm load balancer.  I then deployed a stack that had the labels for Traefik indicating it wanted to discovered on a certain subdomain of that main domain.  once it started, the Traefik system picked up on it and showed it as being active on the dashboard.  I then went to that subdomain, and after a few seconds and requests to generate the SSL, had a fully secure connection.  I can then scale that service in the stack up and down and it round robins perfectly.

Here’s an examle whoami yml.  You would set your zone file to point domain.com to the load balancer and *.domain.com as a CNAME to your main domain, then use this:

version: "3"
services:
    whoami:
        networks:
            - webgateway
        image: emilevauge/whoami
        deploy:
            labels:
                traefik.docker.network: webgateway
                traefik.frontend.rule: Host:whoami.domain.com
                traefik.port: 80
networks:
    webgateway:
        external: true

Run this using the same setup as the Traefik command above (sending your commands to your remote docker swarm):

docker stack deploy -c <whoami file>.yml whoami

And you will see it appear on the Traefik dashboard.  Hit the URL – it may take a few requests, or seconds, for the ssl to work correctly, and then it is all set!

You can now scale the service running:

docker service scale whoami_whoami=3

And it will scale it up!  Hit the domain again and you can see the node ID changing, but the cert working.

Now you are all set to start deploying multiple sites and services to your swarm.

jQuery’s abort() method on AJAX requests still calls the fail() callback function

I am working on a piece of cloud software that allows users to generate Text Codes (those nifty letters and numbers you can send to a short phone number to sign up for events, coupons, etc.).  One of the requested features was an inline validation checker for the actual text code (since the codes would all be for the same phone number, each has to be unique).

No problem.  Let’s create the input and bind some fancy listeners on key events!

<input id="textcode" type="text" class="form-control col-md-7 col-xs-12 @if($errors->has('textcode')) parsley-error @endif"
name="textcode" value="{{ old('textcode') }}" required>

Javascript:

function watchTextCodeInput()
{
    var $input = $('input[name=textcode]');
    var $currentRequest = false;
    //on keydown, clear the countdown
    $input.on('keyup', function () {

        this.value = this.value.toLocaleUpperCase();
        var textcode = $input.val();

        $input.removeClass('parsley-validated');
        $input.removeClass('parsley-success');
        $input.removeClass('parsley-error');

        $input.parent().parent().find('.parsley-errors-list').remove();

        // if $currentRequest is now defined as an ajax object, abort it
        if($currentRequest !== false){
            $currentRequest.abort();
            $currentRequest = false;
        }

        if(textcode.length > 3){

            $currentRequest = $.getJSON('/admin/textcodes/check_code/' +     textcode, function(res){

                $input.addClass('parsley-validated parsley-success');

           }).fail(function(res){

                res = res.responseJSON;
                var message = res.valid ? 'Text Code Unavailable' : 'Text Code is invalid format.';
                // Perform other business here...

           });
       }
    });
}

Everything looks fine on the surface, but when testing the code, I ran into an immediate issue – for some reason,  I was getting an error in the console for attempting to find the property ‘valid’ of ‘undefined’.

A little digging through the stack, and I discover that it is being thrown in the .fail() callback – even though I had aborted the call!

Turns out, after some research, that jQuery still calls the .fail() callback on all $.ajax requests ($.getJSON, $.post, $.get, etc.).  This post describes the same issue, and is what helped me track down the issue.

I agree with the sentiment that a call to .abort() should not be considered a failure, but rather a halt.  A failure should only apply if the request was allowed to continue and actually failed due to server, network, or other errors.

The solution appear to be to check the test status passed as the second argument to the .fail() callback function.

function watchTextCodeInput()
{
    var $input = $('input[name=textcode]');
    var $currentRequest = false;
    //on keydown, clear the countdown
    $input.on('keyup', function () {

        this.value = this.value.toLocaleUpperCase();
        var textcode = $input.val();

        $input.removeClass('parsley-validated');
        $input.removeClass('parsley-success');
        $input.removeClass('parsley-error');

        $input.parent().parent().find('.parsley-errors-list').remove();

        // if $currentRequest is now defined as an ajax object, abort it
        if($currentRequest !== false){
            $currentRequest.abort();
            $currentRequest = false;
        }

        if(textcode.length > 3){

            $currentRequest = $.getJSON('/admin/textcodes/check_code/' +     textcode, function(res){

                $input.addClass('parsley-validated parsley-success');

           }).fail(function(res, textStatus){
               if(textStatus !== 'abort'){
                    res = res.responseJSON;
                    var message = res.valid ? 'Text Code Unavailable' : 'Text Code is invalid format.';
                    // Perform other business here...
               }
           });
       }
    });
}

After that, the request performed as expected.

MySQL case sensitivity and table names in Linux

I recently ran into an interesting issue where a project I worked on in CodeIgniter was using a mix of upper and lower case names to query the same table (this project is older and had been worked on by several people over the course of a few years):

SELECT * FROM CMS_email...
SELECT name FROM cms_email...

The issue is that certain file systems are case sensitive, and since MySQL tables are accessed and stored in a file and folder structure, the underlying file system’s case sensitivity plays a role in how well this code works.  Interestingly, when I downloaded the tables via Sequel Pro and imported them to my local VM server, the table names were all converted to lowercase as well.  Either way, it would be a conflict.

Typically, Windows file systems are not case sensitive, so the above code will work out of the box.  On most Unix based systems, however, the file systems are case sensitive, which means we get MySQL errors when we attempt to run this code in a Ubuntu box, for example.  Since I run an Ubuntu VM with Vagrant, I needed to either:

  1. Modify all table calls to be lowercase, and then modify production table names to match, or modify all queries on my local to match production table names, then manually update table names on my local VM instance.  Doable, but somewhat risky and complicated.
  2. Find another way to make this work with MySQL settings (read: lazy, quick)

Option 2 is doable with the MySQL configuration variable lower_case_table_names.  This variable tells MySQL that no matter what the calls made to the database are, everything is converted to lowercase.  This makes it easier to work with  my all lowercase tables that were automatically formatted upon import, and I don’t have to parse through tens of thousands of lines of code to update queries.

Important note: before using this, we need to make sure our table names are all lowercase, since internally MySQL will still be case sensitive – this just modified the input to assume all lowercase!  This works because my imported tables were auto converted to lowercase names.

All we need to do is find our configuration file (my.cnf) and add an entry to turn this system variable on.  On Linux, this is usually:

/etc/mysql/my.cnf

However, this all depends on how you install MySQL. MySQL configuration files are a topic on their own.  This article shows some advanced ways to locate it.

 

Once we have located the file, we can open it with our text editor and find the section titled [mysqld].  There will be a section called Basic Settings, and at the end of this settings block we will add our entry:

#
# * Basic Settings
#
user            = mysql
pid-file        = /var/run/mysqld/mysqld.pid
socket          = /var/run/mysqld/mysqld.sock
port            = 3306
basedir         = /usr
datadir         = /var/lib/mysql
tmpdir          = /tmp
lc-messages-dir = /usr/share/mysql
skip-external-locking
lower_case_table_names=1

Once we are done, we can save the file then restart our MySQL server, in my case using the command:

sudo service mysql restart

And now you can query the table

cms_email

with the query

SELECT * FROM CMS_Email

File Permissions for Apache (Ubuntu, Linux)

I find myself Googling this all the time – setting permissions and users/groups for the /var/www folder of a LAMP install.  Also, I break down some of the terminal commands a little.  In my experience, many web developers tend to touch the command line every so often but never get really comfortable understanding what they are actually doing. So here we go, for my benefit and yours:

Find your main user name (the one you will SSH and SFTP with).  For AWS (my most commonly used) with an Ubuntu EC2 install, it is ‘ubuntu‘ and for Vagrant boxes it is ‘vagrant.

We need to add this user to the www-data group so they can share permissions.  Apache runs in the www-data group, and Apache’s ‘run as’ user will be the one creating and executing files within the /var/www folder (read: uploads, online edits, etc).  We also use sudo with all this to avoid any permissions errors before setup is complete.

The command usermod allows us to change users’ settings.  The flag -a means ‘append’ and must be used in conjunction with -G (list of groups).  Then we tell the -G what groups, then what user we are modifying;

sudo usermod -a -G [group-name] [user-name]

For AWS with Ubuntu:

sudo usermod -a -G www-data ubuntu

Next, we need to change group ownership of the /var/www folder (and everything inside it) to www-data (so we can all share permissions within the group).  the command chgrp performs this task, with a flag -R to mean recursively apply this group, followed by the folder we are applying the group to:

sudo chgrp -R www-data /var/www

Finally, set permissions on folders and files for everything in the /var/www folder. We will use 644 for files and 755 for directories (this is standard). If you need special permissions, run these commands first, then apply special permissions to whichever files and directories need it after the fact.

We use the command chmod to perform this action (see link – we will use numeric permissions, as I prefer this method).  However, chmod has a caveat – it has a -R recursive flag, but we want to apply different permissions depending on whether we are working with a folder or a file. Chmod does not have the ability to differentiate between files and folders, so instead we use the find command in conjunction with the exec command.

Reading from left to right, the logic is to find everything in /var/www that is a particular type (a filter flag for find, –type, followed by d for directory and f for file), the execute an arbitrary (inline) command on it.  We will execute chmod, setting our permissions accordingly.

sudo find /var/www -type d -exec chmod 755 "{}" \;
sudo find /var/www -type f -exec chmod 644  "{}" \;

That last bit with the quotes and curly braces tell -exec that we are working on the current path, which will change while the find command loops and executes the chmod command on every search result.  So say we find index.php – exec then runs

chmod 644 index.php.

Since the entire command is prefaced with sudo, it will actually run:

sudo chmod 644 index.php.

There you have it, permissions are ready to go.

Setting up a Vagrant Environment for PHP/MySQL development

Why Vagrant?  What’s the point?

In my industry, we are often led to believe that the “newest” or “most cutting edge” development tools area must-do, integral part of developing good websites.  Unfortunately, most of the time we end up spending more time learning new tools or workflows than actually getting any work done, and it tends to dissuade us (or me) from branching out and learning new techniques.  I may be tempted by newer, faster ways of doing everyday tasks (SASS compilation, JavaScript minification, etc), but only when I have time and patience.  My line of work is mostly one-time-client-based: fast-moving, short-term projects often stacked in three or four week intervals – so once I start a project, there’s little to no time to try out a new workflow or technology before the deadline.

Having said that, I do stumble across the occasional “must-have” tool (depending on the project), like CodeKit, that truly does improve productivity with minimal impact on my current workflow.

Enter Vagrant – a self-contained, configurable virtual machine for loading and testing websites or applications.  Having run MAMP without problems for years, I figured Vagrant would be cool but not useful enough to incorporate.  But it is, and here’s why:

  1. It keeps the same server environment wherever the project goes
  2. It’s dynamic, and easy to configure to match whatever production server environment I need
  3. It’s fast to boot up, fast to shut down, and fast to erase (one command for each)
  4. It maintains its settings and packages outside the virtual machine, so I don’t waste space with several different Ubuntu environments set up for specific purposes.
  5. FREE

You may be wondering why you would want to run a virtual machine for local server testing if you have *AMP(P) software already running.  The thing is, each installation of Apache, MySQL, and PHP can differ in subtle but important ways.  PHP versions may be different, Apache might not have the same configurations on different machines, and so on.  And in many places, every developer and designer has a local server stack running on their machine to test and preview websites.  This means that code can work on one machine, but not another.  Code may work on all MAMP installs in the office, but fail on an Amazon server (I have had this happen before).  Git takes care of maintaining the same code base across multiple machines, but doesn’t guarantee person A will see the same results as person B.

We could try to include a Virtual Machine image within each repo, but imagine the overhead in updating those files (averaging several gigabytes).

What if, instead, we could just push a flat configuration file for a server, have some software read it, build a virtual machine environment matched to a project, and boot it up?  That’s where Vagrant comes in.

Now, I am by no means a server guru.  I have difficulty setting up virtual hosts, remembering the correct commands to restart Apache, and have to keep a cheat sheet for everything I do over SSH.  And yet, Vagrant was a breeze to set up.  So let’s get into it.  All it takes is some basic knowledge of the command line or terminal app, and you can pick that up as you go.

Installing Vagrant

Head over to Vagrant’s Website and download a copy.  Once you follow the installation procedures, nothing happens.  Yet.

You also need to install virtualization software.  I run VirtualBox for everything here.  All you have to do is have it installed for Vagrant to work with it.  This software basically creates, from system image files, a virtual computer running in it’s own shell, isolated from your host system.  Vagrant’s virtual machine runs in the background, so you don’t actually “see” the virtual machine running.

After that, we are set to start booting up a Vagrant server.

Let’s create a project first.  I’ll create a project folder called Vagrant for the hell of it.  I’ll put it in my Development directory where all my other projects are.  Vagrant automatically uses this directory as the root for your virtual machine user – in essence, a synced folder.  We will see more of that after setup.

Once the project directory is created, I’ll navigate to it via command line / terminal:

Terminal:

cd ~/Development/Vagrant

Windows CMD:

cd C:/users/Calvin/Development/Vagrant

Once I’m there, I will run the vagrant init function. This function takes one argument, where we must specify what type of server (OS, OS version) we want to initialize.  Vagrant refers to this as a Box. Typically, this can be hashicorp/precise32, which is an Ubuntu (Linux) server (Box).

vagrant init hashicorp/precise32

It will grab the latest version of this OS and store it outside of the project folder – so you can have one root machine from which to build different virtual server evironments.

Now we run the boot up function:

vagrant up

And bam!  The server is running.  It may not look like it, but it’s there! To check, run

vagrant status

And it should tell you that the server is running.  You can now SSH into it* using

vagrant ssh

*Note to Windows Users – you will need to install an ssh client to use this command.  Git has one built in, but there are many others out there.

From here, you are using the Linux terminal to perform operations. To test that the folder you are currently in on the Vagrant server is the same as the project folder, run a simple touch command to create a file

touch index.html

Now, outside of the terminal window, navigate to that folder in Finder/Windows. You should see your index.html file sitting there.

But this Virtual Machine is just an operating system – it’s not running Apache, or MySQL, or PHP yet.  We need to install and configure them. However, we don’t want to set it up for just this instance – we can use Vagrant’s configuration methods to make sure that whenever someone clones our source code, they can grab the configuration file for the virtual server and have the exact same environment we are running locally.

Configuring your Vagrant Virtual Machine

Vagrant configuration files are called Vagrantfile, and they are located in the project’s root directory.  Editing them is simple using a text editor.  Loading them into the Vagrant machine is called Provisioning, and provisioning only happens on the first run of a Vagrant project.  To force a machine to re-provision itself, use the command

vagrant reload

or

vagrant reload --provision

If it doesn’t reload properly.

To make a quick, LAMP-stack server, I first created a bash script called bootstrap.sh in the root project directory that checked a few things and installed a LAMP stack.  I then called that script to be run as part of the provisioning configuration.  This comes from a great article Getting Started With Vagrant on This Programming Thing.

bootstrap.sh:

#!/usr/bin/env bash

sudo debconf-set-selections <<< 'mysql-server-5.5 mysql-server/root_password password rootpass'
sudo debconf-set-selections <<< 'mysql-server-5.5 mysql-server/root_password_again password rootpass'
sudo apt-get update
sudo apt-get -y install mysql-server-5.5 php5-mysql apache2 php5

if [ ! -f /var/log/databasesetup ];
then
    echo "CREATE USER 'database_user'@'localhost' IDENTIFIED BY 'database_password'" | mysql -uroot -prootpass
    echo "CREATE DATABASE vagrant_database" | mysql -uroot -prootpass
    echo "GRANT ALL ON vagrant_database.* TO 'database_user'@'localhost'" | mysql -uroot -prootpass
    echo "flush privileges" | mysql -uroot -prootpass

    touch /var/log/databasesetup

    if [ -f /vagrant/data/initial.sql ];
    then
        mysql -uroot -prootpass vagrant_database < /vagrant/data/initial.sql
    fi
fi

if [ ! -h /var/www ];
then 
    rm -rf /var/www
    sudo ln -s /vagrant/public /var/www

    a2enmod rewrite

    sed -i '/AllowOverride None/c AllowOverride All' /etc/apache2/sites-available/default

    service apache2 restart
fi

You can copy this code directly into the bootstrap.sh file you use, just make sure of a few things:

  1. The database_user must be changed to a username of your choice
  2. The database_password must be changed to a secure password
  3. The vagrant_database must be named whatever you want

To make this script run at startup, I changed my Vagrantfile to look like this:


Vagrant.configure(2) do |config|
 config.vm.box = "hashicorp/precise32"
 config.vm.provision :shell, path: "bootstrap.sh"
 config.vm.network "forwarded_port", guest: 80, host: 80
end

The last entry before end deals with port forwarding – basically, the port you want to preview the website from outside of the virtual machine.  Many people opt for port 8080 for development purposes, but I prefer the standard port 80.  It’s up to you.

Now that our provisioning and boot file are set up, head back to the project directory and run

vagrant reload --provision

Just to ensure it really reloads the provisioning files.  You should see the Vagrant software outputting quite a bit of downloading and installation commands, and then boot up. Now you are set with a full LAMP stack in a Vagrant Virtual Machine!

From here, just download/clone/etc the latest version of WordPress, put the files in the root of your project, and run the installer.

For more information:

Vagrant Documentation

Github Repo of some great provisioning bash scripts

New WordPress Site

Well, here we are.

In case anyone actually lands at this page, this is a new blog site/sandbox platform for me to experiment and share my developments, be it with web development, philosophy, or maybe a little project or two with my Arduino Uno.

Thanks for stopping by, and stay tuned…