TurboGears Deployment with supervisord and workingenv.py

As my web development involves more of the microapps stuff, and the applications I’m building become clusters of larger and larger numbers of small services, the deployment challenges become more important to solve. This post is an explanation of my current best practices at the time of writing.

Most of what I build these days is built on the TurboGears Python web framework and runs on Gentoo Linux. I decompose applications into as many small services as possible. As a result, on our production server at work currently has dozens of TG applications running on it. We serve Perl, Java, and PHP apps too and they each have their own deployment challenges, but for now I’m going to focus on the Python stuff.

I started with TurboGears when it was at version 0.8 or so and have basically kept up with the current version (it’s at 1.0b as I’m writing this). The TG APIs haven’t quite stood still across those versions, so the apps written for 0.8 won’t run directly with 0.9 or 1.0. They still work fine though and I’d rather not spend my entire life continuously porting all my applications to the current version of the TG API. So one of the most important aspects of my deployment strategy is a way to have multiple, mutually incompatible versions of TG installed side by side on the same machine without interfering with each other.

It took me a while, but after being bitten too many times by upgrades to one library breaking other applications on the same machine, I eventually came to agree with Ian Bicking’s argument against site-packages (and I’ve been bitten by this in almost every language/environment I’ve ever worked in; it’s not just a Python issue).

Luckily, Ian actually did something about the problem and wrote workingenv.py which is a handy script that sets up an isolated Python environment. Workingenv.py combined with setuptools has been a maintenance dream come true. I now setup a working-env for every single application I write and easy-install TG and any other libraries that are needed for that application into it. I now have complete control over what versions of what libraries are in use by each application (with workingenv.py’s “–requirements” flag) and they can all peacefully coexist on the same machine. The cost is a bit of disk space and an extra shell command here and there to set up and activate the environments, but it saves so many upgrade related headaches that it’s an easy win.

I used to deploy all my TG apps with Apache and mod_python using the mpcp bridge. Actually, many are still deployed that way, but I’m moving away from it. Embedding python in an apache process works really well for some things, but it also made isolating environments with workingenv.py significantly more complicated (I do have a way of doing it, but it’s a messy hack that I’d rather not include in a post on “best practices”). Mod_python deployment also has memory usage issues the more apps you pile into it, and having them all tied to the same Apache process means that to restart or reconfigure one, the apache daemon has to be restarted so all the other apps tied to it get restarted. As some of them become more critical to our operations, the couple seconds of downtime each time apache gets restarted becomes more of a problem.

So I’ve now switched to the approach that the TG community seems to have basically agreed is best, which is running TG apps standalone and proxying to them with apache or lighttpd (we use apache at work and I use lighttpd for this site). The main drawback to this approach is that it means there are a lot more long-running server processes that have to be kept running. So that’s more start/stop scripts, monitoring, and logs for the sysadmin side of me to have to deal with. It also means more chance that if one of the apps manages to crash, someone has to notice and restart it.

Supervisord is the best solution to this problem that I’ve found. Supervisord is a daemon that starts your services as child processes and watches them, restarting them automatically if they die. It handles things like dependencies, monitoring, and is smart enough to know how to back off on restarts if the process goes crazy. Titus Brown’s introductory article is a good place to start to learn more about it.

Here’s a more concrete example of how all of these pieces fit together.

The first step in setting up a new application is to create a working-env for it. I like to include a requirements.txt that lists the exact versions of every library that I want to use for the application. The example ones that are linked to in Ian Bicking’s post are a good starting point.

$ python ~/bin/workingenv.py working-env --requirements requirements.txt

Then, to do development on the application, you activate the environment and start the server:

$ source working-env/bin/activate
$ ./start-foo.py

For production, we’ll need a single script that the supervisord can execute that starts the server in production mode. It looks something like:

:::bash
#!/bin/bash
#  this file is: start.sh
cd $1
source working-env/bin/activate
./start-foo.py $2.cfg &
echo $! > /tmp/foo.pid
wait $!

and the corresponding config section in supervisord.conf is

:::apacheconf
<program foo>
 command pidproxy /tmp/foo.pid /path/to/foo/start.sh /path/to/foo/ prod
 auto-start true
 auto-restart true
 logfile /var/log/foo-supervisor.log
 user pusher
</program>

Those require a little bit of explanation. start.sh takes a path to the project’s directory and the name of a config file as its arguments ($1 and $2) to be flexible for different deployment situations. Since start.sh spawns the python process seperately, if we just did:

:::bash
#!/bin/bash
cd $1
source working-env/bin/activate
./start-foo.py

Supervisord wouldn’t quite work right. It would be sending signals to the bash process instead of the python process and stopping and restarting the service wouldn’t work right. So instead, start.sh starts the python process in the background, writes its pid out to a file and then waits for the python process. The pidproxy program (which comes with supervisord for just such a purpose) then is used to run start.sh and send signals directly to the python process.

It took some work to figure all this out, but once you’ve done it for one or two apps, it becomes pretty quick and easy to set up any application for deployment like this. Probably my next step will be to create a paster template for tg-admin so these scripts get automatically put into quickstarted projects instead of having to copy them in manually.

I should also mention that there are a few more components of our deployment strategy that I didn’t really talk about. The first is version control. If you’re doing development and not using version control of some sort, you are insane and deserve whatever horrible fate befalls you. Second, we have a single application that handles deployment to production. You press one button and it checks out your app to a staging environment, runs the unit tests, then rsyncs it to the production server and runs any additional steps that are needed there (in the above case, it would do a ‘supervisorctl restart foo’ on the production server to get the new code running). It also logs every step, tags releases in svn, and allows for easily rolling back production to a previous release in case something goes wrong. Our pusher/deployment system is very particular to our environment, so I won’t explain it in too much detail here, but I highly recommend spending the time to set up something similar for your situation. At the very least, you’ll want a shell script that does the deployment process in a single step.