By anders pearson 15 Apr 2023
Note: This is not an original idea from me. I definitely learned this term and concept from some book or blog post many years ago. I've been unable to find the original though, so I've written this up so I can reference it.
Automation is a major part of programming. Fundamentally, there’s nothing that a computer does that we couldn’t do manually given enough time. It’s just that computers can operate much faster, more precisely, and are tireless and without distraction.
So when we’re programming, we’re always just taking some manual process and automating it. Programmers spend an awful lot of time thinking about how we automate things, but it’s also important to think about what we automate.
An important antipattern related to the “what to automate” that I try to keep in mind is what I’ve heard called “Automating the Pain”.
Here’s a scenario as an example. Imagine that a company has an old manual process that involves Department A doing some work to generate a report from their own records or databases, printing out copies of that report, then that report gets sent to Department B, who need to take the data from that report, read it, and enter some or all of it into their own systems. Back in the day, this all may have happened via paper records in filing cabinets, forms filled in by hand, photocopies, and sent by post (or at least sneakernet). Now there’s probably less of that, but if you’ve spent any time in a large organization, you’ve almost certainly seen some version of this, probably involving Word documents or Excel spreadsheets being emailed around.
Programmers encountering a situation like that instinctively realize that much of that could be automated and are often eager to do so when given the chance. A first step might be just to switch to a paperless process where documents are emailed rather than printed out, mailed, entered back in, etc. That tends to make things faster and a bit more reliable (the poor data entry person can at least copy and paste instead of having to re-type) and will often be an improvement over the previous situation. They might then make improvements to the document formats to use standard fields that are easier to copy over from one system to another. Eventually, they might switch to web-based versions of both systems and automate the parsing.
The antipattern here is is that often these processes either shouldn’t exist at all or shouldn’t involve any human interaction and the automation work is just making a painful process a bit faster and less painful when instead we should be examining the entire system and eliminating or massively changing the process. Eg, why do Department A and Department B need to have separate copies of the same data, just in different formats and databases? In itself, that’s a recipe for inconsistency and errors. Maybe a better solution would be to set up a shared database that’s the single source of truth for the data that would’ve been sent back and forth and granting both departments the appropriate level of access to read and write to that database. Or maybe Department A is only getting the data first for historical reasons that are no longer relevant and instead the data should just go directly to Department B and bypass A altogether.
When I’m developing, I try to constantly ask myself that question: am I just automating the pain when I really should be automating away the pain?
I’m especially conscious of the antipattern when I find myself working on something that has some similarity to the above scenario, eg, if I’m serializing some data, transferring it somewhere, then deserializing it. But it also comes up often around legacy code. Eg, a workaround is introduced to deal with a bug in some library or system, then other code has to deal with the workaround. Eventually, the original bug is fixed or the system/library with the bug is no longer used. But the workaround remains and other code has to deal with it. Or some complicated part of the code was introduced to handle a particular integration or feature that was later removed. It’s easy for a developer who came onto the team later to spend a lot of time and effort improving that complicated code because they don’t realize that it’s actually no longer relevant and ought to just be removed completely. The common factors tend to be things like silos or handoffs between teams or developers, the interaction of multiple systems, and accretion over time with respect to changing requirements.
As a final thought, I do want to point out that while it’s an antipattern, there are also times when it is a perfectly valid option or even the best approach that can be realistically undertaken. Sometimes there are organization politics, legal restrictions, or market forces that prevent you from implementing the “right” solution. Or doing it “right” is a much harder, larger task and a little bit of automating the pain has a much better return on the investment. Other times, some basic automation can be a useful step towards the right solution, helping you get to a point where the right solution is more obvious. In any of those cases, I just think it’s worth acknowledging (and probably documenting) the tradeoffs you are making.
By anders pearson 30 Jan 2023
I thought I’d share my “simple” recipe for building reliable applications. Of course “simple” doesn’t mean “easy”, but I think this is the core philosophy behind how I approach developing reliable software.
Here is the process:
-
A developer writes some code to make one improvement to the application.
-
That developer deploys that code to production.
-
That developer observes that new code in production to verify that it did what they expected and didn’t cause other problems.
-
That developer takes what they’ve learned in step 3 and goes back to step 1.
That’s it.
Anything that deviates from that feedback loop or slows it down will ultimately make a system less reliable and more prone to incidents and outages.
One obvious caveat on this process is that the developer involved has to actually care about the reliability of the system. I’m kind of taking that as a given though. I guess It’s possible for a developer to not care about reliability, but every developer I’ve ever met has at least not actually wanted to cause an outage.
Outages and incidents usually come from more subtle deviations from that process.
One very common mistake that we, as a profession often get wrong is not having the same developer who wrote the code do the deployment and the observation. That often also includes a related mistake where more than one change is developed or deployed at once.
A commit or PR should have a single purpose. The more different things you try to do at once, the harder the code/PR will be to understand, the more likely it is that errors will be introduced, the harder it will be to thoroughly observe the behavior of all the different changes once it’s in production to verify that it’s actually correct, and the harder it will be to connect something learned in that process back to the actual cause and use that to inform the next improvement. If you’ve spent any time debugging code, this should be intuitive. Writing a thousand lines of code implementing a dozen different features before running anything pretty much guarantees a painful debugging session and missed bugs.
Implementing a single feature or even just a single branch or logical part of a feature and testing it before moving on to the next makes it much faster to locate the source of a problem and to have some confidence that each line of code does what you think it does. As experienced developers, I think most of us recognize that having a single clear purpose is important not just for the unit of work (PR) but for the unit of code as well (function, class, module, service, etc). The more different responsibilities a single piece of code has, the more likely it is to have bugs and the harder it is to understand and to work on.
Google’s CL author’s guide agrees on that point.
The CL makes a minimal change that addresses just one thing. This is usually just one part of a feature, rather than a whole feature at once. In general it’s better to err on the side of writing CLs that are too small vs. CLs that are too large.
The other part of that is probably a more common mistake, but is related. That’s when it’s not the same developer doing all of the steps. If Developer A writes some code, then Developer B (or worse, Ops Person B), deploys it, B is usually not in nearly as good a position as A to properly check that it’s working as expected. It’s not always the case, but usually A, having spent the time writing the code, has a better idea of what it’s supposed to do, what it should look like if it’s working correctly, and what edge cases are most likely to cause problems.
These two are commonly connected. Multiple developers each working on separate improvements get their code bundled together and all deployed in a batch. That pretty much always means that there are multiple changes deployed at the same time, which again makes it harder to reason about problems and interactions, creates more surface area that needs to be checked, and when a problem is found, makes it harder to trace to one change and makes it more complicated to revert just that change.
I’ve occasionally mentioned that I’m not really a fan of “staging” environments as they are often used. There are advantages to having a staging site, but the downside is that they often become a chokepoint where multiple changes get bundled together and then are later deployed to production together, invoking the two above problems. I’ve seen many prodution incidents that started when there were a bunch of different changes that had been deployed to staging, that had been verified there to different degrees. Those then all got merged together and deployed to production. The developer merging and deploying (probably separate developers) didn’t have a full understanding of all the different changes or how to verify them after the deploy. Unfortunately, this is a very common problem with staging environments. There are legitimate uses for a staging environment, but I think that they are often overused and their downside needs to be considered, especially if they are forming this kind of chokepoint.
You may have noticed that “testing” isn’t one of the steps in the process before deploying to production. There are a couple reasons for that.
First, I consider automated tests to be part of both the “write the code” and “deploy the code” steps. The process of writing code should almost always involve running tests locally. Really, a good test driven development workflow is just a mini version of the whole process above, except without a deploy step. You implement a single small piece of functionality at a time and verify its behavior the best you can and then repeat. Step 2 of the process, “That developer deploys that code to production.” doesn’t mean that the developer manually copies code out to the production servers; it means that they initiate an automated deployment pipeline, either by clicking a “deploy” button somewhere or merging a PR to trigger the deploy. The deployment pipeline should be running tests, possibly at multiple levels (unit tests, integration tests, post-deployment smoke
tests) and fail as soon as any of them fail.
A more controversial reason I didn’t explicitly include a testing step is that while I love tests, I actually don’t think they’re as directly important to the reliability of a site. Good automated tests allow a developer who cares about the reliability of the site to make changes to the code and verifythose changes more rapidly. They allow the deploy step to run more safely (so less time spent debugging or fixing broken deploys). My experience is that if a hypothetical Team A writes no tests at all but otherwise cares a lot about site reliability and follows the process above, the result will be a more reliable site than Team B, who write a lot of tests but implement large changes, don’t deploy those changes individually, deploy infrequently, and don’t focus on observing the changes in production. Team A might start off worse, but they’ll learn a lot more, have a deeper understanding of how their code actually runs in production, and be able to build something that’s more reliable in the long run.
Steps 3 and 4 where the developer who implemented a change closely observe that code in production and learn from it are perhaps the key to the whole approach. This is why I tend to put so much emphasis on metrics, monitoring, logging, and observability tools. You usually can’t just see inside a running process in production, so you have to have tools in place to collect and process useful data. This is also why, while I can put a lot of those tools in place, at some point, the developers writing the code need to pick them up and use them. They are the ones who will be in the best position to know what data will help them verify that a feature is working like they expect or to understand what’s happening when it behaves differently in production than they expected. The developers will have assumptions about how often a given function is called, how long
a database call should take to execute, which calls might fail, what sorts of values will be passed as parameters to a function, etc. Production traffic and end-user behavior often prove our assumptions wrong in startling ways. Uncovering those wrong assumptions quickly and correcting them is key to making a reliable site. One of the best things a developer can do is to cultivate a deep curiousity about what their code is really doing in production.
It’s also important to keep in mind that the success is greatly affected by the speed of the whole process. Each time you go through it, you should learn something. The more times you go through it, the more you learn and the more you can improve.
If the process, or steps in the process are slow or difficult, that limits how many times you can go through the cycle. A large codebase that takes a long time to navigate and slow tests make the development step slower. Slow deployment pipelines obviously make the deploy step take longer, but that’s also slowed by not having zero downtime deployments, forcing you to only be able to deploy during certain, infrequent windows (this also again makes it much more likely that you’ll end up deploying multiple changes at the same time). Not having good observability tooling makes it slower to verify the change in
production. In addition to allowing fewer iterations, any slowness in the process also reduces reliability because the more time that passes between writing code and observing it in production, the more difficult it will be for the developer to really remember the full context of the change and properly verify it.
We often have additional steps in our workflow that serve purposes other than reliability and must be there. But we need to minimize their impact on the overall process. Eg, if you have compliance requirements (eg, SOC 2, PCI, ISO, etc) you probably have to have code reviews for security. Even without compliance requirements, code reviews are good (though I would argue that their importance has always been less about catching bugs or improving reliability and more about ensuring that other developers are aware of or understand the changes and maintaining common standards for our codebase). But it’s very important that turnaround time on code reviews is kept short to avoid slowing down the entire process. (of course, it’s equally important that we keep PRs small and single-purpose so the reviewers can do meaningful reviews quickly). It’s also important to lean on automation as much as possible to keep that part of the process fast and efficient.
Finally, it’s also worth mentioning that the importance of this process isn’t limited to just application code. When I’m working on infrastructure, my goal is to go through this whole cycle with Terraform config, ansible roles, etc.
This post has contained a lot of my opinions. I expect that not everyone will agree with me on many of these. I will note though that what I advocate for is pretty close to the recommendations that you will see in, eg, Charity Majors’ twitter feed, Dave Farley’s Modern Software Engineering or the DORA Report, which studies software development at a large scale. In particular, DORA are interested in what makes a “high functioning organization” and the four key metrics that they’ve found to be the most reliable predictors are 1) lead time for a change (how long it takes to get from an idea to that change being running in production and available to customers; shorter is better) 2) deploy frequency (how many times per day/month/etc. you deploy or release to end users; more often is better) 3) time to restore (if there’s an outage, how long does it typically take to fix; shorter is better) and 4) change fail percentage (what percentage of your changes/releases resulted in an incident; lower is better). The first two of those are obviously directly related to the approach I describe and I think an argument could be made that it helps with the latter two as well.
See also: my post about Ratchets.
By anders pearson 03 Dec 2022
I have a couple old related blog posts here: Continuously Deploying Django with Docker (2015) and Continuously Deploying Django with GitHub Actions (2019). They describe my approach (at the time) to deploying Django apps on a simple cluster of servers, achieving continuous deployment with zero downtime and some basic failover and scalability capabilities.
At the end of the first one, I say:
Nevertheless, what I’ve put together is basically a low rent, probably buggy version of a PaaS. I knew this going in. I did it anyway because I wanted to get a handle on all of this myself. (I’m also weird and enjoy this kind of thing). Now that I feel like I really understand the challenges here, when I get time, I’ll probably convert it all to run on Kubernetes or Mesos or something similar.
And, several years later, I feel like I should probably mention that yes, I did actually just convert it all to run on Kubernetes. That was actually quite a while ago, but I’ve been lazy about blogging.
Since I got a lot of positive feedback on the previous posts, I figure I ought to write up my Kubernetes setup as well in the hopes that people find that useful.
Let’s talk about Kubernetes really quickly first. I should stress that I am, by no means, a Kubernetes expert. I’ve gone through the exercise of setting up and running a cluster manually. As a learning exercise, I highly recommend doing that. If you’re actually going to run a Kubernetes cluster with production workloads I only recommend going that route if you plan to invest a significant amount of time in becoming a Kubernetes expert. Kubernetes has a reputation for complexity and it’s deserved when it comes to building, operating, and maintaining a cluster. Just deploying an application to a Kubernetes cluster that someone else is responsible for operating is actually quite simple though. These days, there are a number of options for managed Kubernetes clusters that you can just pay for. I have experience with GKE and DigitalOcean Managed Kubernetes and can recommend either of those. I’m sure the equivalent offerings from AWS, etc. are also fine; I just don’t have experience with them.
The setup I’ll describe for my personal apps here uses DigitalOcean Kubernetes. I’ll point out a few things that are specific to that, but most of it will be pretty generic.
For the most part, the underlying Django applications are still structured the same and packaged individually as Docker images pretty exactly as described in the previous blog posts. The actual Dockerfile
s, python dependencies, etc. have all been updated over the years to more modern approaches, but the end result is still an image that is more or less a black box that takes its configuration from environment variables (à la 12 factor apps. Kubernetes just runs docker containers, so I really didn’t have to change the apps to run them there.
Again, I’ll use my RSS feed reader app, “antisocial”, as the example and go through its configuration. That’s because it has both a web component as well as Celery workers and Celery Beat, which also need to run. That’s the most complicated one I have. Other apps that don’t need Celery or Celery Beat are basically the same, but even simpler than what I’ll show here.
My DigitalOcean Kubernetes cluster is pretty small, just three nodes with enough RAM to handle all the apps I want to run on them. Again, I aim to have at least two instances of the web component running behind a load balancer for basic failover and to allow for zero downtime deployments. With three nodes in my cluster, I want it to be able to keep serving traffic even if it loses one node. If this were for production workloads and not just my personal apps that don’t really need high availability, I’d set up a larger and more redundancy. For my personal apps though, I need to keep the costs reasonable.
The apps all use a shared managed PostgreSQL instance (again, run by DigitalOcean) and serve static files via AWS S3 and CloudFront. So the Kubernetes cluster is just handling the actual Django web apps and Celery workers as well as a RabbitMQ instance that connects them.
Kubernetes has a number of abstractions that we’re going to have to look at. I won’t go into great detail on each, since there are better resources out there for learning Kubernetes. Instead, we’ll just look at how they are set up for this Django app and how they fit together.
Another quick note: Kubernetes has a bunch of features like namespaces and RBAC that let you secure everything and prevent applications running on the same cluster from accessing each other (accidently or maliciously). Since this is all just my personal side project stuff, I’ve skipped as much of that as I can. If you are going to use Kubernetes for production, really need to go learn that stuff and understand it first.
Pretty much everything in Kubernetes is just defined in a YAML file. You then (after setting up some auth stuff that I won’t cover), interact with your cluster by running kubectl apply -f some_file.yaml
. Kubernetes reads the defnition of whatever you are creating or updating in that file and updates the cluster to match it. Yes, you quickly get a bit overwhelmed with the amount of YAML involved and how verbose it can be, but you get used to it and the consistency of the interface is pretty nice.
The first thing we need for the app is a Service, which is just an abstraction for an application that will be running and accessible on the internal network. So, we have service.yaml
like:
----
apiVersion: v1
kind: Service
metadata:
name: antisocial
labels:
app: antisocial
spec:
type: NodePort
selector:
app: antisocial
ports:
- port: 8000
targetPort: 8000
name: gunicorn
All that’s really doing when we run kubectl apply -f service.yaml
is setting up some routing information to let the cluster know that there’s some “antisocial” application that will be exposing “gunicorn” on port 8000. We’re not actually running anything yet.
The next thing we need is configuration for our applications. In my old setup, I could deploy config files and put environment variables into Systemd/upstart configs as needed. Since you don’t typically have access to the servers in a Kubernetes cluster, you need a different approach. Kubernetes provides a ConfigMap abstraction which is just bundle of key/value pairs. We make one for the app in configmap.yaml
:
---
apiVersion: v1
kind: ConfigMap
metadata:
name: antisocial-config
data:
DB_HOST: "....db.ondigitalocean.com"
DB_USER: "antisocial"
DB_PORT: "25060"
AWS_S3_CUSTOM_DOMAIN: "....cloudfront.net"
AWS_STORAGE_BUCKET_NAME: "thraxil-antisocial-static-prod"
ALLOWED_HOSTS: ".thraxil.org"
HONEYCOMB_DATASET: "antisocial"
Any other settings that we need could be added there. Again, a simple kubectl apply -f configmap.yaml
and it’s built in the cluster.
A bad approach would be to put sensitive data like passwords or the django secret key into the ConfigMap. It would work, but isn’t recommended. Instead, Kubernetes also has Secrets which are very similar to ConfigMaps but, as the name implies, intended for secret and sensitive values. There are a bunch of different ways to set up and manage secrets. The approach I took was to have essentially an .env
file called secrets.txt
with just key/value pairs for the secrets like:
SECRET_KEY=....
DB_PASSWORD=....
AWS_ACCESS_KEY=...
AWS_SECRET_KEY=...
And so on. Then I do kubectl create secret generic antisocial-config --from-env-file ./secrets.txt
to load them into the cluster.
So, we have a Service, and configuration split into the secret and non-secret parts. Now we are ready to actually connect those up and run our application.
Kubernetes does that, along with some more information about the steps involved in spinning up your services in a Deployment. It’s a bit of a weird abstraction at first, but quickly became one of my favorite aspects of Kubernetes once I understood it. The Deployment defines the complete desired state of the application along with enough information for the Kubernetes cluster to figure out how to achieve that desired state no matter what state it starts out in.
Let’s start by just looking at the Deployment for the web app part (ie, gunicorn) without any of the Celery stuff:
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: antisocial-app
labels:
app: antisocial
spec:
replicas: 2
selector:
matchLabels:
app: antisocial
template:
metadata:
labels:
app: antisocial
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- antisocial
topologyKey: kubernetes.io/hostname
containers:
- image: <IMAGE>
name: antisocial
envFrom:
- secretRef:
name: antisocial-secret
- configMapRef:
name: antisocial-config
ports:
- containerPort: 8000
name: gunicorn
The last part of that is actually a good place to start. That specifies the container that’s going to run, gives it the name antisocial
, sets up an environment from the ConfigMap and Secrets that we defined, and tells Kubernetes that those containers will be exposing port 8000 with the name gunicorn
, which lets it associate those container/ports with the abstract Service that was defined way back at the beginning. The -image: <IMAGE>
line we’ll come back to later.
The replicas: 2
line tells it to run two instances of this container. Then, the whole podAntiAffinity:
block basically tells Kubernetes to do its absolute best to run those two instances on different physical nodes (the underlying servers). Having two instances running doesn’t help us much for failover if they’re running on the same node and that node goes down. “Anti-affinity” is Kubernetes’ way of letting you avoid that, while also letting Kubernetes otherwise have complete control over which containers run on which nodes without you having to micro-manage it.
The other wonderful thing about that config is that by default Kubernetes does a rolling deploy with zero downtime. When a new image gets deployed, it spawns instances of the new version and waits until they’re running before moving traffic over to them and only then shutting down the old ones. If you define health check endpoints in your containers, it will wait until the new ones are actually able to handle traffic. If you want to get really fancy, you can replace the health checks with more complicated checks that, eg, look at external metrics like error rates or latency and you can configure it to roll out the new version in small steps, only continuing if those metrics look good (ie, canary deploys).
The Celery Beat Deployment is very similar:
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: antisocial-beat
labels:
app: antisocial-beat
spec:
replicas: 1
selector:
matchLabels:
app: antisocial-beat
template:
metadata:
labels:
app: antisocial-beat
spec:
containers:
- image: <IMAGE>
name: antisocial
command: [ "/run.sh", "beat" ]
envFrom:
- secretRef:
name: antisocial-secret
- configMapRef:
name: antisocial-config
The differences, aside from it being labeled “antisocial-beat” instead of just “antisocial” are that it only has one replica (you don’t want more than one Celery Beats instance running at once), no ports are exposed, and it adds the command: ["./run.sh", "beat"]
parameter which tells the docker container to run the beats service instead of gunicorn.
It’s slightly more complicated with the Celery Workers:
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: antisocial-worker
labels:
app: antisocial-worker
spec:
replicas: 1
selector:
matchLabels:
app: antisocial-worker
template:
metadata:
labels:
app: antisocial-worker
spec:
initContainers:
- image: <IMAGE>
name: migrate
command: [ "/run.sh", "migrate" ]
envFrom:
- secretRef:
name: antisocial-secret
- configMapRef:
name: antisocial-config
- image: <IMAGE>
name: collectstatic
command: [ "/run.sh", "collectstatic" ]
envFrom:
- secretRef:
name: antisocial-secret
- configMapRef:
name: antisocial-config
- image: <IMAGE>
name: compress
command: [ "/run.sh", "compress" ]
envFrom:
- secretRef:
name: antisocial-secret
- configMapRef:
name: antisocial-config
containers:
- image: <IMAGE>
name: antisocial
command: [ "/run.sh", "worker" ]
envFrom:
- secretRef:
name: antisocial-secret
- configMapRef:
name: antisocial-config
It’s basically the same approach as Celery Beat, but adds the whole initContainers
block. That defines containers that should be executed once at initialization. In this case, it runs “migrate”, “collectstatic”, and “compress” commands in sequence before starting the Celery worker, which then stays running.
YAML files can be concatenated together into a single file, so all three of those parts go into deployment.yaml
. Running kubectl apply -f deployment.yaml
will then actually bring everything together and give us a setup with the Celery worker and beats process running, and two gunicorn processes running on different nodes in the cluster. If a node in the cluster goes down, Kubernetes knows what was running on it and will do its best to replicate those containers to other nodes and update the internal network routing to send traffic to them. If containers crash, it will restart them to ensure that the desired number of replicas are always available. If the cluster is expanded, Kubernetes will spread the load out across them as best it can.
Deploying to the cluster is ultimately done by running kubectl apply -f deployment.yaml
after updating the -image: <IMAGE>
lines in the config to point to a new version of the docker image.
A simple GitHub actions workflow can do that:
on:
push:
branches: master
name: deploy
jobs:
buildDockerImage:
name: Build docker image
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@master
- name: Build docker image
run: docker build -t thraxil/antisocial:${{ github.sha }} .
- name: docker login
run: docker login -u $DOCKER_USERNAME -p $DOCKER_PASSWORD
env:
DOCKER_USERNAME: ${{ secrets.DOCKER_USERNAME }}
DOCKER_PASSWORD: ${{ secrets.DOCKER_PASSWORD }}
- name: docker push
run: docker push thraxil/antisocial:${{ github.sha }}
- name: Update deployment file
run: TAG=$(echo $GITHUB_SHA) && sed -i 's|<IMAGE>|thraxil/antisocial:'${TAG}'|' $GITHUB_WORKSPACE/deploy/deployment.yaml
- name: Install doctl
uses: digitalocean/action-doctl@v2
with:
token: ${{ secrets.DIGITALOCEAN_ACCESS_TOKEN }}
- name: Save DigitalOcean kubeconfig with short-lived credentials
run: doctl kubernetes cluster kubeconfig save --expiry-seconds 600 k8s-1-20-2-do-0-nyc1-....
- name: Deploy to DigitalOcean Kubernetes
run: kubectl apply -f $GITHUB_WORKSPACE/deploy/deployment.yaml
- name: Verify deployment
run: kubectl rollout status deployment/antisocial-app
The beginning of that is the same as before, whenever we merge to master
, it builds the docker images (which runs the unit tests), tags it with a git SHA1, and pushes that up to the docker hub. A small extra new step is that it then also does a quick sed
to replace all occurrences of <IMAGE>
in the deployment.yaml
file with the newly created and pushed docker image + tag.
Then it’s a little bit DigitalOcean specific where it authenticates to my cluster.
The actual deploy happens with kubectl apply -f ... deployment.yaml
. The next step, kubectl rollout status deployment/antisocial-app
isn’t strictly necessary but is a nice feature. That just waits until the cluster reports back that the deployment has succeeded. If the deployment fails for some reason, the GitHub Actions workflow will then be marked as a failure and it’s much more noticable to me.
There’s a final piece that I do need to mention. What’s been covered above gets the application running, but at some point, you need to actually expose it to the rest of the internet on a public interface. Kubernetes refers to this as an Ingress. There are a ton of different ways to do Ingress. It’s common to use Nginx, Traefik, Caddy, HAProxy, or other common reverse proxies or load balancers for ingress, especially if you are running your own cluster. For managed clusters, it’s also common for providers to make their own managed load balancer services available as Kubernetes Ingress. Eg, AWS ELB, GCP HTTPS Load Balancers, etc. Digital Ocean does the same with their load balancer. If you’re using a managed Kubernetes cluster, you probably also want to use the managed load balancer. I’m not going to cover my ingress setup here because it’s specific to the DigitalOcean setup and I will instead just recommend that you follow the instructions for your chosen provider. One nice aspect of pretty much all of them is that they make SSL certificates pretty seamless to manage. If the provider’s ingress doesn’t handle certificates itself, you can use cert-manager with your Ingress. I’ve found it much easier to deal with for Letsencrypt certificates than the standard certbot
approach.
Finally, if you’ve read this far, I have a secret for you: I already don’t use this setup. Yeah, by the time I got around to writing up my Kubernetes setup, I’d already moved on to a different approach that I like even better for my personal side projects. I’ll try to be faster about writing about that setup, but no guarantees. In the meantime, I do still recommend a similar approach to Kubernetes, especially if you want that failover and scalability. I’ve barely scratched the surface here of what Kubernetes is capable of as a platform that you can build on. I know it’s an intimidating amount of abstract concepts and YAML when you first encounter it. But compared to a similarly (honestly, much less) capable setup like the previous one I had with VMs running docker and systemd, custom configuration management tools, a consul/etcd cluster, registrator, consul-template, and a bunch of shell scripts for deployment, it really is simpler to deal with. It’s pretty amazing to go into the cluster management console, delete a node, and watch Kubernetes automatically move things around to the remaining servers without dropping any traffic.
By anders pearson 26 Nov 2022
One of my favorite all-purpose engineering tools (maybe it’s more accurate to call it a “technique”, but I’m going to stick with “tool” here) is the ratchet.
In my career, I’ve had the benefit of some very long tenures at different organizations. I and teams I’ve worked on have launched new greenfield projects, I’ve maintained some codebases for more than a decade, I’ve done big bang rewrites and piecemeal migrations. I’ve worked with experienced and talented developers and complete newbies. I’ve also inherited a lot of code and systems. Some of those have been well designed, well tested, and well documented. Others have been… not so much those things.
I’m not dogmatically against rewrites. Sometimes that’s the appropriate solution. Often though, it’s not practical or feasible to rewrite a large codebase or existing system even if it’s in terrible shape. It needs to be improved in place. The thing with systems that are already in bad shape is that making changes is risky. The larger the change, the riskier. It’s often clear that the current state of things is bad, but you don’t know exactly what “good” would look like or how to get there from where you are.
This is where the ratchet comes in.
A ratchet is two parts:
-
any small change that improves the codebase or the system in some way.
-
some safeguard that locks that change in place.
Fix a bug? Add a regression test to make sure the bug stays fixed. No automated tests at all? Add a “dummy” test suite that runs zero tests. Obviously that won’t catch any bugs by itself, but it should be low risk to introduce (you’re just adding test harnesses) and when you do start adding tests, you’ll have the scaffolding there to fit them into. Set up a commit hook or Github action to run the dummy test suite. Again, it shouldn’t introduce any risk but will get everyone accustomed to seeing tests pass as part of the development cycle. I’ve seen dummy test suites like this catch syntax errors or broken imports in code just by virtue of ensuring that the code is at least parsed and compiled before getting pushed out to production (we’ve all seen developers make “just a tiny change” and push without even running it locally; if we’re honest, most of us have done that ourselves).
Is the code all over the place in terms of conventions? Add a simple linter tool (eg, flake8
or eslint
). Most of them will let you enable/disable different rules. If you need to, start out by disabling every single rule so that it’s not actually checking anything, but add it to the commit hooks or CI setup. Then you can enable one rule at a time later on as you gain confidence that they aren’t breaking anything. Each rule that gets enabled prevents the codebase from ever having that problem again. Eventually, you might make enough progress that you’re comfortable switching to an automatic formatter like black or go fmt
or similar.
Is a deploy process manual, slow, and error prone? Write a Runbook entry documenting it as well as you currently understand it. Then start automating parts of the runbook. Add a simple end to end “smoketest” at the end of the deploy to verify that the deploy was successful. Before you know it, you’ll have a completely automated deployment process.
None of these are revolutionary ideas. I just find it useful to think in terms of this “ratchet” mechanism when I’m improving a codebase, a system, or even a team’s process. Make lots of small steps and make it easier for the system to naturally move towards a “better” state than a worse one. At some point the system dynamics take over and become self-reinforcing.
By anders pearson 30 Dec 2019
For the last few years, at the end of the year, I’ve been posting my roundup of new music for the year. I’m a little bored with that for now and thought that this time, I’d list the books I read this year and maybe some brief thoughts on them.
According to my Amazon order history, the first book I bought in 2019 was Hacking: The Art of Exploitation, 2nd Edition. I’d generally avoid anything with the word “hacking” in the title, but somehow this was recommended to me and I’m glad it was. It does a surprisingly good job of explaining basic exploitation techniques like buffer overflows, format string vulnerabilities, and shell code. The kind of stuff I used to read about in text files on usenet but never really got too far into. It reminded me that I really haven’t thought much about assembly and machine code since university. Back in those days, my exposure was mostly writing MIPS and Z-80 assembly and I never really bothered with x86. That led me to pick up Assembly Language Step-by-Step which isn’t terribly modern, but does a good job covering x86 assembly. Wanting to shore up the connections from assembly all the way to higher level languages, I also read Low-Level Programming: C, Assembly, and Program Execution on Intel® 64 Architecture, which was excellent. Its coverage of assembly isn’t as thorough, but it gets much more into those weird bits of compiler and linker magic that I’ve long avoided having to deal with. Later in the year, I ended up getting The Apollo Guidance Computer: Architecture and Operation and I absolutely love it. I’m an old space nerd (aren’t we all) and this is a fascinating look at the computer guidance system on the Apollo, including the hardware, the software, and some math and physics tricks they used to pull everything together.
I stumbled on Introduction to the Theory of Complex Systems at a bookstore and couldn’t resist it. It goes into network theory, evolutionary algorithms, statistical mechanics, and ties together ideas from mathematics, physics, biology, and social sciences. If you know me, you know that I’d be all over that. Afterwards, I wanted more and I’m pretty sure that The Road to Reality: A Complete Guide to the Laws of the Universe came up as a recommendation somehow. If you didn’t know, I was a physics major for most of my undergrad before switching to electrical and then computer engineering. This is the book that I wish I’d had when I was struggling through my physics classes. It doesn’t go deep into any of it, but Penrose explains so many of the concepts with a clarity that was definitely missing from my classes and textbooks. I’m not sure it would be that understandable without at least some basic college physics and math background, but if you ever took a general relativity or quantum mechanics class and felt like you didn’t really get it, this is definitely worth checking out.
That kind of got me thinking about those concepts in those physics classes that I’d learned to use but never really felt like I understood. As a physics major, you don’t necessarily have to take a lot of math classes, and there are a lot of things that get introduced in your physics classes as tools but aren’t really explained that well. Back when I was in school, we didn’t have Wikipedia or Youtube or Amazon (and I certainly didn’t have disposable income to spend on expensive math books that weren’t required for my classes) so unless you took a class on a topic or knew someone with that expertise, it was hard to fill in those gaps. By the end of my physics classes, I remember feeling like I was solving problems by pattern matching and mechanically applying memorized solutions and could get the right results but I no longer really knew what was going on. A few of the important ones that I definitely missed out on were tensors and topology. We used tensors plenty in physics classes, but I remember mostly just thinking of them as multi-dimensional matrices and a set of mechanical rules that you used to manipulate subscripts and superscripts in a way that let you do higher dimensional calcuations without having to write out a million terms. Every once in a while a professor would do something weird with them and we’d just have to take it on faith that it was valid, but it always left me feeling like there was a lot more that I didn’t understand.
This realization that I could now fill in those gaps led me down another rabbit hole of math books that occupied quite a bit of my year. I started with some youtube videos and various online resources that helped a lot. Then I picked up Tensor Calculus because some of those Dover books are inexpensive hidden gems. This one wasn’t. It might be fine if you are more of a mathematician, but I’m still fundamentally coming at things from a physics/engineering perspective and not that interested in proofs and derivations. The one that was much better for me was An Introduction to Tensors and Group Theory for Physicists. The preface basically described my exact situation of “lingering unease” around the concept and does a great job of filling in the blanks. It also made connections to group theory that I hadn’t really thought about. Group Theory comes up more often in computer science and is something I feel more comfortable with but it did remind me that I never actually took an Abstract Algebra class. I was also vaguely aware that despite seeming like it should be unrelated, Topology is pretty heavily based on Abstract Algebra and that was another topic that I wanted to fill in. So I got A Book of Abstract Algebra. That one I think does fall into the “hidden gems” category of Dover books. Highly recommended. I followed it up with Introduction to Topology which was… ok. It was clear enough and I could follow it, but it was pretty dry and I don’t feel like I gained any real insights from it. I didn’t come away from it thinking that Topology was amazing and having a different perspective on things. I just kind of feel like now I know a bunch of definitions and theorems. I also got Counterexamples in Topology which was highly recommended and I can see being a valuable reference, but it’s really not one that you just sit down and read cover to cover. Somewhere in the middle of my Dover math books spree, I also read Geometry, Relativity and the Fourth Dimension which was pretty shallow but a very quick and fun read.
The last math book I picked up that’s worth mentioning is Mathematical Methods for Physics and Engineering: A Comprehensive Guide. This is another book that I wish had existed when I was an undergrad. It’s big and extremely thorough. It basically covers all of the math you would need for an entire physics or engineering undergrad education. I really can’t think of a single mathematical concept, tool, or technique that I encountered in my undergrad physics and engineering that isn’t covered in it. For any given topic, there’s probably a better introduction or a more thorough treatment in some other book, but I’ve never seen any other book with the same breadth. It’s become my goto math reference and lives on a shelf by my desk now.
As a “palate cleanser” during my math refresher, I read The Poetics of Space which has been on my recommendations list for years and come up in numerous conversations. I don’t feel like I got as much out of it as I could have because I’m just not that familiar with the french poetry that he bases his discussions off of. I’m sure it’s one of those books that really needs to be read in its original French, but even in translation, it’s beautifully written and evocative.
I’ve been vegan for a while but only recently thought to actually read anything on the topic. Why We Love Dogs, Eat Pigs, and Wear Cows: An Introduction to Carnism is probably the best introduction to the ethical and moral philosophy behind veganism and animal rights. The Sexual Politics of Meat: A Feminist-Vegetarian Critical Theory makes a strong case for veganism as an important part of intersectional feminism. It was written in the 90’s though and both veganism (eg, at the time “vegan” wasn’t a commonly used term, so it only talks about “vegetarianism”) and feminism have changed quite a bit since then, so it needs to be read with that in mind. One chapter jumps right into fairly graphic descriptions of sexual violence without any warning, so that’s something to be aware of. On a completely different axis, How Not To Die: Discover the foods scientifically proven to prevent and reverse disease, while not explicitly “vegan”, is a great scientific look at nutrition from the doctor behind nutritionfacts.org and supports a vegan diet as at least a good starting point for avoiding many of the diseases and causes of mortality that plague the modern world.
I’ve been living in London for a few years now and in 2020, Phoenix and I become elligible for Indefinite Leave to Remain. Our main obstacle is passing a test that involves knowledge of British history, culture, and government. There’s an official guide, Life in the United Kingdom that covers everything that could be on the test along with practice tests and I’m pretty sure I could pass the test by cramming those for a bit. But I’ve got plenty of time and if I’m going to be living here long term, I figure I might as well know a bit more. Also, if you hadn’t noticed, the last few years have been pretty eventful in British politics and have exposed much of the world to the idiosyncratic and confusing way that very important decisions are made over here. In an effort to understand how these things work and the historical context behind them, I have been going through some “very short introduction” books to build up some background knowledge. So far I’ve gone through The British Empire, Nineteenth Century Britain, Twentieth Century Britain, The British Constitution, and British Politics. I’ll probably read a bunch more before I’m done.
Those books all fit into various themes or categories. In between them were a bunch that were pretty random:
Finally, the one piece of fiction I managed to read this year (to be fair, I read a lot of fiction in 2018, so I was taking a bit of a break on purpose) was Into the Dark (Dark Devices Book 1), which happens to have been written by my coworker. He wrote it for NaNoWriMo and self-published, which I strongly support. It’s not very long and leaves the story set up for future continuation, but it’s a nice fantasy read and reminds me of a slightly sci-fi’d up take on old D&D Underdark books.
By anders pearson 12 May 2019
[Edit 2019-09-21: Updated with the new YAML syntax]
A couple years ago, I wrote a post covering how I set up a continuous deployment pipeline for my personal Django apps using Docker and Jenkins.
A lot has changed since then so I thought it was about time I updated. Especially since this weekend I switched from Jenkins to GitHub Actions.
So, a quick disclaimer before I go on: GitHub Actions is currently in public beta. They have been clear that it is not yet considered stable for production and comes with no warranty. It could change and break my setup at any time. They also haven’t released any information about what it will cost when it is fully out. These are my personal apps so I’m fine with all of that. I’ll try to remember to update this post when the beta is over, but in the meantime, use it at your own risk.
My old post covered the basic setup and, honestly, the general approach hasn’t changed much; I’ve just been swapping out the pieces. It’s worth reading the old post for more details, but I’ll recap the basic approach here so the rest of this post can just cover the new GitHub Actions stuff.
-
I’m still packaging Django apps up in Docker images. My
Dockerfile
has changed, but these days there are a million articles on how to run Django in Docker, so I won’t cover those changes.
-
I have three application servers where the Django apps run. Each app gets deployed to at least two of them so there is redundancy. This means that a server can fail or go down for maintenance without the site going down. For some of them, there are also Celery Worker and Celery Beat services running to handle offline tasks. There’s an nginx proxy in front of that setup proxying back to the individual Django apps. Consul, consul-template, and registrator are used to dynamically adjust the proxying setup so everything is handled smoothly when a server goes down or when an application instance is added or removed. This part still works basically the same so see the old post if you want more details on how that works.
-
My servers are all managed with Salt, including the production settings and secrets. The Docker containers get those settings in environment variables. Those variables are put in place by Salt and handed to the containers via a couple shell scripts. Again, that part is all unchanged and you can get more details in the old post. What’s relevant here is that there is a
docker-runner
script on each server that will run an application’s container with all of the production settings injected and let me basically do manage.py
commands in those containers. So docker-runner myapp migrate
is equivalent to starting up the thraxil/myapp:$TAG
container (where $TAG
is specified in a designated file and is generally a git hash) with production settings and running manage.py migrate
in the container.
-
Static files are hosted on Amazon S3 and CloudFront. Again, this is pretty typical for Django deployments and is better covered elsewhere. All you need to know for my setup is that
manage.py collectstatic
and manage.py compress
need to be called during the deployment to push the latest static files up.
The old approach to deployment was that I ran Jenkins server alongside my app servers. When there was a push to the master
branch on GitHub, it would trigger a build there, which would do the following:
-
Build a new docker image for the app. I like to include running unit tests as part of the
Dockerfile
. That ensures that the tests pass in the exact environment with all the same dependencies as the code will have in production.
-
Tag that image and push it to the Docker Hub.
-
Do a
docker pull
of that exact tag on all of the app servers.
-
Write the tag out to the right place on those servers so
docker-runner
will use that version of the image.
-
On one of the servers, run
docker-runner myapp migrate
, docker-runner myapp collectstatic
, and docker-runner myapp compress
to handle database migrations and static files.
-
One by one, restart the processes on the app servers.
Jenkins used to run those steps with a fairly small shell script described in the old post. At some point, I converted that to a “proper” Jenkins pipeline specified in a Jenkinsfile
. I never wrote about that, and I’m replacing it now, but it’s up here if you are curious. It did some things better than the the shell script version and made for a much nicer overall experience in the Jenkins web interface, but mostly it made me glad that I don’t often have to code in Groovy.
Jenkins worked OK for me but it’s always been a pretty awkward part of the stack and not a lot of fun to run and keep updated.
So when Github Actions came out and I got access to the public beta, I decided to see if I could replace my Jenkins setup.
What I came up with is seems to work pretty well.
Actions for a project are stored as code in a .github
directory in your project. There’s a nice web UI for editing actions, but it’s worth looking at the code. My old blog post covered the deployment of my antisocial feed reader app, so for consistency, let’s look at how the GitHub Actions setup looks for it.
I have two workflows. The first just sets up and runs Jessie Frazelle’s branch-cleanup-action
GitHub action which keeps things tidy by deleting merged branches. She has a great blog post on how it works that helped me start to get my head around Actions.
The main deploy workflow, in .github/workflows/deploy.yml
starts with:
on:
push:
branches: master
name: deploy
jobs:
buildDockerImage:
name: Build docker image
runs-on: ubuntu-latest
steps:
The first step (actually technically two steps) that runs is:
- uses: actions/checkout@master
- name: Build docker image
uses: actions/docker/cli@master
with:
args: build -t thraxil/antisocial:${{ github.sha }} .
That builds the docker image, tagged with the git SHA, which GitHub Actions conveniently exposes in the github.sha
variable. As I mentioned before, with my Dockerfile
s, the unit tests run during the build, so this step also serves as a good check on PRs.
The next two are fairly self-explanatory:
- name: docker login
uses: actions/docker/login@master
env:
DOCKER_PASSWORD: ${{ secrets.DOCKER_PASSWORD }}
DOCKER_USERNAME: ${{ secrets.DOCKER_USERNAME }}
- name: docker push
uses: actions/docker/cli@master
with:
args: push thraxil/antisocial:${{ github.sha }}
They log us into the Docker Hub and push the docker image there. The only new bit is that the login step uses the secrets
field to grant itself access to some secret settings that are stored in the GitHub project settings.
The bulk of the deploy work (that I had to do, at least) is in the next stanza:
- name: deploy
uses: thraxil/django-deploy-action@master
env:
APP: antisocial
KNOWN_HOSTS: ${{ secrets.KNOWN_HOSTS }}
PRIVATE_KEY: ${{ secrets.PRIVATE_KEY }}
SSH_USER: anders
WEB_HOSTS: ${{ secrets.WEB_HOSTS }}
CELERY_HOSTS: ${{ secrets.CELERY_HOSTS }}
BEAT_HOSTS: ${{ secrets.BEAT_HOSTS }}
That uses a custom action that I placed in its own repo so I could easily re-use it across my projects.
The great thing about GitHub Actions is that they are just Docker containers that get some specific environment variables and shared directories set up and run how you need them. GitHub Actions will find the Dockerfile
in there, build it (if it hasn’t already cached), and run it in the appropriate environment.
That means that it’s easy to package up pretty much any common deployment tool you can think of as a GitHub Action (if someone else hasn’t already done it) or just put your own together if you know how to make a Docker image and write a little shell script.
If I were doing this from scratch or if it were a more complicated deployment process, I’d probably grab or build an Ansible action and do the rest of the deployment that way. In my case though, it’s only a couple steps and I pretty much already had a shell script written (see the old post). So I just made an action that pretty much runs that script in a container, with some tweaks to make it work in the new environment.
The Dockerfile is pretty minimal. Just builds off debian:stable-slim
(which GitHub highly recommends to keep images small and using a common base to maximize caching), installs openssh-client
(the only package we need that isn’t already there) and drops in a small script.
That script should look familiar from the old blog post. It just does a few additional things to deal with the GitHub environment variables and setting up a valid SSH config. All of the variables that it needs are either set via env
in the workflow config above, or they are stored in the project settings as secrets (SSH private key, etc.)
Finally, since I like to use Sentry to track exceptions and Sentry does a better job if you tell it when you deploy new code, I use a community published action to publish a new sentry release for the project:
- name: sentry release
uses: juankaram/sentry-release@master
env:
ENVIRONMENT: production
SENTRY_AUTH_TOKEN: ${{ secrets.SENTRY_AUTH_TOKEN }}
SENTRY_ORG: thraxil
SENTRY_PROJECT: antisocial
That’s basically it. I’m mostly pleased with the setup. It took me a few hours to figure out all the pieces and work through some stupid bugs (my own) to get it working how I wanted, but now it’s pretty solid.
Like I said earlier, I really like that it works by just stringing together Docker containers. That means there’s never any question about when GitHub Actions will support some tool. If you can stick it in a Docker image, you can use it. Configuration is straightforward once you’ve spent some time with it (and the Web UI is surprisingly usable and powerful).
I feel like I’m only scratching the surface of what it’s capable of (I’m not even running anything in parallel). The more interesting uses will be less of this “traditional” kind of deployment pipeline and will take better advantage of the Actions’ direct access to the rest of GitHub’s APIs. Right now it feels like the community is still figuring out what those possibilities are and I’m excited to see what patterns emerge.
By anders pearson 24 Feb 2019
The #shelfie hashtag has been coming up lately. I’m bad at social media, but I thought I’d post mine here.
I moved from the US to Europe a few years ago and I had to massively cut down my book collection. In the last few years, I’ve started accumulating books again, but I’ve been a bit more purposeful this time around. So most of the books on my shelves are ones that I decided were important enough to bring across the ocean with me or that I’ve wanted to have for reference since the move. Fiction and “lighter” non-fiction that I’m only going to read through once I try to buy in electronic formats so it doesn’t take up precious space in my flat.
My books are vaguely organized though not totally consistently. The top shelf here is the really random stuff, including the few bits of fiction I have in physical form, Dutch language books (that’s “Charlie and the Chocolate Factory” in Dutch), and of course, one of my small paintings in the front.
The middle shelf is sort of C++, Linux, Networking, and low level systems stuff.
Bottom is mostly Cloud, DevOps, and SRE related.
These shelves are right below a window so I had a hard time getting a clearer photo without it washing out a bit.
Top shelf here is focused more on CS and distributed systems. Some classics here like SICP, CLSR, the Dragon Book, CTM, etc.
Bottom shelf is the fun one with my weird art books. The foil wrapped one is the NASA Graphics Standards Manual. The old looking one next to it is an 1800’s edition of Elihu Vedder’s illustrated version of the Rubaiyat of Omar Khayyam.
Top here is math and random programming books that didn’t fit on the other shelves.
The bottom shelf has a few electronics books and random large ones that won’t fit elsewhere. The filing box is mostly full of printed out CS papers.
By anders pearson 14 Jan 2019
It took me a little longer than I’d planned, but here is my yearly music round up for 2018 just as I’ve done for 2017, 2016, and 2015.
As usual, this isn’t exhaustive and only includes bandcamp links. I have no other commentary other than that I enjoyed these. My tastes run towards weird, dark, loud, and atmospheric. Enjoy.
By anders pearson 02 Jan 2018
I posted my yearly music roundup yesterday, which I’ve done for
the last three years. Today I thought I’d just take a moment
to explain how I go about creating those posts. Eg, there are 165
albums in the last post and I link both the artist and album pages on
each. Do you really think I manually typed out the code for 330 links?
Hell no! I’m a programmer, I automate stuff like that.
First of all, I find music via a ton of different sources. I follow
people on Twitter, I subscribe to various blogs’ RSS feeds, and I hang
out in a bunch of music related forums online. So I’m constantly
having new music show up. I usually end up opening them in a new tab
until I get a chance to actually listen to them. Once I’ve listened to
an album and decided to save it to my list, my automation process
begins.
I’m a longtime emacs user, so I have a capture template set up for
emacs org-mode. When I want to save a music link, I copy the URL in
the browser, then hit one keyboard shortcut in emacs (I always have
an emacs instance running), I paste the link there and type the name
of the artist. That appends it to a list in a text file. The whole
process takes a few seconds. Not a big deal.
At the end of the year, I have this text file full of links. The first
few lines of this last year’s looks something like this:
** Woe - https://woeunholy.bandcamp.com/album/hope-attrition
** Nidingr - https://nidingrsom.bandcamp.com/album/the-high-heat-licks-against-heaven
** Mesarthim - https://mesarthim.bandcamp.com/album/type-iii-e-p
** Hawkbill - https://hawkbill.bandcamp.com/track/fever
** Black Anvil - https://blackanvil.bandcamp.com/album/as-was
** Without - https://withoutdoom.bandcamp.com/
** Wiegedood - https://wiegedood.bandcamp.com/releases
For the first two years, I took a fairly crude approach and just record
an ad-hoc emacs macro that would transform, eg, the first line into
some markdown like:
* [Woe](https://woeunholy.bandcamp.com/album/hope-attrition)
Which the blog engine eventually renders as:
<li><a href="https://woeunholy.bandcamp.com/album/hope-attrition">Woe</a></li>
A little text manipulation like that’s a really basic thing to do in
emacs. Once the macro is recorded, I can just hit one key over and
over to repeat it for every line.
Having done this for three years now though, I’ve noticed a few
problems, and wanted to do a little more as well.
First, You’ll notice that the newest post links both the artist and
the album. This, despite the fact that I only captured the album link
originally.
Second, if you look closely, you’ll notice that not all of the bandcamp links are quite the
same format. Most of them are
<artist>.bandcamp.com/album/<album-name>
, but there are a few anomalies like
https://hawkbill.bandcamp.com/track/fever
or
https://withoutdoom.bandcamp.com/
or
https://wiegedood.bandcamp.com/releases
. The first of those was a
link to a specific track on an album, the latter two both link to the
“artist page”, but if an artist on bandcamp only has one album, that
page displays the data for that album. Unfortunately, that’s a bad
link to use. If the artist adds another album later, it changes. Some
of the links on my old posts were like those and now just point at the
generic artist page.
So, going from just the original link that I’d saved off, whatever
type it happened to be, I wanted to be able to get the artist name,
album name, and a proper, longterm link for each.
I’ve written some emacs lisp over the years and I have no doubt that
if I really wanted to, I could do it all in emacs. But writing a
web-scraper in emacs is a little masochistic, even for me.
The patth of least resisttance for me probably would’ve been to do it
in Python. Python has a lot of handy libraries for that kind of thing
and it would’ve have taken very long.
I’ve been on a Go kick lately though, and I ran across
colly, which looked like a pretty
solid scraping framework for Go, so I decided to implement it with
that.
First, using colly, I wrote a very basic
scraper for bandcamp
to give me a nice layer of abstraction. Then I threw together a real
simple program using it to go through my list of links, scrape the
data for each, and generate the markdown syntax:
script
src=”https://gist.github.com/thraxil/8d858eddd187d8a7a9edb91057d7ee72.js”</script>
I just run that like:
go run yearly.go | sort --ignore-case > output.txt
And a minute or two later, I end up with a nice sorted list that I can
paste into my blog software and I’m done.
By anders pearson 01 Jan 2018
For the third year in a row, here is my roundup of music released in 2017 that I enjoyed.
One of the reasons that I’ve been making these lists is to counteract a sentiment that I encounter a lot, especially with people my age or older. I often hear people say something to the effect of “The music nowadays just isn’t as good as [insert time period when they were in their teens and twenties]”. Sometimes this also comes with arguments about how the internet/filesharing/etc. have killed creativity because artists can’t make money anymore so all that’s left is the corporate friendly mainstream stuff. I’m not going to get into the argument about filesharing and whether musicians are better or worse off than in the past (hot take: musicians have always been screwed over by the music industry, the details of exactly how are the only thing that technology is changing). But I think the general feeling that music now isn’t like the “good old days” is bullshit and the result of mental laziness and stagnation. We naturally fall into habits of just listening to the music that we know we like instead of going out looking for new stuff and exploring with an open mind. My tastes run towards weird dark heavy metal, so that’s what you’ll see here, but I guarantee that for whatever other genres you are into, if you put the effort into looking just a little off the beaten path, you could find just as much great new music coming out every year. I certainly love many of the albums and bands of my youth, but I also feel like the sixteen year old me would be really into any one of these as well.
OK, I know I’ve said that I don’t do “top 10” lists or anything like that, but if you’ve made it all the way to the bottom of this post, I do want to highlight a few that were particularly notable: Bell Witch, Boris, Chelsea Wolfe, Goatwhore, Godflesh, King Woman, Lingua Ignota, Myrkyr, Pallbearer, Portal, The Bug vs Earth, Woe, and Wolves in the Throne Room.
Plus special mention to Tyrannosorceress for having my favorite band name of the year.