By anders pearson 15 Apr 2023
Automation is a major part of programming. Fundamentally, there’s nothing that a computer does that we couldn’t do manually given enough time. It’s just that computers can operate much faster, more precisely, and are tireless and without distraction.
So when we’re programming, we’re always just taking some manual process and automating it. Programmers spend an awful lot of time thinking about how we automate things, but it’s also important to think about what we automate.
An important antipattern related to the “what to automate” that I try to keep in mind is what I’ve heard called “Automating the Pain”.
Here’s a scenario as an example. Imagine that a company has an old manual process that involves Department A doing some work to generate a report from their own records or databases, printing out copies of that report, then that report gets sent to Department B, who need to take the data from that report, read it, and enter some or all of it into their own systems. Back in the day, this all may have happened via paper records in filing cabinets, forms filled in by hand, photocopies, and sent by post (or at least sneakernet). Now there’s probably less of that, but if you’ve spent any time in a large organization, you’ve almost certainly seen some version of this, probably involving Word documents or Excel spreadsheets being emailed around.
Programmers encountering a situation like that instinctively realize that much of that could be automated and are often eager to do so when given the chance. A first step might be just to switch to a paperless process where documents are emailed rather than printed out, mailed, entered back in, etc. That tends to make things faster and a bit more reliable (the poor data entry person can at least copy and paste instead of having to re-type) and will often be an improvement over the previous situation. They might then make improvements to the document formats to use standard fields that are easier to copy over from one system to another. Eventually, they might switch to web-based versions of both systems and automate the parsing.
The antipattern here is is that often these processes either shouldn’t exist at all or shouldn’t involve any human interaction and the automation work is just making a painful process a bit faster and less painful when instead we should be examining the entire system and eliminating or massively changing the process. Eg, why do Department A and Department B need to have separate copies of the same data, just in different formats and databases? In itself, that’s a recipe for inconsistency and errors. Maybe a better solution would be to set up a shared database that’s the single source of truth for the data that would’ve been sent back and forth and granting both departments the appropriate level of access to read and write to that database. Or maybe Department A is only getting the data first for historical reasons that are no longer relevant and instead the data should just go directly to Department B and bypass A altogether.
When I’m developing, I try to constantly ask myself that question: am I just automating the pain when I really should be automating away the pain?
I’m especially conscious of the antipattern when I find myself working on something that has some similarity to the above scenario, eg, if I’m serializing some data, transferring it somewhere, then deserializing it. But it also comes up often around legacy code. Eg, a workaround is introduced to deal with a bug in some library or system, then other code has to deal with the workaround. Eventually, the original bug is fixed or the system/library with the bug is no longer used. But the workaround remains and other code has to deal with it. Or some complicated part of the code was introduced to handle a particular integration or feature that was later removed. It’s easy for a developer who came onto the team later to spend a lot of time and effort improving that complicated code because they don’t realize that it’s actually no longer relevant and ought to just be removed completely. The common factors tend to be things like silos or handoffs between teams or developers, the interaction of multiple systems, and accretion over time with respect to changing requirements.
As a final thought, I do want to point out that while it’s an antipattern, there are also times when it is a perfectly valid option or even the best approach that can be realistically undertaken. Sometimes there are organization politics, legal restrictions, or market forces that prevent you from implementing the “right” solution. Or doing it “right” is a much harder, larger task and a little bit of automating the pain has a much better return on the investment. Other times, some basic automation can be a useful step towards the right solution, helping you get to a point where the right solution is more obvious. In any of those cases, I just think it’s worth acknowledging (and probably documenting) the tradeoffs you are making.