Search This Blog

Face It, You Can't Predict the Future

We programmers are a funny bunch. We think we can predict the future. No, really, we do. Every time we add a layer of abstraction or a level of indirection or a complicated inheritance hierarchy just in case, we're trying to predict the future. Those three words—just in case—should be closely associated with two other words in our minds—make work. In reality, that's what we're doing most of the time. We're making work for ourselves instead of solving the problem at hand and finishing the work that needs to be done.

I fall into this trap of trying to predict the future all the time. I have to stop myself when my mind goes off on tangents, dreaming up Rube Goldberg architectures to handle all kinds of imagined circumstances that may or may not ever happen. Sometimes it's easier to architect than to work on edge cases. Sometimes it's more interesting to rough out structures for potential future issues than to implement the nuts and bolts of features that need to get done now. Sometimes it's fun to over-engineer and over-think a problem.

Other times we need to get stuff done. During those times (honestly, it's most of the time, isn't it?) how do we stay on track instead of heading off into the over-engineered weeds?

For starters, if you're writing some extra structure into your code, just in case you need it, don't. You should be able to hear yourself saying it. "What if we need to support more than one type of gizmo that talks to the thingamajig? I'll make a thingamajig-gizmo bridge interface so that we can easily implement many types of gizmos, just in case." You better be sure that either the number of gizmo types is really more than one (and you will definitely use more than one as part of the requirements of the product), or the use of the gizmo will be strewn over so much of the code that it creates substantial risk should you ever need to replace it. If neither of these things is true, maybe it's better to put your head down and implement the gizmo instead, just in case you want to ship your software.

It's worth it to carefully consider your options any time you want to abstract away part of your system, especially the core parts of your system. Will you really ever switch views? databases? libraries? Ask yourself, are you making an all-singing all-dancing platform, or are you making a targeted product that real people are going to use to solve real problems, hopefully soon.

Maybe you think you are making a platform, but remember, most of today's platforms didn't start out that way. They started out as products that people found useful and eventually grew into platforms. You can't address every single concern up front, and if you try, you'll never ship. You'll be spending all kinds of time working on abstract interfaces to loosely hook up all of your code to…um…save time. I'm not sure how that works, exactly. Only a few of those abstractions will be needed. Pick a view. Pick a database. Pick your libraries. Make some decisions and ship your product.

Speaking of platforms, platforms need the feature of being able to scale. 'Scale' can mean a lot of different things. It can mean millions of users per month. It can mean tens of thousands of transactions per day. It can mean thousands of requests per second. For a team that's still working on their product, it mostly means the load the software should be able to support, just in case it gets big fast. Face it, not everyone is Twitter or Slack. Not even Twitter was Twitter when they started, and they had all kinds of scaling problems as they got huge. It didn't matter, and they dealt with the scaling problems in real time.

If you're engineering your software to scale before you need to, don't. There are too many things you should be doing to make your product better so that you actually have problems with scaling later. Don't waste time on scaling before it's a problem. If you waste enough time, scaling may never be a problem because your product never ships.

So far I've been talking about programming in the large, but let's turn to programming in the small. Programming in the large is all about architecture and scaling. Programming in the small is all about refactoring and optimizing. Refactoring is another one of those slippery words like scaling. It can mean whatever you want it to mean, so everyone loves it and loves doing it. I love refactoring, too. What does it mean? I'm not sure, but I love it.

I think it means change. I try not to change code unless it markedly improves its readability or performance, but mostly readability. When writing code, the primary goal should be to make it as clear and concise as possible. A piece of code should express its intent as obviously as the language allows. I find that clear, concise code also performs better than verbose, flexible code most of the time. Making code shorter tends to make it faster, to a point. And 90% of the time shorter code has good enough performance while being more clear.

Another reason why performance isn't an issue for most code, and you can get away with focusing on clarity, is that most code doesn't need to run fast. The bulk of most code bases is there to setup the system, provide configuration options, or is I/O bound, waiting for user input or slower processes.

One of my previous projects was writing code for an embedded system that did real-time signal processing at a sample rate of up to 1 MHz for four input channels. Functions could be attached to each of the channels for calculating things like running averages, frequencies, and peak-to-peak amplitudes. The more functions that could run simultaneously, the better. Certain sections of code in this system had to run as fast as they possibly could, so those sections were written in the specialized assembly language of the DSP processor we used. To make this code run even faster, a lot of things were set up and calculated ahead of time so that as few things as possible needed to be recalculated during run time. All of this setup and configuration code didn't need to be fast, so it was written as clearly as possible. That made it much easier to change as requirements changed, and I could implement new features within days of them being proposed because I could look at most of the code and immediately understand how it worked.

If performance is a problem and if you have measured and found where the code is slow, then, and only then, should you optimize the code in question. Since the code is already as clear and concise as you can make it, optimizing will mean making the code more complicated in some way. Maybe you need to use a more complex algorithm. Maybe you need to use more obfuscated low-level functions or assembly language. Maybe you need to add caching or memoization that muddies up the code. That's why optimizations should only be done where they're necessary, and the goal should still be to make the optimized code as clear as possible.

Beyond premature optimizations, we can also have premature refactorings in the name of DRY (Don't Repeat Yourself). DRY is a fundamental principle of good software development from Andy Hunt and David Thomas in The Pragmatic Programmer. (Excellent book. You should read it, if you haven't already.) But the DRY principle can be abused if you try to compartmentalize every operation into its own function, just in case that operation may be used again in some other place.

The problem with making every operation a function is twofold. First, you probably can't predict how the operation will be subtly different in other cases, so the function interface will likely be wrong. Second, doing this fractures your code base to the point where it's hard to find or understand anything. It actually makes it more likely that you'll have code duplication because you'll have hundreds of tiny functions that all do similar things, and you don't even realize it because you can't remember what they all are. If the original code adheres to DRY, leave it alone until code duplication actually occurs. Once it does occur, then refactor.

Obviously, the over-arching theme here is don't do something just in case it may be required in the future. If you do things just in case, you'll have spent huge amounts of time writing code that will mostly go unused. That time would be better spent thinking more deeply about the real problems and writing code to address those problems. Make sure the code is clear and concise and the architecture closely fits the problem you're currently trying to solve. Wait to scale until scaling becomes the problem you need to solve, and only optimize the code that needs to run fast. Face it, you can't predict the future. Instead, deal with the present.

No comments:

Post a Comment