This week, we were discussing about performance, bugs prevention, issue detection and fix, and how it was important to write code that make problems fix easy. The kind of code you would write as a developer, that would allow someone in the Ops team to understand what’s happening on Production.
A mention was then made about this great post by the great Ahende about professional code. I’m pasting the entire post here but you can see it on Ahende blog.
Trystan made a very interesting comment on my post about unprofessional code:
I think it’s interesting that your definition of professional is not about SOLID code, infrastructure, or any other technical issues. Professional means that you, or the support staff, can easily see what the system is doing in production and why.
It is a pretty accurate statement, yes. More to the point, a professional system is one that can be supported in production easily. About the most unprofessional thing that you can say is: “I have no idea what is going on.”
Expanding on this, we have been paying a LOT of attention recently to production readiness. We can’t afford not to. Just building the software is often just not enough for us. In many cases, if there is a problem, we can’t just debug through the process. Either because reproducing the problem is too hard or because it happens at a client side with their own private data. Even more important than that, if we can give the ops team the tools to actually see what is going on within the system, we drastically reduce the number of support calls we have to take.
Not to mention that software that actively support and help the ops team gets into the actual data center a lot faster and easier than software that doesn’t. Sure, clean code is important, but production ready code is often not clean code. I read this a long time ago, and it stuck:
Back to that two page function. Yes, I know, it’s just a simple function to display a window, but it has grown little hairs and stuff on it and nobody knows why. Well, I’ll tell you why: those are bug fixes. One of them fixes that bug that Nancy had when she tried to install the thing on a computer that didn’t have Internet Explorer. Another one fixes that bug that occurs in low memory conditions. Another one fixes that bug that occurred when the file is on a floppy disk and the user yanks out the disk in the middle. That LoadLibrary call is ugly but it makes the code work on old versions of Windows 95.
Each of these bugs took weeks of real-world usage before they were found. The programmer might have spent a couple of days reproducing the bug in the lab and fixing it. If it’s like a lot of bugs, the fix might be one line of code, or it might even be a couple of characters, but a lot of work and time went into those two characters.
Some parts of the RavenDB code are ugly. HttpServer class, for example, goes on for over thousands lines of mostly error detection and recovery modes. But it works, and it allows us to inspect it on a running production server.
That is important, and that make the separation from good code and production worthy code.