hippocratic oath of software

Some of you may be familiar with the Hippocratic Oath common in the medical field, often paraphrased as “Do no harm.” In a light-hearted casual conversation with a colleague the other day, I realized that we need a similar oath in the field of software development: “Don’t make it worse.”

As dedicated software developers, we hold ourselves to a high standard of developing good software. Sometimes we have liberty and control to author new, creative, functional software that adheres to strong software principles: reliable, secure, maintainable, and well-tested. We take pride in upholding those standards and producing solid code. Other times, we’re more constrained and limited by time, resources, or even skills. One of my current projects involves maintaining and updating a large legacy system that contains hundreds of thousands of lines of code written by other disparate teams over time, many of whom no longer exist. The code was written to various standards (or lack of standards), some of it under duress and high pressure producing a system which was known to be buggy, fragile, and hard to maintain. As much as we’d like to invest a lot of time/effort into cleaning it up and fixing it properly, we don’t always have that luxury. Often we’re forced to do the minimum to fix or implement new things without chasing things down the rabbit hole too far or for too long. Regardless of whether we have our “green field freedom” or “constrained legacy maintenance” we must uphold some sense of quality. We must uphold the Hippocratic Oath of Software: “Don’t make it worse.

So, what does that really mean? How can we build high-quality software when we’re being pressured to “just get it done” by management? I think the answer lies in setting our own standards for code quality, security, and testing. In most (all?) cases, we need tools to help us assess those attributes. For our project, we use SonarQube as a hub for measuring code quality, adherence to standards, basic security vulnerabilities, and code coverage for our unit tests. SonarQube for Java uses tools such as PMD, CheckStyle, and FindBugs to identify code issues and rank them (blocker, critical, major, minor, etc.). Those tools also identify basic security vulnerabilities such as cross-site scripting (XSS), SQL injection, or other simple code-based development mistakes. We also use Jacoco as a code-coverage tool to identify what code is/is not executed by our unit tests. With our legacy code, we already have measures for these. Depending on the system or module, these range somewhere between “terrible” and “pretty good.” When I say “Don’t Make it Worse” I mean the following things:

  • Code quality: Do not introduce NEW SonarQube issues in the code. Even if we have 1,000 critical issues identified, I’m not allowed to add the 1,001st issue. This rule could be adjusted/tempered to focus on specific severities (blocker, critical) if desired.
  • Security vulnerabilities: Similar to code quality, don’t introduce any new security vulnerabilities.
  • Unit testing: regardless of whether unit tests exist for the code you are working on, always unit test the code you touch.
  • Coverage: Coverage cannot go down. This is usually directly coupled to creating/adding unit tests for whatever code you work on. Even if your code coverage is terribly low (e.g., 25%) you must make it go up (or at least stay the same). In some cases, we can even establish higher code coverage goals for “new code.” SonarQube and Jacoco can measure “coverage of new code” by using the version control tool to see what’s changed and only count coverage for that particular block. It allows us to set goals such as “80% for all new code” even though we’re stuck at 25% for average over the entire code base.
  • Maintainability: With legacy code, we often have to spend time reverse engineering and understanding poorly documented code that we didn’t write. This is often a good opportunity to insert some comments explaining what we learned. This helps the next poor sucker gain understanding more quickly. We can also ensure that we document our own code as we go along.

The nice thing is that the tools can be set up to enforce hard limits for these rules. Modern versions of SonarQube, in particular, allows us to define both absolute thresholds (e.g., no more than 80 critical issues) or trends (must be greater than/equal to previous coverage value). That allows us to set our quality standard of “how it looks now” and always force us to trend in the right direction.

I’ve followed the same policy when trying to slowly improve the overall quality of a system on other projects. For example, on a previous project, we established some quality standards of “no blockers, no criticals, and 80% unit test coverage.” Unfortunately, when we first assessed the software, we immediately sound some components that had tons of blocker/critical issues and coverage as low as 5% in some cases. It’s virtually impossible to convince a team to drop everything and fix ALL the issues and write enough unit tests to get coverage from 5% to 80%. Instead, we set a policy of “make it a little better each week.” (It’s an important relative of “don’t make it worse.”) We set a hard “trend” policy that disallowed the creation of new issues or dropping coverage. Additionally, we started setting absolute thresholds to drive the metrics slightly in the right direction each week. For example, if coverage was 5%, we set the “soft” warning threshold to 10% for a week, then switched it to a hard limit. The team committed to increasing to that new level each week otherwise, we started failing their builds. This guaranteed that they “didn’t make it worse” and, in fact, started making it slightly better each week. A dirty side effect emerged as well. If their goal was 10% coverage but they accidentally overshot to 15% coverage, that became the new floor. Coverage wasn’t allowed to ever drop below it again. It also meant we ratcheted the new target up 5% to 20 instead of leaving it at 15%. Dirty trick, huh?

One of the dilemmas is often with coding style, particularly around white-space of general formatting. For example, our team uses a standard of “spaces-not-tabs, 4 space indentation, with Egyptian-style curly braces.” There is a LOT of code that has not followed this standard (or even any standard) and it is often tempting to simply reformat an entire module as soon as we touch it. This effectively tricks many tools into thinking the entire module has changed instead of a few lines. It brings up my soap box argument of being “version aware” when you make changes, but I’ll leave that for another time. My particular recommendation is “only change what you need to and do your best to maintain a reasonable blend of ‘desired’ and ‘current’ styles.” In this example, changing too much can adversely affect the “coverage of new code” metric I mentioned above. By reformatting white-space, we can artificially make the version control tools think that more code changed than actually did by simply reformatting it. Think “you break it, you buy it.” It forces us to write unit tests to cover the whole module or break our “don’t make it worse” oath by not hitting our ‘coverage of new code’ goal.The point is: only change things that you need to. Commenting (for maintainability) is a good exception because there is often no cost associated with adding comments as long as you don’t change lines of code with real statements.

So, remember: when working on new or legacy systems, always follow the Hippocratic Oath of Software: “Don’t make it worse!” I leave it to you to hold yourself to an even higher standard of “always make it better.” In some ways, I relate this to the golden rules of common courtesy I’ve learned over time: always leave some place cleaner than the way you found it.

P.S. As I finished writing this, it occurred to me that I can’t possibly be smart or creative enough to be the first one to think of it. If you search the interwebs, you’ll find a variety of forms of the “Hippocratic Oath of Software” ranging from a string of jokes to liability-inducing commitments for software in safety-critical systems. I like to think of my version as somewhere in between with a light-hearted but professional approach to improving software quality.

Points to remember:

  1. Hippocratic Oath of Software: “Don’t make it worse
  2. Use static analysis and coverage tools to set baseline metrics
  3. Never let static analysis or coverage metrics get worse
  4. Don’t introduce more software quality findings
  5. Don’t introduce new security vulnerabilities
  6. Don’t let code coverage metrics go down

Leave a comment

Your email address will not be published. Required fields are marked *