Friday, May 22, 2009

Some Remarks on Comments

I was recently having a conversation about the perils of commenting code with someone and had one of those "I can't believe that I'm having this conversation *yet* again" moments. It seems a little sad to me that there are still so many professional developers out there who are baffled by the idea that a bounty of comments is anything other than utterly wonderful. I don't know if this is from a lack of exposure to new ideas, laziness, aversion to change, diet, childhood trauma, or what, but I do wish for everyone's sake that people would at least acknowledge the very real risks that come with excessive commenting. I'm not militant about this. In fact, as with most things, I would say I'm a moderate and a realist. But no matter your personal opinion, I feel as though we collectively should be beyond the hyper-simplistic attitude of "comments good, no comments bad".

The conversation ended in a stalemate (though I hope I at least raised their awareness a bit), but it did get me to thinking and I realized that I should post my boilerplate commenting guidelines that I've distributed in one form or another to my teams in the past. This started out as an e-mail to a colleague many years ago and has been tweaked and recycled numerous times to suit the circumstances. It still reads as a bit of a rant, but pretty much sums up my attitude about comments and where I think they fit in the grand scheme of things.

The 5 Risks of Code Comments

1. Contribute to codebase bloat: Guidelines for when to provide inline comments are, by their very nature, subjective. Developers tend to err on the side of "caution" in this case (for fear of being told they're not being clear enough in their comments) and comment most or all of their code (and also tend to comment out sections of code and then provide comments on why they commented out the code.) This inevitably results in severe codebase bloat. A past client of mine had a mission-critical VB/COM/ASP legacy application that was almost 1,000,000 lines of raw source -- of that nearly 20% was comments.

2. Create unneeded maintenance overhead: When inline comments become the norm, developers end up being responsible for not only maintaining the codebase, but also the comments that accompany them. Inevitably, this results in two contradictory outcomes: 1. Productivity loss for developers who earnestly try to scour the code for comments that need to be updated to match the code changes that they just made and 2. Comments that conflict with the code because they haven't been kept up-to-date, either because developers didn't have the time/inclination to do it or because they didn't realize that there was a comment that related to the code they were changing. This happens frequently when comments at the top of a long method apply throughout – another developer may not even realize that the bit of code they’re working on has a comment that applies to it (as an aside, this is yet another argument for the "concise method" philosophy as well).

3. Create and perpetuate confusion: Comments don't compile. No matter how concise and explicit you try to be with your comments, they are NOT definitive. Only the code compiles, so in the end the code is the only thing that is meaningful in any absolute sense. For the comments to be as concise as the code, they must (inherently) be both as structured and as specific as the code itself. Because they don't compile (and therefore cannot be logically validated) this is obviously not a realistic approach to inline comments. Consequently, inline comments will always be more abstract/generalized than the code itself, or put differently, the code will always be the specifically accurate and definitive expression of application behavior. In the presence of simple, thoughtful, well implemented code using agreed upon standards and conventions specific comments will only serve to create confusion about the code that they refer to (And besides, do we really need the comment "// Checks to see if x is 0" for the line of code "if( x == 0 )"? ) The only exception to this is the practice of identifying exceptional situations and implementations. This case is where the developer intentionally and thoughtfully chooses to deviate from the conventions, standards and best practices for a specific reason (for example, refactoring to a less maintainable but higher-performing implementation to address a performance issue.) In this case, an inline comment stating that the developer chose the particular implementation instead of using the standard approach intentionally and stating *why* this decision was made creates clarity by identifying the section of code as exceptional in some way. This is exactly opposite of the normal approach of commenting everything - by only commenting exceptions you are creating clarity about the stuff that *isn't* commented. You're saying "Hey everyone, this other stuff follows the rules and patterns that you already know."

4. Encourage sloppy implementation: The ability to comment on one's code encourages the mindset of "justifying" bad implementation. For some reason, developers tend to think its ok to write bad/sloppy code as long as they acknowledge it. By forcing the developer to express themselves solely through concise, well-crafted code the temptation of explaining away bad practice is removed and the resulting code is, consequently, of higher quality.

5. Give credibility to excessive complexity: Ultimately, inline comments create major complexity for minimal or no value. Our philosophy is that a strong shared understanding of the functional metaphor by the developers, combined with a simple, straight-forward architecture and strong coding standards more than compensates for a lack of inline comments intended to provide "context". To provide non-specific (i.e. semantic) context we have a much more powerful tool available to us in the form of Xml comments at the member/method level (in fact, they even sort-of compile because the format can be validated for well-formedness). If these, combined with an understanding of the behavioral metaphor, expressive naming, and simple, concise code don't provide enough information to understand the implementation then the implementation itself has become bloated and should be refactored to a simpler form. This simplicity not only improves the ability of developers to understand the code but also to maintain and enhance it.

In summary: In-line comments are more trouble than they're worth. In place of these are the following practices:

Code Commenting Best Practices

1. Simple, concise code and an all-pervasive mentality of refactoring whenever we notice things getting too complicated to understand.
2. Expressive naming of variables, methods, etc.
3. Method/member level Xml comments that explain the overall semantics of the method/member.
4. Appropriate use of the standard MS "intelli-comments":

  • // HACK: Improper/shortcut implementation to be fixed ASAP. Allows a developer to prioritize the specific pieces of implementation but NOT allowed to go into production!
  • // TODO: A placeholder. No current implementation for this piece of functionality.
  • // UNDONE: Implemented but functionally incomplete. Needs more code.
5. Deliberate and restrained use of the "exception to the rule" in-line comment. This is the "I'm calling out this one specific bit of code because it's exceptional/doesn't adhere to our standards/is performance tuned in a non-obvious way/etc. and someone looking at this code in the future will need to know that I did it deliberately, and why." Use the "intelli-comment" like syntax "// NOTE:" to indicate that you intend this comment to live with the code.
6. Comments on source-control commit to associate a specific change event with the reason for that change.
7. If it's a comment and not part of the above - ask the person who wrote it (yourself or someone else) why it's needed and if that purpose isn't better served by a combination of the above practices. Then fix it. Comments can be refactored too!
8. Always end a comment with your initials. This serves two purposes: 1. It makes it easier for others to know who to go to if there's uncertainty about the comment (without having to go back to commit history in source control) and 2. Having to stand by our comments forces us all to be thoughtful about whether or not a comment is really needed in the first place.

I think this pretty well summarizes how I feel about comments but I'm curious to hear what folks have to say.

References:
In response to the above my pal Mark W. pointed me at this excellent Tim Ottinger article "A Comment is an Apology" which is a different take on the subject with the same basic message. Thanks Mark!

Add to del.icio.usDiggIt!RedditStumble ThisAdd to Google BookmarksAdd to Yahoo MyWebAdd to Technorati FavesSlashdot it