Now would be a good time for us to step out of Thread Mode and try writing in Pattern Mode some patterns in which we all could agree. I'll start this phase with a couple patlets which I won't sign. I'll leave the original thread below them with the hope that the thread will actually shrink as the patterns grow. -- Ward Cunningham
Program for People. Programs must be understood by both man and machine.
Therefore: Write with the human reader in mind while satisfying the needs of the machine to the degree necessary. This is most difficult where the needs of the two readers conflict.
(The context of this pattern is writing code. See Think Like the Machine below for a pattern for reading code.) See Reads Like Prose, Meaningful Name and Meaningful Comment.
Reveal Intention. Your reader will want to know what you were trying to do as well as what you actually did.
Therefore: Reveal your intentions in the way you name, format and document your code.
Mini-Languages. Where you have the same idea in two different forms, try to derive one from the other automatically. In general this means designing a formal language which expresses the idea in a way that the machine can understand.
Clear Warning. Circumstances may require coding for the machine in a way that could easily mislead the human.
Therefore: Mark each such case with a comment giving the reader clear warning of the possible confusion.
Order Dependency. Sometimes one method cannot be invoked until after another has been invoked.
Therefore: Write a comment explaining the dependency, to prevent someone from accidentally violating the constraint. Practically, I have always been able to transform the situation into a single public method that invokes the other two privately in the correct order.
Think Like the Machine. Machines don't think like humans and aren't likely to do so soon. In fact, we humans find the logical nature of machines useful.
Therefore: Learn to read a program as if you were the machine. Don't expect to have the machine's normal behavior explained in comments. Prefer programs written as objects since it is somewhat easier to think like an object than other semantics.
(The context of this pattern is reading code. See 'Program for People above for a pattern for writing code.) (A programmer must be able to think like the machine in reading code; however if one can prefer code written in objects, one can prefer code written for people using comments as well).
To Do. When you are programming for insight, you will often have ideas that will be required at some point, but pursuit of which would distract you from learning.
Therefore: Mark each such case with a comment giving the reader a suggestion of what other logic will be required. (or Use Code To Bookmark)
Un-Done. When following an iterative process, you may stop with code intentionally left unfinished or untested. Circumstances may preclude your picking up where you left off.
Therefore: Mark each such case with a comment giving (or pointing to) clear instructions on what is left to be done.
Fence It Off. When you are implementing a design, you may be promised that there will never be more than 50 widgets, or 10 simultaneous users. Sometimes there are enormous performance payoffs to taking advantage of those constraints... until they change.
Therefore: When you are writing code that you know depends on data being within constraints, document those constraints. "This method handles only arrays of size 1-50. It must be rewritten for larger arrays."
Blind Alley. There's often a solution that's "simple, obvious, and wrong." Usually, it's in version 1.0 of the code, and replaced by something less simple, less obvious, and less wrong in version 1.1. Someone who comes back to the code later is often tempted to unknowingly repeat the mistakes of the past.
Therefore: When you replace obvious but incorrect code with non-obvious but correct code, comment the old approach as a blind alley that shouldn't be traversed again.
Agree: Betsy Hanes Perry
(Sorry; probably redundant with Clear Warning above.) (Blind Alley)
Format/Comment to Understand When you are faced with understanding a large piece of unfamiliar code, you may have trouble deciding where to start.
Therefore: Read actively, by formatting, commenting, or decomposing methods.
Generated Documentation Understanding a program by reading its code can be daunting.
Therefore: Include comments in predictable places like the beginning of methods. Generate paper or electronic documentation using those comments.
Named idioms There are some common but odd looking coding idioms. (One of my favorite in Perl is:
local $_ = $string;
which makes a copy of the magic $_ variable that's the default target for regular expression matching. Note that:
my $_ = $string; # WRONG!
which most reformed Perl 4 programmers would think of, isn't the same thing; it's a 'Blind Alley as described above.) Such idioms are hard to understand.
Therefore: Agree on names for such idioms. Use the idiom name in the comment. For example:
local $_ = $string; # "Local Default Match" idiom
If you're a distinguished author (or gang of four distinguished authors:-), publish a book that makes such idioms part of the industry vocabulary. (Patterns Of Software has an interesting opinion on this.)
Fix Broken Code Instead Of Commenting Bugs:
"There is always a good reason for doing the wrong thing." Don't explain (in a comment or elsewhere) why the wrong thing was done. Instead, fix it; do it right instead. (Exception: when you don't have time. See also Un-Done above.)
A collection of get it done practices have been dubbed Extreme Programming. In this context, Kent Beck and others have pointed out the value of intention-revealing code. We pick up the thread here with Alistair's questioning what might have been a bit of overstatement...
How come all the rigor on the Coding Conventions? That really surprised me. What causes you to put so much weight on the idea that everyone should write identically? Usually programmers tell me the indentation style etc makes so little difference.
See Coding Conventions for remarks on why we do this. -- Ron Jeffries
The Coding Conventions question seems related to a separate issue concerning uniformity of formatting/indentation. Method Commenting followed from other guidelines suggesting comments per se were (usually) unnecessary and to be deprecated. The following discussion here followed from that, rather than Alistair Cockburn's note above. -- Jim Perry
One should write relevant and helpful comments. As to whether comments are "needed", that's surely subjective. I am of the Programmers Are Writers sect, myself, but we're a small denomination. I can't help but protest what appears to be active discouragement of documentation, though. -- Jim Perry
Perhaps I've been reading too much Douglas Hofstadter lately, but I think an aspect of this that may be slipping by is the distinction between meaning and message. A comment, like code, is a message that is intended to convey a meaning to the reader. If both author and reader have a deep and shared understanding of the meaning space, the comment need not say as much. In fact, among a community of seasoned Smalltalkers, method comments often are superfluous, as Kent and Martin have observed.
On the other hand, we Smalltalkers have internalized an ability to either ignore or derive meaning from several superficial aspects that drive other coder's crazy. These include:
Joining words together and capitalizing, instead of other conventions (like underscores)
Spelling out words instead of abbreviating them
Infix instead of postfix or prefix function notation.
When the number of possible meanings in a communication is very restricted, a very simple message is enough; some call these "signals". For instance, among long-time coworkers, a mere raising of an eyebrow, rolling of a eye, or comment in code that says "yeah, right..." is enough to convey a meaning.
For a long-time C programmer, a phrase like *dst++=*src++ is completely self-explanatory and requires no comment; the meaning is evident from the code. Not all communities share in this assessment, however.
An aspect that perhaps might be interesting to explore further is suggested to me by Ward's Hyper Perl pages. Perhaps comments could be captured in the environment as phrases, paragraphs, or stories that were linked to the code, as opposed to being linearized with it. I envision different communities perhaps wanting to link different webs of narrative to the same fragments of code. Some of these might be written by the developer. Others might be annotations by other interested parties. All might be turned on or off, followed or ignored by the reader, just as we do with the links here.
All writing should be targeted at a specific class of reader, and comments are no exception. Part of the problem is to decide how much you can assume about the reader's knowledge.
When I'm reading other people's code, I often know very little about their application. With a big project, it can take a while to learn it. So some comments should be explaining, not so much what a method is doing, but when and why it should be called.
Systems that are written by seasoned Smalltalkers with deep understanding of the meaning space, sometimes get maintained by inexperienced, lowly paid novices. You may not be writing for yourself, but for the people who come after you when you have moved onto higher things.
-- Dave Harris
Dave Harris's last point nicely captures an oft-overlooked aspect of professionalism in any endeavor. The variant I was taught, and strive to follow, is: "It is your responsibility to Prepare The Way for those who will follow." -- Dave Smith
I wonder if Dave Harris and Dave Smith are making an implicit assumption about the communication path between the author and the reader? I wholeheartedly concur with all they say; its the mechanism of providing it that I'm interested in looking at more deeply.
Suppose the author is Seasoned Smalltalker, and suppose one reader is Lowly Paid Novice and another is Seasoned Maintainer. In a conventional environment, the two readers (Lowly Paid Novice and Seasoned Maintainer) are forced to read the same material, because the medium cannot adjust its presentation to the reader.
In systems with Wiki Nature, however (or perhaps Mu Nature, an environment I'm currently pondering), the medium has the ability to recognize the cross-product of Seasoned Smalltalk and whichever reader is requesting the page. It's therefore perfectly feasible for Seasoned Maintainer to see completely different text from Lowly Paid Novice, even when accessing the same page.
Imagine a mechanism, roughly analogous to function overloading in C++ or Lisp, that tags "comments" about a fragment of code (like a method) from multiple authors, and supports rules for mapping those comments onto author/reader combinations. This mechanism can be invoked during the Retrieve Page (or maybe c2.com ) in Wiki, so that the actual page that appears represents the result of evaluating the mapping in the context of the specific reader referencing the page (I assume that each reader is Known User, a rash assumption in the current Wiki environment).
So while I totally agree with Dave Smith that we should Prepare The Way for Lowly Paid Novice, I also think Kent Beck is following accurate intuition when he suggests that Seasoned Smalltalker(s) would prefer more terse or perhaps even missing comments. I get excited by the opportunity to work with a medium like wiki that lets us effectively accomplish both...we've never had that before.
The nature of the Smalltalk medium is indeed unique and wonderful. Prepare The Way applied to Smalltalk includes "follow Kent Beck's lead, and express your meaning through clear code, and not method comments." -- Dave Smith
Are you saying, Tom Stambaugh, that the Seasoned Smalltalker should actually write every comment twice, once aimed at the Lowly Paid Novice and once at Seasoned Maintainer? Well, maybe. -- Dave Harris
One potential problem with hiding Lowly Paid Novice comments from the Seasoned Smalltalker, is that the Seasoned Smalltalker may never get a chance to learn from the novice. If someone else is finding my code obscure or difficult to understand, or feels my comments are inadequate, I want to know about it.
Also, there's a risk that the Lowly Paid Novice will write a comment which is wrong or misses the point.
I suppose these problems can and should be caught by a code review, but I've worked in environments where they wouldn't be.
-- Dave Harris
Hmmm, I think I'm saying "All of the above". I think there are a multitude of possible approaches, of which you've identified several interesting ones:
Code Author writes all the comments (at all levels). This is your first scenario, where the Seasoned Smalltalker is the Code Author. As you've observed, this puts a burden on the Code Author, and many Code Authors just can't do it.
Each Code Reader annotates the code with his or her learnings (your second scenario). This strikes me as more consistent with Wiki Nature, and I like its organic nature.
Comment Authors write comments, perhaps in conjunction with Code Authors and Code Readers. I wonder if this approach might grow out of such a system in real use, over time.
I particularly liked your final observation. Its been my experience that those "blindingly obvious" insights are precisely the sort of meanings that newcomers DO need to be explicitly told about and that seasoned veterans literally can't imagine not "getting", and so feel no need to explain.
I like some of these suggestions. However, I don't think that the Lowly Paid Novice/Seasoned Programmer dichotomy catches the real issue as I see it. Comments carry information outside the scope of the language, at a different level of abstraction; they are not primarily for clarification of language idioms for those unfamiliar with the language, and they supplement rather than replace clarity of code.
I agree with the Prepare The Way principle, but in my experience a significant proportion of the time the person whose way is prepared by (my) adequate documentation of code is myself (some months or years later). The Code Author at the time of writing the code is privy to the complete design of the piece of code, the reasoning that led to that design over other considered alternatives, and so forth. If, between the code per se and the comments, the author doesn't put down that information, it is lost (up-to-date external documentation is even rarer than internal documentation/comments).
A later visitor, whether the original author or another, may need to reconstruct at some pain the original information, but typically can't do so to the degree that the original author could have at the time of composition.
-- Jim Perry
At the lowest level, all commenting is for purpose, rather than function. I make this distinction for two reasons - one, it was drummed into me when I was first learning assembly language that what the statement says is far less important than what it does, in the overall scheme of what the program is trying to accomplish, and two, very often, even though I am an experienced and skilled programmer, I very often write code that is supposed to do one thing yet does another. If I have comments that tell me what it is that I thought the code was supposed to be doing, I have a better chance of actually making it do that.
I liken uncommented code, no matter how clearly written, to the kind of directions you get when someone tells you, "okay, fourth right, go 4 miles, third right, go 1/2 a mile, seventh left ...". I mean, they're precise, all right, but if you're trying to follow them in the rain or at night, you'll probably get lost if there are more than a couple turns.
Well Commented Code is more like "follow this road to the Shell station, and turn right onto route 302. You'll see the Wal-Mart on the left after 5 miles. Turn at the next stop light onto River Road ... ". I've got signposts (assertions) to show me my subgoals in this piece of code.
-- Joe Mc Mahon
One further thing that needs to be mentioned. It's the second of these two rules:
If your code needs a comment to explain it, rewrite it.
If you can't explain your code with a comment, rewrite it.
(I list both rules because I think they look pretty together.) I've found that documentation exposes flaws. It can be easier to redesign a feature than to document it clearly as it stands. Similarly I have sometimes rewritten code to make the comments easier to write. Truly, one should strive to write the comments first. -- Dave Harris
As some would say, write user manual concurrently with code. This exposes many instances when the programmer loses sight of End User functionality. It also throws light on "what the hey are we doing here" in more obscure sections. The above rules of comments is but a minor sidelight of this, since it is too easy to think one's own comment actually explains what one intended it to. -- Bo Leuf
I see an interesting spectrum here.
Sometimes a programmer hacks out code first (Spike Solution).
Other times a programmer writes a one-line test first (Replace Comment With Assertion; Never Writea Line Of Code Withouta Failing Test), then writes the code.
Other times a programmer writes Code Unit Test First, then writes the code.
Other times someone (the programmer?) writes a rough draft of a small section of the user manual before writing the code implementing that feature (Write The User Manual As You Go).
Apparently no one has actually written the entire user manual before writing an application (Manual As Specification, Write The User Manual First).
-- David Cary
As my first boss said, "No comments don't lie."
The value of:
++i; /* add one to i */
is significantly less than:
++i;
There may be some value in very short comments, i.e., naming the Gang Of Four design pattern you're using. (Using the name in the method you're invoking would be better.)
Sometimes lying comments are extraordinarily useful. Every so often you see
--i; /* add one to i */
which is a blinking neon sign that the programmer wasn't thinking clearly. When you see that, you have a Very Big Clue about what the code was supposed to do, as opposed to what it does.
I don't advocate commenting every line, by the way; I'm just addressing the particular example. I haven't heard anybody here recommending commenting every line; rather, they say that there's a large practice space between Comment Every Line and No Comments At All.
Comments Tell Why. Code Tells How.
See also History Matters.
This example:
++i; /* add one to i */
is a non-comment. It adds nothing to the communication - it's a tautology. Now if it says
++i; /* record another match of this expression */
that is a comment that tells me what the point of the statement is; and is not more noise in the communication. This way I know that (say) after the current loop, i will contain the number of hits. Conversely, if the number of hits is wrong after someone has modified the code, I can see that I should find out where i got clobbered.
-- Joe Mc Mahon
A better alternative is:
++expressionMatchCount;
(or something slightly less verbose).
I agree with Joe Mc Mahon; "/* add one to i */" is a classic example of a bad comment, hence of bad coding. Another common form is the "required boilerplate" where each subroutine is preceded by a large box constructed of asterisks or whatever, possibly containing the subroutine name and typically some unfilled slots for author, dates, calling conventions etc. and no actual useful descriptive information. See Massive Function Headers.
Such usage tends to arise in courses or similar environments where some rule about comment frequency is in place, but there is no appropriate feedback about the quality of the comments. Such an environment teaches that comments are an onerous and pointless chore.
However, the fact that a thing can be done poorly is a poor excuse not to do the thing yourself; to go further and advocate therefore that it not be done at all is worse.
-- Jim Perry
(Here's a list of related Anti-patterns or anti-idioms, called "How to Write Unmaintainable Code": mindprod.com -- Katy Mulvey)
Website above moved to mindprod.com but there is no longer any reference to these anti-idioms. (July 1998 -- Bo Leuf)
As of 2005-10, it has moved again to mindprod.com - but that's only the index. Apparently it has grown over the years, split into several sections and a blog.
I've been reading the Wiki source code (see Hyper Perl or c2.com ), which has no method comments. It's been an interesting experience.
(Actually, I've been reading a printout of it as "straight Perl" on the train. It's clearly the wrong way to read it. Would a Smalltalker ever try to read from a static piece of paper? It's what I have time and equipment to do.)
The Hyper Perl links, in addition to being a useful tool for navigating the source (with the right tools), also provide some information about the code itself. This is more true for those code fragments outside any Perl subroutines, notably variable definitions.
I find code without method comments makes me juggle multiple levels of abstraction simultaneously. If I'm trying to figure out what the Print Body Text subroutine does, I need to build that up out of elements such as foreach(split(/\n/,$_) ... okay, I know that means the value of $_ changes within this loop from the whole document to each line of the document, but I've got to spend a half second figuring that out without forgetting all the other stuff I had been trying to figure out.
This is particularly tough in Perl 4, which is infamously short on ways to build up user defined data structures. There are common idioms to build such structures (which end up being stored in strings, very easy to store and retrieve). I can follow them, but it's one or two more things to remember.
Funny thing. Most programmers have great memories; they can do that easily. I have a lousy memory. I can't even keep track of everything that's going on in a short program (a few hundred lines long, say) I'm currently writing. The good news is, I write very clean, modular code with a lot of information hiding. The bad news is, I think I need method comments more than most programmers. --Paul Chisholm
Hyper Perl Wiki is a great read, but I found it very difficult to answer questions about the scope of locking, who has the db open when, etc., without resorting to the the straight Perl source. This is the same problem I have when doing database work in Smalltalk: it's often easier to reason about the "big picture" from flattened source than it is from hyper-linked pieces. -- Dave Smith
A thought about comments. If comments lie, and we are game to try to get away with fewer and fewer comments, perhaps we could facilitate this by changing our development environments so that all comments for a method are deleted whenever it is touched. It is kind of an evil way of forcing ourselves out of the comment habit. -- Michael Feathers
An idea that has been bouncing around in my head in a half-formed state suddenly just jelled: Identifiers Are Comments.
Good recognition. Also, indentation is comments, in general source code is comments, but, identifiers are the most informative.
If we could compile and execute Natural Language, the comments would be the code.
I doubt it. When teaching a human how to do something, one talks about more than just a sequence of steps, and ways to detect and correct exceptional situations. One also comments on situations where this skill may be useful, the overall goal of doing it this way, the possibility of other methods of reaching the same goal, and sometimes a bit of theory as to why this sequence of steps could possibly achieve that goal.
I agree with most of the patterns listed at the top of the page, but I am a bit uncomfortable with relying on comments for correct program behavior, since Comments Dont Compile. So for patterns like Order Dependency, Un Done, and Fence it off, I prefer to add code that ensures that my intent gets carried out. -- Jay Dunning
See original on c2.com