Study The Source Witha Debugger

The act of acquiring knowledge about third-party libraries and frameworks by tracing the execution of the program in a symbolic debugger or other interactive run-time environment. Brings the code and design to life in ways that simply reading the source can never achieve. Helpful to have a goal in mind while doing this activity, like figuring out why your incremental evolution to the library/framework is not working.

Challenge

I believe the premise has the relationship backwards. One should use a debugger for debugging, i.e., isolating problems. Yes, use a debugger if an "incremental evolution to the library/framework is not working," but if one is using the debugger just for "acquiring knowledge about third-party libraries and frameworks," the reasonable reviewer can only conclude there is no benefit. The onus is really on this page to present a case of why Study The Source Witha Debugger is a beneficial activity. -- anonymous

In the absence of further refinements, this "pattern" is likely to be viewed as an Anti Pattern. It is certainly a strong Code Smell if nothing else that a source has to be studied with the debugger. In the absence of any justifying or compensating factors, such a source should be regarded as very poor quality. At the opposite side of the spectrum, see Tex The Program. -- Costin Cozianu

Because prima facie, if you have two sources doing the same thing, and you need to read only one, while you have to study the other with a debugger, then obviously the first one is better. The burden then rests with the proponents of Study The Source Witha Debugger to show that for certain classes of problems or within certain engineering constraints, only such sources can be written that Study The Source Witha Debugger is the best option, i.e. that it is not possible to write good sources by classical standards. And may we also quote the venerable Tony Hoare on this subject:

"I conclude that there are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other way is to make it so complicated that there are no obvious deficiencies."

When you Study The Source Witha Debugger, you're begging for the non-obvious deficiencies, and you are not likely to find them anyways.

Real code, like real DNA, contains a large volume of information, of which only a tiny percentage is relevant at any one time for any one task. In mature, well-factored, singly-inherited object-oriented code, things that are closer to the root of the inheritance hierarchy tend to be more relevant than things near the leaves (because heavily-reused fragments tend to migrate towards the root), but even there MOST things don't matter for any given task at hand. It is well-nigh impossible to determine, from a hard-copy or online view of the source, what part of the source is relevant. It can be done, of course. But the debugger provides necessary run-time context that provides invaluable information about what does and doesn't matter. Here are some specific things that crop up:

Unused Methods: Sometimes, methods that are never called hang around either because they were once used or because somebody thought they would be needed "someday". The question of "can I delete this method" is usually completely different from "do I need to understand this to add my feature". A debugger offers this in a way that no source-only traversal can.

The above claim is trivially false. In all seriousness, for any non-trivial code base you cannot assume that you cover all execution paths after 1 debugging session, not even after 1000 debugging sessions. Maybe here the "debugger" is conflated with a code browser? You can discover dead branches with a code browser (with the pre-condition of having a pretty good idea about the code base to begin with) in a way that no debugger can help you with.

Let me try to clarify the apparent confusion about what I meant. In a source-code debugger, you only traverse called methods. The developer therefore doesn't have to bother reading and understanding unused methods, because they are never called. One doesn't need to cover all execution paths; I'm not suggesting that these methods be removed or touched in any way. I'm instead suggesting that the source-code debugger allows the developer to focus on methods that are used and ignore the others.

I still don't get it: what do you call "unused"? If you think you separate unused methods from the used methods, I submit to you that you can't do that without covering all the execution paths. If by "unused" you mean not used as often as the more often used one, how do you know that your few debugging sessions are statistically relevant? You're way better off with automated test suites and a good profiler. That combination will give you much better data on this regard than the debugger, still even that can be considered just the beginning of comprehending some aspects of the code base.

I call "unused" the methods that don't show up in the debugger when I exercise the code paths I'm going to be modifying. Please remember we're not talking about taking anything out, we're talking about how to rapidly identify and navigate to the regions of code that need attention.

Comments That Lie: No matter how hard we try, comments get out of synch with reality. Without a debugger, we have to do LOTS more work to discover that a method does something different than it's comment says. This is especially true in environments (such as third-party Java packages) where the source of specific methods isn't necessary available.

If the source is not available, you can't debug it. Surround that with a pre-condition / post-condition check to see if it does what it claims it does, the debugger is typically the slower tool to do that.

If the source is not available, you can't debug it. Wrong. That is not true in Smalltalk and perhaps other environments as well. In any case, one doesn't have to go inside the non-sourced method, the point is that a source-code debugger lets you see what it does, in comparison to what it's comment says it does.

What do you mean by "lets you see what it does"? Does it mean that you stare at the state of objects before calling the method, and then after, and then try to figure out the correlation? I don't think that a debugger "lets you see what it does" but for the most trivial examples.

It doesn't usually take much staring. Pre-condition and post-condition testing helps.

Method Names That Lie: We work to choose "Intention revealing names", especially of method names. In spite of our best efforts, sometimes we discover AFTER the method is named and wired into a gazillion callers, that the intention we named it for was wrong. Sometimes it is simply TOO HARD or takes too long to fix all the callers. So we leave the name the same and change its behavior. It sucks. It's wrong. It's a code stench. We might comment it (but see above). Nevertheless, these things are MUCH easier to uncover in a debugger than in straight source.

It is quite simple to rename a method and leave a shim method with the old name. One can then go back and update callers as appropriate while still having a well named method and a documented name translation. This is often a beneficial approach when one has published interfaces but wants to change names or calling signatures. Internally to a program, though, a method with a high fan in usually indicates the need for another level abstraction. There is probably a lot of shared code among that large number of callers that should be consolidated.

It is quite simple to rename a method and leave a shim method with the old name. One can then go back and update callers as appropriate while still having a well-named method and a documented name translation. You advocate this in the context of an argument against using a source-code debugger? At best, leave well-enough alone, unless the offending method contains problem you're trying to solve. [I am sorry, my comment was unclear. My only intention was to indicate that the statement that "it is simply TOO HARD or takes too long to fix" poor method names is an incorrect assertion.]

Internally to a program, though, a method with a high fan-in usually indicates the need for another level abstraction. We apparently have very different views of how OO code should be refactored. A method with a large number of callers, in my view, usually indicates a mature method that is frequently reused. High fan-in is an even stronger argument to leave it alone if possible. The point is not to "fix" such methods, the point is to quickly see what's going on and keep moving.

Abstraction-Level Shifts/Generated Code: GUI builders (Visual Age comes to mind) are notorious for generating large amounts of source code that work, but that contain virtually NO useful information at the source-code level. A method that, for example, collects a symbol from click, looks it up in a dispatch table, and invokes #perform/eval()/exec() on the result is essentially meaningless without knowing, at run time, what the symbol, dispatch table, and receivers look like.

By the way, I realize that I'm making a HUGE assumption, namely, that The debugger presents the source. All of my comments above assume a Smalltalk-like inspector (Venkman Debugger for Javascript, Komodo for perl, Visual Age Java, etc) where I'm looking at source code in context with variable state at the same time. I contrast this with the various unix and assembler debuggers that require the developer to emit line-noise hieroglyphics to extract information.

See Also: Use The Source Luke, Experiment Study Refine, or for an alternate viewpoint, Forget The Debugger


Yes, but if you must do this, it kinda' says something about the quality of the product's documentation, and how (un)readable the source is, doesn't it?

No. it says something about the complexity of the requirements and the extent of refactoring. Sure, it's easy to read flat procedural functions (if not too much invention of home-built objects is going on), and you really don't have to because they're easy to document. But an object-oriented framework is another matter. Understanding these without a debugger is like understanding a video tape without a VCR. :-)

What do you mean by "understanding?" I can use lots of things without having to understand very much about them. I would worry more about the code I write than the code I use.

I guess it is most useful when you encounter code that relies on a non-obvious side effect of certain methods to work. You have very little hopes of seeing this at first glance unless you run the code first. Of course this way of coding should not have been used in the first place ...

Why should one be concerned if the code "relies on a non-obvious side effect of certain methods to work"? How would one even know where to look to find such things? My admitted preference is to only investigate code on a need to know basis, i.e., it is not working or the way it is working needs to be changed, and I fail to see the rationale for just plowing through code, either through a debugger or a test editor.


No to you, too. "There is no substitute for understanding." Unfortunately, the debugger doesn't provide understanding, it only gives clues. Careful thinking about what actually happened (for the examples we followed in the debugger) will eventually lead to understanding, but is much more difficult without the source. And, in any case, the debugger always shows you exactly what the code did, not what it was supposed to do. The source code (which is not always available, by the way) is often much more informative about the intent. -- Steve Powell


Seems to me that this would not work so well for code with lots of delegation going on (jumping all over the place). I just recently had this technique recommended to me, but I haven't tried it yet, so please correct me if I'm wrong.


For me, studying code in a debugger is like doing biology in vivo. All sorts of useful and valuable information can be learned from texts and in vitro experiments. In order to really understand what's going on, however, you have to be looking at the real thing, in context. Similarly, if you're trying to get an automobile engine running, there's only so much you can learn from plans, manuals, and tv shows. At some point, you HAVE to get under the hood while it's running (or not running).


There seem to be many different contexts.

debugging: The program doesn't work the way I expected. Where is the problem?

functional testing: I just wrote some fresh code. Does it really do what I intended?

curiosity: The program seems to work fine, but I want a better understanding of *how* it does that. Maybe I can learn some new techniques.

Learning: I'm learning a new language. I want to *see* the effect of do...while(); and while(){...}; for myself, not just listen to what someone *claims* they do.

speed optimization: The profiler says that Foo() is our bottleneck. What *exactly* is Foo doing? (Is this calculation *really* optimized away at compile time?)

... any others I missed? ...


Lots of people recommend

"Immediately after writing/changing a method, Use The Debugger to single step through it."

Do Unit Tests make this recommendation obsolete?

I would say that Test First Design, as opposed to merely having Programmer Tests (nee Unit Tests) eliminates the rationale to always single-step through a change. If the test fails initially and then passes, there is no need to look at it any further. Mission accomplished, go on to the next item to be addressed. If however, the test initially and unexpectedly passes, or after implementing the necessary code, the test still fails, then it is time to use debugging tools.

That being said, I also have to point out that I have worked with a programmer who always single stepped through his code after writing it. It personally drove me crazy; I wanted to see the result rather then wade through how the program got there, but he was effective using his style. I would not recommend always single-stepping through code as a standard practice, but I would not forbid it either.


When I came to this page, I expected to see something along the lines of "Given the choice between understanding a code base by either using a debugger in conjunction with code reading, or using code reading alone, you should prefer to use the debugger."

Instead, I see a lot of complaining that code shouldn't need a debugger to be understood, which is another point entirely!

Is there anyone who will seriously argue that when trying to understand code that you don't currently understand, you're actually worse off if you use a debugger? (And I assume you will apply as much intelligence to the use of the debugger as your code reading, so no fair implicitly invoking some mythical code wizard who can read kilobytes of crappy code with single-char variable names and lying comments with instant comprehension but is somehow too stupid to set a conditional breakpoint or jump out of a loop when needed.) Is there anyone who can seriously argue that seeing the code running and alive is somehow less useful than only seeing it dead?

I'd submit that there are lots of words here from people who spend more time studying code and writing pages like this than actually making things work. I'm also guessing that most of the brickbats come from those who have the least experience with source-level debuggers as mentioned above (Smalltalk, Venkman, Wing/Komodo, etc.). -- Tom Stambaugh

The lack of imagination and cultural background (other than Smalltalk, mind you) that affects the Smug Smalltalk Weenie community never ceases to amaze. So, here are some facts:

there are study the source code exercises where the debugger is as good as useless,

there are languages and platforms where the debugger is maybe an after-thought and is not used anyways and hardly anybody grown in that culture misses it.

any non-trivial algorithm when studied with the debugger misses the forest for the trees. An execution trace with some debugging printouts may be much more relevant.

So the contention: that "Study The Source Of The Debugger brings the code and design to life in ways that simply reading the source can never achieve." is simply false as a matter of fact. It may survive if you restrict its generality to some Smalltalk corners of the world.

For the record (in case Tom Stambaugh is still curious or just guessing), I do make things work and I do use debuggers much more often than I'd want to. Almost every time I use a debugger, I end up wishing that I didn't (because it is the lazy way out from having to think harder and it is a very wasteful activity in terms of time for any non-trivial piece of code). It is not rarely that I end up with no immediate results from the debugging session, but after stressing my brain for 5-15 minutes I end up solving the problem. -- Costin Cozianu

See original on c2.com