For those too young to remember the goto statement, it was used much as Exceptions are today, but had a more concise, elegant syntax #gotocph --@bigballofmud Brian Foote
In 1978, Berkeley scientists did a study on the effects of Go To on rats. Twenty rats in a control group were fed a normal diet, while twenty other rats were forced to program in Apple BASIC. Their health was to be monitored for a month, but after two weeks all of the rats in the programming group had melted into a dense goo that tasted a little like quiche.
But some rats learned to execute GOTO END_OF_MAZE, and finished in record time, while the nested block rats took too long to climb into and out of nested boxes to get to the end.
Seriously, though, don't use Go To statements. Like any four letter words, they are not appropriate for polite company.
Unless of course, failure to use goto results in the code being more buggy and unreadable. Also, there are the hidden gotos (in Cee Language) such as continue, break, or a mid-function return which the extremists generally lump in. Personally, I have seen code which introduced a number of extra control booleans, doubled the number of lines of code with a factor of ten indentation blocks (the stuff that wraps 3 times on an 80 column screen) to be pure and was not correct due to the twisted logic. In the end, there are those who write bad unreadable code no matter what constructs they limit themselves to, and those who write good/obvious code no matter what constructs they use. I like to ask about gotos during interviews to at least see whether the candidate has thought about the issues, or simply regurgitates the dogma. - Chris Hyser
Here's a good rule of thumb: don't use backward Go To statements. Notice that continue, break, and mid-function return statements are all forward gotos (and switch is a computed forward goto). Ew Dijkstra's argument does not apply to them. I have noticed subjectively that they aren't confusing, unless you jump into something else (like the middle of a loop - see Duffs Device). I don't think there's anything wrong with using goto to do, for example, a double break. Forward gotos are also useful for jumping to cleanup code after an error - they're often much more readable than doing your real work inside an if statement, and they're always more readable than doing your real work in a terraced landscape of nested if statements. (Syntax Matters)
Backward gotos could be ok when it makes sense that your error case would repeat the last iteration. I like to call it rare looping. A good example exists in beej's networking guide. beej.us See section 8. Why does select() keep falling out on a signal?
Ew Dijkstra's original paper is here: www.acm.org . One might find it rather hard to read, except the opening paragraph:
For a number of years I have been familiar with the observation that the quality of programmers is a decreasing function of the density of go to statements in the programs they produce.
Also see Peter Naur's paper "Go To Statements and Good ALGOL Style" (1963).
Also see Donald Knuth's Structured Programming with go to Statements December 1974 ACM Computing Surveys (CSUR), Volume 6 Issue 4 pplab.snu.ac.kr
In this article he makes some compelling arguments for why GO TO (and its relatives) are useful. For example he demonstrates how a jump into the middle of a loop can be elegant (example 6c). The article also presents some constructs which could be considered a primitive form of exception handling. (He also mentions the Come From statement, BTW.)
A summary of Ew Dijkstra's original paper:
We write programs as lines of text in a file. But our programs execute as microseconds of clock cycles in a machine. We as programmers think in terms of the lines of text, but what really matters is what happens during the microseconds of clock cycles. Therefore, we should make the relationship between the two as easy to understand as possible.
Dijkstra used a set of so-called coordinates to show how far along the program execution is at any given point. For a straight piece of code, we can simply cite the file and line number to show how far along the program execution is. For a function call, we provide a call stack. For a loop, we give the number of iterations so far executed in the loop. Dijkstra reasoned that if you can state a set of such coordinates at each point during the execution of the program, then it's clear enough how the lines of text map to the microseconds of clock cycles.
He also reasoned that many uses of Go To break this understanding, because in Spaghetti Code, it becomes impossible to state, at some points during the execution of the program, how far along the program is.
It is important to understand that Dijkstra wrote his paper at a time when people used Programming Languages without Control Constructs, and thus their programs were littered with gotos -- Spaghetti Code. It does not follow that a 10,000 line C program is poisoned by one goto; leaping to that conclusion is an example of Thinking In Cliches.
There are two major approaches to programming: [is there a page that covers this? there should be]
Pragmatic:
Code Smells should be considered on a case by case basis
Purist:
all Code Smells should be avoided, no exceptions
...And therefore a Code Smell is a hint of possible bad practice to a pragmatist, but a sure sign of bad practice to a purist.
For purists, the rule is simple: don't use GOTOs. This is not hard in any language. Even in assembler, one can write macros that simulate structured control constructs.
For pragmatists, I think it's important to notice that even Dijkstra(as I recall) and Wirth (definitely) admit that there are certain circumstances when GOTOs are best practice: to exit deeply nested structures (of IFs/FORs/WHILEs) in case of unrecoverable error, because doing so with a goto results in far more readable code than does the same code rewritten to test for FATAL_ERROR everywhere. E.g.
for (i=0; i<10; i++) { for (j=0; j<10; j++) { for (k=0; k<10; k++) { if (NOT_FOUND == find(i, j, k)) goto error; func1(i, j, k); } func2(i, j, k); } func3(i, j, k); }
error:
;
Compared with
FATAL_ERROR = 0; for (i=0; i<10 && !FATAL_ERROR; i++) { for (j=0; j<10 && !FATAL_ERROR; j++) { for (k=0; k<10 && !FATAL_ERROR; k++) { if (NOT_FOUND == find(i, j, k)) FATAL_ERROR = 1; if (!FATAL_ERROR) func1(i, j, k); } if (!FATAL_ERROR) func2(i, j, k); } if (!FATAL_ERROR) func3(i, j, k); }
...where you can hardly see the code amidst the references to FATAL_ERROR.
ABOUT THE CODE ABOVE: I would say all this code is wrong. The call of func2 cannot be outside k loop. func3 cannot be outside j loop. using the goto construct clearly shows that this code is designed wrongly. A better designed code would automatically lead to code without goto statements. Yes, (as below) this code really SMELLS, or even better the design SMELLS. Deaply nested structures are also mostly bad code design (also below). Clearly you don't understand Dijkstra by making such claims. Not fixing SMELLY code leads to high cost in maintenance and updates of the code. So tolerating bad code (as below) is a bad management decision.
Let's say, just for the sake of argument, that func2 and func3 are not really functions on (i,j,k) but just some series of statements that happen after the nested for, which was clearly the point of the example. And as for "designed wrongly", how do you know that without any context? What if you're doing multiplication or other operations on matrices? Just try avoiding nested loops.
If I recall correctly, this is was part of the reason (was it Kent Beck or Martin Fowler) chose that name for the term. They wanted to emphasize the pragrammatic view of Code Smells by the connotation of the word "smell". If something smells, it definitely needs to be checked out, but it may not actually need fixing or might have to just be tolerated.
Deeply nested structures themselves are a Code Smell (cf. Arrow Anti Pattern), but yeah, sometimes they are indeed unavoidable.
For simple cases, it is generally preferable to use "return" rather than "goto". It can be considered to be a form of goto, and some dislike it more than GOTOs (it violates the single entry/single exit condition for functions), but others prefer it:
for (i=0; i<10; i++) { for (j=0; j<10; j++) { for (k=0; k<10; k++) { if (NOT_FOUND == find(i, j, k)) { complain(NOT_FOUND); return; } func1(i, j, k); } func2(i, j, k); } func3(i, j, k); }
sub complain(complaint) { // pass along complaint and then do whatever was in error: }
...but this is not directly possible if more work remains to be done following the loops. (Even then, refactoring may save the day.)
Most people agree that, in languages that support exceptions, this would be an ideal place to throw one, and since setjmp/longjmp is merely a non-local goto, it, too, is justifiable for error recovery. Exceptions, in languages that support them, are a more controlled means of doing exactly the same thing: a non-local goto!
The same reasoning leads to the conclusion that GOTOs or their equivalents could be used anytime there is a need to immediately escape from deeply nested control structures and/or functions, e.g. upon finding the target of an extensive search, not just on fatal error.
The question is merely, does the GOTO improve legibility or lessen it. In almost all circumstances it lessens it, but there are exceptions.
Interestingly, since exceptions are a type of GOTO, precisely the same reasoning applies to them: most of the places they could be used for non-error processing, they should not be.
I am under the impression that a function call was considered an acceptable alternative to a GOTO, even in the original debate about GOTO's. An exception system makes things even easier (although the Java exception system is fundamentally broken compared to several of those that preceded it).
Throwing an exception is nothing like a calling a function, because control never returns to the caller (the thrower).
Yes, in this particular example, I personally would agree that using "complain(NOT_FOUND); return;" as in the code above is preferable to the goto in my previous example, and don't get me wrong, that's what I usually do in this precise circumstance. But:
Many people think that return out of the middle of a function is just as bad, stylistically, as the goto you replaced, because it violates the single-entry single-exit condition for functions.
If there is further work to be done in the current function, then using return won't work.
(Just wanted to throw in that quick comment before I returned to the stuff below later, which is getting messy.)
Returning from the middle of a function is language-dependent. Zillions of languages, and in particular Pascal, viewed as the king of the structured languages in the 1970s when Goto Considered Harmful was a new hot subject, did not allow return from the middle of a function. Which is a very good example of why those languages have now been replaced by Cee Language or its descendents. On the other hand, by the time I started coding in Pascal in 1982, return could be called anyplace you wanted. The difference between a "procedure" and a "function" was that we had to assign a value to the function name in order to return a value from a function. I'm pretty sure that's how the language works today, but I haven't coded it in twenty years.
Pascal (at least the version described by Wirth in Algorithms + Data Structures = Programs and Systematic Programming: An Introduction) could not return from the middle of a function, but it did have a goto statement, which Wirth used in the mini-compiler at the end of Algorithms + Data Structures = Programs in order to handle an error by jumping to near the end of a function. See page 301. He not only does a goto, but he goes to a label inside of a calling function! For example, the following only prints out "there":
program test; label 10; begin goto 10; writeln('hello'); 10: writeln('there') end.
Given that, can you see that all this about exceptions and returns is just nitpicking? The point is that sometimes, in some languages, GOTOs are best practice. No, I don't think a GOTO such as you're proposing has EVER been "best practice". I haven't even looked at a Pascal manual in twenty years, so I've long since forgotten what the Pascal syntax is or was for this situation. It surely did NOT include GOTO as "best practice", however (given Pascal's role as the first "structured" language". Wirth and all that.
Is it true all the time in all languages? Of course not. Some languages aren't even imperative to start with. Or don't even have a GOTO.
And even with your purported solution in C, surely you don't think that using a return will always be better than a GOTO-on-error? In a more complicated function, when return is the wrong thing to do, because one needs to get out of deep nesting but not leave the function yet, refactoring will always allow you to fix that? Always? I don't think so. In more than twenty years of coding, it always has for me.
And even if it were true, the refactoring might have to wait for another day, and until it happens, the GOTO solution might be the best interim one.
Nitpicks: death of a thousand cuts.
Since 1982, when I was paid to write code for the first time (after 7 years doing hardware engineering), I've written code in C, Pascal, Object Pascal, Object Cee, Post Script, Smalltalk, Lisp, Cee Plus Plus, Java, and so on and so forth. I have never needed to use GOTO. Not once. I've needed inline ASM directives, platform functions, deep preprocessor magic with fiascos like CORBA, but not ONCE have I needed GOTO. I don't even know the syntax these days. The paragraph at the top cites "return" as a forward GOTO, which Dykstra's argument did not apply to. Nitpick all you want. If you find it a "best practice", be my guest. If I'm supporting code like we were talking about here, I'll replace it with my function call. GOTO is just useless baggage that hasn't been necessary for two decades.
See In All My Years Ive Never. I've programmed from 1980 in several languages and I have needed to use it a handful of times (3-6) for precisely the above-mentioned use, as an exit condition of a nasty loop.
This study is been cited by almost every programming teacher as the reason that the use of Go To statements in code is punishable by expulsion. Except in COBOL, where it's not like removing them would be much help.
Really bad COBOL has
ALTER <paragraph1> TO PROCEED TO <paragraph2>
(It's self-modifying code.)
Now that Go To seems to have largely fallen through a trap door, it makes one wonder what has taken its place as the top Warning Signs Of Bad Programming?
For my money, the new number one position belongs to Side Effects In Functions. When a system contains side-effects in functions, the overall behaviour of the system is immeasurably harder to understand.
Of course, like Go To in the 60's and early 70's, it's widely used and the dangers are not widely appreciated (yet).
I think the problem is not Side Effects In Functions, but Side Effects In Queries. A strict interpretation of the word "function" includes a method that returns a boolean value stating whether the operation succeeded or failed - and that obviously must have a side effect. But there should always be separate and distinct methods for performing a query and updating data - a query shouldn't have side effects.
The #1 bad sign is Long Method Smell.
Hmmm, now that Go To has fallen through the trap-door, it would be nice to see Intercal's Come From more widely used. All the disadvantages and none of the readability!
Shouldn't this topic be consolidated with Go To ?
Many people think that return out of the middle of a function is just as bad, stylistically, as the goto you replaced, because it violates the single-entry single-exit condition for functions.
Huh? What condition is this? Where does that notion come from?
It came from the original very definition of "structured programming" itself. Technically, you are not doing structured programming if you violate that condition. Pragmatists don't care about labels, but purists who avoid GOTOs at all costs should be keenly aware of the contradiction in violating the single-entry single exit condition. Here's one link, of many, that arose when I searched on the term:
My concept of a function is that it takes input, tries to do some processing and reports - as soon as possible - either a result, or an indication that processing was impossible for some reason, in which case the input data should be (still in | returned to) its original state.
If there is further work to be done in the current function, then using return won't work.
Which is why you wrap the current bit of work up in a function. Forward GOTOs can be removed, generally speaking, by Extract Method and allowing returns from the middle of the new function. Of course, the same potential problems arise here (mainly, potentially having to pass in a large amount of context) as when refactoring away Trivial Do While Loop. But how did your code ever get that complicated?
It's not necessarily that complicated, all it takes is any amount of further work at the end of the function, e.g.:
func() {
while(condition()) { work1(); if (fatal_error()) goto done; work2(); } done: x = sqrt(x) + x*x*x; work3(); return work4();
}
That's not complicated, but you can't change the goto into a return without Extract Method or something...and yes, a purist will want to do exactly that refactoring, but a pragmatist may believe that it is more important to keep all of the related work together in one place where it is easier to understand and grasp all at once.
To a pragmatist, it's not a question of being forced to use a goto unwillingly, it's a question of whether it will increase readability in some circumstances such as this. A purist will insist that the pragmatist shouldn't have that freedom, or that the result will inherently be less readable. There isn't a "right" answer at this point, because there isn't agreement on the premises used to deduce an answer.
The example given is trivial to refactor:
func_loop() { while(condition()) { work1(); if (fatal_error()) return; work2(); } }
func() { func_loop(); x = sqrt(x) + x*x*x; work3(); return work4(); }
As long as the two functions are nearby in the source file, readability shouldn't suffer, and if you can give a Meaningful Name to the loop, it should even increase (as you don't need to keep the context for func() in mind when looking at the loop. But a simple variation on this becomes much more difficult to refactor:
func(num, str) { x = num / 2 + 34; y = substr(0, x, str);
while(condition(x, y)) { y = work1(x, str); if (fatal_error()) goto done; x = work2(y, num); } do_regular_termination(x, y);
done: x = sqrt(num) * y*x*x; str = work3(x, num); return work4(y, str); }
The addition of local variables means that Extract Method would need a parameter list of 4 variables, and would need to return both x and y (meaning a struct in most languages). The problem persists if you try to do an Extract Method on 'done:' and tail-call that from the loop and at the end of the function: you'd need to pass in x, y, and num.
This illustrates a legitimate use for Go To: control flow transfer out of a loop where there's a large amount of lexical context that the loop requires. It also illustrates why Go To is a Code Smell - most algorithms shouldn't need this much context, and it's worth trying to see if the algorithm can be simplified before reaching for the Go To. But if it can't, it's good to have it available.
This point about lexical context is exactly what I was trying to get at when I said:
Of course, the same potential problems arise here (mainly, potentially having to pass in a large amount of context) as when refactoring away Trivial Do While Loop. But how did your code ever get that complicated?
The stuff that gets passed into the loop is the "complication".
The code example in question is not "that complicated".
Yes, it is, because it's contrived. It's not complicated in, say, number of statements or variables, but those are the wrong measures of complexity. Conceptually, this code is complex because it has no purpose, neither an implied nor a stated one, other than twiddling some variables (which it does with a distinctly non-obvious code flow). Pointing out that a refactoring would need a large context transfer is pointing out that this code smells -- goto or no. And "do_regular_termination" is a red herring: why is calling a function with two local variables acceptable, but not three? What exactly is "lots of context"?
This example doesn't convince me that there is code that uses lots of lexical context in a loop and cannot be easily refactored. I see your point in theory, but dispute its application in practice. Note that the goto in this example still smells, but it smells independently. Break statements or exception handling both result in code that is more obvious (if not, of course, shorter, but that's penny wise, pound foolish). Whether these are "gotos in disguise" and whether that's also harmful is a separate discussion.
Dijkstra's article really addressed the issue that a goto in the middle of a loop with an autoindex causes difficulty in understanding program flow. In that limited sense the argument is strong. He didn't address at all the use of goto in general coding practice, other than a general comment in his introduction that quality of programming was inversely related to goto density. kd4ttc
I looove the labeled block in java: it makes people's hair stand on end.
found_ok:
{
for retries = 1 to 3 { loop through items { if we found one, break found_ok. } do a retry delay; } throw new "oh no! none were found after 3 retries!"; }
// we found an entry ok
This is pretty awkward. It looks like a backward goto but it really is a forward goto. Confusing. Maybe they should have had it typed out "{ ...; break found_ok; } : found_ok;" This would clearly indicate the start and destination of the jump.
Why not Extract Method it? Here's how in Perl Language you can use a sub to return a value, instead of a block and having to throw an exception:
sub found_ok { my @items = @_; foreach (@items) { return 1 if $_ eq "foo"; } return; # gives undef }
Then you can handle the result of the check a little more cleanly with "if (found_ok(@things))". Of course, this depends on if you think Internal Loop Exits Are Ok or not. You can also trivially modify the above to provide a Single Function Exit Point:
sub found_ok { my @items = @_; my $success; foreach (@items) { $success = 1 if $_ eq "foo"; } return $success; }
What about this C function:
void someFunc() { T *X1, *X2; ... X1 = (T*)malloc(...); ... if (FAILED(func1()) goto clean1; ... X2 = (T*)malloc(...); ... if (FAILED(func2()) goto clean2; ... clean2: free(X2); clean1: free(X1); }
Am I supposed to make a function call per malloc? These gotos produce essentially the same code as would class instances with destructors in C++. Fragmenting such a function to avoid gotos can and will reduce readability, as well as force you to think up useless function names like someFuncBeforeAllocatingX2(). Sometimes this refactoring can be done nicely, sometimes not.
Why not use if statements? Block scope and <8 space tab widths are your friends.
void someFunc() { T *X1 = (T*)malloc(...); ... if (SUCCEEDED(func1()) { ... T *X2 = (T*)malloc(...); ... if (SUCCEEDED(func2()) { ... } free(X2); } free(X1); }
If clauses are a good solution, but it bothers me that this issue of destruction interferes with the flow of the program presented to the programmer. By that I mean, in C++ the same function would look like (given T_wrapper that that does the needed [de]allocation during [de]construction and has operator T*:
void someFunc() {
T_wrapper X1; if (FAILED(func1())) return; ... T_wrapper X2; if (FAILED(func2())) return; ...
}
And it seems to me that what I wrote looks closer to the form I want, compared with the code that puts ... inside if clauses. Besides, it can get more complicated than this example, if the last ... is something like two or three nested for statements that can terminate at any time, I'd like to goto cleanup2 there also.
The gotos implement destruction; I don't think of them as part of the flow control of the "actual" program. I'm merely emulating C++ here. It's similar to using function pointer tables to emulate virtual in C. It looks uglier of course, but this is strictly in the syntax level. Of course, hand-crafted emulation is prone to errors, and it would be best to do it using a C++ to C compiler, if only I could find one.
Why not pass a function pointer to a function that allocates T, calls the function pointer function and then deallocates T?
That's an extra virtual call (call [ecx]) per allocation, and you have to split someFunc() into parts (each time you allocate something, you need another function like someFuncAfter3Allocations()) which is exactly what I was trying to avoid. I don't see how that is more acceptable in any possible sense than using cleanup labels at the end to emulate destruction.
Idea: How about matching each goto with a Come From into a Goto Come From Pair so that gotos are only allowed from sources explicitly named at the target label?
See: Go To, Recv Considered Harmful, Else Considered Smelly, Return Considered Smelly, Objective Evidence Against Gotos
See original on c2.com