Switch Statement

A Switch Statement is a more structured way (than multiple if...else statements) to express conditional logic in C-like languages. Usually looks something like this:

switch(expression) { case a: Do''''Something(); break; case b: Do''''''Something''''''Else(); break; default: Handle''''''General''''''Case(); break; }

Other non-C-like languages have similar constructs, often with the keywords "case" or "select", often without the need for an explicit "break".

A switch can be invaluable in parsing, or as part of an interface between Object Oriented code and Procedural Code. A switch is easier to read than a long if-then-else stack, and almost always faster as well.

Discussion on speed and Optimization

"...almost always..."

Can you substantiate this? Does it apply to all languages with switch statements?

[It is optimized by the compiler as a whole into typically one of three different machine code approaches, which is made possible by the fact that the case expressions are constants. If the case expressions form a dense range, then the whole thing is turned into an indirect jump through a table, for instance. Basically it is optimized to be as fast as it possibly could be. A series of if-then-else statements almost always has to be simply implemented as it was coded, therefore e.g. a sequence of 50 if-then-elses will be as much as 50 times slower than a switch. This is all well-known in the compiler world; it is not something controversial.]

[Some languages do not require the case expressions to be constants, e.g. Lisp (cond), in which case it is just syntactic sugar on top of if-then-elses and not a true Switch Statement as invented in Pascal and C. COND is not a switch! A switch is spelled CASE in Lisp (see below).]

With optimizing compilers and so many different languages to consider, it is dangerous to generalize, but the basic speedup that a switch exploits is that expression is evaluated only once, then compared against each case until one matches. Also, since switch statements are so common, they tend not to be just syntactic sugar, but directly supported right down to the hardware.

The big advantage is that most computers, in assembly language, support computed indirect jumps. In other words, I can make an array of target addresses, and then in order to switch off on a value X, I just use X as an index into this array, fetch the address of a piece of code, and jump to it. This is faster than if... else if... else if... in the same way that an array is faster than a linked list.

Actually, a single conditional jump is often faster than a single computed jump, mostly because of pipelining issues, but of course there is a trade-off point at which the computed jump is faster than the corresponding conditional jump cascade.

The advantage of a Switch Statement is that you can have the compiler decide the trade-off point; i.e. if the Switch Statement has only a few clauses, the compiler will probably produce a conditional jump cascade, but for many clauses, it will switch (sic) to a jump table approach. (Or one of several other approaches, depending on the data.)

And Switch Statements almost always have many clauses, in the wild.

I did this for the ARM compiler 14 years ago. I'm sure it's commonplace now, but I was proud of it at the time :-) See this 6 year old posting groups.google.com

Fall Through

A feature (or some would say misfeature) of C's switch construct is that control flow can "fall through" from one case to another. For example:

switch (command) { case CMD_SHOW_HELP_AND_EXIT: do_show_help(); /* Fall thru */ case CMD_EXIT: do_exit(); break; case CMD_OTHER: do_other(); break; /* ... etc. ... */ }

Because the do_show_help(); line is not followed by a break, the do_exit() call will be executed after do_show_help() when CMD_SHOW_HELP_AND_EXIT is the command. It is traditional to use a comment such as /* Fall thru */ in the example to indicate to readers that a break was not left out unintentionally (a common mistake).

Proponents say that this technique leads to more efficient control dispatching when multiple cases share code. Critics say that using this technique leads to unstructured hard-to-maintain code, and is little better than using gotos.

Many optimizing C compilers will emit the same object code for the above code as for the below -- they merge the common trailing portions of case blocks.

switch (command) { case CMD_SHOW_HELP_AND_EXIT: do_show_help(); do_exit(); break; case CMD_EXIT: do_exit(); break; case CMD_OTHER: do_other(); break; /* ... etc. ... */ }

Cee Language switch (Duffs Device)

int q = (n+7)/8; switch (n%8) { case 0: do { foo(); // Great C hack, Tom, case 7: foo(); // but it's not valid here. case 6: foo(); case 5: foo(); case 4: foo(); case 3: foo(); case 2: foo(); case 1: foo(); } while (--q > 0); }

Forth Language (jump table from a chess program)

CONSTANT Pawn ( ... ) 6 CONSTANT King

CREATE genVector \ each word is ( sq -- sq ) ' NOOP , ' genPawn , ' genKnight , ' genBishop , ' genRook , ' genQueen , ' genKing , ' genError ,

genSquare ( sq -- sq )

DUP piece@ CELLS genVector + @ EXECUTE ; : genMoves ['] genSquare forEachSq ;

Forth Language (ANS Forth Core Extension wordset)

X ( n -- )

CASE test1 OF ( -- ) ... ENDOF testn OF ( -- ) ... ENDOF ( n -- ) ... ( default ) ENDCASE ... ;

which is semantically equivalent to cascading IF statements:

Y ( n -- )

test1 OVER = IF DROP ... ELSE testn OVER = IF DROP ... ELSE ( n -- ) ... THEN THEN ( big list of THENs to match the IFs) ;

Common Lisp switch

(case k ((1 2) 'clause1) (3 'clause2) (nil 'no-keys-so-never-seen) ((nil) 'nilslot) ((:four #\v) 'clause4) ((t) 'tslot) (otherwise 'others)))

Haskell Language switch

Using explicit matching:

take m ys = case (m,ys) of (0,_) -> [] (_,[]) -> [] (n,x:xs) -> x : take (n-1) xs

Using implicit matching:

take 0 _ = [] take _ [] = [] take n (x:xs) = x : take (n - 1) xs

Java Language switch

switch (k) { case 1: case 2: return "clause1"; case 3: return "clause2"; case 0: return "nilslot"; case '4': case 'v': return "clause4"; case 't': return "tslot"; default: return "others"; }

Perl Language switch (this is Perl 6: Perl 5 doesn't have explicit switches, although the language implementation documentation claims that the compiler will always optimize if-else into switch whenever possible; see e.g. "Programming Perl" by Larry Wall et al)

given $number { when &is_prime { warn "$_ is prime\n"; continue; } when /[13579]$/ { warn "$_ is odd"; } when /[02468]$/ { warn "$_ is even"; } }

In Perl 5, you can use something like this (adapted slightly from a common idiom of mine):

&{{ "foo" => sub { bar(); baz(); }, "quux" => sub { wibble(); spam(); } }->{ $selector }};

If you have a dense set of integer cases, you could use an array instead of a hash. No good idea about implementing fall-through, though :)

&{[ handle_0, handle_1, complain_about_other_digits ]->[$digit]};

(Usually I use this setup to just pick a scalar value based on the input. Here I adjust that slightly by using code references for the values, and then invoking the relevant code.)

[Please note that this idiom (in Perl 5) -- using a scalar as an offset into an array or hash of function addresses -- is the basic paradigm of Cee Language "object-oriented" method dispatch, as in Cee Plus Plus, Objective Cee, and various homebrews. Switch statements are generally viewed as a Code Smell in object-oriented environments because they pessimize message dispatch, which a good OO environment makes faster than the corresponding switch. Also see the discussion under "Smalltalk Language switch", below.]

There is a working example of this mechanism in the interpreters for Snusp Language.

Pl Sql switch

CASE grade WHEN 'A' THEN dbms_output.put_line('Excellent'); WHEN 'B' THEN dbms_output.put_line('Very Good'); WHEN 'C' THEN dbms_output.put_line('Good'); WHEN 'D' THEN dbms_output.put_line('Fair'); WHEN 'F' THEN dbms_output.put_line('Poor'); ELSE dbms_output.put_line('No such grade'); END CASE;

Ruby Language switch

case $age when 0 .. 2 "baby" when 3 .. 6 "little child" when 7 .. 12 "child" when 12 .. 18 # Note: 12 already matched by "child" "youth" else "adult" end

Scheme Language switch

(case (car '(c d)) ((a e i o u) 'vowel) ((w y) 'semivowel) (else 'consonant)) ===> consonant

VHDL switch (combinational)

library ieee; use ieee.std_logic_1164.all;

entity CombinationalSwitch is port ( A: in std_logic_vector (2 downto 0); B: in std_logic_vector (7 downto 0); F: out std_logic ); end CombinationalSwitch;

architecture RTL of CombinationalSwitch is begin with A select F <= B(0) when "000", B(1) when "001", B(2) when "010", B(3) when "011", B(4) when "100", B(5) when "101", B(6) when "110", B(7) when "111", 0 when others; end RTL;

VHDL switch (sequential)

library ieee; use ieee.std_logic_1164.all;

entity SequentialSwitch is port ( A: in std_logic_vector (2 downto 0); B: in std_logic_vector (7 downto 0); F: out std_logic ); end CombinationalSwitch;

architecture RTL of SequentialSwitch is begin process (A, B) begin case A is when "000" => F <= B(0); when "001" => F <= B(1); when "010" => F <= B(2); when "011" => F <= B(3); when "100" => F <= B(4); when "101" => F <= B(5); when "110" => F <= B(6); when "111" => F <= B(7); when others => F <= 'Z'; end case; end process; end RTL;

Notice that, since VHDL describes hardware, the Switch statement here is basically a multiplexer. In this case, this is an 8-to-1 multiplexer.

Visual Basic switch

Dim Number Number = 8 Select Case Number Case 1 To 5 Debug.Print "Between 1 and 5" Case 6, 7, 8 Debug.Print "Between 6 and 8" Case 9 To 10 Debug.Print "Greater than 8" Case Else Debug.Print "Not between 1 and 10" End Select

Hebrew (Ivrit)

בדוק משתנה אם משתנה = 1 אזי שלום אם משתנה = 2 אזי להתראות אחרת לך-יא-מניאאק סיים

Delphi Language switch note uses sets, can work with characters and enumerated types instead of integers

case I of 1..5 : Caption := 'Low'; 6..9 : Caption := 'High'; 0, 10..99 : Caption := 'Out of range'; else Caption := ''; end;

case Letter of Vowels : Caption := 'Vowel'; Consonants : Caption := 'Consonant'; end;

where Letter is a char and Vowels and Consonants are Sets.

Smalltalk Language switch

anObject aMessage

where aMessage is a method defined on the class of anObject.

(One can also define syntax for a Smalltalk Case Statement within the language.)

How does that work when you're switching on 1..5, 6..10, and "other"?

A Smalltalk program implementing such a dispatch (and they are exceedingly rare, because the object/message paradigm so richly moots most situations that provoke a Switch) typically looks up a method name in an Array or Dictionary which they then perform: -- the equivalent of the C expression (*action[selector])().

In the interest of advancing the discussion, I built a test harness and measured the results. I simplified your example to dispatch on a number between 1 and 16 -- this gets at the most important aspects of your question. I defined a new class, SwitchTest, with sixteen methods with selectors #m01, #m02, ... , #m16. For the sake of testing, these methods were no-ops. I then built and measured the performance of three dispatch approaches:

"Array Dispatch": I constructed a "dispatchArray" of size 16, where the contents of index N is the integer offset of method m in the methodsArray of SwitchTest. I then constructed a test method that counts an index from 1 to 16, looks up the corresponding offset in dispatchArray, then loads and evaluates the CompiledMethod at that offset.

(aMethodsArray at:
(dispatchArray at: anIndex)) executeWithReceiver: self andArguments: #()

"Perform": I constructed a dispatchArray of size 16, containing 16 symbols, which were #m01, m02, ..., #m16, such that the symbol at index N was #m. I then constructed a test method that counts an index from 1 to 16, looks up the symbol corresponding to the index in dispatchArray, then performs that symbol in the receiver.

self perform:
(dispatchArray at: anIndex)

"Message send": I open coded the sixteen message sends -- this corresponds to the most common idiom in Smalltalk.

self m01; m02; "..." m16.

The above is a load of bull. This is hardcoding the evaluation of m01, etc. when in reality the values for switch statements are variable. Add the fact that objects must be coaxed into selectors before they can be sent to objects.

Smalltalk needs a switch statement, because nested ifTrue/ifFalse's are ripe with design smell.

I then used Time>>millisecondsToRun: to measure the time required to execute 1000000 repetitions of each of the above three cases (on a 3Ghz Pentium with 1G RAM running IBM Smalltalk). Here are the times I measured:

Array Dispatch Test: 637.5

Perform Test: 284.4

Message Send Test: 189.1 My interpretation: using the "switch statement" approach (perform) introduced about a factor of two in dispatch overhead, in comparison to a "regular" message send. It might be worth comparing that to the other languages and environments here.

Even Cee Language imposes some overhead to setup the switch statement, and it might be interesting measure that overhead in comparison to a straight procedure call on each of the resulting branches. (But of course a C switch/case statement is faster than a series of if-elses.)

My point in putting up the example is to put some quantitative bounds (in this case a factor of two) on the switch statement dispatch overhead imposed by the language. If I was parsing something and cared about the performance, I would probably write and debug the parser in Smalltalk and then recode the lexer into C, calling the lexer through the external function API (Alternate Hard And Soft Layers).

The interesting thing is the relative overhead of a switch-like mechanism in Smalltalk, as compared to its natural behavior (not to C). My point is that 99% of what one does with switch/case in a procedural language is mooted by a good polymorphic language (like Smalltalk). Switch/case is needed in C because you don't have polymorphic message sends (unless you want to roll your own tables of function pointers).

The C-style switch isn't needed in Smalltalk (as a separate language construct) because its function is implicit in the method dispatching behavior. If, in the pathological Smalltalk case where you DO need switch, I showed that it's about twice as slow as a "straight" message send (in Smalltalk). The performance arguments don't make sense until you look at much larger grain sizes, where the industry has learned (particularly help from Alternate Hard And Soft Layers) that C's fine-grained performance advantages do not, in general, translate to corresponding application performance differences.

Cobol Language evaluate-when (COBOL 85)

evaluate policy-holder-age also policy-holder-sex also smoking-flag when 20 thru 40 also "F" also ANY perform young-female-policy when 65 thru 999 also "M" also "Y" perform old-smokey when other perform all-other-cases end-evaluate

It's amazing; you can do anything. (I'm not claiming it can do anything well, just that it has an amazing array of expressive options. ;-)

Let's standardize on one particular form. A simple substitution process can then generate the correct syntax for your chosen language.

Um... not true, since different languages switching capabilities are different. No one generic form can be translated to each language.

Long ago, when memory was very limited, there wasn't room for either a jump list or a series of tests in a particular program, so I calculated the required address as a linear function of the switch expression (using a shift instruction for the multiplication), and then made a jump to the calculated address. Those were the days. ;-)

See Switch Statements Smell, Case Statements Considered Harmful (summary: use OO/generic dispatch on the type (Internal Polymorphism) in preference to switch on the type ("Type Case", "External Polymorphism"); when looking at things other than the type, if Polymorphism is impossible or introduces too much extra machinery, then use switch in preference to if-else whenever possible. Contrary to some Purists, in some languages it is not possible to replace every switch statement with polymorphism, and in other languages it is not always desirable. Classic example: switch on char value in a lexical analyzer. But switch-on-type should indeed always be replaced.)