# Switch fall-through
confused why cases in a C switch statement fall through to the next case by default. For instance this code
```c
switch (4) {
case 4: {
printf("4");
}
default: {
printf("something else");
}
}
```
will actually print `"4somethingelse"`; it executed both branches! The idea is that by default, after a branch is done, execution will "fall through" to the next branch. This behavior can be prevented by a `break` (or in certain cases a `return`):
```c
switch (4) {
case 4: {
printf("4");
break;
}
default: {
printf("something else");
// break here does not do much;
// there is no case to fall through to
}
}
```
The default behavior is unintuitive: even in many C code bases, a fall-through is often explicitly marked with a comment to confirm to readers that it is intentional
```
case 4: {
printf("4");
/* fall through */
}
```
So, if it is confusing even to C programmers, why is this the default behavior?
===
We recently started work on [zlib-rs](todo), an implementation of the zlib (de)compression library in rust. The implementation is heavily based on the original [zlib]() and more modern [zlib-ng]() libraries.
Naturally we had to look at the source code of these projects, and especially in decompression they use this pattern of explicit fall-throughs. As I considered how to write this code in rust while maintaining performance, an answer for why fall-throughs are useful hit me.
An answer -- one that I think makes a lot of sense -- is a combination of state machines, unconditional jumps and the branch predictor.
Consider this trivial parser state machine that accepts suffixes of the string `"abc"`.
```rust
fn run(input: &[u8]) -> bool {
enum State { A, B, C }
let mut state = State::A;
for byte in input {
match state {
State::A => match byte {
b'a' => state = State::B,
_ => return false,
}
State::B => match byte {
b'b' => state = State::C,
_ => return false,
}
State::C => return *byte == b'c',
}
}
}
```
The problem is -- if we assume the input is generally valid -- the A state will always proceed to the B state. But the structure here does not make that obvious to the compiler: from its perspective A, B and C are equally likely.
Furthermore, the branch predictor also cannot accurately predict which of A, B or C will be jumped to.
> In practice this code is small and simple enough that a modern compiler. However this very unlikely (and empirically false) for larger projects like the zlib inflate state machine and wasm interpreters.
So the switch fall-through is a way to directly express a state transition as an unconditional jump. These are efficient because the CPU knows what instructions will be executed next.
===
Rust does not have fall-through though. I would not want them either, it is too easy to make mistakes. But I do want fast code. And so I somehow need to get rust to emit direct jumps. There is a trick: tail calls
```rust
fn state_a(input: &[u8]) -> bool {
match input {
[b'a', rest @ .. ] => return state_b(rest),
_ => return false,
}
}
fn state_b(input: &[u8]) -> bool {
match input {
[b'b', rest @ .. ] => return state_c(rest),
_ => return false,
}
}
fn state_c(input: &[u8]) -> bool {
match input {
[b'c', .. ] => return true,
_ => return false,
}
}
```
each state is now represented as a function, and a state transition is a function call. Normally function calls are not that efficient, but in this case all the function calls are in tail position: the function call is the final thing that happens before a return. In this case, compilers can get clever and turn a function call into a jump: exactly what we want!
> again, because this is a simple example, the compiler would just inline all the functions into one. That still means we get direct jumps though!
There is a caveat here: tail calls are not guaranteed to become jumps: we're currently at the mercy of the code generation backend, and they don't guarantee the optimization for all targets.
Rust has reserved the `become` keyword for guaranteed tail calls, so this code would guarantee that the function call is turned into a jump (or would fail to compile time if this that is not possible).
```rust
fn state_a(input: &[u8]) -> bool {
match input {
[b'a', rest @ .. ] => become state_b(rest),
_ => return false,
}
}
```
but there are some tricky issues around running destructors, and certain platforms just not supporting guaranteed tail calls.
https://github.com/rust-lang/rust/issues/112788
https://github.com/rust-lang/rfcs/issues/2691
https://github.com/rust-lang/rfcs/pull/3407