There are two important ideas to pick up here, I think. First, functional programming prefers a reduction-based model of computation: the whole program text represents the current state of the program, and a "step" of computation corresponds to simplifying ("reducing") the expression in a well-defined way. Second, `call/cc` adds a novel reduction rule that behaves a bit like function application, but looks a bit backwards. (`shift` and `reduce` then generalize the basic idea, but use the same machinery.)
In detail: Consider a program `fib 2`, which simply computes the third Fibonacci number and then exits. We can define `fib` like so (using Haskell-y syntax, which I find a little lighter for this purpose):
fib n = fib' n 0 1
fib' 0 a b = b
fib' n a b = fib (n - 1) b (a + b)
If we run this program, what do the individual steps look like?
print (fib 2)
= print (fib' 2 0 1) -- Substitute the definition of fib
= print (fib' (2 - 1) 1 (0 + 1)) -- Substitute the second clause of fib'
= print (fib' 1 1 1) -- Simplify - and + (technically, this is two steps)
= print (fib' (1 - 1) 1 (1 + 1))
= print (fib' 0 1 2)
= print 2 -- Substitute the first clause of fib'
= ()
This isn't just an analogy; this is exactly how an unoptimized interpreter could fully implement a simple functional language. There's no need for an actual call stack when evaluating this program: the program text itself describes "what to do next" (which is the point of a call stack).
Now, what does `call/cc` do? Well, remember what function application does: we take something like `(\x. FOO) A` and substitute `A` into `FOO` where ever `x` appears, no matter how deep into the program it is (unless it's shadowed, i.e. not the same `x`). We could write this as `apply (\x. FOO) A` if we wanted, although by syntax rules we know that the first value in a call is the function to call. (The capitalized names could be entire other expressions, whereas `x` is just a variable name.)
Similarly, if the whole program looks like `A (call/cc \x. FOO)`, we substitute `\y. A` into `FOO` wherever `x` appears (unless it's shadowed)`. It's a weird kind of function call, where the function is nested inside `call/cc` like normal, but the argument is everything outside `call/cc`. And since `A _` has got a value-shaped hole where the `call/cc` was, we know that if `FOO` is going to use it, it needs to provide it a value -- hence why we substitute `\y. A`, not just `A`. (So we have to fill the hole with the new variable name `y`.)
print (call/cc (\x. x 42))
= (\x. x 42) (\y. print y) -- `\y. print y` is equivalent to `print`,
= (\y. print y) 42 -- so I'll elide this expansion next time.
= print 42
This example seems kind of useless -- call/cc did nothing! But what if we had more values to return than just `42` -- what if we had a whole bunch of things, and we wanted to run the same processing for all of them?
print (call/cc (\x. x 42 >> x 101 >> x 0))
= (\x. x 42 >> x 101 >> x 0) print
= print 42 >> print 101 >> print 0
(`>>` is just a Haskell operator for sequencing, like `;` in C or Java.)
Now we've replicated the rest of our program based on each of the three values returned by the inner function. It's like we've bifurcated (trifurcated?) our program into multiple independent programs that continue from where the original program left off. (You might know that `fork` in Linux returns a different value depending on whether you're in the original process or in the spawned child process. This is very similar!)
But what if we don't want to do everything in our program multiple times? What if we just want to do some of it for each value, but join everything back together at the end for some final processing? Well, `shift` and `reset` let you do that. The `shift` operator slots in where `call/cc` was, but you now get to choose how much of the rest of the program to replicate by placing a `reset` somewhere else in the program. You can wrap the whole program in `reset` to get the same behavior as `call/cc`, or you can scope it to a smaller portion:
(reset (print (shift (\x. x 42 >> x 101 >> x 0)))) >> print "Done"
= ((\x. x 42 >> x 101 >> x 0) print) >> print "Done"
= (print 42 >> print 101 >> print 0) >> print "Done"
(We definitely didn't want to print `Done` three times, right?)
None of the program's behavior needed call/cc or shift / reset. They're fundamentally rewriting rules that let us give such a program a different form. Sometimes the forms we can achieve with call/cc are more readable or provide better modularity. Sometimes they don't (so we shouldn't use them!).
Shift and reset let you write your own coroutine system like it was nothing, though, so it's certainly not without merit :)
One thing that catches me all the time is that when writing examples in a functional language, the continuation of call/cc is on the left of the call to call/cc.
Rewriting your example as imperative statements might be more understandable to non-functional[1] programmers like me:
x = call/cc ...
print x ;; <- this is the continuation
Of course this does not work well with your nice reduction based explanation, but can be made to work if you go for a CPS transformation based explanation.
For sure! Like I said, this is very similar to fork(), and you've basically called out why :)
pid = fork();
if (pid != 0) {
printf("parent!");
} else {
printf("child!");
}
// Displays, in principle, though in any order:
// parent!
// child!
`fork` captures "the rest of the program" -- the part after the `;` on `fork()`, although confusingly including the `which = _` -- and causes it to be run twice, each with a different return value (which really means "passed to the continuation").
> the continuation of call/cc is on the left of the call to call/cc.
Kiiind of. It's really around the call to call/cc -- the "101" and "5" in this example are to the right, while the "+" is to the left:
(+ (call/cc (\k. k 42)) 101 5)
= (\k. k 42) (\x. (+ x 101 5))
= (\x. (+ x 101 5)) 42
= (+ 42 101 5)
In terms of a syntax tree, the continuation is above the call to call/cc. The left/right distinction only really shows for super-simple examples. But, conversely, the argument for function application is always below the application sense, so there's a very reliable duality here.
(Another, slightly twisted perspective is that a syntax tree is, after all, a graph; and following the parent of a node leads to a subgraph just as well as following a child. The fact that this subgraph is "upside-down" is just an extra detail ;), but you can use this subgraph during reduction just like you would a child subgraph. In this way, call/cc is literally just function application, but applying the parent subgraph instead of a normal child.)
((There's a third, incredibly boring member of this family... we can pass one child to another with `apply`, and pass the context to a child with `call/cc`. What lets us pass a child to the context? `id`, or the no-op operator!))
Certainly adding call/cc doesn't allow you to represent more computable functions by programs. But there are plenty of internal features of programs that we're interested in, that let us meaningfully distinguish amongst programs that otherwise have the exact same observable behavior.
I like coroutines as an example here, because I have very direct experience writing programs with similar behavior in systems that either do or do not have coroutines. By far, I found the coroutine-based programs far easier to understand and extend.
Mattias Felleisen wrote the seminal paper on comparing expressive power of Turing-complete languages. It's worth a read if it's not come across your radar before!
In detail: Consider a program `fib 2`, which simply computes the third Fibonacci number and then exits. We can define `fib` like so (using Haskell-y syntax, which I find a little lighter for this purpose):
If we run this program, what do the individual steps look like? This isn't just an analogy; this is exactly how an unoptimized interpreter could fully implement a simple functional language. There's no need for an actual call stack when evaluating this program: the program text itself describes "what to do next" (which is the point of a call stack).Now, what does `call/cc` do? Well, remember what function application does: we take something like `(\x. FOO) A` and substitute `A` into `FOO` where ever `x` appears, no matter how deep into the program it is (unless it's shadowed, i.e. not the same `x`). We could write this as `apply (\x. FOO) A` if we wanted, although by syntax rules we know that the first value in a call is the function to call. (The capitalized names could be entire other expressions, whereas `x` is just a variable name.)
Similarly, if the whole program looks like `A (call/cc \x. FOO)`, we substitute `\y. A` into `FOO` wherever `x` appears (unless it's shadowed)`. It's a weird kind of function call, where the function is nested inside `call/cc` like normal, but the argument is everything outside `call/cc`. And since `A _` has got a value-shaped hole where the `call/cc` was, we know that if `FOO` is going to use it, it needs to provide it a value -- hence why we substitute `\y. A`, not just `A`. (So we have to fill the hole with the new variable name `y`.)
This example seems kind of useless -- call/cc did nothing! But what if we had more values to return than just `42` -- what if we had a whole bunch of things, and we wanted to run the same processing for all of them? (`>>` is just a Haskell operator for sequencing, like `;` in C or Java.)Now we've replicated the rest of our program based on each of the three values returned by the inner function. It's like we've bifurcated (trifurcated?) our program into multiple independent programs that continue from where the original program left off. (You might know that `fork` in Linux returns a different value depending on whether you're in the original process or in the spawned child process. This is very similar!)
But what if we don't want to do everything in our program multiple times? What if we just want to do some of it for each value, but join everything back together at the end for some final processing? Well, `shift` and `reset` let you do that. The `shift` operator slots in where `call/cc` was, but you now get to choose how much of the rest of the program to replicate by placing a `reset` somewhere else in the program. You can wrap the whole program in `reset` to get the same behavior as `call/cc`, or you can scope it to a smaller portion:
(We definitely didn't want to print `Done` three times, right?)None of the program's behavior needed call/cc or shift / reset. They're fundamentally rewriting rules that let us give such a program a different form. Sometimes the forms we can achieve with call/cc are more readable or provide better modularity. Sometimes they don't (so we shouldn't use them!).
Shift and reset let you write your own coroutine system like it was nothing, though, so it's certainly not without merit :)