The Lambda Calculus
1. Introduction
Is a fascinating “computer language” that is very simple, but it is possible to create any program within it you could possibly do on any computer ever. It also seems to me to be the most obvious fundamental bridge between mathematics and computer science – and thus as a Mathematics and Computer Science undergraduate, I find it very interesting.
We start off with a set of variables, V. This can be anything – things like x, y, z, a, b, c, etc. This set is considered infinite (although countable, which is very important for proving undecidability results about the lambda calculus.) We then have terms made up of variables, declaring functions, and using functions. These are of the following forms:
V := x,y,z,a,b,c…
T := V  T T  \V.T
Here read \x.T as “the function that takes in x as an input and returns T as an output.” For example, \x.x is the identity function that returns it’s input, \xy.y takes in two values and returns the second value, etc. (\xy.y is an abbreviation for \x.(\y.y) and represents a technique called currying, where we represent a function of two arguments as a function that takes an argument and returns a function, which then takes the second argument and returns an answer.) Another example is the function \f.fx which takes in a function, and returns the value of f(x). In the lambda calculus functional application is denoted by simple concatenation. For example, (\x.x) t = t – this is the identity function applied to the variable t, which returns the variable t. Another example would be (\f.ft) (\xy.x) = (\xy.x) t = \y.t
Notice that functional application is basically substitution – the result of (\x.T) applied to y is T with any occurrence of y replaced by x – we write T[x/y]. Thus the “execution” of the lambda calculus essentially comes down to one rule known as “betaconversion”:
(\x.T) y = T[x/y]
…only it’s not quite that simple. Consider (\f.fy) y. This executes (or “reduces”) to (\y.yy) = … Hence we need alphaconversation: we need to rename the bound variables (variables that are “holes” representing the input to some function) to other fresh variables that aren’t used elsewhere. For example, before we start reducing (\x.xy) (\x.xy) we would need to go through the expression and rename the second x to a new variable, e.g. the z (\x.xy) (\z.zy).
Thus we have our “algorithm” for running “programs” in the lambda calculus – rename all of the bound variables into new, fresh variables using alphaconversion (and we can do this as there are infinitely many variables) and then reduce, using the betareduction rule given above. Alternatively, we can do alphaconversion as we go along through the path of betareduction, which is in fact sometimes necessary (see below)
2. Reduction Strategies
Consider (\xy.x) t ((\x.xx)(\x.xx)). We apply alphaconversion to this, to get (\xy.x) t ((\z.zz)(\v.vv)). Now we have a choice – there are two possible choices to beta reduction. If we reduce the first, we get (\y.t) ((\z.zz)(\v.vv)), which (again taking the left most reduction) reduces to t as (\y.t) ignores its argument. However, if we decided to reduce the (\z.zz)(\v.vv) term first, we then get (\v.vv) (\v.vv) which alphaconverts to (\z.zz) (\v.vv) –where we started. We thus get into an infinite loop. Thus the order we decide to reduce terms is important – in the former case we get an answer, t, in the latter case we keep going on for ever. So which one should we take? In this case it appears to be to reduce the leftmost (i.e. outermost) reduct – but is this true in the general case? Taking the outermost reduct is like evaluating (3+4)^2 to (3+4).(3+4) to (7).(3+4) to (7).(7) to 49, whilst doing the 3+4 first takes less time as we only need to do it once. There is an even more important question though – if taking different reduction paths can lead to an answer, or not, is it possible that two reduction paths can lead to different answers? Fortunately, we have the following two theorems:
The ChurchRosser Theorem: Suppose a reduces to b and a reduces to c, both of which cannot be reduced further. Then b and c are the same term (up to alpha conversion.)
The Leftmost Reduction Theorem: Suppose a reduces to b where b cannot be reduced further. Then a will reduce to b if you follow a leftmost reduction strategy, always reducing the outermost lambda you see.
From the ChurchRosser Theorem, we can show that Lambda Calculus is consistent, i.e. it is not the case that all terms are the same. From the Leftmost Reduction Theorem, we can show that if in the example above we have a nonterminating term, but it gets to a solution, then that term was never used and can thus be replaced with any other term and you still get to the same answer.
Implementations of betareduction often use “graph reduction” rather than straightforward beta reduction – in reducing (3+4)^2 will reduce to %*% where % = 3+4 – it would then reduce the 3+4 and insert it in both places the % was found. This way we only have to do things once, and still guaranteeing us an answer, apparently the optimum reduction strategy to use.
3. Doing Useful Things
Well that’s all well and good, but I thought “lambda calculus could do anything any computer could possibly do ever?” I hear you ask. Well, here are some examples.
The lambda calculus just has functions, nothing else. For a reasonable programming language, we need conditionals, booleans, looping, numbers, lists, etc, which are clearly not available in the lambda calculus. Well, they kind of are.
Ok, your first demand was conditionals. Encode
true := \xy.x
false := \xy.y
if := \abc.abc.
Then if true a b = true a b = a and if false a b = false a b = b (note there isn’t really any need for if here at all, we could just write the second element in these strings of equals.) We can also have, for example
and := \b. if b c false
or = \b.if b true c
not := \b.if b false true.
These are all reasonable intuitive. Secondly, we have numbers. Encode
0 := \fx.x
1 := \fx.fx
2 := \fx.f(fx)
3 := \fx.f(f(fx)), and so on (so n contains n applications of f.)
Then we can define
succ := \nfx.f(nfx)
zero? := \n.n(\x.false)true (zero? returns true if its argument is zero,)
pred := \n. if (zero? n) 0 (n (\y.y i (succ y)) (\ab.0)) where i is (am?) the identity, i := \x.x.
eq := \ab. if (zero? m) (zero? n 0 1) (zero? n 1 (eq (pred m) (pred n))
Some of these seem intuitive from the above definitions (succ, zero?, some need a little more unwinding such as eq and the terms below, and some are reasonable incomprehensible such as pred. To see pred, one may take the special case of zero out and prove by induction that this works on all values other than zero, then add in the special case.)
We would like to say that
add := \ab. if (zero? a) b (succ(add (pred a) which corresponds to a+b := (a1)+b + 1. Sim,
mult := \ab. if (eq a 1) b (add(mult (pred a)
sub := \ab. if (zero? a (pred(sub a (pred ) which corresponds to a+b := (a1)+b + 1. Sim,
div := \ab. if (eq a 1) b (add(mult (pred a)
However, we cannot do so as the lambda calculus doesn’t allow recursion (remember, these definitions are just abbreviations, and we can’t have recursive abbreviations unless we’re unix programmers.) However, we can directly simulate recursion. The key observation is noticing the presence of the fixed point operator theta:
theta := (\xy.y(xxy)) (\xy.y(xxy))
Note that this is somehow paradoxical, as we are applying a function to itself. This is what makes the lambda calculus so powerful (the fact that we can do this kind of thing) and also sometimes amusingly obscure :).
Then theta has the nice property that theta f = f(theta f). So for example, if we have a formula f = … f … we can replace this for f = theta(\s. …s…) and this allows us to simulate recursion in the lambda calculus. Thus, in the examples above
add := theta(\fab. if (zero? a) b (succ(f (pred a) )
And so on. Finally, we can also represent data structures within the lambda calculus. For example, list structures:
cons := \abf.fab
head := \c.c(\ab.
tail := \c.c(\ab.
nil := \f.true
nil? := \x.x(\ht.false)
The recurring theme here is that we are simulating other objects by functions that behave in a certain way that we would like when used with other functions (e.g. above head(cons p q) = p, etc.)
4. Computability and Decidability
This section is mathematics, and is heavier going than previous sections.
Roughly speaking, a recursive function is a partial function mapping N^k > N made up from fundamental building blocks of constant functions, identities, projections, that’s closed under composition and minimalisation. Using the above computations, it can be shown that any recursive function can be defined in the lambda calculus, that is:
If f : N^k > N is a recursive function then exists a lambda term g such that <b>f(x1,…xk)</b> = <b>g x1 … xk</b>
Here the boldface represents the term representation of the number.
This is done just by construction using the definitions above.
Church’s Thesis states that the recursive functions are precisely the computable functions (i.e. functions that can be computed by a Turing machine, a superset of those computable by any computer (as a Turing machine has a potentially infinite memory.) And any lambda calculus function mapping N to N is thus recursive, as we can write down an algorithm to evaluate it – namely leftmost reduction.
This shows that we can do anything on lambda calculus that we can on a Turing machine, i.e. we can do anything in lambda calculus that can done by any computer at all.
Now we look at decidability. The set of terms T is countable. Thus, there is a bijection # : T > N. Furthermore, by constructing a suitable example, we can see that the bijection # is computable. Define “” : T > T by “t” = #t. Thus “” takes a term, and maps it to the lambda calculus representation of the number of that term. The second recursion theorem states that
For all terms f, there exists a term u s.t. f “u” = u
and it can be proved using the fact that any computable function is lambdadefinable. From this we can prove the ScottCurry theorem, which is a generalisation of the fact that there is no algorithm to determine whether, given any two terms in the lambda calculus, can return whether they are equal or not (note this result is in general: it may be possible in special cases, e.g. the reduction described above, but there is no algorithm that can be given two arbitrary terms and always return a definite value of whether they are equal or not, always terminating. Note that the equals referred to in this theorem is symmetric – before I have been using an antisymmetric = where things reduce from left to right, while here the symmetric closure of that is taken – a = b iff a reduces to b or b reduces to a.)
5. “Compiling” the lambda calculus
Surprisingly, there is something even simpler than the lambda calculus that the lambda calculus can be translated to, and then this simpler substance can be interpreted with even simpler ease. This is known as combinatory logic – and we can compile lambda calculus to combinatory logic, and run a CL interpreter that’s ridiculously simple as there are no abstractions (and thus no naming problems so there is no need for alphaconversion.) I shall perhaps come back and expand this section.
This introduction has described the untyped lambda calculus, where there are no types and all terms are equal. The extension is the typed lambda calculus, but is not as simple as the untyped lambda calculus.
In conclusion, lambda calculus is a very powerful system that is itself very simple and can be compiled into a very simple form.
I shall probably return and improve this at some point.
Welcome to AstaHost  Dear Guest , Please Register here to get Your own website.  Ask a Question / Express Opinion / Reply w/o SignUp! 
Toggle shoutbox Shoutbox

The Lambda Calculus: A Short Introduction
Started by mdchurchill, Nov 08 2005 02:01 PM
2 replies to this topic
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users