Practical Foundations of Mathematics

Paul Taylor

2.3 Sums, Products and Function-Types

We begin our reassessment of the constructions needed in mathematics by considering the function-type and sum. The system of types which is freely generated by the binary symbols x, Ž and + from Ć, 1 and a given collection of base types or sorts will be called simple type theory. Applied to sets, Ć, 1, x, + and Ž refer to the constructions of Section 2.1. Recall that these were accompanied by operations such as p_i, n_i and ev, which we shall now discuss. Later we shall show how certain other mathematical objects admit directly analogous structures.

Function (l) abstraction Sections 1.1-1.3 discussed how functions act, but they must also be considered as entities in themselves. Early in the history of the integral calculus problems arose in which the unknown was a function as a whole, rather than its value at particular or even all points: the Sun's light takes that path through the variable density of the atmosphere which minimises the time of travel; the motion of a stretched string depends on its initial displacement along its whole length.

REMARK 2.3.1 In order to consider a function per se, we must first identify which of the unknowns in an expression p are inputs. Lambda abstraction, l[(x)\vec], does this, thereby binding these variables (Definition 1.1.6). Sometimes the function already has a name, such as sin or sine, but the squaring function can only be written as ² or, now, as lx.x². Since l is clearer than the informal notation p(x) of Remark 1.1.2 as a way of distinguishing the inputs from other variables, we treat all variables, whether free or bound, as part of the expression p.

Given a function f and an argument a, the one may be applied to the other and evaluated to a result. This is usually written fa without brackets, in which the juxtaposition denotes a formal operation of application; as we shall need to study this operation in its own right, we shall sometimes write ev(f,a) instead. The result of the evaluation, which was written p(a) informally, is p[x: = a]. The passage (using substitution)

from (lx.p)a to p[x: = a]

is called b-reduction. See Definition 1.2.2 for related terminology.

We also identify lx.sinx with sin, and more generally

lx.fx with f, where x Ď FV( f).

This is, crudely speaking, a converse to the b-rule (where an abstraction is applied) because it says that when you abstract the application of a function you get the function back. It is called h-reduction. The interpretation of h as a computation step is not so clear as for b (arguably it should go from f to lx.fx), but it will be needed (as an equality) in Lemma 4.7.5ff to obtain an exact match between the intensional and extensional notions of function.

Besides the type-theoretic rules, we also intend that the constants may have their own laws, or d-rules, as they are known in the l- calculus.

For us, all terms are typed: if x and p have types X and Y, the abstraction lx.p has type XŽ Y ş Y^X. Notice that the two reduction rules preserve type, ie they obey subject reduction, Definition 1.2.12.

The type XŽ (YŽ Z) or (Z^Y)^X ş Z^(XxY) is that of a function of two arguments. This trick - of using l-abstraction to supply multiple arguments one by one to a function - is called Currying, though it had been observed by Moses Schönfinkel and was implicit in Frege's work.

CONVENTION 2.3.2 It is customary to omit the brackets as follows:

omitted eqnarray* environment

The notation is further abbreviated to [(X)\vec]Ž Y, f[(a)\vec] and l[(x)\vec].p respectively. As this deals with many-argument functions (in fact in a rather useful way), many authors omit pairing from the calculus. By contrast, the type (XŽ Y)Ž Z is that of what is sometimes called a function al, ie a function whose argument is itself a function, and so is more complex.

Normalisation A distinction is made between (lx.p)a and p[x: = a], and between application and evaluation, since the result of a b-reduction is frequently a longer expression than the first: it may contain more ls and so more opportunities for further reduction than the original term. So strategies for b-reduction are an important topic of study in themselves.

FACT 2.3.3 The Church-Rosser Theorem says that the pure l-calculus (without d-rules) is confluent (Definition 1.2.5). The simply typed pure l-calculus is also strongly normalising (Definition 1.2.8). []

The Church-Rosser Theorem relies too much on intricacies of syntax to be appropriate for this book: see, eg , [Bar81], [LS86] and [Bar92] for a detailed treatment. The result is valid in many different calculi - including the untyped l-calculus, in which the normalisation theorems fail - but unfortunately breaks down in some variations which seem semantically benign, such as the untyped calculus with surjective pairing. Without the type discipline (Notation 1.3.3) there need be no normal form to which a term reduces. For example the term (lx.xx)(lx.xx) reduces to itself, and there are much worse phenomena.

The depth of bracket nesting, with the above Convention for omitting them, considered as a notion of type complexity, can be used to prove weak normalisation : see Example 2.6.4. It follows from Fact 2.3.3 and Theorem 1.2.9 that normal forms exist and are unique. They are characterised in Exercise 2.23, and used in Theorem 7.6.15. Section 7.7 shows in another way that every term is provably equal to a normal form.

REMARK 2.3.4 When the l-calculus is used as a programming language [ Plo77], it is usual to forbid b-reduction under l, as lx.p is regarded as an as yet passive fragment of code, which is only activated when it is applied to some argument. Then there is a choice whether

(a): to reduce (lx.p)a straight away ( call by name), so avoiding the perhaps unnecessary risk of evaluating an undefined argument, or
(b): to wait until a has itself been normalised ( call by value), so that this is not done repeatedly if x occurs several times in p.

Contexts for the l-calculus The variable governed by l-abstraction, like that bound by the quantifiers $ and ", is generic, so we use boxes to delimit it. As in Section 1.5, the variable cannot be used outside the box, because to do so would restrict its generality and prejudice the question of whether its type is empty.

DEFINITION 2.3.5 The box rules for l-abstraction and application are

omitted vchbox environment omitted vchbox environment

cf the syntax for procedures with formal parameters in programming languages. Inside the box we may form and manipulate terms algebraically, treating x as a constant. Its operational meaning is given by the (Ž E )- and b-rules, where an actual parameter a is substituted throughout for the formal x ( cf Remark 1.5.10).

NOTATION 2.3.6 As for the predicate calculus (Definition 1.5.3), the abstract study of the l-calculus needs a sequent form in which

G\vdash t:X means that { omitted tabular environment

In the simply typed l-calculus, contexts consist only of typed variables, as there are no propositional hypotheses.

DEFINITION 2.3.7 The sequent forms of the rules for l-terms are

omitted prooftree environment omitted prooftree environment

The other five rules make (Ž Á ) into a two-way adjoint correspondence:

omitted prooftree environment

REMARK 2.3.8 We need structural rules for terms as well as formulae (Definition 1.4.8). In fact these ought to have been given for the ("Á )- and ($E )-boxes (Definition 1.5.1), in order to import terms.

= 0pt omitted array environment

In the informal notation, the effect of cut, weakening and contraction is t(x)Ž t(a), tŽ t and t(x,y)Ž t(x,x) respectively. The Y has been included to emphasise that these things can happen anywhere in the context, not just at the end of the list.

REMARK 2.3.9 There is a superficial similarity between the b- and cut rules. To see the b-rule in action in a symbolic idiom we need to use general expressions for the body p and argument a of the function, cf the various ways of expressing conjunction as a sequent rule in Remark 1.4.9; Example 7.2.7 gives a diagrammatic version with variables instead. Cut makes these substitutions, and this explains the likeness.

The substitution rule (Definition 1.1.10(d)) relates cut to (Ž Á ):

omitted prooftree environment omitted prooftree environment

The side-conditions z\not ş x Ď FV(c) are forced automatically by the need for the contexts to be well formed. It is these rules which allow any term definable outside the box to be imported, according to the scoping rules familiar in programming languages as block structure, cf Lemma 1.6.3. Remark 9.4.3 discusses the substitution rule further and relates it to the Beck-Chevalley condition in category theory.

The sum type The rules for sum largely mirror those for product. The pair a,b for the product is supplied by the data, and the program uses p_i to extract what it needs. For the sum, the program provides a pair [f,g] of options, from which the input makes a selection using n_i. There is an ultimate result type Q to mirror the context of parameters G. This symmetry is much more (arguably too) obvious in the categorical presentation (Sections 5.3-5.5); it is really spoilt by the asymmetry of the term calculus, in which the input but not the output may involve parameters. The defects require new variable-binding operations, and rules to handle substitution and its dual notion of continuation.

These rules are rather technical and will not be relevant until Section 5.3, so you should skip the rest of this section unless you are already very familiar with the typed l-calculus expressed in a contextual style.

REMARK 2.3.10 The sum type has inclusions n₀ and n₁,

omitted prooftree environment omitted prooftree environment

which are operation-symbols and so obey the obvious equality rules. The elimination rule for sums is case analysis. Its sequent form is

omitted prooftree environment

or, by using cut (Remark 2.3.8, cf conjunction in Remark 1.4.9),

omitted prooftree environment

There is no syntax for the term below the line that is as natural and concise as l, or that is universally agreed. In programming languages it is written in a form such as on the left below, where each branch offers a pattern n₀(x) or n₁(y) against which to match the data c.

omitted program environment omitted proofbox environment

Notice that the variables x and y are local to (bound by) their respective branches. The letter n in the introduction rules behaves as an operation-symbol, and is applied to a term a; in the elimination rule it is a variable binder like l. The b-rules give the meaning of this construct, deleting an elimination rule following an introduction. When the pattern n₀(x) meets the data n₀(a), the value a is substituted for x in f(x).

omitted eqnarray* environment

The h-rule says that if the two branches contain the same code p then the switch is redundant: [\Case 0 x.p(n₀(x)), \Case 1 y.p(n₁(y))](c) = p(c).

There is also an equality rule similar to (l = ) in Definition 2.3.7.

REMARK 2.3.11 Just as pairing handles many-argument operations uniformly, so l- abstraction avoids the need for other variable binders. Thus (+E) can be put in a form which doesn't itself alter the context:

omitted prooftree environment

The [ , ] notation is at least standard usage in category theory, where the rules can be summed up as the two-way adjoint correspondence

omitted prooftree environment

so Q^(X+Y) ş Q^XxQ^Y.

REMARK 2.3.12 Terms may be imported into the boxes, with the effect of substituting for free variables within the binding. The rule for this is exactly analogous to that in Definition 1.1.10(d),

[f,g](c)[w: = a] =

é
ë

f[w: = a], g[w: = a]

ů
ű

(c)

^
w

[f,g ] = [

^
w

g].

Semantically, this results in the distributive law ( cf Lemma 1.6.3),

(GxX)+(GxY) ş Gx(X+Y).

REMARK 2.3.13 Remark 1.6.5 showed that the ($E )- box is essentially open-ended below: any phrase of proof not involving the bound variable can be moved in or out. The (+E)-rule has the same property, but now that we are discussing significant terms rather than anonymous proofs, we must state another law, called a continuation rule or commuting conversion. This says that moving the continuation z into or out of the box has no effect,

zo[f,g] = [zof, zog],

so the h-rule can be expressed as [n₀,n₁] = id. Continuation is dual to substitution, and is explained in category theory by postcomposition.

As for the function-type, these reduction rules interact, and questions of confluence and normalisation have to be studied. For example we would like to know that every definable closed term of sum type is provably equal to either n₀(a) or n₁(b). This will be considered in Section 7.7.

REMARK 2.3.14 Since there are two binary introduction rules, there is no nullary one (and so no b-rule either), and as the binary elimination rule has two cases (premises), the nullary one has none.

omitted prooftree environment

The elimination rule provides a function ĆŽ Q for each type Q, and the corresponding equality rule says that this is unique. The (Ćh) and continuation rules mean that it is the identity on Ć itself, and preserved by any function. The substitution rule gives Proposition 2.1.9: any XŽ Ć is invertible, Q^Ć ş 1 and GxĆ ş Ć.

Contexts will be developed in Section 4.3 and Chapter VIII. Sections 4.5- 4.7 give a categorical account of products and function-types , applying the former to universal algebra. We discuss the binary sum further in Section 5.3ff, along with the if then else fi programming construct. Section 9.3 treats the infinitary (dependent type) analogue, Sx.Y[x].