14 The intermediate value theorem

14 The intermediate value theorem

We can now prove the intermediate value theorem within ASD, in the two forms (non-singular and singular) that we discussed in Section 2. In the non-singular case on a closed bounded interval, the approximate forms of the theorem in the previous section are enough to ensure that the solution-set S_f=Z_f is compact, overt and non-empty, and therefore has a maximum element. We begin by formulating the relevant definitions from Sections 1 and 2 in ASD.

Definition 14.1 A function …, x:ℝ ⊢ f x:ℝ (Definition 5.3)

doesn’t hover (cf. Definition 1.4) if, for d,u:ℝ,
d< u =⇒ ∃ x.(d< x< u) ∧ (f x≠ 0);
doesn’t touch from below without crossing if
(d< u) ∧ (f d< 0) ∧ (f u< 0) =⇒ (∀ x:[d,u].f x< 0) ∨ (∃ x:[d,u].f x> 0);
and doesn’t touch from above without crossing if
(d< u) ∧ (f d> 0) ∧ (f u> 0) =⇒ (∀ x:[d,u].f x> 0) ∨ (∃ x:[d,u].f x< 0).
A real number a:ℝ is a stable zero of f (cf. Definition 1.8) if
(d< a< u) =⇒ ∃ e t.(d< e< t< u) ∧ (f e< 0< f t ∨ f e> 0> f t);
and the possibility modal operator ◊:Σ^{Σ^ℝ} is, for φ:Σ^ℝ,
◊φ ≡ ∃ d< u. (∀ x:[d,u].φ x) ∧ (f d< 0< f u ∨ f d> 0> f u),

so f d< 0< f u=⇒◊⊤.
The solution co-classifier (or non-solution classifier) is ω x ≡ (f x≠ 0), which defines an open subspace W_f and a closed one Z_f; and,
restricting to an interval I≡[d,u], the necessity modal operator ▫:Σ^{Σ^I} is
▫φ ≡ ∀ x:I.(f x≠ 0)∨φ x,

which makes Z_f a compact subspace.

Proposition 14.2 The stable zeroes are exactly the members or accumulation points of ◊ in the sense of Definition 11.5.

Proof If a is a stable zero then, by Theorem 10.2,

φ a	⇒	∃ d u.(d< a< u) ∧ ∀ x:[d,u].φ x
	⇒	∃ d e t u.(d< e< t< u) ∧ (f e< 0< f t∨ f e> 0> f t) ∧ ∀ x:[e,t].φ x
	⇒	◊φ.

Conversely, suppose that φ a⇒◊φ and let d< a< u. With φ x≡(d< x< u),

⊤ ⇔ φ a ⇒ ◊φ ⇒ ∃ e t. (d< e< t< u)∧(f e< 0< f t∨ f e> 0> f t). ▫

Recall from Definition 11.1 that ◊ and ω define a closed overt subspace iff they satisfy ◊ω⇔⊥ and relative instantiation, φ x⇒ω x∨◊φ. The first of these comes for free:

Lemma 14.3 ◊ω⇔⊥.

Proof With φ x≡(f x> 0) and ψ x≡(f x< 0),

◊ω	≡	∃ d< u.(∀ x:[d,u].f x≠ 0) ∧(f d< 0< f u∨ f u< 0< f d)
	⇒	∃ d< u.(∀ x:[d,u].φ x∨ψ x) ∧(∃ y:[d,u].φ y) ∧(∃ z:[d,u].ψ z)
	⇒	∃ d< u.∃ w:[d,u].φ w∧ψ w
	⇒	∃ w:ℝ.0< f w< 0 ⇔ ⊥

by connectedness of [d,u] in the form of Proposition 13.7(b). ▫

Hence if a∈◊ then a∈{x ∣ ¬ω x}, or S_f⊂ Z_f in the notation of Section 2. In order to characterise the non-singular case, S_f=Z_f, we therefore show that relative instantiation is equivalent to the conditions in Definition 14.1.

Lemma 14.4 If f doesn’t hover or touch without crossing then ◊ and ω satisfy relative instantiation.

Proof If φ x then ∃ d u.(d< x< u) ∧ ∀ y:[d,u].φ y by Theorem 10.2, and, since f doesn’t hover in (d,x) or (x,u), there are d< e< x< t< u with f e≠ 0 and f t≠ 0, so without loss of generality f d≠ 0 and f u≠ 0.

This gives four cases, according as f d< 0 or f d> 0 and as f u< 0 or f u> 0. Two of them say

∃ d< u. (f d< 0< f u∨ f d> 0> fu)∧∀ y:[d,u].φ y,

which is ◊φ. Since f doesn’t touch from below without crossing, i.e.

(f d< 0∧ f u< 0) =⇒(∀ y:[d,u].f y< 0)∨(∃ y:[d,u].f y> 0),

in the third case we deduce (f x≠ 0) from the ∀ disjunct, whilst the other one provides a straddling interval [d,y] and so ◊φ. The fourth yields the same conclusion since f doesn’t touch from above without crossing. Hence φ x⇒ω x∨◊φ in all four cases. ▫

Lemma 14.5 If ◊ and ω satisfy relative instantiation then f doesn’t hover.

Proof Given d< u, consider φ x ≡ (d< x< u), for which

◊φ	≡	∃ e t.(d< e< t< u)∧(f e< 0< f t∨ f e> 0> f t)
	⇒	∃ y.(d< y< u)∧ (f y≠ 0).

Then (d< x< u) =⇒ (f x≠ 0) ∨ ◊φ =⇒ ∃ y.(d< y< u)∧ (f y≠ 0). ▫

Lemma 14.6 If ◊ and ω satisfy relative instantiation then f doesn’t touch from below without crossing.

Proof We are given e< t with f e< 0 and f t< 0. By Theorem 10.2, there are d< e< t< u with ∀ x:[d,e].f x< 0 and ∀ x:[t,u].f x< 0. Relative instantiation for φ x≡(d< x< u) gives

⊤	⇔	∀ x:[e,t].φ x
	⇒	∀ x:[e,t].(◊φ∨ f x< 0∨ f x> 0)
	⇔	◊φ ∨ ∀ x:[e,t].(f x< 0∨ f x> 0) Proposition 8.2(c)
	⇒	◊φ ∨ (∀ x:[e,t].f x< 0) ∨ (∀ x:[e,t].f x> 0)

by compact connectedness of [e,t], since f x< 0 and f x> 0 are disjoint. But the last disjunct contradicts the hypotheses that f e< 0 and f t< 0. Finally,

◊φ =⇒ ∃ x:[d,u].f x> 0 =⇒ ∃ x:[e,t].f x> 0

since ∀ x:[d,e].f x< 0 and ∀ x:[t,u].f x< 0. ▫

Theorem 14.7 The following are equivalent for f:ℝ→ℝ:

all zeroes are stable;
f doesn’t hover or touch without crossing;
◊ and ω satisfy relative instantiation, φ x =⇒ ω x∨ ◊φ, so they define a closed overt subspace; and
in any interval, ◊ and ▫ satisfy the mixed modal law ▫(φ∨ψ) ⇒ ▫φ∨◊ψ, so they define a compact overt subspace of that interval.

Proof [a⊣⊢c] by Lemma 14.3 and Proposition 11.6, [b⊣⊢c] by the previous three lemmas and [c⊣⊢d] by Proposition 12.1. ▫

Corollary 14.8 In this case, Bolzano’s formula for a zero in Theorem 1.1(a),

sup{y∈I ∣ f y≤ 0},

is, after all, not only valid in ASD but also computationally meaningful.

Proof The compact overt subspace is either empty or has a maximum (Theorem 12.9), but it cannot be empty by Propositions 13.4f. ▫

Now we turn to the singular case, of functions that may touch without crossing. We are still looking for stable zeroes, because finding tangential points requires other evidence that the function does take a value that is equal to zero (Section 1). However, the impact of the following results (which were developed while this paper was already in the reviewing process) is on the case where the stable and unstable zeroes are densely mixed, rather than simply for polynomials with double zeroes.

In the non-singular case the operator ◊ had very strong topological properties (relative instantiation and the mixed modal law) that related it to the closed or compact space of all zeroes, but this knowledge is no longer of any help now that the two spaces are different. Note that the Brouwer degree is not defined in this situation either.

Nevertheless, we argued that Theorems 2.5 and 2.8 capture the computational ideas behind solving equations. However, we leave the translation of the former into ASD as an exercise, and instead consider a new argument that has the generality of the classical intermediate value theorem.

Example 1.2 is typical of intermediate-value questions that arise in the real world, such as exactly when the boom of 2007 turned into the crash of 2009: the more economic indicators we investigate, the muddier it becomes. The common-sense answer is that it happened during a certain interval, on which we may place upper and lower bounds. However, since these are extremely imprecise, this concession to classical pure mathematicians will not appeal to numerical analysts.

We therefore propose

Definition 14.9 A solution of an equation f x=0 (where f:[d,u]→ℝ with f d< 0< f u) is not necessarily a point but an occupied compact interval, i.e. a Dedekind pseudo-cut [δ,υ] in the sense of Definition 9.9. Of course, there may be many such intervals, just as f may have many zeroes in the usual pointwise sense.

Definition 14.10 An open subspace φ:Σ^ℝ is contour-closed if it is a union of open intervals at whose (Euclidean) endpoints the function is non-zero:

φ x ⇔ ∃ d u.(d< x< u) ∧ (f d≠ 0) ∧ (f u≠ 0) ∧ ∀ y:[d,u].φ y.

We are thinking of geographical contours here: if φ includes part of a hovering interval then it contains the whole of it. In Example 1.2, the interval (0,3) is contour-closed but (0,1⅔) and (1⅓,3) are not. On the other hand, f doesn’t hover iff every open subspace is contour-closed.

Lemma 14.11 Any union or binary intersection of contour-closed open subspaces is contour-closed. If φ is contour-closed then so are its components, the convex open intervals in Theorem 13.15. If (unlike Example 1.11) f is not eventually constantly zero then ⊤ is contour-closed. ▫

Then the generalisation of Theorem 2.5 is

Theorem 14.12 Restricted to contour-closed open sets, the operator ◊ preserves joins and satisfies the (easy) mixed modal law ▫φ∧◊ψ⇒◊(φ∧ψ).

Proof By Definition 14.1, ◊(∃ i.θ_i ) says that there is a straddling interval [d,u] that is covered by the union ∃ i.θ_i . By Definition 14.10, each member θ_i of this union is itself a union of intervals (e,t) with f e≠ 0≠ f t, so

◊(∃ i.θ_i )	≡	∃ d u.(f d< 0< fu∨ fd > 0> f u)
		∧ ∀ x:[d,u].∃ i.∃ e t.(e< x< t)
		∧ (f e≠ 0∧ f t≠ 0) ∧ ∀ y:[e,t].θ_i y.

Now we apply the Heine–Borel theorem in its traditional form to the open cover indexed by (i,e,t). This uses general Scott continuity (Axiom 9.1), the hyper-space of lists from an overt discrete space (Remark 12.16) and a combinatorial argument like those in the next section. It yields

◊(∃ i.θ_i )	⇔	∃ d u:ℚ.(f d< 0< f u∨ f d > 0> f u)
		∧ ∃ℓ:List (I×ℚ×ℚ).
		(∀ x:[d,u].∃(i,e,t)∈ℓ.(e< x< t))
		∧ (∀(i,e,t)∈ℓ.(f e≠ 0∧ f t≠ 0) ∧ ∀ y:[e,t].θ_i y),

in which, by Theorem 10.2, we may assume that d, u and all of the e and t in the list ℓ are distinct. Arranging them in numerical order, some successive pair (p,q) must have f p< 0< f q or f p> 0> f q. Although p and q are not the e and t of any (i,e,t)∈ℓ because the intervals must overlap, nevertheless the interval [p,q] is part of some such [e,t], and therefore ∃ i.∀ y:[p,q].θ_i y. Hence ∃ i.◊θ_i and the modal law also follows using Proposition 12.1 and Lemma 14.3. ▫

Corollary 14.13 For any function f:I≡[d,u]→ℝ with f d< 0< f u, there is a non-deterministic program ◊⊤ that finds arbitrarily close approximations to interval-valued stable zeroes.

Proof In Theorem 1.7 we found a nested sequence of open intervals for which the function takes non-zero values with opposite signs at the endpoints. The non-hovering condition ensured that we could divide any such interval approximately in half at a new non-zero value (Theorem 2.8), although interval-Newton methods can make much better choices (Remark 2.9). Without this condition, we can apparently do no better than prod randomly (and in parallel) in search of non-zero values. Nevertheless, if we find them then the subintervals under consideration are contour-closed, so the Theorem applies to them.

The endpoints of the sequence that we choose generate a pseudo-cut, cf. Theorem 6.14. If the non-deterministic search for non-zero values is fair (in the sense that it will eventually consider any possibility that exists, and distinguish any value from zero if it is non-zero), the function is constantly zero on the interval that this pseudo-cut defines. ▫

In Remark 2.6 we mentioned the classical objection that the non-hovering condition is redundant, because both hovering and non-hovering functions have zeroes, respectively for trivial and constructive reasons. Instead of this disjunction of untestable cases, we can factorise the given map f:I→ℝ into a composite of maps that capture the purely classical and purely constructive ideas. (This also re-admits the constantly zero function.)

Lemma 14.14 The relation x#_f z (or just x# y) defined by

∃ y. (x< y< z ∨ x> y> z) ∧ f y≠ 0

co-classifies a closed equivalence relation: it is irreflexive, symmetric and co-transitive,

x# z=⇒ x≠ z, x# z=⇒ z# x and x# z=⇒ x# y∨ y# z,

and its equivalence classes are compact intervals (Definition 9.9).

Proof Observe that x# z ⇔ δ_z x∨υ_z x, where

δ_z x ≡ ∃ y.(x< y< z)∧ f y≠ 0 and υ_z x ≡ ∃ y.(x> y> z)∧ f y≠ 0

are rounded and disjoint but not necessarily bounded or located. For co-transitivity, suppose δ_z x. Then, by Theorem 10.2,

∃ u w. (x< u< w< z) ∧ ∀ v:[u,w].f v≠ 0, whilst u< y∨ y< w,

of which the first disjunct gives δ_y x and the second δ_z y. ▫

Theorem 14.15 Any function f:I≡[d,u]→ℝ factorises as f=g· q where

q:I→I/(#_f) is a proper surjection onto a compact Hausdorff space and has compact connected fibres; and
g:I/(#_f)→ℝ doesn’t hover, but if f d< 0< f u then g d< 0< g u.

If f doesn’t hover then #_f is ≠ and q=id_I, whilst if f is constantly zero then # is ⊥ and I/(#_f) is a singleton.

Proof The quotient of an overt discrete space by an open equivalence relation is constructed in Section C 10, but that paper only used the part of the calculus that is strictly lattice dual (without assuming Scott continuity or anything about ℕ or ℝ), so the result is also directly applicable to the quotient of a compact Hausdorff space by a closed equivalence relation. However, this uses the more general theory of Σ-split subspaces that was sketched in Remark 7.12 and does not belong to the class of simpler types that are generated by Axiom 4.1.

The condition f d< 0< f u ensures that δ_z and υ_z are bounded, so the fibres are compact connected (Theorem 15.10). The function f respects the closed equivalence relation #_f because

f x≠ f y =⇒ x#_f y,

and therefore factors through the quotient. The relation #_g is ≠ on I/(#_f), so g doesn’t hover and the constructive intermediate value theorem applies to it. ▫

This answers the opening discussion in Section 1:

Corollary 14.16 The difference between the general classical intermediate value theorem and its restricted constructive version lies exactly in the distinction between being occupied and inhabited (Definitions 8.6 and 11.5). ▫

Remark 14.17 In order to generalise this argument to Brouwer’s fixed point theorem or to finding zeroes of f:ℝⁿ→ℝⁿ, we must replace the intervals in Definition 14.10 by polyhedra on whose faces (or (n−1)-skeleton) f≠ 0, cf. Remark 2.7.

The argument with alternating signs that we used in Theorem 14.12 has a well known generalisation in combinatorial topology called Sperner’s Lemma [Spe28]. For example, Dirk van Dalen[vD09] used this to generalise the approximate intermediate value theorem (Proposition 13.4). In Günter Baigger’s counterexample [Bai85, Pot07] there is a grid of zeroes (or fixed points), so its only contour-closed open subset is the whole square. ▫