Notes on small inductively defined classes and the majorisation relation Mathias Barra Department of Mathematics University of Oslo November 2009 Dissertation presented for the degree of Philosophiae Doctor (PhD) i Preface. This thesis does not resemble anything I had envisioned when I embarked on my PhD studies in the spring of 2006. In-between my MSc-studies, which I completed early summer 2004, and the time when I could finally return to the Niels Henrik Abel’s building at the University of Oslo (UiO) – courtesy of a grant from the Norwegian Research Council – I had been fortunate enough to be able to do some mathematics together with my one of my present supervisors, Lars Kristiansen. Together we submitted a paper ([K&B05]) to the first CiE meeting, which convened at the Universiteit van Amsterdam in June 2005. The plan was to perhaps pursue the work presented there – which could also be seen as a continuation of the work I did for my MSc-thesis – further in the direction of investigating non-determinism in rewriting systems and λ-calculus. This work resulted in a presentation (and an extended abstract [BKV07]) with Lars Kristiansen and Paul Voda for the CiE 2007 meeting in Siena. However, I soon realised that my interests were more inclined towards idc.theory and the so-called small Grzegorczyk-classes. I had ‘secretly’ started to jot down some thoughts and ideas in a small note which I called Inherently bounded functions, and parallelly I was writing down thoughts about, and reading up on available literature on, the subject of very small inductively defined classes (idc.) of functions. These investigations were collected in a work-inprogress manuscript called Minimal IDC’s, and a substantial part of this thesis is composed of the two articles which that research resulted in: Bounded Minimalisation and Bounded Counting in Argument-bounded idc.’s ([B09a]) and Pure iteration and periodicity ([B08b]). This research also permitted me to continue the work on inherently bounded functions, albeit within a quite different formalism than the one I initially envisioned. As it happened, Kristiansen and Voda had been looking at a similar approach to small idc.’s as had I, with the difference that their theory was more elegant yet less general. I immediately thought that their theory of detour degrees was perfect for adapting so as to accommodate my own more general approach. The reader can judge the first emerging results of this effort in chapter 3. The final chapter of this thesis contains reporting on some of the research on the majorisation relation and certain well-orders which Philipp Gerhardy and I have been doing jointly over the last year-and-a-half. Seemingly a very different game than idc.-theory, it is my hope that readers will appreciate the connection between this part and the two first, and that this thesis will arouse their interest in all the subjects covered. There are so many people who deserve my gratitude and I would like to thank in particular the following: Lars Kristiansen and Dag Normann, who have been my supervisors during my PhD studies. Both also supervised my MSc-research, and lectured courses I attended during my undergraduate years, encouraged me to pursue Logics, helped me find work when I had no funding, and spurred me on during the last four years. Philipp Gerhardy, for being both a friend and a colleague, and for the invaluable ii effort, insight and energy he has added to the joint work on the majorisation relation, parts of which is presented here. Philipp also read the final draft and offered many suggestions and spotted quite a few of my grammatical errors. Sindre Duedahl, for being a true friend and for his die-hard belief that I could be taught Functional Analysis, Measure theory and Differential structures (which were topics for my extra-logical credits). All the people who have attended and organized the CiE meetings 2005–2009, with whom I have had the fantastic opportunity to discuss ideas, listen to inspiring lectures, not to mention enjoy their great company. In particular I would like to express my gratitude to ‘Barry Cooper’s Leeds crew’, who arranged a fantastic week of Mathematical Logic during the summer of 2006 at the MATLOGAPPS summer-school series. I would also like to thank here all those anonymous referees who have provided valuable comments and thoughts on the various articles I have (co-) submitted to these conferences. I am also indebted to the members of the adujcation committee, for undertaking to read and evaluate the following pages. Not only did they volunteer to undertake this chore, but they also identified typos and misprints, and proposed minor yet important and clarifying adjustments to the text. This means less embarrassment on my part, and, more importantly, ensures a better experience for future readers. Thank you. Invaluable support have also come from my family and friends during these last years – I could not have completed this thesis without you! – neither for practical reasons, but more importantly you have all contributed towards keeping me emotionally ‘afloat’. My family-forest is a quite complex structure, and apart from my mother, father and their respective spouses I hope you will forgive me when I refrain from thanking you all by name or designation, lest I should have to add an entire appendix. I love and thank you all. Finally, there are three special people who I must not merely thank, but whom I owe everything: they are my wonderful son Angelo, my complement and completion Iselin, and her wonderful daughter Lilja – you are my universe. Because of a great love, one is courageous. – Lao Tzu iii A note on the formatting of the Dissertation Embedded articles. This dissertation contains 3 embedded articles – imported into the LATEX document as pdf-files. They appear exactly as they were published/submitted. These pages feature double page-numbering: the original ‘extra-dissertational’ numbering appears in the top-left or top-right corners, while the ‘intra-dissertational’ numbering appear centered in the footer. The articles are [B09a] pp. 61–87, [B08b] pp. 93–102 and [B&G09] pp. 177–186. There is also a discrepancy in the heading- and theorem-structure formatting between the embedded articles and the main text. Theorems (lemmata, corollaries, etc.) and definitions are numbered continuously throughout the dissertation. The reader will note that theorem-items in the embedded articles appear with a bold face-heading, while in the main-text they come with a Small capsheading. Keeping this in mind, hopefully cross-referencing will be manageable to the reader. Cross-referencing. For obvious reasons references within an embedded article never points outside of the article itself, and any notational convention made elsewhere will not apply to pages containing these, on the contrary, notational conventions are frequently breached there. Convention 1 (some notational conventions) A reference of the form convention 1 (set with the small caps font) refer to an item within the main text, while a reference of the form [B08a]-Lemma 2 (set in the bold face-font) refers to the corresponding result found in [B08a]. Note also that all internal cross-references are typeset with a small first letter, while the heading of the item referred to is typeset with a larger letter, as in: ‘the heading of theorem 3 is ‘Theorem 3” T.3 In a proof (or elsewhere) the labeling of some relation-symbol – like in ‘F1 = F2 ’ – means that the equality is justified by the relevant result. In the example theorem 3 is the intended reference, other abbreviations are (L)emma, (P)roposition (C)orollary and (O)bservation. Any other labeling, usually a ‘†’ or a ‘z’, indicates that a justification follows below, while a question mark indicates that the relation is an open problem. Footnotes. The problem of choosing suitable footnote-marks for mathematics, and where to insert them, is always present, mostly because it should not be possible to confuse the footnote-mark with a superscript. This issue can be avoided by simply not using them. However, the asterisk ‘*’ is not used at all∗ as notation in this thesis, and the asterisk∗∗ is used for this purpose and this purpose only. ∗ Except ∗∗ Or in the embedded article [B09a] a couple of them. iv Furthermore a footnote mark is never super-scripted onto a mathematical symbol or formula, but rather to the English word immediately pre- or succeeding it. Notational discrepancies. Sometimes, introducing a flexible and versatile notation – with a high initial cost in terms of space and time to familiarise oneself with it – is well worth the effort. Other times, the initial cost is too high. A PhD-dissertation is an example of the first scenario, while e.g. a shorter extended-abstract-like conference contribution is an example of the latter. Consequently – though I have tried to keep it at a minimum – notation is not as uniform throughout the dissertation as one might wish. The final Bibliography is complete in the sense that it it contains all references from the main text and all references from the embedded articles. Contents 1 Introduction 1 1.1 How Difficult Is IT ? . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.1 Standard symbols. . . . . . . . . . . . . . . . . . . . . 3 1.2.2 Sets, relations and maps. . . . . . . . . . . . . . . . . 3 1.2.3 Function, schema and idc. . . . . . . . . . . . . . . . 4 1.2.4 Predicates. . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Minimal IDC.’s 2.1 2.2 2.3 2.4 11 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.1.1 The MINIMAL in Minimal idc.’s. . . . . . . . . . . . . 11 2.1.2 Why idc.’s? . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.1.3 The case for projections, constants and composition: . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.1.4 Intermission – on the term class. . . . . . . . . . . . 19 2.1.5 Logical functions and 1 -order logic. . . . . . . . 20 Basic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.2.1 The maximum and minimum functions. . . . . . . . . 27 2.2.2 The case-functions. . . . . . . . . . . . . . . . . . . . 31 2.2.3 The predecessor function. . . . . . . . . . . . . . . . 37 2.2.4 The truncated difference function. . . . . . . . . 39 2.2.5 Combinations. . . . . . . . . . . . . . . . . . . . . . . . 41 Schemata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.3.1 Historical context and motivation. . . . . . . . . . 45 2.3.2 Bounded minimalisation and bounded counting. . 59 2.3.3 Primitive recursive schemata. . . . . . . . . . . . . . 89 st Summary of Results . . . . . . . . . . . . . . . . . . . . . . . 107 vi CONTENTS 3 Relativised Detour Degrees 109 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 3.2 F-degrees – Relativised Detour Degrees . . . . . . . . . 113 3.2.1 3.2.2 Bootstrapping, remarks and subtleties. . . . . . . 115 3.2.4 Closure properties of F(f ) and Ff . . . . . . . . . . 119 3.2.6 3.2.7 The structure (DF , vF , ∩F , ∪F ) . . . . . . . . . . . . 122 An enhanced lemma. . . . . . . . . . . . . . . . . . . . 125 Three open problems and some partial answers. . 126 Specific F-degrees. . . . . . . . . . . . . . . . . . . . . . . . . 133 3.3.1 3.4 Foundations. . . . . . . . . . . . . . . . . . . . . . . . . 115 3.2.3 3.2.5 3.3 X -degrees. . . . . . . . . . . . . . . . . . . . . . . . . . 113 On id v.s. id + 1 when op = it . . . . . . . . . . . . . 137 Directions for further research. . . . . . . . . . . . . . . 145 4 Majorisation 151 4.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 4.2 On the Order . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 4.3 4.4 4.2.1 Monotonicity properties. . . . . . . . . . . . . . . . . 156 4.2.2 Majorisation properties. . . . . . . . . . . . . . . . . 165 4.2.3 Generalised towers of two’s. . . . . . . . . . . . . . 169 Skolem’s problem . . . . . . . . . . . . . . . . . . . . . . . . . 173 4.3.1 Notational bridge. . . . . . . . . . . . . . . . . . . . . 173 4.3.2 Proof of Theorem 2. . . . . . . . . . . . . . . . . . . . 174 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . 191 Bibliography 193 Introduction Dealing with complexity is an inefficient and unnecessary waste of time, attention and mental energy. There is never any justification for things being complex when they could be simple. – Edward de Bono 1.1 How Difficult Is IT ? At the core of this dissertation lies the question How difficult is it?. As a novice logician, one finds that a great number of the problems one are asked to solve in exercises, and many of the theorems exhibited and proved during lectures, are in some way – implicitly or explicitly – concerned with answering an instance of this question. Logicians prove results about the sizes of formal proofs; be it derivation length or number of symbols in their syntactic representation. Logicians also prove results about the size of ordinals necessary to carry out inductive arguments of sufficient infinite transcendence to prove a given theorem; here the ordinal becomes a measure of the complexity – relative to the formal system of deduction within which the induction takes place – of the theorem in question. Logicians prove recursion theoretic theorems about relative reducibility and degrees which yield insight into whether or not some function is more difficult to compute than some other. And, of course, computer scientists do much of the same in the closely related field of computer science. Obviously, exactly what it refers to in the question above, will decide whether or not the question makes sense at all, and if so, it will usually suggest a variety of different ways to suitably formalise the question into a mathematical object subjectable to mathematical analysis. It is in our context for the most part the process of recognizing elements from some A ⊆ Nk . By process we mean algorithm. But what is an algorithm? Within the field of computer science one can take algorithm to mean a Turing machine (TM). Trying to determine an upper bound on the number of computation-steps required by some (optimal) Turing machine∗ in order to decide membership of natural numbers n in some predicate A as a function of the size of (any reasonable representation of) n gives rise to time-complexity theory, and classes like polynomial-time and exponential-time. Counting the number of tape-cells written onto during the execution of an algorithm give rise ∗ Turing machines will not be defined formally, as they are used informally only in this thesis. 2 CHAPTER 1. INTRODUCTION to space-complexity theory. Taking TMs as the basic notion of an algorithm and investigating time- and/or space-measures on ‘difficultiness’ – better known as complexity theory – has withstood the test of time, and research in this field is plentiful, enlightening and brings forth new and intriguing insights on a regular basis. Still, there are other approaches. One of these is based on the early work by Grzegorczyk, Péter, Roberts and Roberts and many others∗ , and we will call this formalism idc.-theory here. Informally, idc.-theory is the sub-discipline of recursion- or computability-theory in which the algorithms are certain inductively obtained definitions of functions from some basic functions combined with certain and well-delimited rules for constructing new functions from previously defined ones. One then attempts to characterise, describe, compare and measure the complexity of the resulting functions. The literature abounds in papers on the interrelationship between idc.-theory and complexity theory∗∗ which suggests that the one approach cannot have more intrinsic interest to logicians or computer scientists than the other. We only mention Clote [Clo96] here; more references are found throughout this thesis. Also contained in this thesis, is joint work with Philipp Gerhardy on the majorisation relation (chapter 4). Here the object of study is an order induced onto the functions of certain families of functions. This part is only indirectly related to the rest of the research presented, however, the same question heading this introduction is very much present also in that work. The ‘it’ is there the order, and the ‘How difficult’ is now the size of the ordinal of this order. Because the nature of the topics treated in the various chapters are as varied as they are, we delay further introductory remarks to the individual chapters 2–4. First we define in section 1.2 the terminology and formal definitions needed in order to present, discuss and develop our results. ∗ Some ∗∗ The [Rit63]. of the historical background will be reviewed in chapters 2 and 3 canonical example is perhaps Ritchie’s celebrated result that E?2 = linspace from 3 1.2. DEFINITIONS 1.2 1.2.1 Definitions Standard symbols. We assume that the reader has a basic understanding of 1st -order logic, as presented in e.g. Schoenfield [Sho67] or Leary [Lea00]. Most symbols and concepts will be defined during the exposition, with the exception of standard logical, set-theoretical and arithmetical symbols (e.g. ∧, ∨,∃. . . ,∩,∪,∈,. . . ,<,≤,. . . ). 1.2.2 Sets, relations and maps. A set is uniquely characterised by its elements. That is, we shall not need to say anything more on this issue, and the reader may simply have in mind ‘that which is studied in a course on ZFC set-theory’ (e.g. Kunen [Kun99]). The set of natural numbers {0, 1, 2, 3, . . .} is denoted N, and the set of integers {. . . , −1, 0, 1, . . .} is denoted Z. We also assume the reader is familiar with the concept of ordinal numbers and their arithmetic as presented in e.g. Sierpiński [Sie65] or Kunen [Kun99]. Ordinals will not be used in an essential way until chapter 4. The cardinality |A| of a set A is the number of elements in A, and ω is the least infinite ordinal. A set is finite when |A| ∈ N, equivalently when |A| < ω . We will not need to worry about the Axiom of Choice, and in general all sets encountered are countable, that is, |A| ≤ ω. For two sets A and B, the Cartesian product A × B is the set of all ordered pairs (a, b) such that a ∈ A and b ∈ B. Ak denotes the natural generalisation of the Cartesian product of A with itself k times, and as usual we extend the notion def def of ordered pair to that of an ordered k-tuple, so that e.g. A3 = A × A × A = {(a, b, c) | a, b, c ∈ A } . A k-ary relation on a set A, is a subset R ⊆ Ak , and we write (a1 , . . . , ak ) ∈ R, a1 , . . . , ak ∈ R, or R(a1 , . . . , ak ) interchangeably. A 2-ary relation is called binary, and we often write aRb for (a, b) ∈ R, and a6Rb abbreviates ¬(aRb). There is one important exception to this convention: A ( B does not mean ¬(A ⊆ B), but rather A ⊆ B ∧ A 6= B. . A binary relation is reflexive if ∀a∈A (aRa) , anti-reflexive if ∀a∈A ¬(aRa) , symmetric if ∀a,b∈A (aRb → bRa) , anti-symmetric if (aRb ∧ bRa) → a = b , and transitive if ∀a,b,c∈A (aRb ∧ bRc → aRc) . A binary relation v on A is a partial order (p.o.) if v is reflexive, anti-symmetric def and transitive. When (A, v) is a p.o. we define a ⊥ b ⇔ a 6v b ∧ b 6v a. A total order, is a p.o. where ∀a,b∈A (a 6⊥ b), and a well-order is a total order where ∀B⊆A (B 6= ∅ → ∃b∈B ∀c∈B (b v c)) . An equivalence relation (e.r.) on A is a reflexive, symmetric and transitive S binary relation. A partition of a set A is a family {Ai }i∈I such that A = i∈I Ai and Ai ∩ Aj 6= ∅ ⇔ i = j. Any e.r. ' on A induces a partition {[a]' }a∈A of A 4 CHAPTER 1. INTRODUCTION def into the equivalence classes [a]' = {a0 ∈ A | a0 ' a }. Let A, B and C be sets. A map F ⊆ A × B is a binary relation such that ∀a∈A ∃b∈B (aF b ∧ ∀c∈B (aF c → c = b)) . (M) F : A → B means that F is a map from A to B, and we write F (a) = b for aF b. For F : A → B, A is the domain Dom(F ), and B is the co-domain. For def C ⊆ A, as usual F (C) = {F (a) | a ∈ C } ⊆ B , and F (C) is called the image of C under F . F (A) is called the range of F , denoted Im(F ). F C denotes the unique map F 0 : C → B such that F C (a) = F (a) for all a ∈ C, and F 0 is called the restriction of F to C. A partial map F : A → B is a map F : A ∪ {↑} → B ∪ {↑} satisfying F (↑) =↑. Here ↑ denotes the undefined. F The expression A 3 a 7→ E(a) ∈ B specifies a map F whenever ‘E’ is some expression which satisfies (M) above. If A and B are clear from context, they may be omitted, as can the F if no name for the map is needed. def We let B A = {F | F : A → B } . We identify the set 2A with the power set of A, that is the set {B | B ⊆ A } . For B, C ∈ 2A , the expression B ⊥ C is w.r.t. def the p.o. (2A , ⊆). Define B ⊂fin A ⇔ B ⊆ A ∧ |B| ∈ N . def As usual A \ B = {x | x ∈ A ∧ x 6∈ B } , and the operation \ is called set difference. a.e. If F, G : A → B and R is a binary relation on B, then F R G, abbreviates ∃C⊂fin A (a ∈ (A \ C) → F (a) R G(a) . 1.2.3 Function, schema and idc. Unless otherwise specified, a function is a map f : Nk → N , for some k ∈ N. For a function f : Nk → N the arity ar(f ) of f is then k. By convention ar(f ) = k, so that f (~x) = f (x1 , . . . , xk ), unless otherwise specified. id is the identity-function on N. Functions f : Nk → N are occasionally referred to as number-theoretic functions, or arithmetic functions. In our treatment of functions f : Nk → N we shall be quite formal in our introduction of them, but we shall take (Z, +Z , −Z , ≤Z ) – the integers with integer-addition, -subtraction and the standard ‘less than or equal to’-relation – as our naive and given fixed structure, not to be defined further. For example, we will formally introduce the maximum-function in terms of ≤Z restricted to the natural numbers (see definition 18 (p. 27)), and the predecessor-function on N formally as∗ max(x − 1, 0) (see definition 22 (p. 37)). This choice might seem odd, but (1) as we do not intend to give neither an axiomatic development of fundamental concepts and structures discussed, nor a philosophical justification of them, and (2) since we will need some starting point for an otherwise quite formal treatment of the involved concepts, it turns out that this structure is perfect for our needs. Thus, though we consider the successor (S(x) = x + 1) as computationally or algorithmically simpler than ∗ Which would then be max(x + (−1), 0) if you are really cantankerous. 5 1.2. DEFINITIONS addition x + y, we do not pass any judgment on their relative philosophical complexity. Indeed, as we define S via reference to +Z , in a sense, and implicitly, we could be accused of advocating the opposite view: that one cannot know what it means to add just one unless one knows how to add arbitrary numbers. The fact of the matter is that we are unopinionated on these matters – at least here. That is: we feel at liberty to assume the existence of integers and the additive group over these, and, to define further concepts in terms of these structures. We do want to define the successor function formally, and as emphasised above, the way this is done is a matter of convenience rather than philosophical conviction. Definition 1 (Majorisation) Define the majorisation relation ‘’ on NN by def f g ⇔ ∃N ∈N ∀x≥N (f (x) ≤ g(x)) . def We say that g majorises f when f g, and as usual f ≺ g ⇔ f g ∧ g 6 f . We say that f and g are comparable if f ≺ g or f = g or g ≺ f . Hence g majorises f when g is almost everywhere (a.e.) greater than or equal to f . The relation is transitive and ‘almost’ anti-symmetric on NN ; that is, a.e. we cannot have both f ≺ g and g ≺ f , and f g ∧ g f ⇒ f = g, which N ae means that is anti-symmetric on N /= . Definition 2 (k-ary Majorisation) We also want to be able to compare functions of higher arity than one with respect to asymptotic growth. Given two functions f, g : Nk → N we define def f g ⇔ ∃A⊂fin Nk (~x 6∈ A → f (~x) ≤ g(~x)) . The meaning of f ≺ g is as expected. For unary h we define def f h ⇔ f (~x) h(max(~x)) . Definition 3 (argument bounded) A function f : Nk → N is called argumentbounded (a.b.) if f id. If∗ f ≤ id we say that f is strictly a.b. Note that f is a.b. iff ∃cf ∈N ∀~x∈Nk (f (~x) ≤ max(~x, cf )) , and that if cf = 0, then f is strictly a.b. and this is the definition used in the embedded articles. This is a core concept for this exposition, in particular in chapter 3, and most of the functions we shall acquaint ourselves with in this dissertation will be a.b. We will often consider argument-bounded versions of non-argument-bounded functions. ∗ For functions f, g the assertion f ≤ g is interpreted in the usual pointwise sense: ∀x∈N (f (~ x) ≤ g(x)) . 6 CHAPTER 1. INTRODUCTION Definition 4 (the bounded f ) For a function f : Nk → N, the bounded f is the (k + 1)-ary function f¯, defined by def f¯(~x, y) = min(f (~x), y) . Definition 5 (top index) For a function f , we say that f has top-index i if ∃cf ∈N ∀~x∈Nk (f (~x) ≤ max(xi , cf )) . If cf = 0 we say that i is a strict top-index. Clearly the presence of a top-index implies that f is a.b., and a strict top-index implies f strictly a.b. Furthermore, f¯ has strict top-index k + 1. Definition 6 (functional, operator and schema) A functional or operator is a map σ : A1 × · · · Ak → Ak+1 where each Ai is a set of the form f f : Nki → N for some k ∈ N. A schema op is a set of functionals or operators. A schema op is argument-bounding (a.b.) if, for all σ ∈ op we have that if f~ are a.b., then σ(f~) is a.b. Definition 7 (idc.) Let X be some set of initial, primitive or basic functions, and let op be a set of operators. The inductively defined class of functions (idc.) [X ; op] is defined as the the least set of functions F containing X and closed under the schemata of op, that is for each functional σ ∈ op, viewed as a partial map on the set of all arithmetical functions ~g ∈ Dom(σ) ∩ F ⇒ σ(~g ) ∈ F . As we shall have much more to say about idc.’s below, this initial definition should be taken as preliminary and naive only. 1.2.4 Predicates. A predicate is a subset R of Nk for some k. Predicates are interchangeably called relations or problems. Usually uppercase Latins R, S, T, . . . denote a predicate. def The complement R of R ⊆ Nk is defined by R = Nk \ R. There are two predicates associated with a function f which will play a major rôle in this exposition. They are f −1 (0) and Γf : Definition 8 (characteristic function) Let f be a function and let R ⊆ Nk be a predicate. If∗ R = f −1 (0), then f is referred to as a characteristic function for R. We denote by χR any characteristic function for R. Amongst ∗ For def def a function f and A ⊆ N as usual f −1 (A) = ~ x ∈ Nk | f (~ x) ∈ A , and f −1 (n) = −1 The set f (A) is also called the preimage of A under f . f −1 ({n}). 7 1.2. DEFINITIONS all the characteristic functions for some predicate R there is, for each 1 ≤ c ∈ N, a canonical representative χcR satisfying 0 , if ~x ∈ R def χcR (~x) = . c , if ~x 6∈ R The function χ1R is referred to as the characteristic function for R. Thus, the assertion ‘f = χR ’ should be interpreted as f −1 (0) = R, and in a sim ilar vain the expression ‘χR ∈ F’ should be interpreted as ∃f ∈F R = f −1 (0) . It is also clear that f (~x) = 0 ⇔ ~x ∈ R is equivalent to stating that f is a characteristic function for R. Definition 9 (graph of f ) The graph of f Γf is defined by: def Γf = (~x, y) ∈ Nk+1 | f (~x) = y . def Moreover, when F is some set of functions we define ΓF = {Γf | f ∈ F }. Definition 10 (induced relational class) Let X be a set of functions. The induced relational class of X , denoted X? is defined by: def X? = f −1 (0) | f ∈ X . The set X? is also called the relations in/of X , or the problems in/of X . Thus R ∈ F? ⇔ ∃f ∈F (~x ∈ R ⇔ f (~x) = 0), conforming with Grzegorczyk’s original definition of ‘the relations of a given class’ [Grz53, pp. 6–7 ]. We just remark here that Grzegorczyk defined the classes E?n as a closure of certain initial relations under certain operations on relations [Grz53, p. 36], and then proceeded to prove that his sets of predicates E?n corresponded to his ‘relations of E n ’. In the literature though, the usual practise has become to define F? as we have done here. By examining the definitions, it is trivial that: Γf ∈ F? ⇔ χΓf ∈ F . Hence there is no problem with overloading notation and simply writing Γf for def any characteristic function for Γf , since the above means that Γf ∈ F? ⇔ Γf ∈ F, where Γf represents the predicate to the left, and the function to the right. For most classes any of the various variations are equivalent: demanding that the characteristic function belongs to F, or demanding e.g. χ−1 R (1) = R. However, for some of the first idc.’s we shall study here, this distinction is significant. def Definition 11 Let X be a set of functions. Define X• = R ⊆ Nk χ1R ∈ X . That is, X• consists of those predicates for which the characteristic function belongs to X . Since the characteristic function clearly is a characteristic function, we have that X• ⊆ X? trivially. 8 CHAPTER 1. INTRODUCTION Convention 2 It will also be convenient with the convention that F ⊆? G abbreviates F? ⊆ G? , so that the expression ‘F ⊆? G = H’ abbreviates ‘F? ⊆ G? and G = H’ . That is, a ‘starred’ relation between sets of functions, indicate that the relation applies to the induced relational classes. Minimal IDC.’s and Relativised Detour Degrees Minimal IDC.’s Do the difficult things while they are easy and do the great things while they are small. A journey of a thousand miles must begin with a single – Lao Tzu step. 2.1 Introduction This chapter contains investigations into the computational strength of (very) small inductively defined classes of functions F. By computational strength we here mean which subsets of Nk have a characteristic function in F? Before embarking on our charting and discovery of some minimal idc.’s, we will briefly discuss the term class in a historical (recursion-theoretic) context, and specify what shall be understood by it during the ensuing exposition. By this we also hope to motivate why and how we have determined which classes to study and include in this exposition, and which we will omit. 2.1.1 The MINIMAL in Minimal idc.’s. Sets are partially ordered by set-inclusion ‘⊆’. When ‘class’ is interpreted as synonymous to ‘set’, and the word ‘least’ is interpreted with respect to inclusion, the minimal class is of course [∅ ; ∅] = ∅. However, the term class sometimes carries with it connotations of something more than just any set, e.g. a set A which may be specified through a common natural property of its elements∗ . One rarely speaks of the empty class. Since our concern is with idc.’s we also feel that the empty set does fall outside of our scope – we want some initial functions (viz. X = 6 ∅) in our minimal idc. Furthermore, we want something ‘to happen’ when we inductively close it: we want some operators (viz. op 6= ∅) which can be applied to the members of X . And, preferably, we want something more than just X to be generated by this closure (viz. X ( op(X )). Thus, we are left with the somewhat ontologically flavoured task of choosing natural initial functions and operations. A discussion of which criteria should be met in order to qualify as a natural initial function or a natural schema could develop into a rather lengthy affair, ∗ Other times – as in set theory – classes are sets too large to be sets, and in some contexts a class is just a set. 12 CHAPTER 2. MINIMAL IDC.’S and we shall have to turn to the literature for guidance. Secondly, it is our hope that clarifying our view on what the essence of idc.-theory is, and by the same token making explicit the end which we hope these investigations will facilitate, will enable the reader to appreciate the choices being made. 2.1.2 Why idc.’s? The field of idc.-theory can largely be traced back to two texts from the early fifties. They are Rózsa Péter’s Recursive Functions [Pet67] from 1951, and Andrzej Grzegorczyk’s Some classes of recursive functions [Grz53] from 1953. Recursive Functions may be regarded as the first handbook on the subject, and maintains its status as one of the definitive introductions to the field. It certainly provides firm ground from which to ponder on questions of ‘naturalness’. The book’s subject is much broader than this dissertation’s and does not deal with idc.’s qua idc.’s, but in particular the first few chapters prove an invaluable source for identifying candidates for natural basic functions. Rose’s book Subrecursion – Functions and hierarchies [Ros84] from 1984 must also be considered an influential text in this field, and the reader could rightfully claim that the choices of natural functions and schemata are both influenced by and an expression of this author’s veneration of these three fountainheads. The sheer number of publications in this field more than indicate that the subject is interesting in its own right, and starting with the famous theorem of Ritchie that linspace = E?2 more and more evidence has been mounted to support the claim that in many ways sub-recursion theory and complexity-theory are two faces of the same coin. Complexity-theory analyzes algorithms. Arguably mathematicians and computer scientists have yet to define and agree upon exactly which mathematical objects it is that corresponds to algorithms. It may be the case that one, unifying view of what an algorithm is will never synthesize, and this may be due to the extensive informal use of the notion of an algorithm employed in metamathematical jargon: we would want an ‘algorithm’ to determine whether or not a given object is an algorithm or not. In fact, for those formalisations of the concept which already exist, even equivalence of algorithms is elusive, as e.g. both the extensional and the syntactical (within a given language for writing algorithms) measure, fail to capture our naive understanding of what it means that two algorithms are equivalent. This thesis does not in any way try to address these fundamental, technical and philosophical issues. However, most idc.’s, and all we shall encounter here, do have a natural algorithmic interpretation. The definition of a schema op usually contains an algorithm for how to compute say σ(~g )(x), given algorithms for computing the gi ’s. Hence, when f is in an idc. F, we might think of f as given by an algorithm for computing it via the primitive instructions specified in X , and the more complicated algorithms written by applying to these the functionals of op in various ways. Studying idc.’s is thus also implicitly a study of algorithms, and this observation alone justifies time spent on the subject in this authors opinion. This view of idc.’s also suggests that natural functions and functionals are those which can 13 2.1. INTRODUCTION immediately be seen to be computable in the sense that they can be calculated by some deterministic algorithm. This does not mean that we exclude the possibility of interesting research and insights derived from the study of idc.’s where some members are not computable, but rather that our main interest in idc.’s stem from their connection to algorithms and their complexity. The selection presented in the following sections can thus be summarised as: those of the standard functions and schemata most frequently encountered in the literature, which the author subjectively considers as evidently computable and which in addition are argument-bounded (functions) or argument-bounding (functionals). Sometimes bounded versions of non-argument-bounded functions are considered. Now – without further ado – let us bring on the first candidates. 2.1.3 The case for projections, constants and composition: The advocation for projections, constants and composition as natural initial functions and schemata will be led by the explicit definitions. This concept is best illustrated by an example: It is evident from the early literature that once an author has access to some functions, say g(x, y) and h(x, y, z), he regards it as matter-of-factly that he may also access say def f (x, y) = g(y, h(x, x, 17)) . (ED) The formula (ED) above is an explicit definition of f . More specifically, it involves the composing of functions and certain manipulations of the arguments often commonly referred to as explicit transformations. These explicit transformations [. . . ] are (i) interchange of variables, (ii) identification of variables, and (iii) substitution of a constant for a variable. – [Ros84, p. 125] Numerous other authors attribute the same meaning to explicit transformations (Smullyan [Smu61], Esbelin & More [Esb94], Clote [Clo96]), and it is rather uncontroversial to claim that in sub-recursion theory, Rose’s definition as cited above is the canonical interpretation of explicit transformation. Also the early pioneers (Péter, Grzegorczyk) uses them, though without explicit mention. Since this author also regards explicit definitions as natural – they enable us to no-more-no-less than to execute computations which we already know how to perform, in arbitrary order and on choice input – we do want to include them as our natural givens. The initially vague idea of explicit definitions has been gradually formalised in the literature, and we repeat this exercise in an idc.-setting here. That is, we want to characterize the explicit transformations through suitable basic functions and schemata. 14 CHAPTER 2. MINIMAL IDC.’S Definition 12 (projection- and constant-functions) I is the set of all def projections Ini (~x) = xi . The identity function I11 is also denoted simply id , and def N is the set of all constants c(x) = c for all c ∈ N. These functions are suitable for e.g. interchanging variables or a substitution for a constant, but we lack a method for the situation where e.g. g should carry out further computations on say h(x), indeed even the projections must be substituted into – naive composition of functions – in order to get the job done. The next step is to define∗ the schema of composition: Definition 13 (composition) The schema∗∗ comp is called composition. Formally comp is a family of functionals def comp = ◦k` | k, ` ∈ N (†) where the arity of the operator ◦k` is ` + 1. The functional ◦k` accepts as arguments one function h of arity `, and ` functions ~g , each of arity k. The function ◦k` (h, ~g ), for which we adopt the infix notation h ◦k` ~g , is defined by the following formulae: def ar(h ◦k` ~g ) = k def ∀n1 ...,nk ∈Nk h ◦k` ~g (n1 , . . . , nk ) = h(g1 (n1 , . . . , nk ), . . . , g` (n1 . . . , nk )) . We note that ◦ is an associative operation on functions∗∗∗ , and when f : N → N ∗ We shall gradually become less formal when introducing new schemata and functions. This definition of composition serves as an example of the kind of painstaking formalisms which we later suppress. ∗∗ Schema vs. Operator vs. Functional: The following entry was taken from http://dictionary.reference.com/browse/schema on March 5th 2009: sche·ma [skee-muh] noun, plural sche·ma·ta [skee-muh-tuh or, sometimes, skeemah-tuh, ski-], sche·mas. 1. a diagram, plan, or scheme. 2. an underlying organizational pattern or structure; conceptual framework. 3. (in Kantian epistemology) a concept, similar to a universal but limited to phenomenal knowledge, by which an object of knowledge or an idea of pure reason may be apprehended. Origin: 1790–1800; < Gk schêma form This excerpt indicates that the term schema is that which to the greatest extent reflects the intended concept we are trying to define. Strictly speaking an operator or a functional is (usually) a map A : X → B where X is a set of functions (or functionals). The actual schema is thus the template for defining the functionals – like the one given in this definition for the functionals ◦k` – while comp is the collection of functionals identified by (†) and the paragraph succeeding (†). In idc. theory, we thus identify the set of functionals comp with the method given for determining whether a given operator belongs in comp or not, viz. the definition in which the reference to this footnote appears. As lingual subtleties are not the subject at hand, we conclude this discussion here, noting in the passing that the schema comp is defined by a schema for defining functionals. ∗∗∗ I.e. (f ◦ ~ g ) ◦ ~h = f ◦ g1 ◦ ~h, . . . , gk ◦ ~h . 15 2.1. INTRODUCTION the notation f (m) denotes ‘f composed with itself m times’. Formally: x ,m = 0 def f (m) (x) = . (f ◦ f (m−1) )(x)) , m > 0 def Now, in an idc. setting, assume that g, h ∈ F = [X ∪ I ∪ N ; op ∪ {comp}]. That is: (1) projections and constants are amongst the primitive functions, (2) all composition-operators are available in the idc. F, and (3) h and g are certified members of F. Then def def c2 = 17 ◦21 (I21 ) ∈ F and ar(c2 ) = 2 f 0 = h ◦23 (I21 , I21 , c2 ) ∈ F and ar(f 0 ) = 2 f 00 = g ◦22 (I22 , f 0 ) ∈ F and ar(f 00 ) = 2 It is now straightforward to verify that all three steps above are valid with respect to the rules governing idc.’s and that we can formally and within F perform the explicit definition of f in terms of g and h represented by the equation (ED) from page 13: f 00 = f . The reader will also verify upon inspection that all arity-constraints are maintained above. In a more involved argument however, the k and ` in ◦k` will tend to cloud rather than clarify the argument, thus: Convention 3 In an expression like h ◦ ~g , by convention ar(h) = |~g | = ` and ar(gi ) = k by default. That is, explicit mention of k and ` is suppressed henceforth, and we simply write h ◦ ~g for h ◦k` ~g unless they are absolutely necessary. The above discussion also contains the general idea for the claim that: If the idc. F is closed under composition and contains projections and constants, then F is closed under explicit definitions. The converse implication is more subtle∗ . However, if one would consider e.g. I32 as defined explicitly by f (x, y, z) = y, then it is easy to see that we also have: If the idc. F is closed under explicit definitions, then F is closed under composition and contains the projections and constants. Hence: Observation 1 Let F be an idc. Then F is closed under the functionals of comp and I ∪ N ⊆ F iff F is closed under explicit definitions. q.e.d. Our first candidate for a natural minimal idc. is thus def T = [I ∪ N ; comp] . ∗ The subtlety arises with the word contains above. The assertion is false if this wording is taken to mean that the constants are included as basic functions, since it is possible that instead zero and successor are included, while the constants are derived functions. If contains are interpreted as in the closure, it is evidently true. That F is closed under composition is clearly true; this wording does not suggest that comp is a primitive schema to F . 16 CHAPTER 2. MINIMAL IDC.’S In view of observation 1 and the preceding discussion, this idc. contains a bare minimum. Now, if we let N ω designate the set of constant functions of all arities, then we also have: Observation 2 T = [I ∪ N ω ; comp] = I ∪ N ω . q.e.d. This observation indicates that in fact our first attempt at exhibiting a minimal idc. in a sense failed: the inductive closure of our initial functions is essentially the initial functions themselves, and we said before that we wanted something new, something more than just X , to be generated by the closure. Secondly, since sub-recursion-theory is mainly concerned with problems regarding the computational strength of F – which predicates belong to F? ? – the following observation further illustrates T ’s insipidness: Observation 3 R ⊆ Nk ∧ R ∈ T? k _ R = N R=∅ ⇔ ∃i R = Ni−1 × {0} × Nk−i Proof: The constant function ck (of arity k) vanishes on Nk for c = 0, or on ∅ for c > 0; a projection Iki vanish iff xi = 0, and so Iki is a characteristic function for Ni−1 × {0} × Nk−i . q.e.d. Remark 1 The intuition which is strengthened by the proof of observation 3 above is that the functions of I ∪ N are not computing anything. They are ‘ignorant’ of their their arguments – that there are any characteristic functions in I ∪ N at all is ‘by accident’ – they have no ‘brains’. This brain-metaphor is adopted from M. Kutylowski: The next result shows that what differentiates [the second and first Grzegorczyk class] from [the zeroth Grzegorczyk class] is not their ‘brain’ but their ‘body’, that is the possibility of using large numbers inaccessible in [the zeroth Grzegorczyk class]. – Kutylowski [Kut87, p. 196 ] In this authors view the metaphor above is both fitting and useful, and we will resort to it frequently in both this chapter and the next during less formal passages, replacing ‘body’ by ‘muscle’. A typical all-muscle-no-brains function would be the exponential function. On the other end of the scale – as we shall see – an excellent representative of the no· muscle-all-brains crew is the truncated difference function − (see section 2.2.4). Indulging this metaphor even further, I ∪N is a right out sorry-looking group of push-overs: no brains, no muscle but easily exploited to do your book-keeping if you tell them explicitly what you need done. 2.1. INTRODUCTION 17 In other words, T is not really a proper inductively defined class, in the sense that one ‘expects’ X ( [X ; op]. Nor does it compute anything interesting at all – it contains characteristic functions for near-trivial ∗ predicates only. def We define T = T? , and dub this set the near-trivial predicates. Furthermore, this result is not robust with respect to our definition of X? , since∗∗ T• = Nk k∈N = ∅, N, N2 , . . . . So T• corresponds to the trivial predicates in the usual sense of the term. That this holds follows from the easy fact that the only 0–1-valued functions in T are the 0k and 1k for k ≥ 1. In view of our discussion so far we feel justified in forwarding the following claims: Observation 4 (1) Since the functions in I ∪ N and the schema comp provide an idc. exactly with explicit definitions, they are natural primitives in any idc. (2) From a computability point-of-view T is spectacularly uninteresting. It therefore constitutes a natural minimum in the world of sub-recursive classes. Convention 4 (classes and idc.’s) Henceforth, we reserve the designation class of functions, or just class, exclusively for sets F of arithmetic functions, which are (C1) closed under the functionals of comp and (C2) contain I ∪ N . Thus, in order to make the class in idc. conform to this convention the term idc. is reserved for inductively defined classes [I ∪ N ∪ X ; op ∪ comp]. By this token we also omit specific mention of I, N and comp in the sequel. Hence [X ; ] abbreviates [I ∪ N ∪ X ; comp] , and likewise [X ; op] is short for [X ∪ I ∪ N ; op ∪ comp]. Since one can use them interchangeably, we also do not distinguish between, N and N ω . Because we are concerned with minimal idc.’s, in what follows X will consist of a very few functions fi and op will be the union of a few schemata {opj }j∈J , and rather than writing the more cumbersome [{f1 , . . . fn } ; op1 ∪ · · · ∪ opm ], we simply write [f1 , . . . , fn ; op1 , . . . , opm ]. To further confuse the reader we even abbreviate the abbreviation [X ; op] as X OP when convenient (with the hope that after the initial struggle to untangle the notational knots, this convention will make the text more transparent rather than obscuring it). Henceforth the symbol F is reserved exclusively for idc.’s, while X is used for an arbitrary set of functions. In most contexts an X represents the initial functions to some idc., but this is not always the case. Remark 2 (function vs. definition) Let F = [X ; op] . Up until now, we have avoided the distinction between the function f ∈ F and a definition of f the trivial predicates are ∅ and N (possibly also including Nk for k ∈ N). that X• are those predicates for which the characteristic function for R is in X (see Definition 11) ∗ Traditionally, ∗∗ Recall 18 CHAPTER 2. MINIMAL IDC.’S from F, that is: we have not made a distinction between the set of functions F, and the idc. F. We do want to blur this distinction as much as possible, however a few comments, and a small discussion is overdue at this point. To clarify what is the issue at hand, consider the proposition ‘f ∈ F’, meaning that ‘f belongs to F’. Implicitly we are thus claiming that there is some way of obtaining f from the initial functions of X , by a finite sequence of applications of the functionals of op. In other words we are claiming that there exists some finite-relative-to [X ; op] definition of f . This intuition can easily be formalised in terms of e.g. term-rewriting∗ (see e.g. Beckmann & Weiermann [B&W96], Oitavem [Oit02], Barra [B04]), which gives precise algorithmic content to an idc., or it could be formalised through syntactic objects in a suitable pseudo-programming language. The canonical way of such a formalisation consists in considering some auxiliary def def set X = f | f ∈ X of names for the functions, a set op = {σ | σ ∈ op } of names for the functionals, and then defining a set ∆[X ; op] of definitions inductively so that the map Φ : ∆[X ; op] → F is surjective and satisfies equations of the form Φ(σ(f 1 , . . . f ` ))(~x) = σ(f1 , . . . , f` )(~x) for all σ and f~. Given such a formalisation, there will be many obvious syntactic measures of complexity on the set of definitions. E.g., for δ ∈ ∆[X ; op] , define def µ(δ) = ‘the number of symbols in δ’ . Of course, there will be a many-to-one relationship between definitions and the functions they represent, and in particular we will not have that Φ(δ1 ) = Φ(δ2 ) implies µ(δ1 ) = µ(δ2 ). In fact – because of the projections – it will, for any ‘natural measure µ’ and f ∈ F, be the case that def N (f ) = µ(Φ−1 (f )) = {n, n + 1, n + 2, . . .} for some n. On the other hand, for δ ∈ ∆ and a syntactic measure µ in ∆, even def though determining the number nf,∆,µ = min N (f ) is undecidable in general, it is certainly well-defined, and so most measures µ : ∆ → N (or ∆ → α for an ordinal α) induce a stratification of F into the hierarchy F 0 ⊆ F 1 ⊆ F 2 ⊆ · · ·, and a great number of sub-recursive conundrums are related to whether or not such hierarchies are strict, collapse, or satisfy other specific properties. The level of precision above is sufficient for our needs, and we will not worry to much about the particular formalisation chosen in what follows. However, in an expression like ‘f ≡ h ◦ g’ we mean something like: f could be given a formal definition in some ∆ as e.g. ◦(δh , δg ) where δh and δg would be formal definitions of h and g respectively. ∗ See e.g. [Terese] for an excellent introduction to, and overview of, this field. 19 2.1. INTRODUCTION 2.1.4 Intermission – on the term class. Of course, the term class has many uses in mathematics, and unlike set it takes on different meanings in different contexts. In Grzegorczyk [Grz53], class is used synonymously with set of arithmetic functions. In Gandy [Gan84] we find the following: By a class C of number-theoretic functions we shall always mean a collection which contains the successor function, the case function C (Cxyuv = x if u = v, (Cxyuv = y if u 6= v) and which is closed under explicit definition. – Gandy [Gan84] In Ritchie’s Classes of Predictably Computable Functions [. . . ] the class F of functions computable by [Turing machines viewed as] finite automata [. . . ] is taken as a basic class since these functions are computed as “as swiftly as possible;”as soon as the input has been scanned, the output is printed. – Ritchie [Rit63, p. 1] He then notes that F is closed under explicit definitions (explicit transformations and composition) and contains the successor function. Clearly, if a machine description of a set of functions is desired, the omission of the successor is quite unnatural, and would certainly involve rather contrived constructions. It is possible to ask oneself if the combined academic weight of the various authors cited above have contributed to maintaining the self-evident inclusion of the successor function in most studied classes. But, if focus is kept on the relational class, and if one leaves the domain of computing by machines, there will be no reason to carry this intuition over. Even if the successor is to be considered as indigenous to complexity-theory by Ritchie’s reasoning, this does not imply prima facie nativity to sub-recursive settings. As noted, [I ∪ N ; comp] is ‘provably trivial’. There are (at least) three ways of making our idc. more exiting: adding to X or adding to op, or adding to both. This chapter of the thesis is devoted to investigate, develop, and chart these possibilities systematically. Such an endeavour necessarily involves choosing which functions and schemata not to include. This author’s ulterior motive of finding idc.’s suitable for the work presented in chapter 3 – idc.’s containing argument-bounded functions only – the selection of functions and schemata has certainly been guided by this criterion. Anticipating the chapter 3, and to further motivate our selection, consider the following quote: [. . . ] interesting things tend to happen when successor-like functions are removed from a standard computability-theoretic framework. 20 CHAPTER 2. MINIMAL IDC.’S – Kristiansen [Kri08] The selection which is presented in this dissertation is thus intended to comprise the most natural and commonly encountered argument-bounded arithmetic functions. How we have chosen which schemata to include is discussed in section 2.3. For non-argument-bounded idc.’s, the number of researchers with major contributions is so great that undertaking reproducing a list of their names is unfeasible other than in a purely historical text. Rózsa Péter’s Recursive Functions [Pet67] is by many considered the very first text-book on the subject, and she is frequently referred to as the ‘mother of recursive function-theory’ (Wikipedia attributes to her the coining of the term primitive recursive). A modest selection of important papers in the field should include M. Robinson [RRo47, RRo55], J. Robinson [JRo50, JRo55] Grzegorczyk [Grz53], Gladstone [Gla67, Gla71], Harrow [Har73, Har75, Har78, Har78], Kutylowski [Kut87], Esbelin [Esb94], Esbelin & More [E&M98], and more recently Severin [Sev08]. Of course Rogers [Rog67], Rose [Ros84], Odifreddi [Odi99, ch. 7–8] and Clote [Clo96] provide excellent and interesting overviews of the field from different vantage points in time. In the next section, before embarking on our journey through the world of minimal idc.’s, we first present some general observations regarding idc.’s, and formally define a few auxiliary notions. 2.1.5 Logical functions and 1st -order logic. Logical functions. Definition 14 (Logical functions) A (logical-) and-function is any function χ∧ : N2 → N satisfying χ∧ (x, y) = 0 ⇔ x = 0 = y. A (logical-) or-function is any function χ∨ : N2 → N satisfying χ∨ (x, y) > 0 ⇔ x > 0 < y. A (logical-) negation-function is any function χ¬ : N → N satisfying χ¬ (x) = 0 ⇔ x > 0. In particular, if R, S ⊆ Nk , then χ∧ ◦ (χR , χS ) = χR∩S , χ∨ ◦ (χR , χS ) = χR∪S and χ¬ ◦ χR = χR . Clearly now, e.g. def f (~x, ~y ) = χ∧ (χR (~x), χS (~y )) is a straightforward explicit definition of a characteristic function for the predidef cate T , defined by (~x, ~y ) ∈ T ⇔ ~x ∈ R ∧ ~y ∈ S, and similar equivalences hold w.r.t. ‘∨’ and ‘¬’. Let F be an idc., and consider the 1st -order language (without equality) def LF = {Rf , n | f ∈ F, n ∈ N } , def def and the LF -structure NF = (N, F? ) where RfNF = f −1 (0) and nNF = n ; viz. NF |= Rf (n, m) iff f (n, m) = 0. Now, any LF -formula φ, paired with a list z1 , . . . , z` of formal LF -variables such that all free variables of φ occur in ~z, defines a unique predicate R~φz ⊆ N` defined 21 2.1. INTRODUCTION by∗ : def ~n ∈ R~φz ⇔ NF |= φ[~z := ~n] . This rather tedious∗∗ discussion enables us to propose the following definition: Definition 15 (logical closures) Let LF be as above. We say that F? is closed under (logical-) and if R~φz ∈ F? for any φ and ~z, where φ is a quantifierfree (q.f.) conjunction of atomic formulae. Similarly F? is closed under (logical-) or when φ is a q.f. disjunction of atoms, and F? is closed under negation when φ is an arbitrary literal. We say that F? is Boolean or closed under propositional logic when the analogue assertion holds for any q.f. φ. To better appreciate why this definition warrants the extensive preparations in the preceding paragraph, consider the difference between e.g.∗∗∗ φ1 ≡ Rf (x1 , x2 ) ∨ Rg (x1 , x2 ) and φ2 ≡ Rf (x1 , x2 ) ∨ Rg (x3 , x4 ). Now Rφx11,x2 is simply f −1 (0) ∪ g −1 (0), while e.g. Rφx11,x2 x3 = f −1 (0) ∪ g −1 (0) × N. Next R~φx 2 = f −1 (0) × N2 ∪ (N2 × g −1 (0), while adding a variable x5 yields 5 R~φx,x = (f −1 (0) × N3 ) ∪ (N2 × g −1 (0) × N). If we consider ∧-formulae the 2 picture is equally messy. The important point is that for our idc.’s – which we already know accommodate explicit definitions – we clearly have: Proposition 5 (logical closure) (i) If χ∧ ∈ F, then F? is closed under logical ‘and’; (ii) If χ∨ ∈ F, then F? is closed under logical ‘or’; (iii) If χ¬ ∈ F, then F? is closed under logical negation. (iv) If χ∧ , χ∨ , χ¬ ∈ F? , then F? is Boolean. q.e.d. As expected, there is a close relationship between logical closures and settheoretic closures – it follows directly from closure under explicit definitions and proposition 5 that: Proposition 6 (set-theoretic closure) (i) F? is closed under logical ‘and’ iff F? is closed under (finite) intersections. (ii) F? is closed under logical ‘or’ iff F? is closed under (finite) unions. (iii) F? is closed under logical negation iff F? is closed under complements. ∗ φ[x := t] denotes usual substitution of the term t for the variable x. challenge the reader to come up with an as stringent yet shorter definition. ∗∗∗ The symbol ≡ is used in various contexts and is used to emphasise syntactical identity over extensional. Here we would have Rf (x1 , x2 ) ∨ Rg (x1 , x2 ) = Rg (x1 , x2 ) ∨ Rf (x1 , x2 ) but not Rf (x1 , x2 ) ∨ Rg (x1 , x2 ) ≡ Rg (x1 , x2 ) ∨ Rf (x1 , x2 ) ∗∗ I 22 CHAPTER 2. MINIMAL IDC.’S q.e.d. Definition 16 For a set of relations R, we let R∩ , R∪ and Rc denote the closure∗ of R under finite intersections, finite unions and complements (relative to Nk for R 3 R ⊆ Nk ) respectively. For an idc. F, we let F?∧ , F?∨ and F?¬ denote the respective logical closures∗∗ , and F?B denotes the Boolean closure, viz. closure under propositional logic. 1st -order logic. Consider a 1st -order language L (with or without equality), and let U be an def L-structure with domain U , and form LU = L ∪ {u | u ∈ U } where each u is U a fresh constant. Then the LU -structure UU , where uU = u, is an elementary extension of U viewed as L-structures, and the theory of UU is a conservative extension of the theory of U. Given an L-formula φ and a list of k variables ~x such that all the free variables FV(φ) of φ occur in ~x, we denote by R~φx the unique subset of U k such that def ~u ∈ R~φx ⇔ UU |= φ(~u) . Given a language L, there are several standard fragments Frag(L) one may consider (e.g. the quantifier-free L-formulae, denoted Lqf ). Hence, for fixed Frag(L) and U we may define a set of predicates∗∗∗ (Frag(L))? by: def (Frag(L))? = R~φx | φ ∈ Frag(L) ∧ FV(φ) ⊆ ~x . There is of course no loss of generality in assuming that ~x = x1 , . . . , xk for some sufficiently large k. Many of our results will concern some 1st -order language L with a canonical interpretation in some structure with domain N, and be of the type: The predicates computable in the idc. F are exactly the predicates definable by a formula φ ∈ Frag(L), which we write simply as F? = (Frag(L))? . An example of ‘Proof by induction on the buildup of f ’. The next proposition is rather obvious. We include it for completeness, and because its proof serves as an example of a method of proof frequently used without further mention during the rest of the text: proof by induction on the buildup of f ∈ F. This is a standard form for proving results of the type All f in F = [I ∪ N ∪ X ; comp ∪ op] satisfy property P. def ∗ In the standard set-theoretic sense. the sense of definition 15. ∗∗∗ Keeping faithful to our convention that a ‘starred symbol’ represents a set of nonconstructively defined predicates. By non-constructively we here mean that in contrast to defining R as the predicates obtainable by closing some set of initial predicates under, say finite unions, when giving R as L? or F? , we see that ∗∗ In def def R ∈ R ⇔ ∃f ∈F (f −1 (0) = R) or R ∈ R ⇔ ∃φ∈L (Rφ = R) , so that the unbounded existential quantifier in the definition makes the set R highly nonconstructive. 23 2.1. INTRODUCTION A proof by induction on the buildup is complete once both the induction start – All f in I ∪ N ∪ X satisfies property P – and the induction step – If f~ satisfies property P and σ ∈ comp ∪ op, then σ(f~) satisfies property P – have been established. Of course, though we do not usually write down I ∪ N nor comp, it is absolutely crucial to not forget them in such a proof. Henceforth, by induction on f ∈ F abbreviates by induction on the buildup of f ∈ F. Finally, off course this method of proof works equally well for proving theorems of the type ∀a∈A (P (a)) whenever A is some inductively defined set of objects. We shall encounter various inductively defined sets other than idc.’s, and it will be clear when this method of proof is applied. The following quote (italics mine), is the initial remark by Péter preceding her proof of the theorem (see section 2.3.1 (p. 52)) that the Ackermann-Péter function is total general recursive, yet not primitive recursive. At the same time this will again afford a good example for what we have pointed out [. . . the] advantage of the delimitation of the concept of recursive function; for, since the recursive functions can be built up from some initial functions by some fixed kinds of definition, it is possible to prove assertions valid for all these functions in a way similar to that employed for the assertions on natural numbers, viz. by showing first of all that the assertion holds “at the beginning”, i.e. for the initial functions and that it devolves step by step to the “later” functions, i.e. to those which are built up from the already defined functions in an admitted way. – Péter [Pet67, p.108] Proposition 7 If all f ∈ X are a.b., and if all σ ∈ op are argument-bounding, then all f ∈ [X ; op] are a.b. Proof: The first thing to do is to establish that comp is argument-bounding. def If h, ~g are a.b., then ~y = ~g (~x) ≤ max(~x, max(~c)) where gi (~x) ≤ max(~x, ci ). Since h(~y ) ≤ max(~y , ch ) this part of the proof i finished. We proceed by induction on the buildup of f ∈ F. induction start: All f ∈ X are a.b. by assumption. We must prove that projections and constants are a.b. Clearly Iki is strictly a.b. with top-index i, and c is a.b. as witnessed by c(x) = c ≤ max(x, c(x)). induction step: Let f~ be a.b. functions. By assumption σ ∈ op is argumentbounded, meaning that σ(f~) is a.b. as well. q.e.d. 24 CHAPTER 2. MINIMAL IDC.’S 25 2.2. BASIC FUNCTIONS 2.2 Basic Functions In this section we will study various idc.’s of the form def def [X ; ] = X ◦ = X COMP , viz. an idc. with composition as the only schema, and a very few initial functions apart from I ∪ N . These idc.’s are for the most part ‘almost trivial’, with the notable exception of the idc. D encountered in section 2.2.4. Before we introduce our prospective basic functions, and investigate exactly how much or how little they achieve on their own, we first introduce the X trees which provide a useful intuition about the ‘anatomy’ of an X ◦ -function. X -trees. Definition 17 (X -trees) Let X be a set of functions (assume w.l.o.g. X ∩ (I ∪ N ) = ∅), and let {xi , c | i > 0, c ∈ N } be a set of formal variables and numerals. The set of X -trees T (X ) is defined inductively by: (B) {xi , c | i > 0, c ∈ N } ⊆ T (X ); (I) If f ∈ X is `-ary and t1 , . . . , t` ∈ T (X ), then∗ hf, ~ti ∈ T (X ). Define the set Var(t) of variables of t as outlined below: (V1) Var(xi ) = {xi }; (V2) Var(c) = ∅; (I) Var(hf, t1 , . . . , t` i) = S i≤` Var(ti ) We now want to view the X -trees as functions. In order to do this, one could spend a lot of space on stringently formalising the relationship between the formal variables xi and the arguments of a function f (~x), and so on. We shall not carry out this construction, as it should suffice to describe the desired result of such an effort. Thus, given an X -tree t with Var(t) = {x2 , x3 , x5 }, we consider the variables as ordered by their indices, and define for each k ≥ 5 = max {i | xi ∈ Var(t) } a function tk (~x) : Nk → N by: , if t ≡ xi xi k c , if t ≡ c t (~x) = . ~ k (f ◦ (t ))(~x) , if t ≡ hf, ~ti Note how any t corresponds to one k-ary function for each sufficiently large k. Lemma 8 (a substitution lemma) Assume Var(t) ⊆ x1 , . . . , xm for some X tree t (meaning m ≥ max {i | xi ∈ Var(t) }), and let s1 , . . . , sm be X -trees satisfying, for each i, Var(si ) ⊆ ~x. Then ∗ Exactly what the ‘f ’ appearing within the brackets ‘is’ is irrelevant for our purposes: it is a name for f . 26 CHAPTER 2. MINIMAL IDC.’S k (tm ◦ s~k )(~x) = (t[~x := ~s]) (~x) where t[~x := ~s] denotes the result of simultaneously substituting each occurrence of xi in t with si . Proof: By induction on the buildup∗ of t. induction start: If t ≡ xi , then t[~x := ~s] ≡ si and clearly (tm ◦ ~sk )(~x) = Im sk (~x)) = ski (~x) = (xi [~x := ~s])k (~x) = (t[~x := ~s])(~x) . i (~ If t ≡ c the result is equally obvious. induction step: Then t ≡ hf, t1 , . . . , t` i for some `-ary f ∈ X , and ~t ∈ T (X ), and thus: t[~x := ~s] ≡ hf, t1 [~x := ~s], . . . , t` [~x := ~s]i . Hence def IH ~k x)) = (hf, ~tim ◦ (s~k ))(~x) = f ((tm sk ))(~x), . . . , (tm 1 ◦ (~ ` ◦ (s ))(~ def f ((t1 [~x := ~s])k (~x), . . . , (t` [~x := ~s])k (~x)) = hf, t1 [~x := ~s], . . . , t` [~x := ~s]ik (~x) = (t[~x := ~s])k (~x) , which concludes the proof. Corollary 9 q.e.d. tk | t ∈ T (X ) ∧ k ≥ max {i | xi ∈ Var(t) } = [X ; ]. Proof: The proof of the ⊆-direction is contained in the proof of Lemma 8, since clearly hf, ~tik = f ◦ (tk1 , . . . , tk` ) ∈ X ◦ when each tki ∈ X ◦ . For the other direction we proceed by induction on f ∈ X ◦ . induction start: If f ∈ I ∪ N , tf ≡ xi for f = Iki , or tf ≡ c for f = ck suffice. If f ∈ X , set tf ≡ hf, x1 , . . . , xk i, so that tkf (~x) = f (xk1 (~x), . . . , xkk (~x)) = f (~x). induction step: Now f = h ◦ ~g . Fix th and ~t be X -trees corresponding to h and ~g respectively as guaranteed by the ih. Then, by lemma 8 we have that tf ≡ th [~x := ~t] is an X -tree satisfying tkf (~x) = (h◦~g )(~x) = f (~x), which concludes the proof of the ⊇-direction. q.e.d. The key point here is that any function in an idc. with no other schema than composition, has a very simple tree-like structure. Conversely, any X -tree t and sufficiently large k corresponds (as a pair) to a unique function in X ◦ . In particular one consequence of this result is that one may assume w.l.o.g. that if f = h ◦ ~g ∈ X ◦ then h ∈ X . Observation 10 Thus, in a proof by induction on f ∈ X ◦ , when considering the case f = h ◦ (~g ) – the inductive step – one may assume w.l.o.g. that h ∈ X – that is, h is an initial function. This is necessarily sound because corollary 9 ensures that f (~x) equals tk (~x) = hh, ~tik (~x) for some t ∈ T (X ). In a sense, such functions complexity is inherent in the tree’s structure, and just as we expect, only the functions of X play a significant rôle. ∗ See footnote on p. 22. 27 2.2. BASIC FUNCTIONS 2.2.1 The maximum and minimum functions. Definition 18 (maximum and minimum) The maximum function max : N2 → N, is defined by: x , if x ≥ y def max(x, y) = y , if x < y The minimum function min : N2 → N, is defined by: x , if x ≤ y def min(x, y) = y , if x > y . . def Define M = [I ∪ N ∪ {min, max} ; comp]. ◦ According to our notational convention we also write {max} for [max ; ]. Observation 11 Both max and min are strictly a.b. While max does not have a top-index, min has two ∗ top-indices. ◦ ◦ So {max} and {min} are our first candidates for minimal idc.’s. The ability to choose the larger or the lesser of two arguments seems like a rather natural feature, and most authors do add them to their idc.’s whenever not redundant. It is a fact that in e.g. E 0 the function max cannot be defined, but – as shown by Bel’tyukov (see e.g. [E&M98]) – when focus is on the relational classes, its inclusion makes no difference. First of all, it is trivial that: Proposition 12 max is an and-function, and min is an or-function. q.e.d.∗∗ Consequently, by proposition 5 (p. 21): Proposition 13 If max ∈ F, then F? is closed under ∧. If min ∈ F, then F? is closed under ∨. q.e.d. ◦ ◦ For the functions of {max} and {min} a complete characterisation is readily available. Lemma 14 Let∗∗∗ ~y ⊆ ~x (possibly ~y = ∅). Define functions max~yn (~x) : Nk → N by: max(~y ) , if max(~y ) ≥ n ~ y maxn (~x) = , n , if max(~y ) < n where k =|~x| is the arity of the defined function. ∗ Since min(x, y) ≤ x and min(x, y) ≤ y both hold for all x, y ∈ N2 , both 1 and 2 qualify as a top-index for min. ∗∗ See section 2.1.5 ∗∗∗ We freely write ~ y ⊆~ x when we mean that ~ y is a sub-list of ~ x. Similarly e.g. (x1 , x3 ) ∪ (x3 , x4 ) means the list x1 , x3 , x4 , and (x1 , x3 ) ∩ (x3 , x4 ) means x3 . Finally, ~ y = ∅ has the obvious interpretation. 28 CHAPTER 2. MINIMAL IDC.’S Define analogously functions min~yn (~x) : Nk → N with the exception that we allow def n to assume the value ‘ω’, where min(ω, x) = x for all x ∈ N. Then ◦ (i) {max} = I ∪ N ∪ max~yn (~x) | n ∈ N ∧ ~y ⊆ ~x . n o ◦ (i) {min} = I ∪ N ∪ min~yn (~x) | n ∈ N ∪ {ω} ∧ ~y ⊆ ~x . Proof: We only prove the result for max, as the proof for min is obtained by substituting ‘min’ for ‘max’ and ‘ω’ for ‘0’ below. ◦ By induction on f ∈ {max} . That c(x) = max∅c (x), Iki (~x) = maxx0 i (~x), and max(x1 , x2 ) = maxx0 1 ,x2 (x1 , x2 ) takes care of the induction start. induction step: By observation 10 and the ih w.l.o.g. assume f = max ◦ (max~yn1 , max~zn2 ) . ∪~ z Thus f (~x) = max~ymax(n (~x), which concludes the proof. 1 ,n2 ) q.e.d. Now, a comparably simple characterisation of the functions of M is probably not possible, since min and max ‘entangle’ in a way that is not always reducible to a transparent expression. def Still, recalling that T = T? , the following result is easily provable by viewing f ∈ M as a {min, max}-tree, and invoking observation 10 and proposition 6. Theorem 15 ◦ (i) {max}? = T?∧ = T∩ ; ◦ (ii) {min}? = T?∨ = T∪ ; (iii) M? = T?∨∧ , the closure of T? under ∨ and ∧ – equivalently – the closure of T under finite unions and intersections. ◦ ◦ P.13 ◦ ◦ Proof: Since T? ⊆ {max}? and {max}? = ({max}? )∧ , we have T?∧ ⊆ {max}? . ◦ ◦ That {max}? ⊆ T?∧ follows by induction on f ∈ {max}? thus: induction start: We need only verify for f = max: max−1 (0) = {0} × {0} = ({0} × N) ∩ (N × {0}) ∈ T∩ . O.10 induction step: Then f = max ◦ (g1 , g2 ), where gi−1 (0) ∈ T? . Thus def def (max ◦ (g1 , g2 ))−1 (0) = ~x ∈ Nk | g1 (~x) = 0 ∧ g2 (~x) = 0 = def ~x ∈ Nk | g1 (~x) = 0 ∧ ~x ∈ Nk | g2 (~x) = 0 = g1−1 (0) ∩ g2−1 (0) . IH Hence f −1 (0) = g1−1 (0) ∩ g2−1 (0) ∈ T∩ . We omit the proofs of (ii) and (iii), as they are completely analogue. We remark here that T?∧ ⊥ T?∨ (equivalently T∩ ⊥ T∪ ) since e.g. (N × {0}) ∩ ({0} × N) = {0} × {0} = {(0, 0)} ∈ T∩ \ T∪ , while (N × {0}) ∪ ({0} × N) ∈ T∪ \ T∩ . q.e.d. 29 2.2. BASIC FUNCTIONS On the characteristic function. We also point out that we are still confined to a very unstable – as opposed to robust – territory: the theorem 15 is dependent upon our definition of F? and ◦ {max}• = T• , since max ◦ (g1 , g2 ) can be 0–1-valued only in the case that both gi are constant. With min, on the other hand, we see that ‘by chance’ min(x, 1) = χ1{0} (x), ◦ ◦ so that in particular {0} ∈ {min}• . More generally, if χR ∈ {min} , then ◦ ◦ χ{0} (χR (x)) = χ1R (x), so that {min}• = {min}? . Hence ◦ {min}• = T?∨ and M• = T?∨∧ . Still, there is no reason to think that min is in any way more ‘brainy’ than max, as this difference vanishes if we were to define a characteristic function for R as one vanishing outside of R. Note also how none of the classesabove are closed under complements. If we kept our definition of T = Nk , {0} k∈N while ◦ requiring χ1R (x) = 1 ⇔ x ∈ R, then {min}• 6= T∨ , since it would contain {1, 2, 3, . . .} but not {0}. Having demonstrated that min and max are potentially useful functions yielding some logical closure properties of any F? which they are members of, and also that they are rather uninteresting by themselves, we close this section here. 30 CHAPTER 2. MINIMAL IDC.’S 31 2.2. BASIC FUNCTIONS 2.2.2 The case-functions. Definition 19 (case-functions) The case-function C : N3 → N, is defined by: x , if z = 0 def C(x, y, z) = y , if z > 0 . The truncated case-function C̄ : N4 → N, is defined by: def C̄(x, y, z, b) = min(C(x, y, z), b) . Note that this definition corresponds with our notation of the bounded C, see definition 4 (p. 6). The arithmetical sign-function sgn : N → N, is defined by: 0 , if z = 0 def sgn(z) = . 1 , if z > 0 These functions do not output a number ‘by accident’ (as do e.g. Iki ), and there will be a few non-near-trivial problems in e.g. {C̄}◦• . Note that all the casefunctions above are a.b., that C does not have a top-index, and that C̄ has strict top-index 4. The sign-function has strict top-index 1. As we shall see, C and C̄ are cut from the same cloth, and provide the same computational strength, while the sign-function – in an idc. context really a severely restricted variant of a case-function – is discussed briefly at the end of this section. Proposition 16 (i) If C ∈ F, then F? is Boolean; (ii) χ1{0} ∈ F; (iii) F? = F• . Proof: That (i) & (ii) ⇒ (iii) is obvious. Since C(1, 0, x), C(C(0, 1, x), 1, y) and C(0, 1, x) are, respectively, the negation-function, the and-function and the characteristic function for {0} , (i) and (ii) are also established. q.e.d. Noting that C̄(c0 , c1 , z, c2 ) = C(c0 , c1 , z) when c2 ≥ max(c0 , c1 ), we also see that e.g. C̄(1, 0, x, 1) is the negation function. ◦ Lemma 17 (i) min ∈ {C̄}◦ ; (ii) f ∈ {C} ⇒ f¯ ∈ {C̄}◦ . Proof: For (i), min(x, y) = C̄(0, y, x, x) ∈ {C̄}◦ (we shall see below that actually ◦ min 6∈ {C} ). ◦ (ii) is by induction on f ∈ {C} ; the induction start is trivial. For the induction step, if f ≡ C ◦ (g0 , g1 , g), then: min(ḡ0 (~x, b), b) , if g¯2 (~x, b) = 0 C̄(ḡ0 (~x, b), ḡ1 (~x, b), ḡ2 (~x, b), b) = . (†) min(ḡ1 (~x, b), b) , if g¯2 (~x, b) > 0 Since min(ḡi (~x, b), b) = ḡi (~x, b) and ḡ2 (~x, b) = 0 ⇔ g(~x) = 0 ∨ b = 0, we see that indeed the function computed by the expression found above (†) is min(C(g0 (~x), g1 (~x), g2 (~x)), b), which is what needed to be shown. q.e.d. This lemma and proposition 16 (ii) immediately yield: 32 CHAPTER 2. MINIMAL IDC.’S Proposition 18 (i) If C̄ ∈ F, then F? is Boolean; (ii) χ1{0} ∈ F; (iii) F? = F• ; ◦ (iv) {C̄}◦? = {C}? . q.e.d. The item (iv) above means that as far as computable predicates are concerned, ◦ we need only investigate one of the classes {C̄}◦? and {C}? . Now, C’s fundamental characteristic is the ability to ask whether an argument has the ‘zeroness property’, and based on the answer choose its output between two cases. The equivalence relation ∼1 on N, defined by the partition of N into {0} and {1, 2, 3, . . .}, suggests itself as convenient for describing properties of {C}◦ (and {C̄}◦ ). Definition 20 Let ∼1 be the equivalence relation on N defined by the partition {{0} , {1, 2, 3, . . .}}. Extend ∼1 coordinate-wise to an equivalence ∼k on Nk , viz. def ~x ∼k ~y ⇔ ∀i≤k (xi ∼1 yi ). We simply write ~x ∼ ~y when the k is insignificant. def Finally, set f ∼ g ⇔ ∀~x∈Nk (f (~x) ∼ g(~x)). So e.g. 0 ∼ n iff n = 0, and (0, 1, 1) ∼ (0, 5, 6) ∼ (0, 1, 25) 6∼ (14, 0, 3) ∼ (1, 0, 1) . That is, each ∼k -equivalence class contains a canonical representative which maps bijectively to bit-strings of length k, from which we deduce that ∼k partitions Nk into 2k equivalence classes. Lemma 19 Let {Ej }1≤j≤2k be the 2k elements of Nk /∼ . Then∗ ◦ f (~x) ∈ {C} ⇒ ∀j≤2k f Ej ∈ I ∪ N Furthermore, ~x ∼ ~x0 ⇒ f (~x) ∼ f (~x0 ). . (†) Proof: We note first that ~x ∼ ~x0 ⇒ f (~x) ∼ f (~x0 ) is a direct consequence of f satisfying (†). For, f Ej ∈ I implies ~x ∼ ~x0 ⇒ f (~x) = xi ∼ x0i = f (~x0 ) , while if f Ej ∈ N , then ~x ∼ ~x0 ⇒ f (~x) = f (~x0 ) , so certainly f (~x) ∼ f (~x0 ) . ◦ By induction on f ∈ {C} , for all k simultaneously. induction start: For f ∈ I ∪ N there is nothing to prove; f A ∈ I ∪ N for any A ⊆ Dom(f ) . If f ≡ C, then f Ej is I3i+1 Ej when Ej is of the form A × A × [i]∼ for i = 0, 1 . O.10 induction step: Assume w.l.o.g that f = C ◦ (g0 , g1 , g) , and let Ej be IH arbitrary. By the ih g Ej ∈ I ∪ N , and thus f Ej = gi Ej ∈ I ∪ N , depending on whether g(~x) ∼ 0 or g(~x) ∼ 1 on Ej . q.e.d. Theorem 20 ◦ def (i) {C} = C = {f | f is of the form (†) from lemma 19 } ; ◦ (ii) {C}? = {C̄}◦? = T?B ; ∗ Here f E ∈ I ∪ N means that f Ej ∈ {Iki Ej | 1 ≤ i ≤ k} ∪ {ck Ej | c ∈ N}. 33 2.2. BASIC FUNCTIONS ◦ (iii) {C} ⊥ {C̄}◦ . Proof: (i): The ⊆-direction is lemma 19, while the converse direction follows directly from the observations that each Ej used in the definition of f ∈ C is in the Boolean closure of T? , and that we can easily write an explicit definition of such functions using only C and I ∪ N -functions. (ii): Immediate by (i), proposition 18 (iv), and the obvious equality C? = T?B . (iii): C does not have a top-index, whence C 6∈ {C̄}◦ . As (n, n + 1) ∼ (n + 1, n) and min(n, n + 1) = min(n + 1, n) = n for all n ∈ N, min 6∈ C. q.e.d. ◦ We also see that χ= 6∈ {C} , since f (n + 1, n + 1) ∼ f (n + 1, n + 2) for all ◦ f ∈ {C} . That is, the predicate ‘x = y’ is not available. However, the relation f (~x) ∼ z is in C when f ∈ C . ◦ ◦ def Proposition 21 Let f ∈ {C} and (~x, y) ∈ R ⇔ f (~x) ∼ y . Then R ∈ {C}? . ◦ ◦ Proof: χR = C(C(0, 1, z), C(1, 0, z), f (~x)) ∈ {C} when f ∈ {C} . q.e.d. Compared to the class M of the last section, we see that the two case functions C and C̄ have the advantage of being able to also compute complements. That e.g. I21 is the characteristic function for R = 1 × N is not because I21 ‘knows’ that, say (0, 4) ∈ R; it is ‘by accident’. Of course, C? also contains R since def I21 ∈ C. However, the function f (x, y) = C(0, 1, x) also vanishes on R, but this time it is no accident: this function ‘knows’ it to be true. Furthermore, a slight def modification to, say f (x, y) = C ◦ (1, 0, x), decides {1, 2, . . .} × N = R. Again, because C may decide if its last argument is non-zero or not. Even if, seemingly, the sign-function shares this feature, adding only sgn to I ∪ N is futile: ◦ ◦ Observation 22 {sgn}• = {sgn}? = T? . Proof: This follows by noting that sgn(x) ∼ id(x) for all x ∈ N. Hence given ◦ a definition of f ∈ {sgn} , substituting sgn for id everywhere yields function f 0 ∈ T satisfying ∀~x ∈ Nk f (~x) ∼ f 0 (~x) ∼ (sgn ◦ f 0 )(~x) . The rightmost expression is clearly 0–1-valued, whence the result follows. q.e.d. The sign-function’s shortcomings are further illustrated in a brief discussion about unary functions in section 2.2.3. Rather than comparing the min- and max-classes of the last section to the caseclasses here, we postpone this to the section 2.2.5. On definition by cases. Convention 5 Before we proceed, note then so is the function f1 (~x) def .. f (~x) = . f` (~x) that if C, f1 , . . . , f` , χR1 . . . , χR` ∈ F, , if ~x ∈ R1 .. . , if ~x ∈ R` (d.b.c) 34 CHAPTER 2. MINIMAL IDC.’S This means that we can treat definition by F? -cases as an explicit definition once we know that C ∈ F. Similarly, if C̄ ∈ F, then the function min(f (~x), b) ∈ F where f is as above. Usually, when defining f via an expression like (d.b.c.) above, one has already verified that R1 , . . . , R` is a partition of Nk . This is an unnecessary assumpS tion: if we consider the Ri ’s as ordered by their indices, then i≤` Ri = Nk is a sufficient condition for the equation (d.b.c.) above to define a function uniquely by treating it top-down as an nested if-then-else-construction. Even this requirement can beScircumvented in an idc. F containing C by introducing a default-value for ~x ∈ i≤` Ri , viz. simply define f (~x) = 0 (or any other value) for such ~x. Henceforth, when C ∈ F, we take f1 (~x) , ~x ∈ R1 .. . f (~x) = . , .. f` (~x) , ~x ∈ R` to be an explicit definition of f in F, ordered by the indices of the Ri ’s, and thus such an expression should also be interpreted as the claim that each Ri belongs to F? . On the other hand, the meta-definition f1 (~x) , if ~x ∈ R1 .. . f (~x) = . , .. f` (~x) , if ~x ∈ R` with the ‘if’s in place, will constitute the implicit claim that the Ri ’s form a partition of Nk . We also abbreviate definition by cases as d.b.c. On the robustness of F? in the presence of C. In this section we remark upon the fact that as soon as C ∈ F, then the induced relational class of F is largely independent of how we define a characteristic function. For example F• = F? , and we will close this section with emphasis on this phenomenon. Definition 21 Let X be a set of functions, and ∅ ( A ( N. Define def X?A = f −1 (A) | f ∈ X . def {0} Thus we have that X? = X? . Proposition 23 Let F be an idc. closed under d.b.c. in the sense above, and let ∅ ( A ( N. Then: A ∈ F? ⇔ F?A = F? . Proof: ⇐: Assume F?A = F? . Then A = id−1 (A) ⇒ A ∈ F ? . ⇒ : Assume A ∈ F? , i.e. χA ∈ F. 35 2.2. BASIC FUNCTIONS F? ⊆ F?A : Let B ∈ F? , i.e. there is a function χB ∈ F such that χ−1 B (0) = B. def def def Set a = minA, a = min(N \ A), and b = max(a, a). Then def g(~x) = a , χB (~x) = 0 a , χB (~x) > 0 , is a d.b.c. of g in F, and g −1 (A) = B is obvious. If C̄ is the available function, note that ∀~x (ḡ(~x, b) = g(~x)) (with b regarded as the constant), so that B ∈ F?A in both cases. F?A ⊆ F? : Let f ∈ F be arbitrary. Consider B = f −1 (A) ∈ F?A . Since χA ∈ F by hypothesis, we have 0 , if f (~x) ∈ A ⇔ ~x ∈ B def χA ◦ f (~x) = χA (f (~x)) = . 1 , if f (~x) 6∈ A ⇔ ~x 6∈ B Thus χB = χA ◦ f , and so B ∈ F? . q.e.d. By transitivity of ‘⇔’ and equality we thus have F?A = F?B for all (non-trivial) A, B ∈ F? . Thus, if computational strength of an idc. is reasonably wellmeasured by considering inverse images of {0}, then once d.b.c. is available, the choice of in-set is arbitrary as long as the chosen in-set is no more complex than {0} relative to the idc. under consideration∗ . ∗ It may be interesting to look into what happens if some in-set B, not in F is considered, ? but we have not investigated this any further. 36 CHAPTER 2. MINIMAL IDC.’S 37 2.2. BASIC FUNCTIONS 2.2.3 The predecessor function P . . . and other unary functions. Definition 22 (the predecessor-function) The predecessor-function P : N → N is defined by: x , if x = 0 def P(x) = max(0, x) = x − 1 , if x > 0 . This function carries the germ of being able to distinguish, not only between zero and nonzero arguments, but between say n and n + 1. However, P’s serious ◦ handicap in a minimal setting like that of {P} is that it is unary, and by this ◦ token alone, the number of possible candidates for membership in {P} are severely restricted. ◦ We therefore first investigate some general properties of {U} when U is unary, and conclude by applying these results to the case U = P. Off course, the sign-function mentioned in section 2.2.2 suffers from the same deficiency. Unary functions. On the one hand, with no restrictions on the complexity of the unary func◦ tions involved, the idc. {U1 , U2 , U3 }• can be arbitrarily complex. By fixing some enumeration f1 , f2 , . . . of the unary functions in an arbitrary idc. F, and then letting U1 (n) be e.g. a Kleene-style computation-tree t(1, n) for f1 (n), def U2 (t(i, n)) = t(i + 1, n) and U3 (t(i, n)) = fi (n). Clearly U3 ◦ U2 ◦ · · · ◦ U2 ◦U1 (n) = fm (n) , {z } | m-times so that at least {u ∈ F | u is unary } ⊆ [{U1 , U2 , U3 } ; ] . On the other hand, the naturalness of the Ui ’s above is highly questionable, and other questions like whether or not only two unary functions could suffice for a similar construction falls outside the scope of this dissertation. On the other end of the scale, the negation-function yields simply (where we u only consider unary predicates and indicate this by writing ‘=’) ◦ u ◦ u {χ¬ }• = {χ¬ }? = {∅, {0} , {1, 2, . . .} , N} . In general, the results in section 2.2 on X -trees ensures that when U is any set of unary functions, for any f ∈ U ◦ the predicate f −1 (0) will be of the form −1 N × · · · × N × A × N × · · · × N, where A = (U1 ◦ · · · ◦ Uk ) (0) where each Ui ∈ U . This is evident, as we can write the functions in such an idc. on the form f (~x) = U1 ◦ · · · ◦ Uk (xi ) for suitable Uj ’s and i. Hence it suffices to consider only the unary predicates in U?◦ in order to study their complexity. We shall not spend more time on the subject other than demonstrating what this means for the predecessor-function. 38 CHAPTER 2. MINIMAL IDC.’S ◦ ◦ u u Theorem 24 (i) {P}? = {{0, . . . , m} | m ∈ N }; (ii) {P}• = T• . ◦ Proof: A unary f ∈ {P} is P(m) for some m or a constant. Clearly (i) P(m) (x) = max(0, x − m) = 0 ⇔ x ≤ m, and (ii) P(m) is not 0–1-valued.q.e.d. We now move on to the truncated-difference function, which is the function from which the first interesting results emerge. 39 2.2. BASIC FUNCTIONS 2.2.4 The truncated difference function. This section is different from the preceding ones in the following ways: Most of the chapter is dedicated not to the function mentioned in the heading, but to the successor-function, and to the 1st -order language of Presburger arithmetic. This is because neither concept merits their own sections in the context of this dissertation, but at the same time we are absolutely dependent upon them in order to state our results. After their introduction we state the main results pertaining to the difference-function in a theorem without proof. This is because all proofs are contained in the embedded article Bounded Minimalisation and Bounded Counting in Argument-bounded idc.’s [B09a], appearing in section 2.3.2 (pp. 61–87) on bounded minimalisation and bounded counting. · Definition 23 The (truncated) difference-function − : N2 → N, is defined by: 0 , if x ≤ y def · −(x, y) = max(x − y, 0) = . x − y , if x > y · We immediately adopt the standard infix-notation ‘x −y’ rather than the awkdef ◦ · · ward ‘ −(x, y)’, and also set D = { −} . In most studies of small idc.’s, one has access to some schema op involving · recursions or iterations, and if P ∈ X one obtain ‘ −’ in the resulting class. In · fact, P, − ∈ [ ; pr] , where pr denotes the schema of primitive recursion ∗ . · The fact that we have constant-functions allows us to define e.g. P(x) = x −1 · 2 within D. However, the fact that − is binary – and unlike e.g. I2 depends crucially on both arguments – makes it that much more versatile than any of the other functions we have studied so far. The reader might have noticed that the perhaps most canonical initial function of them all – the successor-function – has not been treated, in fact we have not even mentioned it thus far. The reason for this is that the successor is not a.b., and even though we will introduce it here, it is mostly in order to compare it to our a.b. functions, and to show how we can emulate it up to a certain point with a.b. functions. Addition and successor. Definition 24 (successor) def The addition-function + : Nk → N, is defined by∗∗ : + = +Z N2 . def The successor-function S : N → N, is defined by: S(x) = x + 1 . ◦ ◦ ◦ ◦ Observation 25 (i) {S}? = {S}• = {S, +}• = T• ; (ii) {S, +}? = T?∩ . ∗ (See definition 30 (p. 49) and observation 33 (p. 89). apologise for this awkward definition, which was included under much doubt simply to keep faithful to our convention that (Z, +Z , −Z , ≤Z ) is our naive structure. ∗∗ We 40 CHAPTER 2. MINIMAL IDC.’S The result for S is contained in the argument from section 2.2.3 (p. 37), and that the equality extends to the idc. where + is included is quite obvious for the •-case since there are only the two 0–1-valued functions 0 and 1. Since e.g. def f (~x) = x2 + x4 + x5 = 0 ⇔ x2 = x4 = x5 = 0 the closure under intersections in item (ii) is obvious. As such, these classes do not interest us at all here, and since these functions are not a.b. we are not really interested in them as initial functions either. However, we are very interested in questions of the type can we remove S from the initial functions of some idc. and still retain its computational strength? or if not, can we substitute it for an a.b. function?. Indeed, this kind of questions largely motivated the research presented in this chapter in the first place, and the theory of detour degrees, presented in chapter 3., depends on the subsequent sections in a crucial way. Presburger arithmetic. Definition 25 (Presburger arithmetic∗ ) The 1st -order language of Presdef burger arithmetic, denoted PrA, is defined by: PrA = {0, S, +, <, =}. PrAqf [qf] denotes the quantifier-free formulae of PrA, and PrA? denotes those predicates which are definable by a PrA[?qf] -formula. This notation conforms with the notation found in [B09a], where a few auxiliary results are included. We can now summarise the main results from [B09a]: Theorem 26 (from [B09a]) (i) All f ∈ D are a.b. with a top-index; hence C, max, +, S 6∈ D. ◦ · (ii) D? = D• = {min, max, C, P, −, S, +}? = PrAqf ?. q.e.d. With the exception of D? = D• , this theorem is proved in a series of results in [B09a]. Given that adding C to D does not alter the induced relational class, D? = D• is obvious. ∗ See [B09a] for a more detailed definition. 41 2.2. BASIC FUNCTIONS 2.2.5 Combinations. In this section we investigate the various combinations of the functions we have studied so far. We start off with the idc. {C, P}, and note that by proposition 18 all positive results on membership of various predicates in the induced relational class carry over to {C̄, P}◦? . ◦ Lemma 27 ∀n∈N χ{n} ∈ {C, P} . Proof: χ{0} is id, and for n ≥ 1 we have that χ1{n} (x) = 0 1 , (x ≤ n) ∧ n ≤ x , else is an explicit definition of χ1{n} . By observing that when n ≥ 1 then n ≤ x ⇔ n − 1 < x ⇔ ¬(x ≤ n − 1) , T.24 ◦ ◦ recalling that {C, P}? is Boolean, and that x ≤ n ∈ {C, P}? , we are done. q.e.d. Definition 26 Define∗ def def A ⊂fin N ⇔ |A| ∈ N ∧ A ⊆ N and A ⊂cof N ⇔ A ⊂fin N . When A ⊂fin N or A ⊂cof N, we say that A is finite or cofinite respectively. Next def PC0 = R ⊆ Nk | k ∈ N ∧ R is a Cartesian product of finite or cofinite sets , def and define PC = (PC0 )∩∪ , viz. the closure of PC0 under finite unions and intersection. Observation 28 1. R ⊆ Nk belongs to PC iff R is of the form A1 × · · · × Ak where the Ai ’s are either finite or cofinite. Hence PC is closed under complements. To see this note that e.g. A1 × A2 = A1 × A2 ∪ A1 × A2 ∪ A1 × A2 , and A1 ∈ PC ⇔ A1 ∈ PC. This means that PC = PCB , viz. PC is Boolean. 2. A second useful observation is that def def ~x ∈ f −1 (A) ⇔ ~x 6∈ f −1 (A) ⇔ f (~x) 6∈ A ⇔ f (~x) ∈ A ⇔ ~x ∈ f −1 (A) , so that f −1 (A) = f −1 (A) in general. ∗ ‘⊂ ’ fin was defined in section 1.2, and is consistent with this definition. 42 CHAPTER 2. MINIMAL IDC.’S ◦ Theorem 29 {C, P}◦? = {C, P, min, max}? = PC . ◦ L.27 Proof: Since {C, P}◦? is Boolean, {C, P}? ⊇ PC. We next prove the inclusion ◦ ◦ PC ⊇ {C, P, min, max}? by induction on f ∈ {C, P} , thus completing the proof. induction start: The case f ∈ I ∪ N is obvious. case f = P: P−1 (0) = {0, 1} ∈ PC; case f = C: C−1 (0) = ({0} × N × {0}) ∪ (N × {0} × {1, 2, . . .}) ∈ PC. case f = min: min−1 (0) = ({0} × N) ∪ (N × {0}) ∈ PC. case f = max: max−1 (0) = {0} × {0} ∈ PC. induction step: Now f = h ◦ ~g . The case h ∈ N is trivial, the case h ∈ I follows from the ih. case h = P: Then f ≡ P ◦ g, and † f −1 (0) = g −1 ({0, 1}) ∈ PC . † That ∈ holds follows from combining proposition 23 (p. 34) and the ih on g. case h = C: Then f ≡ C ◦ (g0 , g1 , g), and IH f −1 (0) = (g0−1 (0) ∩ g −1 (0)) ∪ (g1−1 (0) ∩ g −1 (0)) ∈ PC . IH case h = min: Then f ≡ min ◦ (g1 , g2 ), so f −1 (0) = g1−1 (0) ∪ g2−1 (0) ∈ PC . IH case h = max: Then f ≡ max ◦ (g1 , g2 ), so max−1 (0) = g1−1 (0) ∩ g2−1 (0) ∈ PC . q.e.d. ◦ Since any R ∈ PC0 can be obtained as a Boolean combination of {P}? -predicates, we also see that: ◦ B q.e.d. Corollary 30 {P}? = PC. ◦ A 1st -order description of {C, P} . From the theorem 29 we also infer that equality is not a computable predicate ◦ in {C, P} . Formally ‘=’ is {(n, n) | n ∈ N } ⊆ N2 , which is clearly not a finite Boolean combination of cofinite sets. On the other hand, ‘equals n’ is {n} ⊆ N ◦ and belongs to {C, P}? for each n ∈ N. Theorem 31 Let Linit be the 1st -order language {Ln , n | n ∈ N }, without equaldef N N ity, and with intended structure N = (N, LN n , n ) where Ln = {0, 1, . . . , n} and N ∗ n = n. Then PC = {Rφ | φ ∈ Linit } . qf Proof: Let Lqf init be the quantifier-free fragment of Linit . PC = {Rφ | φ ∈ Linit } is obvious, hence we need to show that {Rφ | φ ∈ Lqf init } = {Rφ | φ ∈ Linit } . It is clearly sufficient to prove that for any φ(~x) ∈ Lqf init , there exists some φ0 (x2 , . . . , xk ) ∈ Lqf such that: init ∗ See section 2.1.5 (p. 22) 43 2.2. BASIC FUNCTIONS N |= (∃x1 φ)(n2 , . . . , nk ) ⇔ N |= φ0 (n2 , . . . , nk ) . (z) The left side of (z) means for some m ∈ N we have N |= φ(~n)[x1 := m] ≡ (φ[x1 := m])(~n) . and we know that by basic set-operations Rφ ⊆ Nk can be written as a finite union of intersections of sets in PC0 . Hence we may find m1 , . . . , m` ∈ N such W 0 that φ ≡ j≤` φ(mj , x2 , . . . xk ) is the required Lqf q.e.d. init -formula. Summary of basic functions. · The function ‘ −’ was the last natural a.b. basic function we considered. At the same time it is the first and only basic function which yielded an interesting and slightly unexpected result – in being as computationally strong as it turned out to be. ◦ ◦ We have not bothered to look at e.g. {P, S} or {P, min} separately, since by now the reader can easily supply the details if curious. That is to say, we do not consider these classes very interesting, and e.g. the result on PC was merely included to demonstrate how one would proceed in order to obtain characterisations of such classes. It is clear that all such classes will be contained in D, and that most predicates ‘just happen to be there’ – as if by accident – rather than being put into the relational classes by the computational skills of the initial functions. Because theorem 26 provides us with sufficient information to relate this sections results and inclusions to the other functions encountered so far, we summarise the main features of the terrain of basic-functions-only with the following diagram: T• = T?∨ = {min}? T?∨∧ = M? T?B = (Linit )? = PrAqf ? = ) = ◦ ) T? def ◦ ◦ ◦ {min}• = {max}• = {P}• = T = {0} , Nk k∈N ◦ {min}? = T?∧ {C}? = {C, min, max}? {C̄, P}◦? = {C, P, min, max, }? = ({P}? )B = PC D? = {C, P, min, max, S, +}? ) ) ⊥ ) ) ) ◦ ◦ (• = ?) ◦ ◦ ◦ When it comes to the functional classes, the equalities are scarcer. As we have ◦ pointed out, e.g. {C} ⊥ {C̄}◦ , and by top-index considerations this carries ◦ over to e.g. {C, P} ⊥ {C̄, P}◦ , but we shall not spend more time on the various different combinations of initial functions and the relations between the resulting idc.’s. 44 CHAPTER 2. MINIMAL IDC.’S Finally we would like to remark upon why we have not given the two functions rem, the remainder function, and ·· , the integer division function any attention thus far. Both functions are defined and treated in [B09a], but since regrettably we have not been able to obtain neither interesting nor complete results on these functions, we defer their treatment to that paper. Now that we have a thorough understanding of our chosen initial functions, we move on to investigating various schemata. 45 2.3. SCHEMATA 2.3 2.3.1 Schemata Historical context and motivation. In this section we recall and develop results about small idc.’s of the form [ ; op] and [X ; op] where X consists of one or a few of the initial functions studied in section 2.2. We open this section with a few historical remarks, serving to clarify what we will mean by the various terms employed e.g. computable, recursive or sub-recursive. We then, unlike in our treatment of the basic functions – which for the most part were defined in their respective sub-sections – define formally those schemata which we shall have something to say about. Either because we present new results, or because we want to place these results in a broader context. While doing this we have also included a short account of the history of these schemata’s names and place in the literature. As for the initial functions, we have made our selection with the ulterior motive of finding suitable idc.’s for the theory of relativised detour degrees of the next chapter. For this purpose we also state some properties of the schemata which are relevant for the next section; viz. to which extent they preserve argumentboundedness. This section also seemed a suitable place for giving a definition of the Péterdef S Ackermann-function and the Grzegorczyk-hierarchy E = n∈N E n , as well as various other classes and hierarchies from the literature which we will relate our results to. We would also like to apologise to anyone who feel that we have omitted any person, method or result worthy of mentioning. The following pages should be considered a continuation of this thesis’ introduction, and is intended to motivate and position our research in a historical context. It has not been the intention to write a complete historical overview, and there is no doubt that many contributors and theorems which should have been mentioned have been left out. What is a sub-recursive schema, and recursive vs. computable. First of all, the term recursive function may soon be an anachronism, and its sortie from the vocabulary of working mathematicians has been hastened in particular by R. Soare (see [Soa]). Today a recursive function is synonymous to a computable function – equivalently any function computable by a Turing machine. Saying that some – possibly partial – function f is general recursive usually means that (i) f is computable, and furthermore (ii) this can bee seen because f can be defined by recursive definition (in any sense similar to that of Rogers’ above), possibly by use of the unbounded µ-operator. As we shall not deal formally with partial, or non-total, functions, it suffices to think of them e.g. as maps F : Nk ∪{↑} → N∪{↑}, satisfying F (↑) =↑ . With this rudimentary notion of a partial function, unbounded search is the schema for defining a map F from a map G by: def def F (~x) = min {y ∈ N | G(~x, y) = 0 ∧ ∀z<y (G(~x, z) ∈ N) } where min ∅ =↑ . 46 CHAPTER 2. MINIMAL IDC.’S Of course, knowing when during the ‘computation of F ’ to conclude that the set {y ∈ N | G(~x, y) = 0 } is empty, so as to output ↑ is a very elusive question to say the least. For more on general recursive functions, see e.g. [Odi92] or [Sho67]. So what is a recursive function is our context? Consider the following quote: One method for characterizing a class of functions is to take, as members of the class, all functions obtainable by certain kinds of recursive definition. A recursive definition for a function is, roughly speaking, a definition wherein values of the function for given arguments are directly related to values of the same function for “simpler” arguments or to values of “simpler” functions. The notion “simpler” is to be specified in the chosen characterization—with the constant functions, amongst others, usually taken as the simplest of all. This method of formal characterization is useful for our purposes, in that recursive definitions can often be made to serve as algorithms. – Rogers [Rog67, pp. 5–6] As I understand Rogers, the word ‘recursive’ has the meaning inductive above, and the excerpt can be viewed as a rudimentary and informal definition of an idc., albeit not necessarily of total functions, since he does not explicitly rule out the possibility of e.g. unbounded search. In this thesis, by sub-recursion theory, we mean the study of inductively defined classes of total (general) recursive functions. The set T R of total recursive functions is not itself an idc., as shown in great detail in e.g. Péter [Pet67, Ch. 15]. Thus sub-recursion theory is the study of proper sub-classes of T R. This also means that a recursive schema could simply mean a schema op (see definition 6 (p. 6)) which is defined in such a way that for each of the operators σ ∈ op and ~g ∈ T R, one can extract an algorithm for computing σ(~g )(x) from algorithms for computing gi on arbitrary arguments. As such, composition is a ‘classical recursive schema’ in the sense classical sub-recursive schema. For, in def order to compute the total function f = h ◦ g, we depend upon values from the ‘simpler’ total functions h and g. However, sometimes, when logicians or computer scientists assert that op is a recursive schema, they mean that the canonical algorithm by means of which the functionals σ ∈ op compute σ(g)(x), need access not only to values g(y), as provided by the simpler function g, but also to values σ(g)(y) for ‘y simpler than x’. Usually this means ‘for y < x’, but sometimes this can also mean ‘for y ≺ x’ where ‘≺’ is some computable well-order on N. The canonical example of a recursive schema in this last meaning of the term is that of primitive recursion: Definition (Dedekind [1888]) A function f is defined from g and h by primitive recursion if f (~x, 0) f (~x, y + 1) = = g(~x) h(~x, y, f (~x, y)) – Definition I.1.3. in [Odi92, p. 20] 47 2.3. SCHEMATA This definition from [Odi92] – where it is attributed to Dedekind and his 1888 monograph Was sind und was sollen die Zahlen? – also serves to illustrate that the idea of recursive definitions of functions is a fundamental and long established mathematical idea. Now, if we write h R g for the defined function f , we see that not only is the computation of e.g (h R g)(~x, 2) dependent upon various values g(~y ) or h(~z), but also on the values (h R g)(~x, 1) and (h R g)(~x, 0). This means that not only is the definition of primitive recursion inductive by relying on previously defined functions, but the generated function itself relies – in every computation (h R g)(~x, y) with non-zero y – recursively on itself. Alternatively, we can observe that this simply means that the sequence f (~x, 0), f (~x, 1), f (~x, 2), . . . is inductively defined along the last coordinate. The terminology we will adopt is that a recursive schema is any recursive schema in the sense of Rogers – which could also be called an inductive schema – while a recursion schema, is a schema where in order to compute σ(~g )(x), one may needs access to σ(~g )(y) for y ≺ x. We fully acknowledge the lack of rigour in the discussion above, but as the aim merely is to give an intuitive meaning to the terms mentioned, and not to formally characterise the recursive vs. the non-recursive schemata, and since we only wish to be able to use them in informal discussion, the above distinction should suffice for our needs. The schemata and their definitions. We have already defined the schema comp. The other schemata we will investigate are variants of bounded minimalisation, bounded counting, and various forms of primitive recursion. We will also need to define bounded, limited or truncated versions of some of these schemata in order to compare and relate our results to other research in the field, and knowledge of the schema of bounded sum will also come in handy. As promised we will be less stringent in specifying e.g. arities of the operators, and omit redundant phraseology of the kind ‘Formally the schema op is a family of operators. . . ’. Also, we will frequently refer to schemata not yet defined here. This way of presenting the schemata will suit the reader who is familiar with the various concepts who can concentrate on the particular definition chosen here. For the reader new to the subject, reading this section will entail some jumping back and forth in order to appreciate the various comments and remarks. Definition 27 (bounded minimalisation) The schema bmin is called bounded minimalisation. Given functions g1 (~x, z), g2 (~x, z) and y ∈ N, let Ey = {z ≤ y | g1 (~x, z) = g2 (~x, z) } , and define def µz≤y [g1 , g2 ](~x) = minEy 0 , if Ey 6= ∅ , otherwise . 48 CHAPTER 2. MINIMAL IDC.’S def We say that f = µz≤y [g1 , g2 ] is obtained by bounded minimalisation from g1 and g2 . The different names given to this schema, and the variations over the intuitive concept – searching for the least argument below some predefined threshold (y) – are many, and, as we will see, usually equivalent except in very small idc.’s. The definition 27 above conforms with that of Harrow [Har75, p. 417], except that, following Grzegorczyk, he calls the schema limited minimum. Actually, the schema Grzegorczyk calls limited minimum is a special case of our bounded minimalisation: according to Grzegorczyk [Grz53, p. 9] f is obtained from g by the operation of limited minimum if f = µz≤y [g, 0], a schema dubbed strict limited minimum by Harrow [Har75, p. 419]. If χ= is available, then strict limited minimum and limited minimum are equivalent as µz≤y [χ= ◦ (g1 , g2 ), 0] = µz≤y [g1 , g2 ] , while in very small idc.’s they may not be. A second way to produce a slightly different schema is to change the failedsearch value: let µyz≤y [g1 , g2 ](~x, y) (note the super-scripted y) be defined just as µyz≤y with the exception that when Ey is empty, the output is y rather than 0. Set Y = µyz≤y [g1 , g2 ](~x, y). Then Y , χ= (g1 (~x, Y ), g2 (~x, Y ))) µz≤y [g1 , g2 ](~x, y) = , 0 , otherwise and this demonstrates that the presence of C and χ= ensures the two schemata are equivalent (reverse the rôle of Y and µz≤y [g1 , g2 ](~x, y) in order to obtain the converse direction). A second schema for generating functions by considering values along a coordinate of previously defined functions is that of bounded counting: Definition 28 (bounded `-ary counting) The schema bcount` is called `-fold bounded counting. Given functions g1 (~x, ~z) def and g2 (~x, ~z), where |~z| = `, define a function f = ]~z≤y [g1 , g2 ] by: def f (~x, y) = ~z ∈ N` | max(~z) ≤ y ∧ g1 (~x, ~z) = g2 (~x, ~z) . The schema bcount` is called truncated `-fold bounded counting. Given funcdef tions g1 (~x, ~z), g2 (~x, ~z), where |~z| = `, define a function f = ]~z≤y [g1 , g2 ] by: def f (~x, y) = min(]~z≤y [g1 , g2 ], y) . We usually omit the ` in the superscript, and a ‘z’ rather than a ~z in the subscript will indicate whether we are dealing with unary counting or not. As we will demonstrate, the computational strength is only boosted by going from unary to binary counting in any case. Also note that the definition does not exclude the possibility of ~x being empty. Many classical open questions in sub-recursion-theory regards closure under def unary counting. For example, whether or not the idc. R = [S, × ; bmin] is 2.3. SCHEMATA 49 closed under unary counting is equivalent to whether or not ∆N 0 is closed under counting operations, a still-open problem, studied extensively by e.g. Wrathall [Wra78], Paris & Wilkie [P&V85], Esbelin & More [E&M98] and others. We defer further discussion of these schemata to section 2.3.2. Definition 29 (bounded sum) The schema bsum is called bounded sum. Given a function g(~x, y), the funcdef Py tion f = i=0 g(~x, i) is defined the obvious way. The schema bsum of truncated def bounded sum is defined as expected: bsum(g)(~x, y, z)) = min {bsum(g)(~x, y), z} . As bsum can be called an inherently non-argument bounding schema, we shall not be concerned with it here. The truncated version could have been studied here, were it not for lack of time∗ . A fine point is that in e.g. Rose [Ros84, p. 4] (strict) bounded minimalisation is introduced under the name least number operator, and is regarded a derived schema. In essence Rose proves that closure under bounded sum yield closure under g 7→ µz≤y [g, 0], following the tradition of Péter. This is quite natural given that neither of the authors are concerned with what happens below Grzegorczyk’s E 0 – a class founded on limited recursion, and which thus includes truncated bounded sum in a straightforward way. For this reason we suspect, the bounded minimalisation- and bounded countingschemata are not always included amongst the classical recursive schemata. In this author’s opinion, this is a linguistic mistake where the ‘recursive’ in recursive schema is interpreted as ‘primitive recursion-like’, rather than ‘inductive’. Either way, our next schemata are definitely classical recursive schemata under any interpretation of the term: Definition 30 (primitive recursion and iteration) The schema pr` is called `-fold simultaneous primitive recursion. Given functions g1 , . . . , g` , each of arity k, and h1 , . . . , h` , each of arity k + ` + 1, define ` def functions fi = (~h R ~g )i , each of arity k + 1, by: gi (~x) , if y = 0 def fi (~x, y) = . hi (~x, y − 1, f1 (~x, y − 1), . . . , f` (~x, y − 1)) , if y > 0 The schema it` is called `-fold simultaneous iteration. Given g1 , . . . , g` of arity def k and h1 , . . . , h` of arity k + `, define ` functions fi = (~h I ~g )i , each of arity k + 1, by: gi (~x) , if y = 0 def fi (~x, y) = . hi (~x, f1 (~x, y − 1), . . . , f` (~x, y − 1)) , if y > 0 We omit the ` when irrelevant or clear from context. Following Esbelin & More (see [E&M98, p. 134]), we sometimes refer to an hi (in the context ~h I ~g ) as a transition-function, and a gi is sometimes referred to as a base-function. ∗I have no opinion about what e.g. [ ; bsum] would look like. Playing with it for a while would probably either quickly reveal it to be very weak, like {S, +}◦ , or one would discover · that some auxiliary functions (like C or even −) are constructable, which would mean that quite a lot could be done within that framework. I would go with the former if forced to place a bet, but would prefer not to. 50 CHAPTER 2. MINIMAL IDC.’S Note that ` = 1 yield standard primitive recursion and iteration. The two schemata above are named so as to conform with the naming convention from Kutylowski [Kut87] and Esbelin & More [E&M98], where bounded iteration-schemata are studied. In the early literature, the naming convention have differed; consider e.g. Thus with one parameter, we have Recursion: F (y, 0) = Au, Pure Recursion: F (y, 0) = Au, Iteration: F (y, 0) = Au, Pure Iteration: F (y, 0) = Au, the four recursion schemes: F (u, Sx) = B(u, x, F (u, x)) . F (u, Sx) = B(u, F (u, x)) . F (u, Sx) = B(x, F (u, x)) . F (u, Sx) = BF (u, x) . – Robinson [RRo47, p. 998] From this classification, which is also used by e.g. J. Robinson in [JRo50] and Severin (with the qualifier mixed to distinguish e.g. iteration from pure iteration) [Sev08], we see that what we have called iteration, that is it1 , should be called pure recursion. To add to the confusion, in the embedded article Barra [B08b], the (`-fold simultaneous version of the) second schema from the top in Robinson’s list is called pure iteration, an unfortunate slip on behalf of this author. Kutylowski is one of the authors who uses the same naming convention as we employ here, in as much as in e.g. [Kut87] a schema called bounded iteration corresponds to the limited (in the Grzegorczyk-sense) version of Robinson’s pure recursion. In this dissertation we will concentrate on the `-fold simultaneous versions of the two top-most schemata. The main – and only – difference between them lies in the fact that during the computation of (h R g)(~x, y), the first schema grants the function h direct access to the argument y −1 – called the index of the recursion in [E&M98, p. 134] – while during the computation of (h I g)(~x, y) the function h does not enjoy this perk. Both schemata have well-known limited versions. The inventor of this type of schema was Grzegorczyk∗ : [The function f is defined by] the operation of limited recursion if it [is defined from functions] g, h, j [and satisfies] (a) f (u, 0) = g(u), (I) (b) f (u, x + 1) = h(u, x, f (u, x)), (c) f (u, x) ≤ (u, x)), – Grzegorczyk [Grz53, p. 14] Above ‘u’ represents a list of arguments (viz. u is what we denote by ~x). For the discussion below we call this schema prp and denote by h R≤j g the function f defined by the schema above. ∗ In Odifreddi, [Odi99, p. 266] the same schema, under the name of bounded recursion, is attributed to Skolem and his 1923 paper [Sko23]. However, though the use of recursions disguised as bounded quantifiers abound in this paper, and these bounds or limits are paramount for his philosophical arguments, Skolem certainly does not call this procedure limited or bounded recursion, nor does he explicitly write down Grzegorczyk’s expression for when a function f to be defined from simpler functions g, h, j by such a procedure. 51 2.3. SCHEMATA For the work of Grzegorczyk this way to define a truncated version was exactly what he needed, and there are a mere handful of other papers in the field of sub-recursion theory which have been as influential as [Grz53]. Nevertheless, from a computability point-of-view, there is something unsatisfactory about this definition. It is not very extravagant to demand that the set of definitions (see remark 2 (p. 17)) be a decidable set, but in order to ensure this property, we really need to know when we can apply an operator to previously defined functions. That R≤ ’s domain is not determined simply by arity-considerations, but also by the relationship between h R g and j, unfortunately turns the general problem into an undecidable one. The limited∗ version of it` has been studied mainly by Kutylowski in [Kut87]. Of course, the same objection as above applies also to limited iteration. The problem is thus, given g, h and j: (how) can we verify condition (c) above in general? On the other hand, Is this really a problem? Though not really outside the scope of this dissertation, these questions are not addressed here, except for in an ‘extended remark’ on p. 55 below. In the mean time, we first review the primitive recursive functions and the Grzegorczyk-hierarchy. The primitive recursive functions. def PR = [S ; pr] . The idc. PR defined above is the familiar set of primitive recursive functions, and the induced relational class PR? comprises ‘most’ of the natural predicates encountered in mathematics. It is to convey this fact that we have departed from our naming convention and given [S ; pr] the name PR rather than PR. PR is closed under primitive recursion by definition, but PR is also closed under several types of seemingly more powerful recursion schemata: Theorem (Rose) Let ≺ be a well-order on N of order-type strictly less than ω ω , and let P≺ : N → N satisfy ∀y∈N (P≺ (y) ≺ y). Then if g, h ∈ PR, then so is the function f defined by: g(~x) , if y is the ≺-least element def f (~x, y) = . h(~x, P≺ (y), f (~x, P≺ (y))) , otherwise q.e.d. – Rose [Ros84, p. 59] Theorem (Hilbert-Bernays) ∀`∈N (PR = S ; pr` ) . q.e.d. – Péter [Pet67, p. 62] The interested reader will find a large number of other equivalent forms of recursion-like schemata in both [Pet67] and [Ros84]. Secondly, it is known that the seemingly weaker recursion-schemata it` are also equivalent, e.g.: ∗ There under the name of bounded `-fold iteration. 52 CHAPTER 2. MINIMAL IDC.’S Theorem (Gladstone) PR = S ; it1 . q.e.d. – Gladstone [Gla71] Note that if P is included in the idc. of the last theorem, the corresponding result was shown by Robinson in [RRo47]. The main contribution of Gladstone in [Gla71], was to show how to define P in S ; it1 – a highly non-trivial feat. The Ackermann-Péter-function. In order to find a total general recursive function which is not primitive recursive we will thus have to resort to the powerful method of diagonalisation, and at this point the relationship between the majorisation-relation and sub-recursive classes will start to emerge. In fact, the first transparent example of a general non-primitive recursive function was obtained by an argument relying on majorisation-properties of primitive recursive functions. This intermission also serves as the fundament for a bridge between the research of chapters 2–3 and the work found in chapter 4. At least it offers an explanation as to why topics of no transparent connection are treated within the same dissertation. Returning to the identification of a non-primitive recursive function, the idéa is to find a family of ≺-increasing functions {fi }i∈N ⊆ PR such that ∀f ∈PR ∃i∈N (f ≺ fi ) , (z) def and then consider the diagonal-function F = x 7→ fx (x) . Here and below, f ≺ g means that g strictly majorises f , see definition 1 (p. 5). Since F 6≺ fi for any i, by (z) above, surely F 6∈ PR. The function F is obviously total, and by our naive definition of the term general recursive – ‘computable by some algorithm’, or, Turing machine-computable – obviously general recursive, whence F ∈ T R \ PR. Such a function F can also be defined by , if y = 0 x+1 def F (1, y − 1) , if x = 0 ∧ y > 0 F (x, y) = F (F (x − 1, y), y − 1) , if x > 0 ∧ y > 0 , def where the functions fi = x 7→ F (x, i) will be one family as described above. In order to better see why it works we will use the opportunity to introduce the Grzegorczyk-hierarchy now. The Grzegorczyk-hierarchy. In our presentation of the Grzegorczyk-hierarchy, we will use slightly different functions than in Grzegorczyk’s original paper [Grz53]. As is well-known, this is of no importance whatsoever as long as the back-bone functions fn satisfies fn ∈ E n \ E n−1 and majorises the function x 7→ F (x, n) (for F as described above). Using the particular functions defined below will enable us to familiarise ourselves with the functions upon which much of chapter 4 is founded. 53 2.3. SCHEMATA We define∗ two sequences of functions Ek and Tk by: E0 (x, y) = T0 (x, y, z) = max(x, y); def def def E1 (x, y) = x + y, T1 (x, y, z) = z · x + y and E2 (x, y) = x · y , k≥2 ⇒ Ek+1 (x, y) def Tk (x, y, z) def = = Tk (x, 1, y) y Ek (x, Tk (x, y, z − 1)) ,z=0 ,z>0 As we will prove in more detail in chapter 4 we have e.g. E3 (x, y) = x y · and T3 (x, y, z) = x · y ·x o z . Hence E3 coincides with usual exponentiation where defined, and T(x, y, z) is an ‘exponential tower of x’s, of height z, topped with a y’. Also observe that def E4 (x, y) = T3 (x, 1, y) = x We also have 1 ·x o · · y · =x· ·xo y . ∀k,m∈N (Tk (x, 1, m) ≺ Ek+1 (2, x)) , as is proved as corollary 103 (p. 167) in chapter 4. The significance of this proposition will become clear after we give a definition of the Grzegorczykhierarchy. Definition 31 (Grzegorczyk-hierarchy [Grz53]) def def For each k ∈ N define∗∗ E k = [S, Ek ; prp ], and set∗∗∗ E = S k E k. Now, the original definitions differed somewhat, most significantly in that max was not included at level 0 (for k > 0 it is derivable), so that in fact our E 0 properly contains the original E 0 . The properness of inclusion is a consequence of a top-index-like property functions in Grzegorczyk’s E 0 can be shown to satisfy, but which functions in our E 0 does not share. A brief summary of the results from [Grz53] of main interest to us are∗∗∗∗ : ∗ These are the same functions defined in definition 46 ahead, except that E is not needed 0 there. ∗∗ Here pr is the schema defined by Grzegorczyk above (p. 50), equivalently, the schema p defined in definition 33 below. ∗∗∗ In [Grz53] itself, and in e.g. [Clo96], ‘E’ has been used to denote the set of the Kalmárelementary functions, which in this hierarchy corresponds to E 3 . For a good introduction to this this class see e.g. [Ros84, pp. 3–11] (where it is denoted by ‘E 0 ’). We prefer the convention that for a hierarchy F 0 ⊆ F 1 ⊆ · · ·, the ‘F ’ is extracted from the notation to denote its union. ∗∗∗∗ For the original E 0 , in the item (1), for the case k = 0 the correct bound is ∃ x) c∈N ∃i (f (~ xi + c. 54 CHAPTER 2. MINIMAL IDC.’S Theorem (Grzegorzcyk, Rose) 1. f ∈ E k ⇒ ∃c∈N (f (~x) ≺ Ek+1 (max(~x), c)) ; 2. f ∈ PR ⇒ ∃k∈N (f ∈ E k ), whence PR = E ; 3. E 0 ( E 1 ( E 2 ( E 3 ( E 4 ( · · · ( E ; 4. E?0 ⊆ E?1 ⊆ E?2 ( E?3 ( E?4 ( · · · ( E? ; 5. k ≥ 3 ⇒ E k = [S, Ek ; bmin] . q.e.d. The item (1) as stated is actually a refinement from Rose [Ros84, p. 35]. With respect to the item (4), it is well-known that [max, S ; prp ]? = [S ; prp ]? (see e.g. [Kut87]), so the inclusion of max has no impact on the induced relational classes. It is also a testimony to the importance of [Grz53] in sub-recursion theory∗ that (various formulations) of item (1) has also spawned the terminology that a function f is called n-bounded if it satisfies f En+1 (max(~x), c)) for some c ∈ N, which literally means that it is bounded by a function in E n (see e.g. [Gan84].) To further improve intuition about how recursions and majorisation-properties to a certain extent determines the computational strength of the idc., consider the following standard definition (see e.g. Clote [Clo96, p. 609]): Definition 32 (The op1 -rank of f in [X ; op1 ∪ op]) Define the op1 -rank of a canonical definition∗∗ of f ∈ [X ; op1 ∪ op] by: 0 maxi (rkop1 (gi )) rkop1 (f ) = maxi (rkop1 (gi )) + 1 def , if f ∈ X , if f ≡ σ(~g ) and σ ∈ 6 op1 , if f ≡ σ(~g ) and σ ∈ op1 Theorem (Schwichtenberg [Sch69], Müller (see [Clo96, p. 626]) n ≥ 2 ⇒ {f ∈ [S ; pr] | rkpr (f ) ≤ n } = E n+1 . q.e.d. We shall not be concerned with operator-rank hierarchies in this exposition, but we have included this result here in an effort to illustrate some of the possible ways to look at sub-recursion theory, and to give our readers some pointers as to where they can study different approaches than ours. ∗ Note also the fact the the big-Oh notation, today mostly in use in computer science, is based on a concept reminiscent of 2-boundedness: a function g ‘is O(f )’ if g is majorised by f · c – viz. E2 (f, c) – for some constant c. This notion is not polynomially robust, but it is E1 -robust: the sum of two big-Oh of f functions is itself big-Oh of f . ∗∗ At this point the reader might want to review remark 2 (p. 17). 55 2.3. SCHEMATA A remark on truncated and limited schemata. During this section the following notation will be convenient: for X , Y sets of functions, we abbreviate def X Y ⇔ ∀f ∈X ∃g∈Y (f g) , where the g can be a function of the same arity as f or a unary function. Note also that for X = {f } and Y = {g} we have X Y ⇔ f g in the usual sense when f (and g) is unary – recall definitions 1 & 2. Now, we have already proven that if all f in X are a.b. then X ◦ {id} . That is, if all f ∈ X are a.b., then so are all f ∈ [X ; comp] (where we have made explicit mention of the inclusion of comp in the idc.). It is also proved in [B08a]-Proposition 19 and [B08b]-Lemma 1 that X ; bmin, bcount1 , it` X , when X is a certain initial set of functions which are a.b. functions. That it generalises to arbitrary a.b. X , and is extendable to include pr` as well is straightforward. The same is clearly not true for the schemata of bsum or bcount`+1 for ` ≥ 1, since e.g. ]z1 ,z2 <y [I21 , I21 ](y) = y 2 and bsum(id)(y) = y(y + 1) . 2 It is also easy to see that X ; bmin, bcount1 X ◦ . Of course, any function is majorised by itself, so that X ◦ X ◦ , and any function generated by applying σ in either bmin or bcount1 to any ~g will have a strict top index, and as such is clearly majorised by an appropriate projection. This feature is not shared by the primitive recursive-like schemata, and that is also the observation which makes the Grzegorczyk-hierarchy warrant the limited – or as we could call it – the partial and limited version of primitive recursion. Definition 33 (truncated and limited schemata) Let σ be an `-ary operator such that ar(σ(~g )) = k. Define the (`+1)-ary operator the truncated σ, denoted σ, by: σ(~g , j) = min(σ(~g )(~x), j(~x)) . k That is, σ have domain Dom(σ) × N(N ) , and σ(~g , j) coincides with σ(~g ), for ~x such that σ(~g )(~x) ≤ j(~x). Similarly, define the partial operator the limited σ, denoted σp , by: σ(~g ) , if σ(~g ) ≤ j def σp (~g , j) = ↑ , otherwise def For a schema op, define the truncated op, denoted op, by op = {σ | σ ∈ op }, def and the limited op, denoted opp , by opp = {σp | σ ∈ op }. 56 CHAPTER 2. MINIMAL IDC.’S Thus e.g. pr1p denotes the schema utilised by Grzegorczyk in his seminal work on the primitive recursive functions. It is all but trivial that also [X ; opp ] X ◦ for any collection of operators op, and this observation makes for the stratification of the primitive recursive functions which he achieves. The schema compp has been studied with some success by Barra & Kristiansen in conjunction with pr` in [K&B05], and by Wainer in [Wai72]. Also, we shall see some results on the truncated version of bounded counting in the embedded article which comprises most of the next section. However before we start presenting new results, we round of this introduction/historical review with a few remarks on the truncated and limited schemata. In fact, both types σ and σp may be quite unsatisfactory from a computability point-of-view. The trouble with the limited version is the fact that it is not total, and that unless we have more information about σ, for instance some manageable recursive definition of it, we have no way of deciding whether or not σp (~g , j) is a function or an ‘↑’. The truncated version is unsatisfactory for slightly different reasons. Given that σp (~g , j) is undefined, can we predict what σ(~g , j) will be, or can we sneak in unexpectedly complex predicates in an idc. which uses it? To make this more precise observe first that the following observation is trivial by σ Dom(σp ) = σp : Observation 32 [X ; opp ] ⊆ [X ; op] . q.e.d. Thus, if non-empty, can we control [X ; op]? \ [X ; opp ]? ? It would be convenient to be able to characterise in some way pairs of sets of initial functions X and schemata op where the truncated and the limited version coincide in the sense [X ; op]? = [X ; opp ]? , and there are many questions about making effective the limited (partial) version of a schema in such a way that one can prove equality of the resulting effective version with the limited one. With regard to the schemata, we could say that a priori, obtaining nice idc.’s by means of a standard recursive schemata, total and ‘obviously computable’, is the most satisfying in terms of ultimately analysing idc.’s as algorithms. From this perspective, the truncated `-fold bounded counting is not so suspicious after all. Since informally we are counting positive instances of a certain equality, i.e. when the two previously described functions agree on an argument, we know that once we have found y such arguments in a search, we may stop, since this number cannot decrease. This means that we need not continue any sub-computation involving larger numbers in order to infer what the result of the computation in progress will be. This defence of this particular truncated schema in an idc with a.b. initial functions is further supported by our next results which show that the truncated and the standard schema coincide in computational strength in our small idc. To conclude, we feel that there are many interesting questions regarding the interrelationship between, and the status with regard to computability, truncated, limited and ‘unbounded’ schemata, both as a general question, and with 2.3. SCHEMATA 57 regard to particular classical schemata. These questions will have to be answered elsewhere and at a future time (and quite possibly results exist which we are unaware of). What is important here is that the schemata studied here display a straightforward quality common with the initial functions from section 2.2, and that they also preserve majorisation properties so as to be useful in chapter 3. 58 CHAPTER 2. MINIMAL IDC.’S 2.3. SCHEMATA 2.3.2 59 Bounded minimalisation and bounded counting. This section consists entirely of the embedded article Bounded minimalisation and bounded counting in argument-bounded idc.’s [B09a], which elaborates on, and extends, the research presented in the conference paper A characterisation of the relations definable in Presburger Arithmetic [B08a] and contains proofs omitted in section 2.2.4 related to · the function ‘ −’. Secondly several results about small idc.’s based on the schemata bounded counting and truncated bounded counting are included. The questions asked are motivated by the work of Harrow in [Har73, Har75, Har78], (which answered some problems left open by Grzegorczyk) and the recurring theme of this chapter: finding a.b. idc.’s equivalent to classical idc.’s with non-a.b. members. This paper also contains definitions of, and some results regarding, the remainder function rem, and the integer division-function ·· , which albeit natural initial functions, we have not given their own sections. 60 Bounded Minimalisation and Bounded Counting in Argument-bounded Idc.’s∗ Mathias Barra Dept. of Mathematics, University of Oslo, P.B. 1053, Blindern, 0316 Oslo, Norway georgba@math.uio.no April 20, 2009 1 Introduction This paper is based on the talk given by the author at the TAMC 2008 conference, held in Xi’an, China April 25th –29th , and emerges from some investigations into very small sub-recursive classes, so-called inductively defined classes (idc.’s). At the talk I presented results from the pre-proceedings-paper A Characterisation of the Relations Definable in Presburger Arithmetic [Bar08] in addition to various results which either were unfinished at deadline, or omitted due to page-number limitations. The original motivation has been to discard all non-argument-bounded (definitions follow) functions from the set of initial functions of idc.’s, and to compare the resulting classes to otherwise similar classes. This approach – banning all growth – has proved successful in the past, and has repeatedly yielded surprising and enlightening results, see e.g. Jones [Jon99, Jon01]; Kristiansen and Voda in [K&V03a, K&V03b] (with functionals of higher types, and imperative programming languages respectively); Kristiansen and Barra [K&B05] (with function algebras and λ-calculus); and Kristiansen [Kri05, Kri06]. Recently argument-bounded idc.’s have found a new use in the context of detour degrees à la Kristiansen & Voda (see [K&V08]), and this author expects that some of the results presented here will be very useful in the further development of that theory. However, to find the source of inspiration for this particular work, we must look further back in time. The seminal paper by A. Grzegorczyk Some classes of recursive functions [Grz53] from 1953 was the source of great inspiration to many researchers during the decades to follow. One significant contribution emerged with Harrow’s Ph.D. dissertation Sub-elementary classes of functions ∗ This research is supported by a grant from the The Norwegian Research Council. 1 61 and relations [Har73], and his findings were later summarised and enhanced in Small Grzegorczyk classes and limited minimum [Har75]. There he answered several questions – originally posed by Grzegorczyk – with regard to the interchangeability of the schemata bounded primitive recursion and bounded minimalisation in the small Grzegorczyk-classes E i (i = 0, 1, 2). Another result from [Har73] is that G1? (this and other classes mentioned below are defined in the sequel) is identical to the set of predicates PrA? – those subsets of Nk which are definable by a formula in the language of Presburger Arithmetic. We will show that the classes Gi contain redundancies, in the sense that the increasing functions ‘S’, ‘+’ and ‘×’ can be substituted with their argumentbounded inverses predecessor, (truncated) difference, and integer division and remainder, without affecting the induced relational classes Gi? . That is, the growth provided by e.g. addition, in the restricted framework of composition and bounded minimalisation, does not contribute to the number of computable predicates. In fact, we show that the quantifier-free fragment of Presburger Arithmetic may be captured in a much weaker system: essentially only truncated difference and composition is necessary. Next, we we investigate the seemingly stronger schemata of bounded counting, and bounded n-ary counting, and show that an analogous result holds. Indeed, with bounded counting, not only are the increasing functions substitutable for their argument-bounded inverses – they are redundant altogether. 2 Notation and basic definitions Unless otherwise specified, a function means a function f : Nk → N; the arity of f is then k. A function is argument-bounded1 (a.b.) if, for some cf ∈ N we have2 f (~x) ≤ max(~x, cf ) for all ~x ∈ Nk . We say that f has top-index i if f (~x) ≤ max(xi , cf ). If cf = 0, we say that f is strictly argument-bounded, and that i is a strict top-index. Whenever a symbol ‘x’ occurs under an arrow, e.g. ‘~x’, we usually do not point out the length of the list. We adopt the convention that ~x has length k, and ~g has length `. def The bounded f , denoted fˆ, is the (k + 1)-ary function fˆ(~x, b) = min(f (~x), b). These bounded versions, in particular the bounded versions of increasing functions like Ŝ(x, b) = min(x + 1, b) (bounded successor ), will be of major importance for the ensuing developments. The predecessor, denoted P, is defined def by P(x) = max(x − 1, 0). The case function, denoted C, and the (truncated) · difference function, denoted −, are defined by: 0 , if x ≤ y x , if z = 0 def def def · C(x, y, z) = and x −y = max(x−y, 0) = . y , else x − y , if x > y 1 In [Bar08] we employed the term nonincreasing rather than argument-bounded. Following the advice of one of the referees, we change it here. 2 For the reader familiar with E 0 , the bound which holds for f in G0 and E 0 is f (~ x) ≤ max(~ x)+cf – note the distinction. The latter bound is sometimes referred to as 0-boundedness. 2 62 When we use the symbol ‘−’ without the dot in an expression or formula, we mean the usual minus on Z. def Let φ(x, y, n, r) ⇔ 0 ≤ r < y ∧ x = ny + r. The remainder function and integer division function, denoted rem and ·· respectively, are defined by: x def x , if y = 0 x , if y = 0 def , and rem(x, y) = = r , if φ(x, y, n, r) n , if φ(x, y, n, r) y x def 2 respectively. The choice of 0 = x makes the functions total on N . Secondly, j k we have xy y + rem(x, y) = x for all x and y. I is the set of all projections Iki (~x) = xi , and N is the set of all constant functions c(x) = c for all c ∈ N. A relation is a subset R of Nk for some k. Relations are interchangeably called predicates. Sets of predicates are usually sub-scripted with a ‘?’. For a set F? of relations, we say that F? is Boolean, when F? is closed under finite intersections and complements. def When R = f −1 (0) = ~x ∈ Nk | f (~x) = 0 , the function f is referred to as a characteristic function for R, and is denoted χR . This function is not unique, and we denote by χcR the unique characteristic function for R which satisfies χcR (x) = c when x 6∈ R. Let F be a set of functions. F? denotes theset of relations of F, viz. those def −1 k subsets R ⊆ N with χR ∈ F, formally: F? = f (0) | f ∈ F . The graph of f , denoted Γf , is the relation (~x, y) ∈ Nk+1 | f (~x) = y . We overload Γf to also denote its characteristic function. When A is a set, |A| denotes the cardinality of A, and f A denotes the restriction of the function f to the set A. 3 Schemata, Idc.’s and Overview In this section the stage is set for the ensuing developments. We first introduce our most fundamental notions, schemata, or operations, for defining new functions from previously defined functions, and inductively defined classes, our computational model. Next we give a pointer as to what kind of results to expect. Since quite a lot of notation will be needed we do not give an overview of the results until the last section, lest they be either to cumbersome to state or incomprehensible. We thereafter prove a few useful lemmata, before section-wise introducing our classes, our main results and their proofs. 3.1 Schemata. In this paper we are concerned with the following schemata: Definition 1 (Composition and bounded minimalisation) We say that f def is generated from h and ~g by composition when f (~x) = h(g1 (~x), . . . , g` (~x)). The 3 63 schema of composition will be denoted by comp, and we also write h ◦ ~g for the generated function. We say that f is generated from g1 and g2 by bounded minimalisation3 when f (~x, y) equals the least z ≤ y satisfying the equation g1 (~x, z) = g2 (~x, z) – if such exists – and y else. The schema is denoted bmin, and we write µz≤y [g1 (~x, z) = g2 (~x, z)], or simply µz≤y [g1 , g2 ], for the generated function. In [Bar08] we considered a slightly different version of the bounded minimalisation schema. There, a failed search would return 0 rather than y. In most contexts the two versions are equivalent, but in some very restricted settings this may not be the case. We shall have more to say on this in section 4.3. The third and fourth schema we will study are actually families of schemata: for each n ≥ 1 we will define the schema of n-ary bounded counting, and the schema of argument-bounded n-ary bounded counting: Definition 2 (n-ary bounded counting) The function f is generated from g1 (~x, ~z) and g2 (~x, ~z) – where |~z| = n – by n-ary bounded counting when: f (~x, y) = |{~z | max(~z) < y ∧ g1 (~x, ~z) = g1 (~x, ~z) }| def The schema is denoted bcountn , and we write ]~z<y [g1 (~x, z) = g2 (~x, z)], or ]~z<y [g1 , g2 ], for the generated function. The function f is generated from g1 (~x, ~z) and g2 (~x, ~z) – where |~z| = n – by argument-bounded n-ary bounded counting when: f (~x, y) = max (y, |{~z | max(~z) < y ∧ g1 (~x, ~z) = g1 (~x, ~z) }|) def The schema is denoted bcountn , and we write ]~z<y [g1 , g2 ] for the generated function. 3.2 Inductively defined classes. A fundamental notion to the work presented here is that of an inductively defined class of functions (idc.). An idc. is generated from a set X called the initial, primitive or basic functions, as the least class containing X and closed under the schemata, functionals or operations of some set op of functionals. We write [X ; op] for this set4 . We will always assume that our classes contain projections and constants, and that they are are closed under composition, and we will omit I ∪ N and comp from our notation. Thus e.g. [{C} ; bmin] abbreviates [I ∪ N ∪ {C} ; comp, bmin]. This simply means that our idc.’s are closed under so-called explicit definitions. Remark 1 The careful reader will notice that none of the proofs presented depend on the presence of any other constants than 0 and 1. Consequently, all 3 a.k.a. 4 This bounded search a.k.a. limited minimum. notation is adopted from Clote [Clo96], where an idc. is called a function algebra. 4 64 results regarding the induced relational classes are valid under the assumption that I ∪ {0, 1} is substituted for I ∪ N , and all other results will also hold by minor modifications of some definitions. Informally, all results will hold almost everywhere. Whether to choose N or {0, 1} is largely a matter of style, and this author prefers the straightforwardness afforded by having N available over the (unnessecary) exception-handeling incurred by the more minimalistic {0, 1}. 3.3 Results. The forthcoming results roughly fall in under one of three categories: (1) The first type of results concerns the strength of our schemata when working on their own, viz. we investigate idc.’s of the type [ ; op] where op is bounded minimalisation or bounded counting. We also investigate the strength of the · truncated difference function in the context [{ −} ; ]. (2) The second category of results is concerned with redundancy of basic functions when the (only) operator is counting, and equivalence of basic functions with argument-bounded inverses when the operator is minimalisation. (3) The third kind of results are descriptive-complexity-like characterisations of the induced relational classes. That is, we characterise them by well-known fragments of 1st -order logics. We will introduce these fragments and suitable notation on the fly. 3.4 Preliminary lemmata. Lemma 3 Let G be any idc. closed under bmin, and let f (~x) be any function satisfying either f (~x) ≤ xi or f (~x) ≤ c. Then: Γf ∈ G ⇒ f ∈ G . proof. That Γf ∈ G means that for some function Γf ∈ G satisfies Γf (~x, y) = 0 ⇔ f (~x) = y. By hypothesis we either have f (~x) ≤ xi for fixed i – in which case f (~x) = µz≤xi [Γf (~x, z) = 0] ∈ G – or we have f (~x) ≤ c for fixed c – in which case f (~x) = µz≤c [Γf (~x, z) = 0] ∈ G. Lemma 4 Let G and G 0 be arbitrary idc.’s. Assume χ= ∈ G, and that for every f ∈ G 0 , we have Γf ∈ G. Then G?0 ⊆ G? . def proof. Let f 0 ∈ G 0 be arbitrary. We must show that the predicate R = (f 0 )−1 (0) is also the pre-image of some function f ∈ G. By hypothesis χ= , Γf 0 ∈ G, and def thus, so is the function f = χ= (Γf 0 (~x, 0), 0). That f (~x) = 0 ⇔ χ= (Γf 0 (~x, 0), 0) = 0 ⇔ Γf 0 (~x, 0) = 0 ⇔ f 0 (~x) = 0 , concludes the proof. 5 65 Lemma 5 Let h, ~g be argument-bounded. Then so are the functions h ◦ ~g , µz≤y [g1 , g2 ] and ]~z<y [g1 , g2 ]. Hence, if all f ∈ X are a.b., then all f ∈ [X ; op] are a.b., when op is any of comp, bmin or bcountn . proof. By definition, for some ch , c1 , . . . , c` we have h ◦ ~g (~x) ≤ max(g1 (~x), . . . , g` (~x), ch ) ≤ max(~x, max(~c, ch )) , which proves the case of composition. The two remaining cases are trivial: µz≤y [g1 , g2 ], ]~z<y [g1 , g2 ] ≤ y by definiton. 4 The class F µ def The first class we will consider is the class F µ = [ ; bmin]. Note that there are no initial functions except projections and constants. 4.1 Bootstrapping with bmin. The following section represents a recurrent theme of this paper: bootstrapping, or, establishing the existence of various functions in our small idc.’s. Proposition 6 We have that (i) χ= , χ=n ∈ F µ ; (ii) min ∈ F µ ; (iii) F?µ is Boolean; (iv) F?µ is closed under ∃z≤y -type quantifiers; (v) χ< , χ≤ , χ6= ∈ F µ . proof. Recall that χ1R denotes the characteristic function of R which is 1 on the complement. Set f (x1 , x2 , y) = µz≤y [I31 (x1 , x2 , z) = I32 (x1 , x2 , z)] . Then χc= (x1 , x2 ) = f (x1 , x2 , c) = 0 c , if x1 = x2 , else , which, since now χ=n (x) = χc= (x, n), proves (i). Clearly min(x, y) = µz≤y [x = z] . Hence, if χR ∈ F, then so is χ1R (~x) = min(χR (~x), 1), and finally χcR = µz≤c [χR (~x) = 0] . We now obtain (iii) by: χcR∧S = χc= ◦ (χ1R , χ2S ) and χc¬R = χc= ◦ (χ1R , 1) . Hence χc6= ∈ F. (iv) is especially easy to show; bounded minimalisation is tailor-made for the purpose: ∃z ≤ y R(~x, z) ⇔ χR (~x, µz≤y [χR (~x, z) = 0]) = 0 6 66 Now (v) follows, since x < y ⇔ ¬∃z < y x = z and F µ is Boolean by (iii). We can also do some argument-bounded basic arithmetic: Proposition 7 Ŝ, P ∈ F µ . def proof. Observe that Ŝ(x, b) = min(x + 1, b) = µz≤b [χ< (x, z) = 0]. Next, set P0 (x, y) = µz≤y [Ŝ(z + 1, x) = x] . Then P(x) = P0 (x, x) = µz≤x [Ŝ(z + 1, x) = x]. Proposition 8 Ĉ ∈ F µ . proof. Since Ĉ(x, y, z, b) has strict top-index b, by lemma 3 it is sufficient to show that the graph of Ĉ belongs in F µ . But _ z = 0 ∧ x ≤ b ∧ u = x z 6= 0 ∧ y ≤ b ∧ u = y , Ĉ(x, y, z, b) = u ⇔ u=b and the formula to the right is a F?µ -formula by proposition 6. 4.2 F µ vs. the classes G0 and PL. def Let G0 = {P, S} ; bmin0 – where the superscript ‘0’ to ‘bmin’ indicates that the schema in mind is the variant of bounded minimalisation mentioned above – a class originally defined by Grzegorczyk in [Grz53]. Hence G0 is the smallest Grzegorczyk-class E 0 with minimalisation substituted for primitive recursion. Grzegorczyk posed as open the problem of whether the inclusion G0 ⊆ E 0 was proper or not. Some Some twenty years later K. Harrow [Har75] answered the question in the negative by proving that: Theorem (Harrow) G0 = E 0 ∩ PL . Above PL is the set of piecewise linear functions; see definition 9 below. Part of my motivation for this research has been to find classes of argumentbounded functions which characterise previously studied classes. The reader will have noticed that F µ is ‘G0 without successor’, and that all f ∈ F µ are argument-bounded (lemma 5). Our question is thus: how do G0 and F µ compare? 7 67 Definition 9 (piecewise linear) A function is piecewise linear if it may be written on the form _ f (~x) = y ⇔ (y = L3i (~x) ∧ L3i+1 ≤ L3i+2 ) , 1≤i≤` for some ` ∈ N, where each sub-scripted ‘L’ is either a constant or of the form · xj −c or xj + c. Note that there are finitely many clauses5 . Lemma 10 f ∈ PL ⇒ Γf ∈ F µ . proof. We have already shown that χ≤ , P ∈ F µ . By nesting the predecessor · function, we have that the function x −c is in F µ for all (fixed) c ∈ N. Observe · next that for x, y ∈ N we have x ≤ y + c ⇔ x −c ≤ y, since x < c implies both · · x −c = 0 ≤ y and x < c + y, and since c ≤ x implies x −c = x − c. We also have µ the function Ŝ ∈ F , so by nesting we obtain min(x + c, y) ∈ F µ . Furthermore, x + c ≤ y ⇔ min(x + (c − 1), y) 6= min(x + c, y). The above means we can decide any predicate of the form L3i+1 ≤ L3i+2 . Let f be represented by L3i ⇔ L3i+1 ≤ L3i+2 for i ≤ `. Clearly _ f (~x) = y ⇔ (y = L3i ∧ L3i+1 ≤ L3i+2 ) . 1≤i≤` which, by the above, is a F?µ -predicate. Theorem 11 F?µ = G0? = PL? and F µ ( G0 . proof. Since F µ ⊆ G0 ⊆ PL, we need only show that PL? ⊆ F?µ . Since χ= ∈ F µ by proposition 6, the theorem now follows from lemmata 4 & 10. That F µ ( G0 as function-classes is clear: S ∈ G0 , yet S(x) > x is not a.b. 4.3 A remark on the failed-search-value of bmin. As remarked upon earlier, in [Bar08] we employed a µz≤y -schema which returned the value ‘0’ upon a failed search rather than the value ‘y’. During this y 0 discussion, denote these variants µ0≤y and µy≤y respectively, let F µ and F µ 0 def have the obvious meaning, and also define C µ = [{Ĉ}; µ0≤y ]. 0 y Now, it is quite easy to see that F µ ⊆ F µ : µ0z≤y [g1 , g2 ] = Ĉ(µyz≤y [g1 , g2 ], 0, χ= (g1 (~x, µyz≤y [g1 , g2 ], g2 (~x, µyz≤y [g1 , g2 ]), y) , 5 , if x1 ≤ x2 ∧ 3 ≤ x1 is a member of PL, since the ‘else x2 + 4 , else clause’ can be split into the two clauses x2 + 1 ≤ x1 and x1 ≤ 2. 5 E.g. f (x1 , x2 ) = 8 68 y and the function to the right belongs in F µ , since it involves composing funcy tions which has already been shown to belong to F µ . · By verifying that Ĉ(1, 0, x, 1) = 1 −x, and that Ĉ(0, Ĉ(0, 1, y, 1), x, 1) is zero 0 exactly when either x or y is zero, we conclude that C µ is Boolean. Also, that x , if x ≤ y 0 µz≤y [z = x] = 0 , otherwise 0 0 µ · 0 . Since C µ is Boolean, we implies χ≤ (x, y) = Ĉ(0, 1 −µ z≤y [z = x], x, 1) ∈ C also have characteristic functions for the other standard order-predicates. Since µyz≤y [g1 , g2 ] = Ĉ(µ0z≤y [g1 , g2 ], y, χ= (g1 (~x, µ0z≤y [g1 , g2 ]), g2 (~x, µ0z≤y [g1 , g2 ]), y) , 0 y we also see that F µ = C µ . This means that: 0 y 0 Theorem 12 F µ ⊆ F µ = C µ . The status of the inclusion above remains open. I conjectured in [Bar08] that 0 it is proper, but a proof is lacking. The class F µ seems to be very ill-behaved 0 y in the sense that if f (~x) ∈ F µ , then there is some function f 0 (~x, ~y ) ∈ F µ and 0 c such that f (~x, ~y ) = f (~x) when max(~x) + c < min(~y ), but where the function degenerates otherwise. 5 The class D def · In this section we study the class D = [{ −} ; ]. The reader is asked to note that composition is the sole closure operation of D. 5.1 · Bootstrapping with − . The lemma below is the starting point for showing that the class D is surprisingly powerful. Lemma 13 (bounded addition) The function min(x + y, z) belongs to D. def · · · · · proof. Set f (x, y, z) = z −((z −x) −y). If x + y ≥ z, then (z −x) −y = 0, which · · · yields z −((z −x) −y) = z − 0 = z On the other hand, if z > x + y ≥ x, then · · z −x = z − x > y > 0. Hence (z − x) −y = ((z − x) − y) > 0. But now z > (z − (x + y)) > 0, and so · · · z −((z −x) −y) = z − (z − (x + y)) = z − z + (x + y) = x + y . This function is the key to proving several properties of D and D? . 9 69 Proposition 14 We have that (i) min(x, y) ∈ D; (ii) D? is Boolean; (iii) χ= , χ< ∈ D; (iv) Γmax , ΓC ∈ D; (v) If A or Nk \ A is finite, then A ∈ D? . proof. Clearly min(x, y) = min(x + 0, y), thus (i). Next, given χR1 and χR2 we have that · χR1 ∩R2 = min(χR1 + χR2 , 1) and χNk \R = 1 −χ R1 · · hence (ii). Since x −y = 0 ⇔ x ≤ y and y −x =0 ⇔ y≤ x yields (iii) in conjunction with (ii). Next, max(x, y) = z ⇔ y ≤ x∧z = x ∨ x < y∧z = y . Also, C(x, y, z) = w ⇔ z = 0 ∧ w = x ∨ z < 0 ∧ w = y , hence (iv). Finally (v) follows from (ii), (iii) and the observation that any singleton {n} have characteristic function in D by χ{n} (x) = χ= (x, n). Armed with proposition 14, we may prove the following lemma: Lemma 15 Let r ∈ {<, =}, and let 1 ≤ j < k ∈ N be arbitrary. Then, the following relations belong to D? : j X k X xi r i=1 xi . i=j+1 Pk · proof. Observe first that the function f (x, ~y ) = x − i=1 yi ∈ D, by k consecutive applications of composition. Note also that when R(~x) ∈ D? , and f~ ∈ D, def then the relation S(~y ), defined by S(~y ) ⇔ R(f1 (~y ), . . . , fk (~y )), belongs to D? . We prove the lemma by induction on k. proposition 14 constitutes induction start. Note that j X i=1 xi = k X i=j+1 j j k k X X X X xi ⇔ ¬ xi < xi ∧ ¬ xi < xi . i=1 i=j+1 i=j+1 i=1 Hence, it is sufficient to perform the induction step when r is ‘<’. Moreover, as D? is Boolean, we may invoke the induction hypothesis (i.h.) for r ∈ {≥, ≤, <, >, =}. We proceed to the case k + 1, with 2 ≤ k, and we shall first prove the special cases of j = k. Clearly k X i=1 This shows that Pk i=1 k X · xi < xk+1 ⇔ 0 < xk+1 − xi . i=1 xi < xk+1 ∈ D? by the initial remarks. Next consider the general case 1 ≤ j < k. j X i=1 xi < k X i=j+1 xi + xk+1 ⇔ 10 70 j X i=1 j j k X X X · xi < xk+1 ∨ xk+1 ≤ xi ∧ xi −x . < xi k+1 {z | i=1 ψ } | i=1 {z i=j+1 φ } The conjunct marked φ above needs special attention. Consider the function gj (x1 , . . . , xj , xk+1 ) = j X i=1 · xi −x k+1 . Now, gj is not in D, however, we see that gj (~x, y) = · · · · · · · · (x1 −y) + (x2 −(y −x (†) 1 )) + · · · + (xj −(y −x1 −x2 − · · · −xj−1 )) · · · )) . P Furthermore, when ψ is true, we have that gj (~x, y) = i xi − y. Importantly for us, the expression (†) is a sum of j summands, each summand being a D-function of the variables involved. Hence ψ∧φ ⇔ ψ∧ j X | i=1 k i−1 X X · · x` < xi , xi − xk+1 − `=1 {z i=j+1 φ0 } which – since the φ0 is in D? by the i.h. – concludes the proof when r is ‘<’. 5.2 Presburger Arithmetic and D? . Let PrA be the 1st -order language {0, S, +, <, =} with the intended structure def N = (N, 0, S, +, <)—the natural numbers with the usual order, successor and addition. Terms, (atomic) formulae, and the numerals m are defined in the standard way. As usual we use abbreviations extensively, e.g. t ≤ s for t = s ∨ t < s. Many readers will no doubt have recognized PrA as the language of Presburger Arithmetic, see e.g. Enderton [End72, p. 188]. It is known that the theory of N is decidable (we return to this point in section 6), and that it does not admit quantifier elimination. We overload the symbol PrA to also denote the set of PrA-formulae. For def def φ(~x) ∈ PrA, define Rφ ⊆ Nk by Rφ = m ~ ∈ Nk | N |= φ(m) ~ , and set PrA? = qf ∆0 {Rφ | φ ∈ PrA }. PrA denotes the set of quantifier free PrA-formulae; PrA ∆0 the ∆0 -formulae. PrAqf ? and PrA? are defined as expected. P Any PrA-term t is clearly equivalent to some term i≤k ai xi + m, which is shorthand for (x1 + · · · + x1 ) + · · · + (xk + · · · + xk ) + m . | {z } | {z } a1 −times ak −times 11 71 Thus, any atomic formula φ(~x) ∈ PrA is equivalent to a formula of the form6 X X ai xi + a r bi y i + b . (‡) The main result of this section is Theorem 16 PrAqf ? = D? . Last section’s lemma 15 provides us with most of the proof of PrAqf ? ⊆ D? . qf proof. (of PrAqf ? ⊆ D? ) Let R ∈ PrA? . Then, by definition we have R = Rφ qf for some φ(~x) ∈ PrA . Since D? is Boolean, it is sufficient to prove the lemma for φ atomic. Let φ(~x) be atomic. Hence it is on the form specified in (‡) above. If we let R be the relation X X def R(~x, ~y ) = ai xi + z r bi y i + w , we have by lemma 15 that χR ∈ D. But then X X χR (~n, m, ~ a, b) = 0 ⇔ ai ni + a r bi mi + b ⇔ N |= φ(~n, m) ~ , and so χR (~x, ~y , a, b) = χRφ . Hence χRφ ∈ D. To facilitate the proof of the opposite inclusion, we first consider the language · def · · PrA − = PrA ∪ { −, −}, viz. PrA augmented with new function symbols ‘ −’ def · · and ‘−’, and with intended model Z = (Z, 0, S, +, −, −, <). Note that − is well-defined on all of Z2 by its original definition max(x − y, 0). We also just remark on the fact that for φ ∈ PrA variable free, we have Z |= φ ⇔ N |= φ. · Secondly, to every function f ∈ Dk , there is a PrA − -term tf (~x) such that, for all ~n, m ∈ N we have Z |= tf (~n) = m ⇔ f (~n) = m . · Lemma 17 Let φ(~x) ∈ PrA − be atomic. Then, there is a φ0 (~x) ∈ PrAqf such that, for all ~n ∈ N we have N |= φ0 (~n) ⇔ Z |= φ(~n) . · proof. For a PrA − -term t, we define rk −· (t) to be the number of occurrences · · of the symbol − in t. Note that when φ ∈ PrA − is atomic it is of the form t1 r t2 , and that rk −· (t1 ) + rk −· (t2 ) = 0 implies that either φ ∈ PrAqf , or it is equivalent to φ0 ∈ PrAqf by basic arithmetical considerations. The proof is by induction on rk −· (t1 ) + rk −· (t2 ); the comments above effects the induction start. Induction step: Let rk −· (t1 ) + rk −· (t2 ) = ` + 1. At least one of the ti · · (s) = 1, i.e. must contain a sub-term s of the form s1 −s 2 , and with with rk − s may be chosen to satisfy rk −· (s1 ) = rk −· (s2 ) = 0. Next, consider the terms def 0 def 0 defined by t− i = ti [s := s1 − s2 ] and ti = ti [s := 0], where ti [s := s ] denotes 6 We continue to use r as meta variable, ranging over {<, =} if not otherwise specified. 12 72 the result of substituting s0 for all occurrences of the sub-term s. We note that 0 · (t ) < rk − · (ti ) for at least one i since we have removed at least rk −· (t− i ) = rk − i · one occurrence of − from one of the ti in each construction. Thus we have 0 · (t ) + rk − · (t2 ) = rk − · (t2 ) ≤ `, and moreover rk −· (t− 1 ) + rk − 1 _ s (~x) < s (~x) ∧ t0 (~x) r t0 (~x) 1 2 1 2 . t1 (~x) r t2 (~x) ⇔ s2 (~x) ≤ s1 (~x) ∧ t− x) r t− x) 1 (~ 2 (~ By the i.h., the relation t1 (~x) r t2 (~x) is in PrAqf ? , since each disjunct above is a · − conjunction of atomic PrA -terms of rank strictly less than ` + 1. We are now ready to finish the proof of theorem 16. k −1 proof. (of D? ⊆ PrAqf (0) for some f ∈ Dk . Next, ? ) Let R ∈ D? . Then R = f · − fix a PrA -term tf such that Z |= t(~x) = y iff f (~x) = y. Apply lemma 17 to obtain φ ∈ PrAqf satisfying N |= φ(~x, y) ⇔ Z |= t(~x) = y. But then also ~x ∈ f −1 (0) ⇔ N |= φ(~x, 0), and thus f −1 (0) = Rφ ∈ PrAqf ? . Considering that no increasing functions are available in D – and perhaps even more striking, composition is the only schema – the class D really delivers more than one – at a first glance – would expect7 . It is also of interest that when only composition is available, none of the standard linear initial functions add anything to D? . More precisely: · Theorem 18 [{max, min, C, S, P, +, −} ; ]? = D? . · proof. It is sufficient to show that [I ∪ N ∪ {C, −, +} ; comp]? ⊆ PrAqf ? , since max, min, S and P are clearly definable in this idc. To this end, augment the · · language PrA with function symbols ‘ −’, ‘C’ and ‘−’, and define a ( −, C)-rank · analogously to the ‘ −’-rank defined above. The proof is now almost identical to that of lemma 17, except that in the induction-step we must allow also for the · case that the minimal positive-rank-sub-term is C(s0 , s1 , s2 ) rather than s1 −s 2. We obtain: Induction step (case s ≡ C(s0 , s1 , s2 )): Then _ s2 (~x) = 0 ∧ t1 [s := s0 ](~x) r t2 [s := s0 ](~x) t1 (~x) r t2 (~x) ⇔ s2 (~x) > 0 ∧ t1 [s := s1 ](~x) r t2 [s := s1 ](~x) As before, the formula to the right has strictly smaller rank than the formula to the left, and we are done. We need not worry about ‘+’; it is already accommodated for in the language PrA. 7 In contrast, e.g. [{C} ; ] essentially consists of ∅, {0}, N\{0} and N, (and their products), ? and most other familiar functions, like S, P or +, even fail to produce a Boolean set of relations. 13 73 6 The class Dµ . def · In this section we turn our attention to the class Dµ = [{ −} ; bmin], that is: the merging of D and F µ . Obviously D, F µ ⊆ Dµ , and all of the bootstrapping needed has already been carried out in the previous sections. Indeed, the first thing we will do is to prove a proposition which limits the possible candidates for membership in Dµ . We prove a slightly stronger version than we need – we include bcountn – in this section, so that we may re-use the result later on. · Proposition 19 (top-index) Let f ∈ [{ −} ; bmin, bcountn ]. Then f has a k top-index. Furthermore, if f (N ) is infinite, then the top-index is strict. · proof. By induction on f . For the induction start, i.e. f ∈ I ∪ N ∪ { −}, k the result is obvious: constants are bounded by themselves, Ii (~x) ≤ xi and · x −y ≤ x. For the induction step we must analyse three cases: Case (1) – f = h◦~g : We have that f is bounded by the xi such that i is the top-index of gj where j is the top-index of h. The cf may be fixed to max(ch , ~c). A prerequisite for f (Nk ) to be infinite is that h(N` ) is infinite. Hence, the j is a strict top-index for h by the i.h.. Secondly, since h(~y ) ≤ yj , again if f is to have infinite image, then gj must have infinite image, so a second appeal to the i.h. implies that i is a strict top-index for gj . But then i is a strict top-index for f . The remaining two cases are (2) f = µz≤y [g1 , g2 ] and (3) f = ]~z<y [g1 , g2 ]. In both cases f has strict top-index y by definition. 6.1 Presburger Arithmetic and D?µ . Define PrA∆V as the set of PrA-formulae, where all quantifiers occur in the context ∃z (z ≤ y ∧ φ), which we abbreviate as ∃z≤y φ. The ‘V ’ in ‘∆V ’ is meant to reflect the requirement that the quantified variable must be bounded by a variable, and not a general term, and is also known as finite or linear quantification. V Lemma 20 If f ∈ Dµ , then Γf ∈ PrA∆ ? . proof. By induction on f . That D? = PrAqf ? effects the induction start. Case f = h ◦ ~g : Let ij (for 1 ≤ j ≤ m) be the strict top-index of gj if gj (Nk ) is infinite, and max gj (Nk ) otherwise. Fix PrA∆V -formulae φh (~z, y) and φj (~x, zj ) representing the graphs of h and the gj ’s respectively. Then ^ (h ◦ ~g )(~x) = y ⇔ ∃z1 ≤t1 · · · ∃zm ≤tm φ(~x, zi ) ∧ φh (~z, y) , 1≤i≤m where tj = xij for unbounded gj , and ij else. Clearly, quantification bounded by a constant is merely a finite disjunction, and as such even PrAqf ? is closed under ∃x≤c type quantifiers. Case f = µz≤y [g1 , g2 ]: Let φi represent gi (~x) = y, and define 14 74 y ∧ ¬∃z≤y (φ1 (~x, z) ∧ φ2 (~x, z)) _ w = V w = z ∧ φ1 (~x, z) ∧ φ2 (~x, z) φ(~x, y, w) ⇔ ∃z≤y ∀u≤z (¬φ1 (~x, u) ∨ ¬φ2 (~x, u) ∨ u 6= z) Then φ ∈ PrA∆V , and N |= φ(~x, y, w) ⇔ f (~x, y) = w as required. V Corollary 21 D?µ = PrA∆ . ? V proof. The D?µ ⊆ PrA∆ ? -direction follows immediately from lemma 20. The µ V opposite direction follows from the definition of PrA∆ since PrAqf ? ? ⊆ D? ⊆ D? µ µ and the fact that just like F? , the class D? is closed under linear quantification. Our next theorem is: Theorem 22 D?µ = PrA? . We obtain theorem 22 via the original proof that the theory of PrA is decidable. In 1930 Mojżes Presburger demonstrated this now well-known fact in8 Über die Vollständigket eines gewissen Systems der Arithmetik ganzer Zahlen, in welchem die Addition als einzige Operation hervortritt [Pre30] by proving that the theory of the intended structure N≡ = (N, 0, S, +, <, ≡2 , ≡3 , . . .) for def the language PrA≡ = PrA ∪ {≡2 , ≡3 , . . .} does admit quantifier elimination. In particular, for any PrA-formula φ, there is a φ0 ∈ PrA≡,qf such that N≡ |= φ0 ⇔ N≡ |= φ ⇔ N |= φ . The relation ‘congruence modulo λ’, is definable by x ≡λ y ⇔ ∃u≤x x = y + u + · · · + u ∨ ∃u≤y y = x + u + · · · + u . | {z } | {z } λ-terms λ-terms V Since the right side is clearly PrA∆ ? , by corollary 21 these predicates belong µ to D? for all λ ∈ N. But we can do even better. Lemma 23 Let p, q ∈ N[~x] be linear polynomials, and let λ ∈ N. Then, the relation p(~x) ≡λ q(~x) is in D?µ . def proof. First note that the unary function remλ (x) = rem(x, λ) ∈ Dµ , for each fixed λ ∈ N, since rem(x, λ) = µz≤λ [χ≡λ (x, z) = 0]. 8 The author has consulted the excellent translation by D. Jaquette, On the Completeness of a Certain System of Arithmetic of Whole Numbers in Which Addition Occurs as the Only Operation [P&J91]. 15 75 Pkp p Pkp Write p(~x) = i=1 ai xi + mp . Set Ap = i=1 ai , and note that Ap is independent of ~x. Also, since remλ (x) < λ, we have sp (~x) = kp X ai remλ (xi ) < Ap λ . i=1 Similarly for q(~x). Then p(~x) ≡λ q(~x) ⇔ sp (~x) ≡λ sq (~x) . It remains to show that the relation sp (~x) ≡λ sq (~x) is in D?µ . Since bounded def addition is in D, we also have ŝp (~x, z) = min(sp (~x), z) in Dµ . But then, for A = max(λAp , λAq ), we have p(~x) ≡λ q(~x) ⇔ ŝp (~x, A) ≡λ ŝq (~x, A). Because lemma 23 yields a decision-procedure for all atomic PrA≡ -formulae within D?µ , and since D?µ is Boolean, we have PrA≡,qf ⊆ D?µ . Whence, via ? Presburger’s original results V D?µ ⊇ PrA≡,qf = PrA? ⊇ PrA∆ = D?µ , ? ? which constitutes a proof of theorem 22. Note that this also proves that V Corollary 24 PrA∆ = PrA? . ? 6.2 Dµ vs. the class G1 . def The class G1 is defined by G1 = {0, S, +} ; bmin0 . K. Harrow proved in [Har75] that: Theorem (Harrow) PrA? = G1? ( E 1 . Here E 1 is the familiar Grzegorczyk-class; please refer back to the discussion in section 4.2. As with G0 and F µ we obtain a theorem on the induced relational classes: Theorem 25 D?µ = G1? and Dµ ( G1 . proof. The only part lacking a proof is Dµ ( G1 . Again S ∈ G1 \ Dµ suffices as proof: S is not argument-bounded, while all functions of Dµ are. 7 The class DDµ def Let PA = PrA ∪ {×}, be the language of Peano Arithmetic, and adopt the notational conventions from the previous sections. The relational class PA?∆0 can thus be described as those predicates which are definable by a ∆0 -formula in the language PA of Peano Arithmetic. This class has been the focus of intense and varied studies over the years, and is known by many names. Perhaps the most widely used are ∆N 0 or R. The 16 76 class is known to equal the set of Rudimentary relations, and the class of Constructive Arithmetic (both defined by Smullyan [Smu61]), and has many other characterisations. def The definition of the class G2 – minted in our notation – is G2 = {0, S, +, ×} ; bmin0 . We have: ∆0 ∆V 2 0 Theorem (Harrow) PA∆ . ? = G? and PA? = PA? 0 Off course, that PA∆ PA? is well-known. ? ( · µ def · Let the class DD = −, ; bmin , viz. informally DDµ is obtained by · adding integer division to Dµ . Now, since division is somehow to multiplication what difference is to addition, a natural question to ask is whether the inclusion of this new function makes DDµ to G2 what Dµ is to G1 . The answer is yes. We need the following proposition: Proposition 26 rem ∈ DDµ . proof. We have already seen that Ĉ and Ŝ belongs to sub-classes of DDµ . Now · x x −z rem(x, y) = Ĉ(0, min(µz≤x 6= + 1, x), y, x) . y y Because rem(x, y) ≤ y we have rem ∈ DDµ by lemma 3. 2 0 Theorem 27 DDµ? = PA∆ ? = G? . proof. Observe that xy = z ⇔ y >0∧x= z ∨φ ⇔ y y > 0 ∧ rem(z, y) = 0 ∧ z =x ∨φ , y where φ(x, y, z) is e.g. y = 0 ∧ z = 0, and so the graph of multiplication is computable in DDµ? . Next,the class CA of Constructive Arithmetic predicates, as introduced by Smullyan (see [Smu61]), is defined as the closure of the graphs of addition and multiplication under explicit transformations, Boolean operations and quantification bounded by a variable. Hence CA ⊆ DDµ? . Next, since Harrow proved µ 2 0 in [Har75] that G2? = PA∆ ? , and since DD ⊆ G trivially, we clearly have ∆0 µ DD? ⊆ PA? . This is sufficient, since results by Bennett [Ben62], Wrathall [Wra78], and Lip Ben Wra 0 Lipton [Lip79] imply the nontrivial identities CA = RUD = LH = PA∆ , ? which completes a proof of theorem 27. (Here RUD are Smullyan’s Rudimentary relations, and LH the Linear Hierarchy.). 17 77 8 The classes F n,] and F n,] In this section we study classes def F n,] = ; bcountn+1 and F n,] def = h ; bcountn+1 i . We first observe that for the case n = 0 above, the two schemata bcount1 and bcount1 are the same. When one counts solutions ]z<y [g1 (~x, z) = g2 (~x, z)], the answer is already bounded by y = |{0, . . . , y − 1}|. Accordingly, we will 0,] identify the two, and we will simply write F ] for F = F 0,] . n+1 Secondly, we see that the schema bcount for n ≥ 1 does not necessarily generate argument-bounded functions from argument-bounded functions, and as such, the resulting classes F n,µ does not really fit the profile of the other classes introduced in this paper. However, as we shall see in sections 8.3 & 1,] 2,] 8.4 below, we actually have F = F 1,] = F = F 2,] = · · · . Also, bcount will prove to be the strongest schema so far considered. Socalled counting-quantifiers and counting-operations have been subjected to extensive studies in the literature, see e.g. Paris & Wilkie [P& V85], Schweikardt [Sch05], Esbelin [Esb94] and Esbelin & More [E&M98]. We first take a closer look at F ] . 8.1 Bootstrapping with bcount1 . Let f be any function, and let R = f −1 (0), so that f is a characteristic function for R. Consider the function y , if ~x ∈ R def . F (~x, y) = ]z<y [f (~x) = 0] = 0 , if ~x 6∈ R In particular F (~x, c) is the particular characteristic function for ¬R which we denoted χc¬R earlier. Thus, by applying the above construction twice if necessary, when a relation R belongs to F?] , we may form χcR and χc¬R for all c > 0. Hence: (i) F?] is closed under negation. Moreover – and not surprisingly – when R ∈ F?] we may count the number of z < y for which R(~x, z). Let us denote by χ]R the functions which satisfies def χ]R (~x, y) = |{z < y | R(~x, z) }| . Clearly χ]R (~x, y) = ]z<y [χR (~x, z) = 0], thus: (ii) R ∈ F?] ⇒ χ]R ∈ F ] . Next, we have that: 0 , if x ≥ y 0 , if x1 6= x2 ]z<y [x = z] = and ]z<1 [x1 = x2 ] = , 1 , if x < y 1 , if x1 = x2 belong to F ] by applying bcount to suitable projections, whence we conclude that: (iii) χ≤ , χ6= ∈ F ] . 18 78 Closure under negation yield χ< , χ= ∈ F ] . Since χR∧S = χ= ◦ (χ1R , χ2S ) we also have closure under logical ‘and’, whence: (iv) F?] is Boolean. Consider the function χ]≤ . By (i–iii) we have χ]≤ ∈ F ] . This function counts the set D, defined by: def D = {z < y | x ≤ z } = {z | x ≤ z < y } . · Since y ≤ x ≤ z < y is contradictory, when y ≤ x we have |D| = |∅| = 0 = y −x. On the other hand, if y > x, say x + c, then · |D| = |{x, x + 1, x + 2, . . . , x + (c − 1)}| = c = y −x . · We conclude: (v) that χ]≤ = − ∈ F ] . Hence: D ⊆ F ] . We can now prove closure of F ] under bmin. Recall that the schema we · · consider returns y upon a failed search. Set f = 1 − (1 −] v<z [g1 , g2 ]). Verify that f is 0–1-valued and satisfies def f (~x, z) = 1 ⇔ ∃v<z (g1 (~x, v) = g2 (~x, v)) . Set h(~x, y) = ]z<y [f (~x, z) = 1]. Given ~x, assume v0 is the least element of {v | g1 (~x, v) = g2 (~x, v) }. Now, because of the way we defined f , the function h def counts the set My = {z < y | v0 < z } = {z | v0 < z < y } . We have: ∅ , if y ≤ v0 + 1 ⇒ My = {v0 + 1, . . . , v0 + (n + 1)} , if y = v0 + (n + 2) 0 , if y ≤ v0 + 1 . |My | = n + 1 , if y = v0 + (n + 2) The above implies that y , if y ≤ v0 + 1 y , if y ≤ v0 def · h0 (~x, y) = y −h(~ x, y) = = v0 + 1 , if y = v0 + (n + 2) v0 + 1 , if v0 < y Finally, for def · · · h00 (~x, y) = h0 (~x, y) −χ x, h0 (~x, y) −1), g2 (~x, h0 (~x, y) −1)) , 6= (g1 (~ we obtain h = 00 · · · y −χ x, y −1), g2 (~x, y −1)) , if y ≤ v0 6= (g1 (~ · (v0 + 1) −χ6= (g1 (~x, v0 ), g2 (~x, v0 )) , if v0 < y . · · By the definition of v0 , we see that χ6= (g1 (~x, y −1), g2 (~x, y −1)) = 0. Therefore χ6= (g1 (~x, v0 ), g2 (~x, v0 )) = 1. We conclude that · y −0 = y , if y ≤ v0 y , if y < v0 h00 = = = µz≤y [g1 , g2 ] . · (v0 + 1) −1 = v0 , if v0 < y v0 , if v0 ≤ y If there are no solution v0 , the same function works since we are conceptually always in the case y < v0 . Hence: (vi) F ] is closed under bmin. Clearly (i)–(vi) make for a complete proof of the proposition 28 below: Proposition 28 Dµ ⊆ F ] . Hence PrA? ⊆ F?µ . 19 79 . 8.2 Counting-quantifiers and counting-operations. Counting in PA∆0 has been extensively studied in the literature. Most notably, while we know that when R(~x, y) ∈ E?0 , then so is the predicate def S(x, z, y) ⇔ z = |{u < y | R(~x, u) }| , the analogue statement with respect to PA∆0 is an open problem. The predicate S above can be viewed as generated from R by a so-called counting operation. This terminology is found in e.g. Esbelin & More [E&M98]. That z = |{u ≤ y | R(~x, u) }| ⇔ ]u<y [χR (~x, u) = 0] = z , yields: Proposition 29 F?] is closed under the counting-operation. In Schweikardt [Sch05] the concept of counting-quantifiers is surveyed and studied. In [Sch05] the approach taken is one of extending 1st -order logic =z with counting-quantifiers ‘∃=z y ’ with the intended interpretation that ∃u R(x, u) holds iff z = |{u | R(~x, u) }|. These counting-quantifiers are unbounded a priori, but can obviously be made bounded, in the sense that e.g. def S(~x, z, y) = ∃=z x, u)) , u (u < y ∧ R(~ takes on the meaning: there are u0 < u1 < · · · < uz < y such that (i) for each i < z we have R(~x, ui ), and moreover (ii) if v < y satisfies R(~x, v), then for some i < z we have v = ui . We abbreviate this construction by ∃=z x, u). u<y R(~ Secondly, since ∀u<y R(~x, u) ⇔ ∃=y R(~ x , u) and since ∃ R(~ x , u) ⇔ u≤y u<y R(~x, y) ∨ ∃u<y R(~x, u) we see that the closure of a set of relations R? under the bounded counting-operation – which we denote by R] – is also closed under variable-bounded (∃z≤y -type) quantifiers. Incidentally: Theorem (Schweikardt) PrA? = PrA] . The theorem found in [Sch05] actually asserts that one can extend the underlying 1st -order logic of Presburger Arithmetic with full unbounded countingquantifiers, and still retain equality. The important thing for us is that we can now easily prove: Theorem 30 F?] = PrA? . proof. The proof is by induction on f . The base-cases are obvious – we only need to consider I ∪ N -functions – and the induction step – case comp is exactly as in the proof of lemma 20. Induction step – case f = ]u<y [g1 , g2 ]: For ]u<y [g1 (~x, u) = g2 (~x, u)], let φi (~x, u, w) represent the graphs of the gi ’s, and consider the formula9 : def ψ(~x, y, z) = ∃=z x, u, w) ∧ φ2 (~x, u, w)) . x,u) (φ1 (~ u<y ∃w≤max(~ 9 In the expression, formally the ‘∃ w≤max(~ x,y) ’-quantifier is shorthand for the finite disjunction ∃w≤x1 φ ∨ ∃w≤xk φ ∨ · · · ∨ ∃w≤y φ, which is a PrA-formula when φ is. 20 80 Clearly f (~x, y) = z ⇔ N |= ψ(~x, y, z). Since Schweikardt’s proof implies that one may find an equivalent PrA-formula ψ 0 when the φi ∈ PrA, we are done. That is, in a quite precise sense: Presburger Arithmetic is exactly counting! 8.3 The classes F n,] for n ≥ 2 . The first thing to note is that n-ary bounded counting, as defined above, is not an argument-bounded schema. The bound is actually polynomial in the sense that the number of n-tuples ~z which may be counted by ]~z<y [g1 , g2 ] is bounded only by y n . Still, a top-index-like phenomenon arises for functions in F n,] . We generalise the top-index-notion by saying that f (~x) has polynomial top-index i, if, for some c, n ∈ N, we have f (~x) ≤ max(xni , c); i is a strict (polynomial) top-index if c = 0. Lemma 31 (polynomial top-index) Let f ∈ F n,] . Then f has a polynomial top-index. Furthermore, if f (Nk ) is infinite, then the top-index is strict. proof. For f ∈ I ∪ N the assertion has been proved. Induction step – case f = h ◦ ~g : Let h(~y ) ≤ max(yjnh , ch ) and gj (~x) ≤ n max(xi j , cj ). Then n f (~x) ≤ max ((gj (~x)) h , cj ) ≤ n n n max (max(xi j , cj ))nh , ch ≤ max xi h j , max(cnj h , ch ) . As before, if f is to have infinite image, then h must have infinite image, so n ch = 0. Thus f (~x) ≤ (g(~x)) h . Unless g has infinite image, this function is bounded, so cj = 0 as well. Induction step – case f = ]~z<y [g1 , g2 ]: Now |~z| = n trivially implies strict top-index y n . Hence, functions like max, C, or x · y are excluded from F n,] . Secondly, any function in F n,] is bounded by a polynomial, which can be expressed as F n,] ⊆ E 2 for all n ∈ N. def Next, observe that g(x) = ]z1 ,z2 <x [0 = 0] satisfies g ∈ F 1,] and g(x) = x2 . Hence, by composing g with itself k times, we have g (k) (x) = x2k ∈ F 1,] . In particular, any polynomial p ∈ N[x] with constant term equal to zero10 is in F 1,] . 10 E.g. p(x) = x3 + 5 cannot be in F n,] since it has infinite image, yet p(1) = 6 > 1m for any m. 21 81 def Let f (x, y, b) = ]z1 ,z2 <b [z1 < x ∧ z2 < y]. Then f (x, y, b) = min(x · y, b2 ) ∈ F . Equipped with this function from the ‘bounded multiplication family’ we can easily define enough to get F 1,] off the ground. We have that = 0 ∨ y > x) ∧ v = 0 (y _ 0 < v, y ≤ x x V =v ⇔ · min(y(v −1), x2 ) < x y ∃u≤y min(vy, x2 ) + u = x 1,] Since this predicate is obtainable by substituting the F 1,] -function min(xy, b2 ) · 1,] into a PrA? -predicate, we conclude that · ∈ F . Thus DDµ ⊆ F 1,] . Since F n,] is trivially closed under the counting-operation we obtain: 0 Proposition 32 PA∆ ⊆ F?1,] . ] def Definition 33 Let Byk = ~x ∈ Nk | max(~x) < y . Define eky : N k → N by: eky (~x) = k X i=1 It is well-known that eky xi · y (i−1) . Byk is a bijection between Byk and Byk = y k . Denote by πik,y (z) any function such that πik,y (eky (~x)) = xi for ~x ∈ Byk . Exactly what πik,y does for z = eky (~x) when ~x 6∈ Byk is insignificant; we may simply let πik,y (z) = z for such z ≥ Byk . Secondly, for each fixed k > 0, the functions E k (~x, y) : Nk+1 → N; E k (~x, y) = eky (~x) , Πki : N2 → N; Πki (z, y) = πik,y (z) all have PA∆0 -graphs. Because of how we defined πik,y , we see that Πki (x, y) ≤ x. This means Πki has a top-index, whence Πki ∈ F 1,] directly. Secondly, since E k (~x, y) < y k and is a polynomial in ~x and y, we have that E k ∈ F 1,] since it is equal to min(E k (x, y), y k ). 0 Esbelin & More have shown (see [E&M98]) that PA∆ is closed under polyno] 0 mially bounded quantification. We therefore easily obtain F?n,] ⊆ PA∆ ] , or, equivalently: 0 Lemma 34 f ∈ F n,] ⇒ Γf ∈ PA∆ ] . proof. The proof is by induction on f , for all n simultaneously. The induction start is just as in previous proofs. Induction step – case f = h ◦ ~g : Let φh (~y , w), and φj (~x, zj ) be the representing formulae of h and the gj ’s respectively. Consider ^ def m ψm (~x, w) ⇔ ∃~z≤max(xm φj (~x, zj ) ∧ φh (~z, w) . 1 ,...,xk ) 1≤j≤` 22 82 As usual the bound in the quantifier poses no problem with respect to taking 0 is closed under polynomially bounded quantification, ψm a max. Since PA∆ ] ∆0 is in PA] for all m. By choosing m sufficiently large, viz. so that m is larger than the maximal degree of the polynomial top-indices of the gj ’s, we see that ψm represents the graph of h ◦ ~g . 0 Induction step – case f = ]~z<y [g1 = g2 ]: Let φ(~x, ~z) be the PA∆ ] formula asserting that g1 (~x, ~z) = g2 (~x, ~z), we have ( ! ! ) n ^ n n f (~x, y) = v ⇔ v = u < y ∃~z < y (zi = Πi (u, y)) ∧ φ(~x, ~z) , i=1 which is again clearly a def Define F ω,] = S n∈N 0 PA∆ ] -formula. F n] . Then the above means that: 0 Theorem 35 F ] ( F 1,] = F ω,] . Furthermore F?ω,] = PA∆ ] . 0 proof. By proposition 32, we have PA∆ ⊆ F?1,] . Combined with lemma 34 ] ω,] this implies that the graph of any f ∈ F belongs to F 1,] , and thus also f −1 (0) for any such function. This is indeed not very surprising, and the result is included merely for completeness. There is no way of achieving exponential growth in F n,] for any n, and growth-wise F 1,] dominates the whole ‘hierarchy’. In fact it is easy to see that: 0 Corollary 36 [× ; bcount]? = PA∆ . ] This is so since we can still capture the graph of a [× ; bcount]-function 0 0 in PA∆ by essentially the same formula as above: using that PA∆ is closed ] ] under polynomial substitutions we do not depend upon a top-index. 8.4 The classes F n,] for n ≥ 2. Recall that for these classes, the situation returns to ‘normal’ – we have already proved the relevant top-index result as lemma 19. Is the n-ary schema any stronger than the unary? That is, do we have e.g. 0,] 1,] F ( F , when all functions are argument-bounded? The answer is yes. We may still count pairs, and pairs have to do with multiplication. The point is that we can do most of what was done in the previous section by observing that a bounded multiplication-function is still available: f (x, y, b) = min(b, ]z1 ,z2 <b [z1 < x ∧ z2 < y]) = min(b, min(x·y, b2 )) = min(x·y, b) . 1,] This means that the analogue of proposition 32 still holds of F since we can substitute x for the two occurrences of x2 without destroying the equivalence. Hence: 23 83 1,] 0 Proposition 37 PA∆ ⊆ F? . ] 1,] 0 proof. The proof is simply that PA∆ = DDµ? ⊆ F ? , and that we inherit ? 1,] closure under the counting-operation from the fact that F ? is closed under bounded counting. By combining the obvious fact that F Lemma 38 f ∈ F Define F ω,] def = S n,] n,] ⊆ F n,] and lemma 34 we have: 0 ⇒ Γf ∈ PA∆ ] . n] n∈N F . Again: Theorem 39 F ] ( F 1,] =F ω,] ω,] . Furthermore F ? 0 = PA∆ ] . We see that the F 1,] , is only essential use of the polynomial growth] possible in ∆0 · in defining · : the computational difference between PrA? and PA] can be thought of as the ability to divide. Theorem 40 We have: F 1,] = [{ · · } ; bcount] =? [{×} ; bcount] , where ‘=? ’ indicates that the equality only hold for the relational classes. 9 Summary of results and concluding discussion First of all, let us take a look at a diagram which comprises the results of this paper: F?µ = G0? D? = PrAqf ? D?µ = DDµ? 1,] PL? F?] = G1? = G2? = 0 PA∆ ? F? = ··· = F?1,] = ··· = = ⊇ ) ) ) = = PrA? F? ω,] = 0 PA∆ ] F?ω,] = 0 PA∆ ] ? SCH ∆0 0 PrA? = PrA] = F?] ( PA∆ = F?1,] = F?ω,] ? ⊆ PA] The first features in this diagram we will remark on – and which was the original motivation for this research – are the three equalities F?µ = G0? , D?µ = G1? 24 84 and DDµ? = G2? . In each equality we have that (1) on the functional level there is strict inclusion to the right; (2) the left class results from the right class by substituting a argument-bounded almost-everywhere inverse for the nonargument-bounded functions of that class. By this we mean that with e.g. G1 , · the non-argument-bounded function is +, and − is an a.e. inverse in the sense a.e. · · that for n fixed (n + m) −m = n = (n −m) + m. A similar equation holds for multiplication and the pair ·· and rem. It is sensible to view an idc. as a structure which has some connection with the notion of an algorithm; functions are inductively built up by more basic functions. Our findings thus say that n with the non-iterative schemata bmin, and · bcount it is their ability or non· ability to define and compute with P, −, · and rem which decides the induced relational class. A second note-worthy detail is the way that the class D? provides a fresh view at what happens in Presburger Arithmetic with respect to quantification. There is a D–Dµ -dichotomy, which induces a PrAqf ? –PrA? -dichotomy. Intuitively, this result states that in Dµ bounded minimalisation plays exactly the role of a quantifier: if R is the matrix of some PrA-formula on prenex normal-form, then R is a D? -predicate. In contrast, in the case of DD vs. DDµ (where DD is ‘DDµ without bmin’) 0 there is no such dichotomy. Our proof that DDµ = PA∆ ? was only seemingly 0 easy. What we proved in this paper was simply that CA ⊆ DDµ? ⊆ PA∆ ? . The 0 ⊆ CA – is highly non-trivial. proof of the missing inclusion – PA∆ ? 0 Thirdly, the pair of equalities F?] = PrA? and F 1,] = PA∆ are rather ] striking: 0 PrA] is simply unary counting – PA∆ is simply binary counting. ] Finally, in this paper we have studied non-iterative schemata, where the term non-iterative is not very precise, but simply refers to the contrast to schemata like primitive recursion or iteration. Whether bcount should be counted as def a non-iterative schema is not totally clear. Let I − = [P ; it], where it is the schema of pure iteration. Esbelin proved in [Esb94] the second inclusion in the chain below: ∆0 0 PA∆ ⊆ I?− ; ? ⊆ PA] 0 whether they are proper or not is still unknown. Furthermore PA∆ = I?− ] ∆0 2 implies PA] = E? as shown in Esbelin & More [E&M98]. Thus, informally, 0 bcount is ‘quasi-iterative’ since its idc. hovers between PA∆ ? , where many − believe one cannot count, and the idc. I? where one can count – precisely because one can iterate. References [Bar08] Barra, M.: A characterisation of the relations definable in Presburger Arithmetic, in Theory and Applications of Models of Computation (Pro- 25 85 ceedings of the 5th Int. Conf. TAMC 2008, Xi’an, China, April 25th – 29th ), LNCS 4978 Springer-Verlag Berlin Heidelberg, (2008) 258–269 [Ben62] Bennett, J. H.: On Spectra, Ph.D. thesis, Princeton University (1962) [Clo96] Clote, P.: Computation Models and Function Algebra, in Handbook of Computability Theory, Elsevier (1996) [End72] Enderton, H. B.: A mathematical introduction to logic, Academic Press, Inc., San Diego (1972) [Esb94] Esbelin, H-A.: Une classe minimale de fonctions rcursives contenant les relations rudimentaires. (French) [A minimal class of recursive functions that contains rudimentary relations], in C. R. Acad. Sci. Paris Série I 319(5) (1994) 505–508. [E&M98] Esbelin, H-A. and More, M.: Rudimentary relations and primitive recursion: a toolbox, in Theoret. Comput. Sci. 193(1–2) (1998) 129– 148. [Grz53] Grzegorczyk, A.: Some classes of recursive functions, in Rozprawy Matematyczne, No. IV Warszawa (1953) [Har73] Harrow, K.: Sub-elementary classes of functions and relations, Ph. D. Thesis, New York University (1973) [Har75] Harrow, K.: Small Grzegorczyk classes and limited minimum, in Zeitscr. f. math. Logik und Grundlagen d. Math. 21 (1975) 417–426 [Jon99] Jones, N. D.: logspace and ptime characterized by programming languages, in Theoretical Computer Science 228 (1999) 151–174 [Jon01] Jones, N. D.: The expressive power of higher-order types or, life without CONS, in J. Functional Programming 11 (2001) 55–94 [Kri05] Kristiansen, L.: Neat function algebraic characterizations of logspace and linspace, in Computational Complexity 14(1) (2005) 72–88 [Kri06] Kristiansen, L.: Complexity-Theoretic Hierarchies Induced by Fragments of Gödel’s T, in Theory of Computing Systems (2007) (see http://www.springerlink.com/content/e53l54627x063685/) [K&B05] Kristiansen, L. and Barra, M.: The small Grzegorczyk classes and the typed λ-calculus, in New Computational Paradigms, LNCS 3526, Springer Verlag (2005) 252–262 [K&V03a] Kristiansen, L. and Voda, P. J.: The surprising power of restricted programs and Gödel’s functionals, in Computer Science Logic, LNCS 2803, Springer Verlag (2003) 345–358 26 86 [K&V03b] Kristiansen, L. and Voda, P. J.: Complexity classes and fragments of C, in Information Processing Letters 88 (2003) 213–218 [K&V08] Kristiansen L. and Voda, P. J.: The structure of Detour Degrees, in Theory and Applications of Models of Computation (5th Int. Conf. TAMC 2008, Xi’an, China, April 2008 Proceedings), LNCS 4978, Springer-Verlag Berlin Heidelberg, (2008) 148–159 [Lip79] Lipton, R. J.: Model theoretic aspects of computational complexity, in Proc. 19th Annual Symp. on Foundations of Computer Science, IEEE Computer Society, Silver Spring MD (1978) 193–200 [P& V85] Paris, J. and Wilkie, A.: Counting problems in bounded arithmetic in Methods in mathematical logic. Proceedings, Caracas 1983, Lecture Notes in Mathematics 1130, Springer Verlag, (1985) 317–340 [Pre30] Presburger, M.: Über die Vollständigket eines gewissen Systems der Arithmetik ganzer Zahlen, in welchem die Addition als einzige Operation hervortritt, in Sprawozdanie z I Kongresu Matematyków Slowańskich (1930) 92–101 [P&J91] Presburger, M. and Jaquette D.: On the Completeness of a Certain System of Arithmetic of Whole Numbers in Which Addition Occurs as the Only Operation, in History and Philosophy of Logic 12 (1991) 225–233 [Smu61] Smullyan, R. M.: Theory of formal systems, (revised edition), Princeton University Press, Princeton, New Jersey (1961) [Sch05] Schweikardt, N: Arithmetic, First-Order Logic, and Counting Quantifiers, in ACM Trans. Comp. Log. 6(3) (2005) 634–671 [Wra78] Wrathall, C.: Rudimentary predicates and relative computation, in SIAM J. Comput. 7(2) (1978) 194–209 27 87 88 CHAPTER 2. MINIMAL IDC.’S 89 2.3. SCHEMATA 2.3.3 Primitive recursive schemata. We now take a look at primitive-recursive schemata, that is, various restrictions of∗ pr` . First we review some relevant results by Esbelin, Kristiansen & Voda, Kutylowski and others. After this short review, we present the second embedded article, Pure Iteration and Periodicity [B08b] which supply some of the missing details, and which constitutes this sections original results. The section is rounded of with some remarks on other iterative schemata – unfortunately time has not been sufficient to study these any further. Primitive-recursive schemata. The picture here is fairly complete, and the schema pr1 is very versatile in conjunction with explicit definitions. The bootstrapping is made easy by the fact that the transition function h in h R g has direct access to both the arguments ~x, and to the argument y − 1, which carries with it information about which recursive call is currently being computed. When y > 0 then f (~x, y) = h(~x, y − 1, f (~x, y − 1)) . Secondly (see e.g. Barra & Kristiansen [K&B05]) one can exploit the schema’s inherent ‘subtraction of one’ and ‘test for zero’ to obtain: Observation 33 Hence: P = I22 R 0 and C = I42 R I21 , def E− = · ; pr1 = C, min, max, P, S̄, − ; pr1 . q.e.d. Since a recursive schema which allows the transition function access to the index 0 of recursion facilitates counting in a straightforward manner clearly PA∆ ⊆ E −. ] As we will see, this is not the case when it is substituted for pr, and it is an open problem whether explicitly adding P to it is enough to compensate h in h I g for the loss of the information carried by y − 1. Most of the published research on small iteration-classes is due to Esbelin [Esb94] (in conjunction with the predecessor) and Kutylowski [Kut87] (examining forms of 2-fold limited iteration with predecessor), and the picture is completed in [B08b] for general `-fold iteration without initial functions. The case of `-fold primitive recursion with no initial functions has been treated by Kristiansen in [Kri05]. A compact picture of the situation here can be summarised by∗∗ : ∗ See ∗∗ For [Esb94] † N ; it1 = ; it` (? PrAqf P ; it` ⊆ ? ⊆ ∆ ] (? [Kri05] [ ‡ [Kri05] ; pr` ⊆ P ; it`+1 ⊆ E 2 ⊆? ; pr` . also the discussion in section 2.3.1. the ‘starred relations’, see convention 2 (p. 8). `∈N 90 CHAPTER 2. MINIMAL IDC.’S Above, the ` is of course arbitrary, and one of the main result of [Esb94] was [Esb94] to prove the inclusion ‘ (? ’ above for ` = 1. Recall that ∆N ] is the closure of the set of predicates obtainable by adding counting quantifiers to Peano Arithmetic, and then consider the ∆0 -fragment – see [B08a] for a definition∗ . def We set∗∗ I − = P ; it1 in what follows. Because primitive recursion subsumes iteration – meaning that for general X and op we have X ; it` , op ⊆ X ; pr` , op – the (†)-inclusion is obvious. The (‡)-inclusion is also straightforward and probably known, but it is easier to simply prove it than to search the literature for a reference: Observation 34 ∀`∈N X ; pr`+1 , op ⊆ X , P ; it`+2 , op . Proof: We note first that D ⊆ X , P ; it`+1 , op so that S̄ ∈ X , P ; it`+1 , op . Let hi (~x, xk+1 , ~z), gi (~x) be given for 1 ≤ i ≤ ` = |~z|. Set g0 (~x, u) = 0, and h0 (~x, u, z0 , z1 , . . . z` ) = S̄(z0 , u). Next, for i ∈ {1, . . . , `} set h0i (~x, u, z0 , z1 , . . . z` ) = hi (~x, ~z), gi0 (~x, u) = gi (~x), and f~ = (h~0 I g~0 ) (so the iteration is ` + 1-fold). An easy induction on y shows that f0 (~x, z, y) = min(z, y) and thus also 1 ≤ i ≤ ` ⇒ fi (~x, y, y) = (~h R ~g )i (~x, y) . q.e.d. It is an open problem whether the hierarchy induced by the super-scripted ` to pr or it – the width of the simultaneous iteration or recursion – collapses or not, and this problem is equivalent to that of E?0 vs. E?2 . More precisely, for∗∗∗ S def def S E −(`) = ; pr`+1 , and setting S − = `<ω E −(`) = ; `<ω pr`+1 , we have: Theorem 35 −(0) E? [K&V08] = while, for I −(`) = −(0) I? T. 78 KUT ? ? −(1) E?0 ⊆ E?1 ⊆ E? ? −(2) ⊆ E? ; it`+1 , the situation is: ? ? ? ? −(1) = I?0 = I?1 ⊆ E?0 ⊆ E?1 ⊆ I −(1) ⊆ E? ? KRI ⊆ · · · ⊆ E?2 = ? ? [ `<ω −(`) E? , O. 34 ⊆ I −(2) ⊆ · · · ⊆ E?2 = [ `<ω −(`) I? q.e.d. T. 78 [K&V08] The equality ‘ = ’ will be proved in section 3.3.1, where also the ‘ = ’equality is discussed. The reference to [K&V08] is a reference to a proof of the equality in that paper, but this equality appears to have been proven by Bel’tyukov already in 1979 only to remain unnoticed except for by a very limited number of researchers. ∆0 ∆N ] is called PA] . idc. is called PI in [Esb94] and I−1 in [E&M98]. ∗∗∗ The class denoted E −(`) here was called L` in [K&B05]. ∗ There ∗∗ This . 2.3. SCHEMATA 91 def Similarly, Kutylowski defined classes E 0(`) = S, max ; bpr`+1 and iteration def classes I 0(`) = S, max, P ; bit`+1 and proved various results about these idc.’s in [Kut87]. Now we insert our second embedded article, which shows that the `-fold version of iteration is nowhere near the `-fold version of recursion when it comes to compensating for ‘lack of brains’. More precisely, we investigate the hierarchy∗ def IT −(`) = ; it`+1 . We know from Esbelin [Esb94] that adding P to the initial functions of IT −(0) ? brings it up to at least ∆N ] . As the main result in [B08b] demonstrate, P is not substitutable by increasing recursion-width. In fact, the hierarchy collapses, meaning that nothing is gained by widening the iterations unless some brainfunction – like P – is included. ∗ The classes IT −(`) are called simply IT ` in [B08b]. The renaming here is more appropriate in this more general setting. 92 Pure Iteration and Periodicity A Note on Some Small Sub-recursive Classes Mathias Barra Dept. of Mathematics, University of Oslo, P.B. 1053, Blindern, 0316 Oslo, Norway georgba@math.uio.no http://folk.uio.no/georgba Ë Abstract. We define a hierarchy IT = n IT n of small sub-recursive classes, based on the schema of pure iteration. IT is compared with a similar hierarchy, based on primitive recursion, for which a collapse is equivalent to a collapse of the small Grzegorczyk-classes. Our hierarchy does collapse, and the induced relational class is shown to have a highly periodic structure; indeed a unary predicate is decidable in IT iff it is definable in Presburger Arithmetic. The concluding discussion contrasts our findings to those of Kutylowski [12]. 1 Introduction and Notation Introduction: Over the last decade, several researchers have investigated the consequences of banning successor-like functions from various computational frameworks. Examples include Jones [5,6]; Kristiansen and Voda [9] (functionals of higher types) and [10] (imperative programming languages); Kristiansen and Barra [8] (idc.’s and λ-calculus) and Kristiansen [7] and Barra [1] (idc.’s). This approach—banning all growth—has proved successful in the past, and has repeatedly yielded surprising and enlightening results. This paper emerge from a broad and general study of very small inductively defined classes of functions. In 1953, A. Grzegorczyk published the seminal paper Some classes of recursive functions [4]; an important source of inspiration to a great number of researchers. Several questions were raised, some of which have been answered, some of which remain open to date. Most notorious is perhaps the problem of the statuses of ? ? the inclusions E0 ⊆ E1 ⊆ E2 (see definitions below). In 1987, Kutylowski presented various partial results on the above chain in Small Grzegorczyk classes [12], a paper which deals with function classes based on variations over the schema of bounded iteration. The paper at hand is the result of a first attempt to combine the work of Kutylowski,with the successor-free approach. The results presented here does not represent a final product. However, in addition to being interesting in their own right—by shedding light on the intrinsic nature of pure iteration contrasted to primitive recursion—perhaps they may provide a point of origin for a viable route to the answer to the above mentioned problem. A. Beckmann, C. Dimitracopoulos, and B. Löwe (Eds.): CiE 2008, LNCS 5028, pp. 42–51, 2008. c Springer-Verlag Berlin Heidelberg 2008 93 Pure Iteration and Periodicity 43 Notation: Unless otherwise specified, a function means a function f : INk → IN The arity of f , denoted ar(f ), is then k. The notation f (n) (x) has the usual def def meaning: f (0) (x) = x and f (n+1) (x) = f (f (n) (x)). P is the predecessor function: def def P(x) = max(0, x − 1); S is the successor function: S(x) = x + 1; and C is the def case function: C(x, y, z) = x if z = 0 and y else. I is the set of projections: def def Iki (x) = xi ; N is the set of all constant functions c(x) = c. An inductively defined class of functions (idc.), is generated from a set X whose elements are called the initial, primitive or basic functions, as the least class containing X and closed under the schemata, functionals or operations of some set op of functionals. We write [X ; op] for this class1 . Familiarity with the schema composition, denoted comp, is assumed. We write h ◦ g for the composition of functions g into the function h. A k-ary predicate is a subset R ⊆ INk . Predicates are interchangeably called relations. A set of predicates which is closed under finite intersections and complements, is called boolean or an algebra. A partition of INk is a finite collection of predicates P = {Pi }i≤n , satisfying (i) i Pi = INk , and (ii) i = j ⇔ Pi ∩ Pj = ∅ is called The algebra generated by P is the smallest algebra containing P. It can easily be shown that the algebra generated by P is the set of all unions of sets from P, and that the algebra is finite. When R = f −1 (0), the function f is referred to as a characteristic function of R, and is denoted χR . Note that χR is not unique. Let F be a set of functions. F denotes the set of relations of F , viz. those subsets R ⊆ INk with χR ∈ F for some χR . Formally Fk = f −1 (0) ⊆ INk |f ∈ F k Fk , and F = k∈IN where F denotes the k-ary functions of F . Whenever a symbol occurs under an arrow, e.g. x, we usually do not point out the length of the list—by convention it shall be k for variables and for functions unless otherwise specified. k 2 Background and Motivation The standard schema of -fold simultaneous primitive recursion, denoted pr defines functions f from functions h and g when gi (x) , if y = 0 , fi (x, y) = hi (x, y − 1, f(x, y − 1)) , if y > 0 for 1 ≤ i ≤ . A less studied schema is that of -fold simultaneous (pure) iteration, denoted it , where functions f are defined from functions h and g by gi (x) , if y = 0 fi (x, y) = . hi (x, f(x, y − 1)) , if y > 0 1 This notation is adopted from Clote [2], where an idc. is called a function algebra. 94 44 M. Barra for 1 ≤ i ≤ . We omit the superscript when the iteration or recursion is 1-fold. We refer to the in it as the iteration width. It is well-known thatin most contexts it1 is equivalent to pr for arbitrary 1 ∈ IN. More precisely I ∪ {0, S, P} ; comp, it = I ∪ N ∪ {S} ; comp, pr . Quite a lot of coding is required, and proofs are technical. For this, and more on these schemata consult e.g. Rose [13]. For the schemata of bounded -fold simultaneous primitive recursion (bpr ) and bounded simultaneous -fold pure iteration (bit ) one requires an additional function b(x, y), such that fi (x, y) ≤ b(x, y) for all x, y and i. To the authors best knowledge, bpr was introduced under the name of limited primitive recursion by Grzegorczyk in [4], where the schema was used to define his famous def hierarchy E = n E n . A modern rendering of the original definitions, would be def e.g. E n = [I ∪ N ∪ {S, An } ; comp, bpr], where the function An is the nth Ackermann branch. If we let PR denote the set of all primitive recursive functions, one of the main results of [4] is that E = PR. While it is known that the hierarchy of functions is strict at all levels, and that the hierarchy of relational classes En is strict from n = 2 and upwards, the problem of whether any of the inclusions E0 ⊆ E1 ⊆ E2 are proper or not remains open to date. In 1987 Kutylowski experimented with variations over bounded iteration classes, and published his results in Small Grzegorczyk classes [12]. More predef cisely, Kutylowski defined classes I n = [I ∪ N ∪ {P, An , max} ; comp, bit], viz. I n is essentially2 E n with bit substituted for bpr. That I n = E n for n ≥ 2 is ? ? fairly straightforward. However, for n = 0, 1, a solution of I n ⊆ E n or I1 ⊆ I2 ? ? are as hard as for the problem of E0 ⊆ E1 ⊆ E2 , and so the equivalence of bit and bpr is an open problem in this context. Some interdependencies are nicely summarised in [12]. Of particular interest to us, are Theorems D and F from [12], which establishes that I0 = I1 ; and that I0 = E0 ⇔ E0 = E2 respectively. def Consider the classes Ln = I ∪ N ; comp, prn+1 . In Kristiansen and Barra def [8] the hierarchy L = n Ln is defined. It follows from earlier results in Kristiansen [7] that L = E2 , and more recently Kristiansen and Voda [11] established L0 = E0 . Furthermore, the hierarchy L collapses iff E0 = E2 (see [8]). Hence the following diagram: L0 = E0 ⊆ E1 ⊆ E2 = L . Note that Ln contains no increasing functions, and that the recursive schema is unbounded. The paper at hand emerges from a desire to investigate the intersection of the two approaches: both omitting the increasing functions, and replacing prn with itn . 3 The Hierarchy IT def def Definition 1. For n ∈ IN, IT n = I ∪ N ; comp, itn+1 ; IT = n∈IN IT n . 2 The appearance of ‘max’ is inessential, and can be added also to E n without altering the induced relational class. 95 Pure Iteration and Periodicity 45 The next lemma is rather obvious, and we skip the proof. Lemma 1. A function is nonincreasing if, for some c ∈ IN we have f (x) ≤ max(x, c) for all x ∈ INk . We have that all f ∈ IT n are nonincreasing. We next define a family of very simple and regular partitions, which will be useful for describing the relational classes induced by the IT n . Henceforth, let 0 < μ ∈ IN. For 0 ≤ m < μ let m + μIN have the usual meaning, i.e. ‘the equivalence class of m under congruence modulo μ’. The standard def notation x ≡μ y ⇔ |x − y| ∈ μIN will also be used. It is well-known that viewed as a relation on IN, ‘≡μ ’ defines an equivalence relation, and that equivalences induce partitions. Hence the following definition. Definition 2. Let M(μ, 0) = {m + μIN|0 ≤ m < μ} be the partition of IN induced by ‘≡μ ’. For a ∈ IN, let M(μ, a) denote the partition obtained from M(μ, 0) by placing each of the number 0 through a − 1 in their own equivalence class as singletons3 , and let x ≡μ,a x have the obvious meaning. Let ‘≡kμ,a ’ be the coordinate-wise extension of ‘≡μ,a ’ to an equivalence on INk , def i.e. x ≡kμ,a x ⇔ ∀i (xi ≡μ,a xi ), and let Mk (μ, a) denote the corresponding partition4 . def Set Mk = Mk (μ, a)|μ, a ∈ IN and finally def Mk = Ak (μ, a) and M = μ,a∈IN Mk , k∈IN where Ak (μ, a) is the smallest algebra containing Mk (μ, a). So a predicate P ∈ Mk is a union of Mk (μ, a)-classes for some μ, a. We suppress μ and a when convenient. def If we set5 M(μ, a) M(μ , a ) ⇔ μ|μ ∧ a ≤ a , it is easy to see that ‘’ defines a partial order on M. When M M , we say that M is a refinement of M. Hence, for any pair of partitions there is a least common refinement, given by M(μ1 , a1 ), M(μ2 , a2 ) M(lcm(μ1 , μ2 ), max(a1 , a2 )), and which itself is an element of M. Definition 3. Let Mk = {M1 , . . . , M }. A function f is an Mk -case, if there exist f1 , . . . , f ∈ I ∪ N such that ⎧ ⎪ ⎨ f1 (x) , x ∈ M1 .. .. f (x) = . . ⎪ ⎩ f (x) , x ∈ M 3 4 5 def Thus e.g. M(3, 2) = {{0} , {1} , 3IN \ {0} , (1 + 3IN) \ {1} , (2 + 3IN)}. Note that M(1, a) is the partition consisting of the singletons {n}, for n < a, and {a, a + 1, a + 2, . . .}, and M(1, 0) = {IN}. Hence Mk (μ, a) consists of all ‘boxes’ M1 × · · · × Mk where each Mi ∈ M(μ, a). ‘n|m’ means ‘n divides m’, and ‘lcm’ denotes ‘the least common multiple’. 96 46 M. Barra An M-case, is an Mk -case for some Mk ∈ Mk . Let M denote the set of all M-cases, and let Mk denote the k-ary functions in M. k Informally, an Mk (μ, a)-case is arbitrary on {0, . . . , a − 1} , and then ‘classwise’, or ‘case-wise’, a projection or a constant on each of the finitely many equivalence classes. We write [x]μ,a , for the equivalence class of x under ≡kμ,a (omitting the subscript when convenient). Henceforth, when f is an M-case, a subscripted f is tacitly assumed to be an element of I ∪ N , and we also write f (x) = f[x] (x) = fM (x) when [x] = M ∈ Mk . Yet another formulation is that6 f is an M-case iff f M ∈ I ∪ N for each M ∈ M. The proposition below is easy to prove. Proposition 1. (i) If f is an M-case, and M M , then f is an M -case; (ii) If {Mi }i≤ ∈ Mk , and f (x, y ) = fi (x, y ) ⇔ x ∈ Mi , then f is an M-case; (iii) For all predicates P ∈ M , there is some M-case f such that P = f −1 (0). Conversely, if f is an M-case, then f −1 (0) ∈ M ; Hence (iv) M = M . Our main results are all derived from the theorem below. Theorem 1. IT 0 = IT = M . That is, the hierarchy IT n collapses, and its functions are exactly the M-cases. We begin by establishing some basic facts about IT 0 . It is straightforward to show that characteristic functions for the intersection P ∩P and the complement of P are easily defined by composition from C, I ∪ N , χP and χP . Observing that C ∈ IT 0 —since it is definable by iterating h = I32 on g = I21 —we conclude that IT 0 is an algebra. Furthermore, given any finite partition P = {Pi }i≤n of INk , once we have access to C and {χPi }i≤n , we can easily construct a P-case, by nested compositions. That is, when {χPi }i≤n ⊆ IT 0 , then any P-case belongs to IT 0 . Thus, a proof the inclusion M ⊆ IT 0 is reduced to proving that all M(μ, a) belong to IT 0 . Lemma 2. 2IN, 1 + 2IN ∈ IT 0 . 0 Proof. Since IT 0 is an algebra, it is sufficient to show that 2IN ∈ IT . Consider 0 ,x = 0 the function f (x) = . C(1, 0, f (x − 1)) , x > 0 By induction on n, we show that f (2n) = 0 and f (2n + 1) = 1. For the induction start, by definition f (0) = 0, and thus f (1) = C(1, 0, f (0)) = 1. Case (n+1): Since 2(n + 1) − 1 = 2n + 1, we obtain def IH def f (2(n + 1)) = C(1, 0, f (2n + 1)) = C(1, 0, 1) = 0 , † Clearly f (2(n + 1) + 1) = C(1, 0, f (2(n + 1)) = C(1, 0, 0) = 1, so f = χ2IN . 6 f M denotes the restriction of f to M . 97 (†) Pure Iteration and Periodicity 47 Viewed as a binary predicate, ‘≡μ ’ is the set {(x, y)| |x − y| ∈ μIN}. (So ‘≡1 ’ is all of IN2 , and ‘≡0 ’ is ‘equality’. Recall the restriction μ > 0.) A reformulation of Lemma 2 is thus that the unary predicates ‘x ≡2 0’ and ‘x ≡2 1’ both belong to IT 0 . Below the idea of the last proof is iterated for the general result. Lemma 3. For all μ, the binary predicate ‘x ≡μ y’ is in IT 0 . Proof. The proof is by induction on μ ≥ 2; the case μ = 1 is trivial. Lemma 2 effects induction start, since x ≡2 y ⇔ C(χ2IN (y), χ1+2IN (y), χ2IN (x)) = 0. Case μ + 1: By the i.h. the partition M(μ, 0) is in IT 0 : for m < μ we have χ(m+μIN) (x) = χ≡μ (x, m). Thus M(μ, 1) also belongs to IT 0 , since we have x ∈ μIN \ {0} ⇔ C(1, χμIN (x), x) = 0 and χ{0} (x) = C(0, 1, x). Next, consider the M(μ, 1)-case μ , x ∈ {0} f (x) = m − 1 , x ∈ m + μIN \ {0} for 1 ≤ m ≤ μ . Note that (i) f {0,...,μ} is the permutation (μ 0 1 · · · (μ − 1)). Moreover, for all n ∈ IN and m ∈ {0, . . . , μ}, we have that (ii) f (n(μ+1)) (m) = m; and (iii) f (m) (m) = 0. Hence, if y ≡μ+1 m and 0 ≤ m, m < μ + 1, then (ii) (i)&(iii) f (n(μ+1)+m ) (m) = f (n(μ+1)) f (m ) (m) = f (m ) (m) = 0 ⇔ m = m . Thus, the function f (y) (m) is a characteristic function for m + (μ + 1)IN. This function is clearly definable in IT 0 fromf , constant functions and it1 . Since IT 0 is boolean, and since x ≡μ+1 y ⇔ μm=0 (x ≡μ+1 m ∧ y ≡μ+1 m), we are done. Corollary 1. For all n ∈ IN, the predicate ‘x = n’ is in ∈ IT 0 . Proof. For n = 0 the proof is contained in the proofof Lemma 3. Secondly, n + 1 , if x ≡n+1 0 for n > 0, we have that the M(n + 1, 0)-case h(x) = is m + 1 , if x ≡n+1 m def in IT 0 . Consider the function f (x) = h(x) (1). Since 0 < n implies 1 ≡n+1 0, we obtain 0 ≤ x < n ⇒ h(x) (1) = x + 1 > 0 and h(n) (1) = 0. By definition h(0) = h(n + 1) = n + 1 > 0. But then f (x) = 0 iff x = n, viz. f = χ{n} . Proposition 2. M ⊆ IT 0 . Proof. Combining Lemma 3, Corollary 1, and an appeal to the boolean struc ture of IT 0 yields ∀μ, a ∈ IN M(μ, a) ⊆ IT 0 . Hence, any M-case is definable in IT 0 . The proposition now follows from Proposition 1 (iii). It is trivial that IT 0 ⊆ IT ; thus the inclusion IT ⊆ M would complete a full proof of Theorem 1. Since the basic functions I ∪ N of IT n obviously belong to M, we have reduced our task to proving closure of M under comp and itn . 98 48 M. Barra Lemma 4. M is closed under composition. Proof. Consider h ∈ M , and g 1 , . . . , g ∈ Mk . Because f ∈ Mk means that for some M, f is a case on the coordinate-wise extension of M to Mk , we can find M(μh , ah ) and M(μj , aj ) for 1 ≤ j ≤ corresponding to h and the g j ’s respectively, and for which we may find a common refinement M. Hence we may consider the g j ’s Mk -cases, and h an M -case. It is easy to show that x ≡k x ⇒ g(x) ≡ g (x ), when g are Mk -cases. Formally ∀M ∈ Mk ∃M ∈ M (g(M ) ⊆ M ). j ∈ I ∪ N , we have But then, where h M = hM ∈ I ∪ N , and g j M = gM f M = hM ◦ gM ∈ I ∪ N since I ∪ N is closed under composition. Since f is an M-case, precisely when its restriction to each M-class belongs to I ∪ N , the conclusion of the lemma now follows. Remark 1. Note the property of M-functions which enters the proof: when g are Mk -cases, they preserve equivalence: x ≡k x ⇒ g (x) ≡ g(x ). Lemma 5. M is closed under simultaneous pure iteration. Proof. Let be arbitrary, and let h1 , . . . , h ∈ Mk+ , g 1 , . . . , g ∈ Mk . As before—considering a least common refinement if necessary—assume w.l.o.g. that the hj ’s are Mk+ (μ, a)-cases, and the g j ’s are Mk (μ, a)-cases. Moreover, let C equal the value of the largest constant function corresponding to some gij or hji , and note that because separating off more singletons does not necessitate the introduction of any new constant functions, there is no loss of generality in assuming a > C. It is sufficient to show that for arbitrary M ∈ Mk (μ, a), we have that each st j f M×IN is a case on the (k + 1) coordinate, viz. that there is M(μM , aM ) ∈ M such that f M×IN (x, y) = f[y]μM ,aM (x, y). If this can be shown, each f j will be an Pk+1 -case for any common refinement P ∈ M of M(μ, a) and the finitely many M(μM , aM )’s. Thus let M ∈ M(μ, a) be arbitrary. Write fyj (x) for f j (x, y), and fy for 1 fy , . . . fy . It is easy to show that f j (x, y) ≤ max(x, C). Hence, for fixed z ∈ M , is contained in {0, . . . , max(z, C)} . By the the sequence of -tuples fy y∈IN pigeon hole principle there are indices aM and bM such that C < aM < bM and such that faM (z) = fbM (z ). If we let these indices be minimal and set μM = bM − aM , this observation translates to: , if y < aM fy (z) fy (z) = . faM+m (z) , if aM ≤ y ∈ m + μM IN and 0 ≤ m < μM So far we have that f restricted to {z }×IN is an M(μM , aM )-case on the (k + 1)st variable. Secondly, for arbitrary x ∈ INk , and by exploiting the fact that M-cases preserve M in the sense of Remark 1, it is straightforward to show by induction on y that the sequence of M (μ, a)-classes [f0 (x)]M , [f1 (x)]M , [f2 (x)]M , . . . is uniquely determined by the Mk (μ, a)-class of x. 99 Pure Iteration and Periodicity 49 x , x ∈ {z} . Below, zi , x = zi . . , φ(x ).Obviously we simply write φ for φzz , and φ(x) for 1 ), . φ(x φ(z) = z . We next show by induction on y that φ fy (z) = fy (z ) . Let z, z ∈ M . Define a map φzz : IN → IN, by φ(x) = y∈IN y∈IN Induction start: We have j j = Iki φ(zi ) = zi , if gM zi , if gM = Iki j j j f0 (z ) = gM (z ) = ⇒ φ(f0 (z )) = . † j j c , if gM = c φ(c) = c , if gM = c The equality marked (†) is justified thus: Since a > C ≥ c, we infer that the M(μ, a)-class of such c is in fact {c}. Hence, if c ∈ {z}, say c = zi , then zi = zi , and so φ(c) = c. If c ∈ {z }, then φ(c) = c by definition. Induction step: Recall that the M (μ, a)-class of fy (z ) and fy (z ) coincide, and set My = M × [fy (z )]M (μ,a) . Then ⎧ ⎪ , if hjMy = Ik+ and 1 ≤ i ≤ k i ⎨ zi j j j k+ j i fy+1 (z) = h (z, fy (z )) = hMy (z, fy (z)) = fy (z) , if hMy = Ik+i and 1 ≤ i ≤ ⎪ ⎩c , if hjMy = c j (z ), simply add primes to the z’s above. By invoking the i.h. for the For fy+1 case of hjMy = Ik+ k+i , the conclusion follows by the same argument employed in the induction start. Since f (z , y) is an M(μM , aM )-case restricted to {z} × IN, and since we have indeed shown that for z ∈ M, if f (z, y) = zi , then f (z , y) = zi , and similarly, if f (z, y) = c then f (z , y) = c, we are done. Proposition 3. IT ⊆ M . Theorem 2. IT 0 = IT = M = M . Theorem 2 is a direct corollary to Theorem 1, the proof of which is completed by Proposition 3. We now have a complete characterisation of the induced relational class in terms of very simple and regular finite partitions of IN. Before the concluding discussion, we include a result which relates IT to Presburger Arithmetic. 3.1 A Note on Presburger Arithmetic and IT Let PrA be the 1st -order language {0, S, +, <, =} with the intended structure def NA = (IN, 0, S, +, <)—the natural numbers with the usual order, successor and addition. Many readers will recognize PrA as the language of Presburger Arithmetic, see e.g. Enderton [3, pp. 188–193]. Let PrA denote the predicates definable by a PrA-formula. Consider the following [3, p. 192, Theorem 32F]: A set of natural numbers D belongs to PrA iff it is eventually periodic, where eventually periodic means 100 50 M. Barra that for some μ, a ∈ IN we have n > a ⇒ (n ∈ D ⇔ n + μ ∈ D). Since this is exactly what it means for D to be in M, if we let PrAu and IT u denote the unary predicates of the respective classes, we immediately see that PrAu and IT u coincide. However, this result does not hold for higher arities, since we have the following corollary to Theorem 1: Corollary 2. χ< , χ= ∈ IT Proof. If χ< ∈ IT = M, then it is an M(μ, a)-case for some μ, a ∈ IN. Clearly a ≡μ a + μ. Hence, we have 1 = χ< (a, a) = χ< (a, a + μ) = 0; a contradiction. Similarly, 0 = χ= (a, a) = χ= (a, a + μ) = 1. As both ‘equals’ and ‘less than’ are primitive to PrA , we obtain: Theorem 3. (i) IT u = PrAu ; (ii) IT PrA . 4 Discussion and Directions for Further Research Consider the assertion: ‘iteration is inherently weak and periodic’. What we have shown beyond doubt is that iteration is weak and periodic when working alone. Secondly we have shown with Theorem 1 that iteration width, or simultaneity—in the very weak context of this paper—does not add any computational strength. Contrasted to the recursion-based hierarchy L, which does not collapse unless E0 = E2 , we see that in other weak contexts simultaneity may in fact be stronger. Recall that I 0 is essentially IT 0 with predecessor and successor. In what def follows we omit the explicit mention of I ∪ N and comp. Let PIT = [{P} ; it], and so we have ? ? Kut IT PIT ⊆ I0 = I1 ⊆ E0 . Note that iteration width is restricted to 1, and that if two-fold iteration over P is allowed, one obtains at least one-fold primitive recursion. The first, proper containment follows from e.g. χ= ∈ PIT . The schema of bounded minimalisation, denoted bmin, defines a function f from functions g1 , g2 by f (x, y) = μz≤y [g1 (x, z) = g2 (x, z)]. Clearly, if an idc. F includes bmin, the resulting F will be closed under bounded quantification. Even though bmin and bit are hard to compare directly, the following fact is quite enticing. It is shown in Barra [1] that [{P} ; bmin] = [{P, S} ; bmin] , and · ; bmin] = [{+} ; bmin] = PrA . That is, there is no loss of predicates that [{ −} by simply removing S in the above context. A natural open question is thus how close PIT and I 0 are? We feel that some evidence has been mounted in support of the opening statement of this section. In a context such as Kutylowski’s I 2 , bounded iteration is equivalent to that of bounded primitive recursion. But what about I 0 ? By Kutylowski [12], we also know that equality between E0 and E2 relies on I0 , and hence I1 , being equal to E0 . 101 Pure Iteration and Periodicity 51 In light of the paper at hand, is there a chance that the equality I0 = I1 is due—not to the ability of iteration to ‘raise I0 up to the level of I1 ’—but rather a case of iteration being so weak as to ‘lower I1 to the level of I0 ’ ? Since the characteristic function of e.g. the primes is obviously non-periodic, and belongs to E 0 , at some stage between IT 0 and I 0 , one must be able to escape the periodicity inherent in iteration. Can one exploit the periodic behavior of functions defined by pure iteration in order to prove I0 = E0 ? Or—perhaps via a successful attempt at escaping periodicity—can one obtain the strength of recursion? whence I0 = E0 would follow. Since both possibilities would provide a solution to many long standing conundrums of sub-recursion theory, further investigations should be well worth the effort. References 1. Barra, M.: A characterisation of the relations definable in Presburger Arithmetic. In: Proceedings of TAMC 2008. LNCS, vol. 4978, pp. 258–269. Springer, Heidelberg (2008) 2. Clote, P.: Computation Models and Function Algebra. In: Handbook of Computability Theory. Elsevier, Amsterdam (1996) 3. Enderton, H.B.: A mathematical introduction to logic. Academic Press, Inc., San Diego (1972) 4. Grzegorczyk, A.: Some classes of recursive functions, in Rozprawy Matematyczne, No. IV, Warszawa (1953) 5. Jones, N.D.: LOGSPACE and PTIME characterized by programming languages. In: Theoretical Computer Science, vol. 228, pp. 151–174 (1999) 6. Jones, N.D.: The expressive power of higher-order types or, life without CONS. J. Functional Programming 11, 55–94 (2001) 7. Kristiansen, L.: Neat function algebraic characterizations of LOGSPACE and LINSPACE. Computational Complexity 14(1), 72–88 (2005) 8. Kristiansen, L., Barra, M.: The small Grzegorczyk classes and the typed λ-calculus. In: Cooper, S.B., Löwe, B., Torenvliet, L. (eds.) CiE 2005. LNCS, vol. 3526, pp. 252–262. Springer, Heidelberg (2005) 9. Kristiansen, L., Voda, P.J.: The surprising power of restricted programs and Gödel’s functionals. In: Baaz, M., Makowsky, J.A. (eds.) CSL 2003. LNCS, vol. 2803, pp. 345–358. Springer, Heidelberg (2003) 10. Kristiansen, L., Voda, P.J.: Complexity classes and fragments of C. Information Processing Letters 88, 213–218 (2003) 11. Kristiansen, L., Voda, P.J.: The Structure of Detour Degrees. In: Proceedings of TAMC 2008. LNCS, vol. 4978, pp. 148–159. Springer, Heidelberg (2008) 12. Kutylowski, M.: Small Grzegorczyk classes. J. London Math. Soc (2) 36, 193–210 (1987) 13. Rose, H.E.: Subrecursion. Functions and hierarchies. Clarendon Press, Oxford (1984) 102 103 2.3. SCHEMATA A remark on pure iteration and pure recursion. There are two more primitive recursive-like schemata on the list of variations considered by Robinson (see p. 50) and which we have not investigated further. It is an unfortunate fact that time is a scarce resource, and this author has not had the time to investigate them further. However, a few properties are easily deduced. Note that the nomenclature used here is inconsistent with that of the article preceding this section – we use the qualifier pure here in the following sense: Definition 34 (pure primitive recursion and pure iteration) The schema ppr` is called `-fold simultaneous pure primitive recursion. Given def g1 , . . . , g` of arity k and h1 , . . . , h` of arity k+1, define ` functions fi = (~h Rp ~g )i , each of arity k + 1, by: gi (~x) , if y = 0 def fi (~x, y) = . hi (y − 1, f1 (~x, y − 1), . . . , f` (~x, y − 1)) , if y > 0 The schema pit` is called `-fold simultaneous pure iteration. Given g1 , . . . , g` of def arity k and h1 , . . . , h` of arity `, define ` functions fi = (~h Ip ~g )i , each of arity k + 1, by: gi (~x) , if y = 0 def fi (~x, y) = . hi (f1 (~x, y − 1), . . . , f` (~x, y − 1)) , if y > 0 Remark 3 Our naming convention is not consistent with that of Robinson. We thus advocate the convention that the defining feature in the dichotomy recursion–iteration is whether or not the transition function h has access to the index of recursion y or not, while the qualifier pure is taken to mean that the transition function does not have access to the additional parameters/arguments. This naming convention is consistent with that of Kutylowski and Esbelin and More. The following classes suggests themselves: ; pit` , P ; pit` , ; ppr` and P ; ppr` . Note first that ; ppr` = P ; ppr` since by a slight modification to observation 33, we see that P is readily definable (the membership of C is a slightly more delicate matter). Hence we are left with only three `-hierarchies. Next, it is straightforward to prove∗ : Theorem 36 ; pit`+1 = IT −(0) ; S S S 2. `<ω P ; pit`+1 = `<ω ; ppr`+1 = `<ω E −(`) =? E 2 . 1. S `<ω ∗ Recall def that IT −(`) = ; it`+1 , also called IT ` in [B08b]. 104 CHAPTER 2. MINIMAL IDC.’S Proof: We first consider the item 1., where ; pit`+1 ⊆ IT −(`) trivially. To S def see that F = ; it1 ⊆ ( `<ω ; pit`+1 , let h(~x, z), g(~x) ∈ F, and define: k+1 k Ij , j ≤ k Ij ,j ≤ k def def . hj = and gj = g ,j = k + 1 h ,j = k + 1 def Then, for f~ = ~h Ip ~g , we have h I g = fk+1 ∈ F, since we can employ arbitrarily wide iterations. Hence: [ [ IT −(0) ⊆ ; pit`+1 ⊆ ; it`+1 ⊆ IT −(0) . `<ω `<ω For 2. the proof is more or less identical, except that actually writing out an explicit definition specifying the exact projections with which to capture an `fold iteration/recursion by ` · (k + 1) wide pure recursion/iteration is a rather time-consuming affair. main point is that for every ` there is clearly an `0 h i The 0 such that P ; ppr` ⊇ P ; pr` (similarly for pit), whence the result follows S directly from observation 34 and the theorem of Kristiansen on `<ω E −(`) . q.e.d. Examining the proofs from [B08b] it is also clear that already with ; pit2 , the function C is definable, and our supply of constants also ensures that all the other constructions are possible already in this idc. We also remark here that even was neglected in [B08b], it is not clear whether though this question or not {0, 1} ; it1 = I ∪ N ; it1 – the bootstrapping there depended upon the constants in a non-trivial way. We thus have the following open problems: ; pit1 ( IT −(0) ? 2. Does P ; pit1 ( I −(0) ? 3. Does ; ppr1 ( E −(0) ? 1. Does I expect the first problem on the list to have a positive answer, and believe that a 1st -order characterisation in the spirit of [B08b] would be feasible. W.r.t. · the two last items, it is clear that since ‘ −’ is obtainable in either of them, with P as transition function, both relational classes comprise the class D from [B08a]. Furthermore, the predicates 2N and 1 + 2N can also be shown to belong there, whence – in the event that the inclusion is proper – it could be that much harder to pin-point the relational class exactly. The first priority in such a pursuit should be to try and determine whether or not these classes are closed under bounded minimalisation. Lastly, though Robinson’s table contains the most commons versions of primitive recursion, only the imagination limits the number of variations possible. Consider for example the following schema considered (in a limited version) by Kutylowski and Loryś in [K&L87]: Let f be defined from g0 and g1 by , if y = 0 g0 (~x) g1 (~x) , if y = 1 . f (~x, y) = h(~x, y − 2, f (~x, y − 1)) , if y > 1 2.3. SCHEMATA 105 Of course, iteration/limited/`-fold variations over this schema are also immediately constructable, and investigating such schemata in order to determine their inherent computational strength would be interesting. How to map such small idc.’s to specific detour-degrees, the topic of chapter 3, should also be worthwhile. 106 CHAPTER 2. MINIMAL IDC.’S 107 2.4. SUMMARY OF RESULTS 2.4 Summary of Results The following diagram∗ is not intended to provide a complete ‘historical overview’. Its purpose is rather to include all the new results from this chapter, and we have interspersed what we feel is a representative selection of classical results which can serve as bridges to other more complete diagrams found in e.g. [Clo96] or [Ros84]. The two top-most proper inclusions and the incomparability of IT −(0) ? and F?µ is proved below. Concluding diagram {C̄, P}◦? G0? = F?µ PrAqf ? = PrA? = ∆N 0 ) = ) (Linit )? IT −(0) ? = ··· D?µ = F?] = G1? = DDµ? = G2? ∆N ] = F? 1,] = ··· = ∆N ] = F?1,] = ··· = I?0 = I? ⊆ ··· ⊆ E?0 = E? ⊆ ··· ⊆ ) ⊥ ) = `<ω IT ?−(`) −(0) S `,] `<ω F? `<ω F?`,] −(`) I? −(`) E? ⊆ ⊆ ⊆ −(`+1) I? ⊇ ⊇ −(0) S ⊇ ⊇ = ⊇ ) ) D? S E?2 Concluding chain-of-inclusions ? ? ] N N 1,] ω,] D? = PrAqf ⊆ ? ( PrA? = PrA] = F? ( ∆0 ⊆ ∆] = F? = F? SCH C. 77 KUT ? I?− = I?0 = I?1 ⊆ E?− K&V&B = E?0 . Proof: Recall that PL? = F?µ (see [B09a]), and the characterisation of IT − ? as M? from [B08b]. That (Linit )? ( M? is obvious, since each (Linit )? predicate is composed of M? -predicates, while e.g. M(2, 0) ∈ M? \ (Linit )? . To see that PL? ⊥ M? , note that M(2, 0) ∈ M? \ PL? , since 2N cannot be the preimage of a piecewise linear unary function. An example of a PL? \ M? predicate is e.g. f −1 (0) for the PL-function x ,x ≤ y def f (x, y) = . y ,y ≤ x + 1 ∗ See [B08b, pp. 4–5] for definitions of M(m, n) and M? . 108 CHAPTER 2. MINIMAL IDC.’S For (Linit )? ⊆ PL? simply observe that the latter is Boolean and contains PC0 in a trivial way, while for f as above, f −1 (0) also verifies that (Linit )? ( PL? . q.e.d. The contrast between how seemingly with iterative schemata – iteration or recursion – non-a.b. functions seem to be able to boost the computational strength, while with non-iterative schemata – bounded-minimalisation or -counting – it does not. While we do not have an X -tree result (recall section 2.2) as strong for [X ; bmin] as for X ◦ , there is still a fixed, tree-like structure within which the computation must take place, but with bmin one obtain the ability to ‘nondeterministically’ guess values with a µ-search a fixed number of times. This feature adds computational strength, but nothing is gained by adding growing functions too small to encode sequences of arbitrary length. With the iterative schemata on the other hand, ‘large’ numbers enables the treetemplate to expand the branches of the function-tree recursively, manifolding and growing. This seems to give rise to increased complexity. There is also a trade-off phenomenon with respect to recursion-width and polynomial detours in the context of small idc.’s; see e.g. [K&B05]. However, we have also seen that iterations alone are very weak. Only in conjunction with P can the iterations be exploited. It is our hope that the reader will have enjoyed this chapter’s results and reviews, and that this subject will continue to interest researchers in the future. There remains unanswered questions, and we hope that we have made here a modest contribution towards characterising and understanding idc.-theory ‘from scratch’. In the next chapter we hope to further motivate why idc.-theory deserve the attention and effort of researchers, by developing the concept of detour degrees, forged by Kristiansen and Voda in [K&V08], into a general theory for small a.b. idc.’s. Relativised Detour Degrees It is not that I am worthy to occupy myself with mathematics, but rather that mathematics is worthy to occupy oneself with. – Attributed to an unknown colleague by Rózsa Péter 3.1 Introduction In this chapter we introduce a general theory of F-detour degrees where F is an idc. of argument-bounded functions. In a sense we have now come full circle; the results surveyed and developed in the preceding chapters have been paving the way towards this theory, and though I find the topic of chapter 2 interesting in its own right, my initial interest in them was to better be able to understand and formulate a theory of inherently bounded functions. Below follows parts of the introduction from the unpublished manuscript with the working title Inherently bounded functions and complexity classes [IBF]. This draft of working notes was the first thing I wrote down just after I enrolled as a PhD-student at the University of Oslo in February 2006, and I was already ? quite familiar with the E?0 ⊆ E?2 -inclusion, Kutylowski’s work, and some of the results by Esbelin and More. The problem of the inclusion above seemed as intriguing as insurmountable; thus I decided that even though actually solving it was an unrealistic goal – at least in as brief a period as merely 3–4 years – I could perhaps find some related problems which would enable me to contribute some original research to the field, while at the same time study the problem in more detail. Below is an excerpt from the introduction to [IBF]: 0 N i It is well-known that ∆N 0 ⊆ E? , yet the status of the inclusion ∆0 ⊆ E? is still uncertain for i = 0, 1, 2. Actually separating the classes proves a very hard problem, and it seems increasingly difficult as one familiarises one-selves with, say ∆N 0 : it is quite large! Over the years ever more relations have been eased into the lower classes by ever more sophisticated methods and make-shift coding schemata, see e.g. Esbelin & More[E&M98]. For enlightening partial results concerning the lower Grzegorczyk-classes see e.g. Kutylowski [Kut87]. Hence, finding candidates at all for membership in E?2 \ ∆N 0 is not always easy, let alone 110 CHAPTER 3. RELATIVISED DETOUR DEGREES natural ones. This phenomenon arises, to quote Kutylowski, because [. . . ] what differentiates E 2 and E 1 from E 0 is not their ‘brain’ but their ‘body’ [Kut87, p. 196]. Or to rephrase: most E 2 -functions not in E 0 are actually quite ‘stupid’, all they know is how to grow faster. There seemingly are not any good ways of converting their muscle into coding skills or other more sophisticated tricks. Indeed, as Calude’s paper [Cal87] shows, the graphs of other such ‘stupid’ functions are, with due effort, often captured as low as in ∆N 0 , even if the function itself is super-primitive recursive. This paper thus aims at investigating this phenomenon. Analysing what distinguishes the bodybuilders from the brainiacs, and what happens when the two features merge. – Barra [IBF] The general idea is perhaps best illustrated by the following well-known boosting theorem: For any f ∈ E 2 , there exists a polynomial p ∈ N[x] and an fˆ ∈ E 0 such that f (~x) = fˆ(~x, p(~x)). Combine this with the fact that all polynomials have E?0 -graph, and the result above poignantly illustrates Kutylowskis metaphor: if indeed E?2 is computationally more complex than E 0 , it is only in its ability to ‘cultivate’ its own polynomials – a feature which E 0 lacks. The general problem of small relational classes could therefore possibly be solved by first stripping the larger ‘small classes’ of all non-a.b. functions, and then investigate whether a general theory for gradually adding more muscle could lead to some new insights or breakthroughs (or provide researchers with information as to why such an approach cannot lead to a resolution). Luckily, the work was put aside in order to study and complete the theory of argument-bounded idc.’s – a labour which fruits was presented in the previous chapter. Luckily, because at the time I travelled to China to present [B08a] (which became [B09a]) in Xi’an, I had the opportunity to read the paper The Structure of Detour Degrees (DD) [K&V08] by Kristiansen and Voda, which was presented at the same conference. Amongst other deficiencies, the approach from [IBF] involved an ad-hoc way of defining inherent boundedness resorting to Kleene-style computation-trees. Although more cumbersome – and certainly less elegant than the detours from DD – essentially the same class would have been obtained if the same set of detour functions had been isolated. This chapter thus presents an integration of the elegant degree-theoretic formulation of detour degrees found in DD, and the latent generalisations of those results which were hidden in [IBF]. Relativisations which – in this new and exiting framework – finally finds a satisfactory form. The theory has a flavour of honest degree theory, where an honest function is a total recursive monotone increasing function with elementary graph. See DD for a further discussion of this connection, and see e.g. [Kri98] for a brief introduction to honest degree theory and more references. 3.1. INTRODUCTION 111 Note that we deviate slightly from the original definitions. Kristiansen and Voda only considers the unary predicates which have characteristic function in F(f ), and they refer to unary predicates as problems. In this exposition, predicates of all arities are included. 112 CHAPTER 3. RELATIVISED DETOUR DEGREES 3.2. F-DEGREES – RELATIVISED DETOUR DEGREES 3.2 113 F-degrees – Relativised Detour Degrees 3.2.1 X -degrees. The first concept we will define, is that of a detour (function). Informally the detours will boost the computational strength of functions in some idc. F by supplying more resources – in the form of larger numbers with which to code, decode or execute a long iteration or recursion. However, the detour function should not bestow the functions of F with any additional computational strength inherent in the detour itself. That is, the computational complexity of the detour should not exceed that of F in the sense that Γf should belong to F? . Indeed, most natural detours will have a rudimentary graph – they should provide muscle, but no additional brainpower. Below we define the set DX of detour functions relative to a family of functions X , and give a p.o. vX on DX . The intended use of these concepts are for argument-bounded idc.’s. We give the definitions in their most general form here before proceeding to investigate what can be inferred by substituting X for some of the idc.’s studied in the previous chapter. Definition 35 A function f : N → N is a detour if ∀x∈N (x ≤ f (x) ≤ f (x + 1)) . If X is a family of functions, a detour f is an X -detour, if Γf ∈ X? . Let DX denote the set of X -detours. A function φ : Nk → N is X -computable in f if ∃φ̂f ∈X ∀~x∈Nk φ̂f (~x, f (max(~x))) = φ(~x) . For f ∈ DX define: def X (f ) = {φ | φ is X computable in f } . Note the superscript ‘f ’ to the ‘φ̂’ – a priori the φ̂ witnessing the X -computability in f may depend strongly on f . As the lemma 42 below will show, for relatively well-behaved X , just how intimately a φ̂f depends on the f may be relaxed so that φ̂f will compute φ for any g ≥ f (in the sense ‘everywhere greater than’). This subtlety was totally suppressed in DD – but as lemma 42 emphasises – cannot be ignored in a complete and rigid development of the theory. Convention 6 As all detours are unary by definition, there will be no ambiguity in writing simply f (~x) for f (max(~x)). Thus one may think of the detours as functions f : N<ω → N, where N<ω is the set of finite sequences of natural numbers. Secondly, for the remainder of this chapter, Latins f, g, h, . . . range over detours, while Greeks φ, ψ, . . . range over functions. When a particular f is clear from context, we may omit the superscript to the φ̂f . Furthermore, as we shall often consider the composition of detours, we abbreviate f ◦ g by f g in the continuation. 114 CHAPTER 3. RELATIVISED DETOUR DEGREES Intuitively, there are many ways to think of X (f ). One way is to consider each f ∈ DX as a functional f : T R → T R acting on a φ ∈ X as defined by∗ f (φ)(~x, xk+1 ) = φ(~x, f (max(~x))). From this point of view X (f ) is simply f (X ) – the image of X under f considered a functional. From a complexity pointof-view, when X is an idc., one may think of f as an additional resource to be provided to the X -machine φ̂, thus enabling it to compute more complex predicates. Alternatively, one may think the φ ∈ X as the functionals, acting on f ∈ DF by φ(f )(~x, xk+1 ) = φ(~x, f (~x)). From this perspective X (f ) = {φ(f ) | φ ∈ X }. Definition 36 Let X be fixed. Define a relation vX on DX by: def f vX g ⇔ X (f )? ⊆ X (g)? . The relations f @X g and f ≡X g are derived from vX as expected. The relation ‘vX ’ inherits transitivity and reflexivity from ‘⊆’, whence ‘≡X ’ is an equivalence relation, and ‘vX ’ induces a p.o. ‘≤X ’ on the set def DX /≡X = {[f ]≡X | f ∈ DF } of ≡X -equivalence classes. Definition 37 (the X -degrees) The set of X -detour degrees is the set def DX /≡X = {[f ]≡X | f ∈ X } . Convention 7 We overload the symbol ‘DX ’ to also mean DX /≡X when convenient, as it will be clear whether we talk about the detour f or the equivalenceclass [f ]≡X . We also write [f ]≡X as simply [f ]X or dgX (f ). Boldface Latins a, b, c . . . usually denote degrees, and, when X is clear from context, we even sometimes simply f for [f ]X . We use ≤X for the ordering induced on the X degrees by vX , and derive <X and =X the standard way. Of course, this ordering may not be total, so a ⊥X b has the usual meaning. Finally, in all of the above notation, a sub- or super-scripted X may be omitted whenever clear from context or irrelevant. Some readers might ponder why the monotonicity requirement x ≤ f (x) ≤ f (x + 1) (M) is not sufficient. The explanation for why one must lay restriction upon the graph of f is outlined here, and amplified successively throughout the chapter. Assume φ satisfies (M), and that φ f for some X -detour f . The scenario where φ(x) ∈ R ⇔ x ∈ S and R ∈ X? (i.e. R is of the same complexity as X? ) while S 6∈ X (f )? (high complexity relative to X? and X (f )? ) is quite possible (for an example see p. 147). This would mean that we could not use majorisation as a first coarse measure of boosting-power (which we want, see ∗ For this to be well-defined, define e.g. f (max(ε)) = 0, where ε is the empty sequence. 3.2. F-DEGREES – RELATIVISED DETOUR DEGREES 115 corollary 44), and we would loose control over the vX -relation. The key idea is that detours grow in an ‘uninformed’ manner, accommodating a brain-muscle dichotomy within the induced p.o. ({X (f )? | f ∈ DX } , ⊆). The definitions are so far very general, and unless X is a given sub-recursive – algorithmic – structure, we cannot say very much about the resulting degreestructure. However, when X is some idc. F, with its inherent algorithmic interpretation, the resulting structure will be an intriguing one. In particular, when F and G are idc.’s 1. consisting of argument-bounded functions only, 2. closed under bmin, and satisfying 3. ∆N 0 ⊆ F? , G? , one can develop an interesting theory for investigating the properties of vF , and for comparing vF and vG . 3.2.2 Foundations. Definition 38 (Foundation) An idc. F satisfying 1. ∀φ∈F ∃cφ ∈N (φ(~x) ≤ max(~x, cφ )), 2. F = [F ; bmin] and 3. ∆N 0 ⊆ F? will be called a foundation. The item 1. states that all functions in a foundation are a.b., while 2. asserts that F is closed under bounded minimalisation. Combined with the item 3., which requires that F? is a quite rich collection of predicates, 1. and 2. means that F and F? will contain sufficiently many functions and predicates to ensure an array of nice closure properties. In the continuation, all F’s are foundations, but we will mark theorems etc. with an (F) to point out that we always assume that 1.–3. above holds. The list of functions which are known to have rudimentary graph, is long and includes the logical functions χ= , χ≤ , χ∧ , χ¬ , . . ., functions like min, max, C, · −, S, P, ×, xy , . . . , and – when composed into either∗ b·c or d·e – maps like · √ k ·, logk (·), e· , . . .. ·, 3.2.3 Bootstrapping, remarks and subtleties. The items 2. and 3. above ensures that: Proposition (F) 37 Let Γφ ∈ ∆N 0 . Then (i) φ̄ ∈ F and (ii) if φ has a strict top-index or is bounded by a constant then φ ∈ F. (iii) max ∈ F(id), (iv) all def ∗ Here b·c , d·e : R → Z are defined by brc = max {n ∈ N | n ≤ r } and dre min {n ∈ N | n ≥ r }. b·c is often called the integer part functions, and clearly brc + 1 , if r 6∈ Z dre = . brc , if r ∈ Z For any map φ :: R → R satisfying ∀n∈N (φ(n) ≥ 0), we have b·c ◦ φ : N → N. def = 116 CHAPTER 3. RELATIVISED DETOUR DEGREES ξ ∈ F(id) are a.b. and (v) F(id) is closed under composition. Also (vi) if φ is a.b., then φ ∈ F(id). def Proof: (i): Recall that φ̄ = min(φ(~x), y) is a (k + 1)-ary function. We immediately see that φ̄(~x, y) = µv≤y [χΓφ (~x, v) = 0] . (ii): If φ(~x) ≤ xi (have a strict top index), then φ(~x) = µv≤xi [χΓφ (~x, v) = 0] , while, if φ(~x) ≤ c (is bounded by a constant), then φ(~x) = µv≤c [χΓφ (~x, v) = 0] . ˆ id (~x, z) = max(~ ¯ x, z). Hence, max ∈ F(id). (iii): Simply verify that max (iv): Since all ξˆid ∈ F are a.b., and since ξ ∈ F(id) satisfies ξ(~x) = ξˆid (~x, id(~x)) ≤ max(~x, max(~x), cξ̂id ) = max(~x, cξ̂id ) , we see that ξ is a.b. (v) Assuming now that ξ, ~γ ∈ F(id), fix ξˆ and γ̂i ’s in F. Then def ˆ 1 (~x, z), . . . , γ̂` (~x, z), z) ∈ F , ψ(~x, z) = ξ(γ̂ satisfies def ˆ 1 (~x, id(~x)), . . . , γ̂` (~x, id(~x)), id(~x)) = ψ(~x, id(~x)) = ξ(γ̂ ˆ 1 (~x), . . . , γ` (~x), id(~x)) = ξ(γ1 (~x), . . . , γ` (~x)) = ξ ◦ ~γ (~x) . ξ(γ id That is, ψ = ξ[ ◦ ~γ , so ξ ◦ ~γ ∈ F(id). (vi): Now, if φ is a.b., then φ(~x) = φ̄(~x, max(~x, cφ )), and the right side of this last expression belongs to F(id) by (i)–(v). This proposition and its proof is tedious and straightforward, but represents necessary bootstrapping. We shall give more concise and enlightening results below, but need to start somewhere. It is worth noting though that, when max 6∈ F, there are some subtleties which needs to be dealt with carefully. For instance, if max 6∈ F then F 6= F(id). Before treating this issue further, we need to carry out some more groundwork. We also make brief pauses in this section in order to discuss and elaborate on some fine-points regarding the F-degrees. Hopefully this will be an aid to the reader for building some intuition about the emerging theory∗ . Our next lemma is quite obvious, but we include it for completeness; it gives two useful closure properties of F? , and furthermore enables us to continue to blur the distinction between ‘~x’ and ‘max(~x)’ in various contexts. ∗ The reader who immediately recognizes the results here as true, may skip to the next section. 117 3.2. F-DEGREES – RELATIVISED DETOUR DEGREES Proposition (F) 38 Let F be a foundation. Then (i) F? is Boolean and closed under variable-bounded quantification; (ii) R(~x, z) ∈ F? ⇒ ∃z≤max(~y) R(~x, z) ∈ F? ; (iii) Let f ∈ DF , then the relation ‘f (max(~x)) = v’, i.e. Γφ for φ = f ◦ max, belong to F? . Proof: As noted, F? will contain the logical functions because they have ∆N 0graph and are bounded by 1, whence F? is Boolean. Now, if R(~x, z) ∈ F? , then the function def φ(~x, y) = χR (~x, µz≤y [χR (~x, z) = 0]) is the characteristic function for ∃z≤y R(~x, z). For (ii), we observe that ∃z≤max(~y) R(~x, z) ⇔ def W i≤` ∃z≤yi R(~x, z). Next, (iii) (ii) follows from f (~x) = f (max(~x)) = v ⇔ ∃z≤max(~x) (Γf (z, v)) ∈ F? . q.e.d. On F vs. F(id). The next proposition yields the minimality of id in the vF -order on the detours as a corollary. It also allows us to identify F and F(id) in most settings. Proposition (F) 39 F(id)? = F? . Furthermore F ⊆ F(f ) for all f ∈ DF . def Proof: If φ ∈ F and ψ is any unary function, by defining φ0 (~x, z) = φ(~x), informally we see that ‘φ0 = φ̂ψ ’, so that ‘φ ∈ F(ψ)’. In particular, when ψ = f ∈ DF , then φ ∈ F(f ). Hence F ⊆ F(f ) for any f , and, in particular for f = id. Hence F ⊆ F(id), whence F? ⊆ F(id)? . def To see that F(id)? ⊆ F? , we note that R ∈ F(id)? ⇔ χ̂id R ∈ F, and this function satisfies: 0 , if ~x ∈ R χ̂id (~ x , max(~ x )) = . R 1 , if ~x ∈ 6 R Because (~x, v) ∈ Γφ ⇔ χ= (φ(~x), v) = 0 , and χ= have ∆N 0 -graph and is bounded by 1 we have in general that ΓF ⊆ F? . Hence Γχ̂idR ∈ F? . But then, ~x ∈ R ⇔ ∃v≤~x (Γmax (~x, v) ∧ Γχ̂idR (~x, v, 0)) , and the right expression defines an F? -predicate by lemma 38 and the fact that q.e.d. max have ∆N 0 -graph. Motivated by this result we define: def Definition 39 0F = dgF (id) . The justification for this definition consists of the observation that f @F id contradicts F ⊆ F(f ) and F(id)? = F? , and thus ∀aF ∈DF (0F ≤F aF ). That is, 0F is the vF -minimum (for all foundations F). As usual, superscripts ‘F’ are usually omitted. We now return to the discussion of the case where possibly max 6∈ F . 118 CHAPTER 3. RELATIVISED DETOUR DEGREES def def For F = [X ; op], let F max = [X ∪ {max} ; op] , and observe that F ⊆ F max trivially. Now, if max ∈ F then F ⊆ F(id) by property 39, and the inclusion F(id) ⊆ F max is also easy to prove: if φ ∈ F(id), let φ̂id ∈ F ⊆ F max , so that φ(~x) = φ̂id (~x, max(~x)) ∈ F max . On the other hand, when max 6∈ F, to prove that F max ⊆ F(id) is a more delicate matter. The obvious attempt would entail e.g. induction on f ∈ F max . But, in the inductive step, where φ = σ(~γ ) for some σ ∈ op, we simply do not have enough information to proceed. Informally, we need to know something about how σ(~γ )(~x) is computed, and we depend on some way of propagating the ‘z’ in say γ̂i (~x, z) via an operator σ 0 which accommodates the change of arity from ar(γi ) to ar(γi ) + 1. However, if Γφ ∈ F? for all φ ∈ F max , then the equality ‘F(id) = F max ’ comes easily: for φ ∈ F max we obtain φ̂id (~x, z) = µv≤z [Γφ (~x, v) = 0] . Observation (F) 40 ΓF max ⊆ F? ⇒ F =? F(id) = F max . q.e.d. It is folklore that Bel’tyukov proved that E 0 max =? E 0 , and what we would like is to give some general condition for when F =? F max . What observation 40 gives is one such sufficient condition when F is a foundation. It may be interesting to investigate this matter further, but we will not here in this dissertation. In any case, for all foundations F we will consider in the sequel, either max ∈ F, or ΓF max ⊆ F? , and we do not wish to delve more on this issue. However, the assumption ΓF max ⊆ F? will be needed for some results below. On F(f ) vs. F(g). Note that f is F-computable in f because fˆf (x, z) = µv≤z [Γf (x) = v], and it is evident that when f ≤ g then fˆf (x, g(x)) = f (x) for this particular f . However, it is not the case that f = g implies that either is F-computable in ae the other. In fact, since φ̂(~x, g(~x)) ≤ max(~x, g(~x)) = g(~x), we see that any φ – and thus f – can be F-computable in g only if φ g . Observation (F) 41 φ ∈ F(f ) ⇒ ∃cφ ∈N (φ(~x) ≤ max(~x, f (~x), cφ ) = max(f (~x), cφ )) . Rephrased: φ ∈ F(f ) ⇒ φ f . If we call a function φ f -bounded when it is majorised by f , the terminology fits nicely with the terminology∗ of 0-boundedness, 1-boundedness etc. A function is e.g. i-bounded iff it is f -bounded by a detour in E i with rudimentary-graph. ∗ See e.g. Gandy [Gan84]. 3.2. F-DEGREES – RELATIVISED DETOUR DEGREES 119 On φ̂f vs. φ̂g . The next result is also important for the further developments. It states that the nature of the dependence of φ̂f on f , is essentially tied to the -properties of f . Lemma (F) 42 If φ ∈ F(f ), then φ̂f can be chosen so that φ(~x) , if f (~x) ≤ z φ̂f (~x, z) = . z , otherwise Proof: Let φ be F-computable in f , fix φ̂f ∈ F, and recall that Γf ∈ F? by definition. Consider the function ψ(~x, z) defined by: def ψ(~x, z) = φ̂f (~x, µv≤z [Γf (~x, v) = 0]) . Clearly, when z ≥ f (~x), then µv≤z [Γf (~x, v) = 0]) = f (~x), whence we infer that def ψ(~x, z) = φ̂f (~x, f (~x)) = φ(~x). q.e.d. This lemma confirms our intuition that majorisation and vF are connected. This connection will be emphasised further by the corollary 44∗ below. On the notation Ff . We introduce our last new notation for a while: define Ff = F(f )? . By the above, this notation would not be well-defined if Ff was used for the set of def functions F(f ), and since f, g ∈ a ⇔ F(f )? = F(g)? , this notation should be rather unambiguous. def def P. 39 Thus, recalling that 0 = [id] and F0 = F(id)? = F? , also justifies using F0 and F? interchangeably. This notation will be very versatile and well suited for expressing and discussing sub-recursion theory. Especially when the theory developed here is applied to the classes investigated in chapter 2, the clarity it brings to the subject will be appreciable. 3.2.4 Closure properties of F(f ) and Ff . In general F(f )’s closure properties will not match the closure properties of the underlying foundation F. On the contrary, when in the sequel F is, say [ ; pr], then F(f ) is not closed under primitive recursion, when f is non-a.b. Indeed, F(f ) will not even be closed under composition, but for in a very weak and obvious sense: Lemma (F) 43 If φ ∈ F(f ) and ~γ ∈ F(g), then φ ◦ ~γ ∈ F(f g). Proof: Consider def ψ(~x, z) = φ̂(γ̂1 (~x, z), . . . , γ̂` (~x, z), z) . ∗ Corollary to lemmata 42 & 43 120 CHAPTER 3. RELATIVISED DETOUR DEGREES L. 42 That g ≤ f g ensures ψ(~x, f g(~x)) = φ̂(~y , f (g(~x))) , where yi = γi (~x). By ae fg ae q.e.d. hypothesis g(~x) ≥ max(~y ) which means that ψ = φ[ ◦ ~γ . ae Corollary (F) 44 If φ0 = φ, and φ0 ∈ F(f ), then φ ∈ F(f ). In particular f g ⇒ f vF g . Thus, we can assume w.l.o.g. that f g ⇒ φ̂f (~x, g(~x)) = φ(~x). P. 37 ae Proof: Since max, C ∈ F(id), when φ = φ0 ∈ F(f ), we can construct φ(~x) by by explicit definition by cases, first treating the finitely many cases where φ0 (~x) differ from φ(~x), ended by a ‘φ0 (~x) otherwise’-clause. This composition is a composition of the F(f )-functions φ and constants c~y , for ~y ∈ {~x | φ0 (~x) 6= φ(~x) }, L. 46 into an F(id)-function, whence φ ∈ F(id ◦ f ) = F(f ) . ae Hence, if f g and φ ∈ F(f ), by lemma 42 we have φ̂f (~x, g(~x)) = φ(~x), whence φ ∈ F(g). q.e.d. Corollary (F) 45 Let φ ∈ F(f ) and ~γ ∈ F(g). If f g h, then φ ◦ ~γ ∈ F(h). In particular, since f ◦id = id◦f = f , when φ ∈ F(f ) and ~γ ∈ F(id) or φ ∈ F(id) and ~γ ∈ F(f ), then φ ◦ ~γ ∈ F(f ). q.e.d. On the other hand, F(f ) will be closed under bmin, and, if F is closed under bcount, this closure property is also preserved by all F(f ). Lemma (F) 46 F(f ) is closed under bmin. Furthermore, if F is closed under bcount, then F(f ) is closed under bcount. def Proof: Let φ1 , φ2 ∈ F(f ), and consider ψ = µz≤y [φ1 , φ2 ]. Choose φ̂1 , φ̂2 ∈ F and set def ξ(~x, y, u) = µz≤y [φ̂1 (~x, z, u) = φ̂2 (~x, z, u)] . Then ψ̂ = ξ. def Similarly, if F is closed under bcount and ψ = ]z≤y [φ1 , φ2 ], choose φ̂1 , φ̂2 ∈ F and set def ξ(~x, y, u) = ]z≤y [φ̂1 (~x, z, u) = φ̂2 (~x, z, u)] . q.e.d. It is important to note that this lemma does not imply closure under f -bounded minimalisation, nor f -bounded counting. First of all, this would contradict theorem 48 below, since that result implicitly states that essentially no more than one f -bounded search can be performed. Consider for example the function∗ ψ = µz≤f (x) [µv≤f (z) [χˆR ff (~x, v), 0] = 0] . If F(f ) was closed under f -bounded µ-search, the function ψ above would be admitted into F(f ) in general. But this definition ‘cheats’: the construction of ∗ Recall that ff abbreviates f ◦ f . 121 3.2. F-DEGREES – RELATIVISED DETOUR DEGREES ψ above creates a χ̂fR from a χ̂ff R , and as we will show in the sequel there are examples of triplets F, R, f such that R ∈ Fff \ Ff . Proposition (F) 47 Ff is Boolean and closed under variable-bounded quantification. Proof: The logical functions belong to F(id), whence a Boolean combination of Ff -predicates belong to F (id ◦ f )? = Ff . If χ̂fR ∈ F, then χ̂f∃z≤yR = χ̂fR (~x, µz≤y [χ̂fR (~x, v, z) = 0], z) ∈ F . q.e.d. We also have the following descriptive-complexity-like characterisation of Ff : Theorem (F) 48 R(~x) ∈ Ff ⇔ ∃R0 ∈ F0 R(~x) ⇔ ∃z≤f (~x) R0 (~x, z) . Proof: Let R ∈ Ff , fix χ̂R ∈ F , and recall that Γχ̂R ∈ F? . Moreover, Γf ∈ F0 by definition, and as F0 is Boolean, the matrix R0 , defined by R0 (~x, z) ⇔ Γf (~x, z) ∧ Γχ̂R (~x, z, 0) , belongs to F0 . Hence R(~x) ⇔ ∃z≤f (~x) (R0 (~x, z)) , which proves the ⇒-direction. Next, let R0 (~x, z) be an arbitrary F0 -predicate, i.e. χR0 ∈ F. We must prove ∃z≤f (~x) (R0 (~x, z)) ∈ Ff . To this end, set def φ(~x, u) = χR0 (~x, µz≤u [χR0 (~x, z) = 0]) . That φ ∈ F is routine to verify. Assume there exists z ≤ f (~x) such that R0 (~x, z). Then φ(~x, f (~x)) = 0 since (~x, µz≤f (~x) [χR0 (~x, z) = 0]) ∈ R0 . Similarly, if for all z ≤ f (~x) we have (~x, z) 6∈ R0 , then µz≤f (~x) [χR0 (~x, z) = 0] = f (~x), and so φ(~x, f (~x)) = 1. That is φ = χ̂fR , and so R ∈ Ff . This concludes the proof. q.e.d. The theorem above yields a normal form for Ff -predicates: an F0 -matrix, with a simple, f -bounded existential prefix. One convenient consequence is that for detours f and g the set def B = ~x ∈ Nk | f (~x) ≥ g(~x) belongs in Ff because ~x ∈ B ⇔ ∃z≤f (~x) (g(~x) = z) , and the graph of g is F0 by definition. We shall have use for this below. 122 CHAPTER 3. RELATIVISED DETOUR DEGREES 3.2.5 The structure (DF , vF , ∩F , ∪F ) . We now proceed by reproving, in the general setting of foundations, many of the results from DD regarding the structure (DF , vF ). That is, we shall define binary operations ∩F (meet or cap) and ∪F (join or cup) on DF , such that (DF , vF , ∩F , ∪F ) become a distributive lattice ∗ . Lemma (F) 49 Let x, y ≤ φ(x, y) ≤ φ(x + 1, y), φ(x, y + 1), Γφ ∈ F0 and f, g ∈ DF . Then φ ◦ (f, g), f g ∈ DF . Proof: Because of the monotonicity of φ φ ◦ (f, g)(x) = v ⇔ ∃zf ,zg ≤v (Γf (x, zf ) ∧ Γg (x, zg ) ∧ Γφ (zf , zg )) , so that Γφ◦(g,f ) ∈ F0 . Monotonicity of φ ◦ (f, g) follows from def x ≤ f (x), g(x) ⇒ x ≤ φ(f (x), g(x)) ≤ φ(f (x + 1), g(x + 1)) = φ ◦ (f, g)(x + 1) . That f g ∈ DF is equally straightforward. q.e.d. Corollary (F) 50 If f, g ∈ DF , then∗∗ Ek ◦ (f, g) ∈ DF . Hence, any unary f in the idc. {DF , E0 , E1 , E2 , . . .}◦ will be an F-detour. q.e.d. We defer the proof of the monotonicity properties of the Ek ’s to chapter 4, which is exclusively concerned with the majorisation-relation (see e.g. proposition 86 (p. 156)). That all∗∗∗ p ∈ N[x] have ∆N 0 -graph is well-known folklore, and that each Ek have rudimentary graph follows from general results in e.g. Esbelin and More [E&M98] – indeed, even the super-primitive recursive x 7→ Ex (x, x) has a rudimentary graph, as demonstrated by Calude [Cal87]. Lemma (F) 51 If f, g ∈ DF , then min ◦ (f, g) ∈ DF . Proof: That Γmin◦(f,g) ∈ DF follows from _ Γf (x, v) ∧ Γg (x, v) Γf (x, v) ∧ ∀z≤v ¬Γg (x, z) min ◦ (f, g)(x) = v ⇔ Γg (x, v) ∧ ∀z≤v ¬Γf (x, z) . Furthermore, x ≤ min(f (x), g(x)) ≤ min(f (x + 1), g(x + 1)) is a direct consequence of the monotonicity properties of f and g. q.e.d. Lemma (F) 52 Let ψ(x) ≤ ψ(x + 1) satisfy Γψ ∈ F0 , and let f ∈ DF . Then max ◦ (ψ, f ) ∈ DF . ∗ A lattice is a p.o. in which any pair of elements a, b have a least upper bound a ∪ b and a greatest lower bound a ∩ b. In a distributive lattice ∩ and ∪ distributes over each other, viz.: a ∩ (b ∪ c) = (a ∩ b) ∪ (a ∩ c) and a ∪ (b ∩ c) = (a ∪ b) ∩ (a ∪ c). See e.g. [Rog67, p. 223] for more details. ∗∗ Recall that E (x, y) = max(x, y) , E (x, y) = x + y , E = x · y , E (x, y) = xy , . . . (see 0 1 2 3 definition 2.3.1) ∗∗∗ N[x] is the set of polynomials with natural coefficients. 123 3.2. F-DEGREES – RELATIVISED DETOUR DEGREES Proof: Set φ = max ◦ (ψ, f ) That Γφ ∈ F0 follows from Γφ (x, v) ⇔ (Γf (x, v) ∧ ∀z≤v ¬Γψ (x, z)) ∨ (Γψ (x, v) ∧ ∀z<v ¬Γf (x, z)) . def Next x ≤ f (x) implies x ≤ max(ψ(x), f (x)) = φ(x) , and finally max(ψ(x), f (x)) ≤ max(ψ(x + 1), f (x + 1)) follows from ψ(x) ≤ ψ(x + 1) and f (x) ≤ f (x + 1) . q.e.d. Hence max◦(f, g) ∈ DF . Combined with this last lemma, lemma 49 also ensures the existence of slow-growing detours which can be ‘wedged between’ canonical ones. E.g. x 7→ x2 ≤ x 7→ max(x2 , x2 + blog2 xc) = x 7→ x2 + blog2 xc ≤ x 7→ x2 + x . Lemma (F) 53 Let e, f, g, h ∈ DF . Then (i) h vF f, g ⇒ h vF min ◦ (f, g); (ii) f, g vF h ⇒ max ◦ (f, g) vF h. (iii) F(min ◦ (f, g))? = Ff ∩ Fg ; Assume furthermore that e ≡F f and g ≡F h. Then (iv) min ◦ (e, g) ≡F min ◦ (f, h), (v) max ◦ (e, g) ≡F max ◦ (f, h) Proof: (i): Let m(x) denote min ◦ (f, g)(x). Consider that def ∃z≤m(~x) Γf (~x, z) ⇔ ~x ∈ B = ~x ∈ Nk | f (~x) ≤ g(~x) , T. 48 and that ∃z≤m(~x) Γf (~x, z) ∈ Fm . Let A ∈ Fh . By definition of vF we have A ∈ Ff , Fg , so we can fix χ̂fA and χ̂gA in F such that e.g. z ≥ f (~x) ⇒ χˆA f (~x, z) = χA (~x). But then def φ(~x, z) = C̄(χ̂fA (~x, z), χ̂gA (~x, z), χ̂m x, z), 1) B (~ verifies A ∈ Fm , which proves (i). (ii): Let M denote max ◦ (f, g)(x), and let A ∈ FM . We note first that the sets def def Bf = ~x ∈ Nk | f (~x) > g(~x) and Bg = ~x ∈ Nk | g(~x) ≥ f (~x) belong to Ff and Fg respectively, whence they both belong to Fh by assumption. def Also, Af = Bf ∩ A ∈ Ff because χ̂fAf (~x, z) = χ̂M x, z) , χ̂fBf (~x, z) = 0 A (~ 1 , otherwise 124 CHAPTER 3. RELATIVISED DETOUR DEGREES and similarly Ag ∈ Fg . Since Fh is Boolean, and since A = Af ∪ Ag we are done. (iii): Set m = min ◦ (f, g) and assume A ∈ Fm . Since m ≤ f, g, we have by lemma 42 that for arbitrary ~x ∈ Nk : χ̂m x, f (~x)) = χ̂m x, g(~x)) = χ̂m x, m(~x)) = χA (~x) , A (~ A (~ A (~ which shows that A ∈ Ff ∩ Fg . If A ∈ Ff ∩ Fg , fix χ̂fA , χ̂gA ∈ F, and set def B = ~x ∈ Nk | f (~x) ≤ g(~x) . Now ~x ∈ B ⇔ ∃z≤m(~x) (Γf ◦max (~x, z)) , whence B ∈ Fm . But then, we have for def φ(~x, z) = C̄(χ̂fA (~x, z), χ̂gA (~x, z), χ̂m x, z), 1) , B (~ that φ ∈ F and φ = χ̂m A. (iv): Let e, f, g, h be as specified under (iv). This item now follows directly from (iii) because by assumption we have Fe ∩ F g = Ff ∩ F h . (v): Let e, f, g, h be as specified under (iv). Set M = max ◦ (e, g), and def def let A ∈ FM . Consider Ae = A ∩ ~x ∈ Nk | g(~x) ≤ e(~x)) and Ag = A ∩ ~x ∈ Nk | e(~x) ≤ g(~x)) , so that Ae ∪ Ag = A. Observe next that ~x ∈ Ae ⇔ χ̂eA (~x, e(~x)) = 0 ∧ ∃z≤e(~x) Γg (~x, z) , so that Ae ∈ Fe , and thus by e ≡F f we have Ae ∈ Ff . The analogue construction also yields Ag ∈ Fh . But then, since f, h max ◦ (f, h), we have Ae , Ag ∈ F(max ◦ (f, h))? , and thus A = Ae ∪ Ag as well. This proves FM ⊆ F(max ◦ (f, h))? , and the proof of the converse inclusion is completely symmetrical. q.e.d. The observant reader will have noticed the ‘missing item’ from the list (i)–(v) in ? lemma 53: ‘F(max ◦ (f, g)) = Ff ∪ Fg ’ . The question of whether this equality holds is an open problem which we shall not investigate further here. However, it is interesting to note that a similar phenomenon arises in honest degree theory, see Kristiansen [Kri01]. Definition 40 For a, b ∈ DF , let f ∈ a and g ∈ b. Define a ∩F b = dgF (min ◦ (f, g)) and a ∪F b = dgF (max ◦ (f, g)) . lemma 53 ensures that ∩F and ∪F are well-defined binary operations on DF , and – observing that maximum distributes over minimum and vice versa – also proves the next theorem: Theorem (F) 54 (The Lattice of F-Degrees) The structure (DF , vF , ∪F , ∩F ) is a distributive lattice∗ . ∗A q.e.d. distributive lattice is a p.o. (D, v) with binary operations bl bg such that bl (a, b) is a greatest lower bound for a and b w.r.t. v, b2 (a, b) is a least upper bound for a and b w.r.t. v, and such that b1 distributes over b2 and vice versa. 125 3.2. F-DEGREES – RELATIVISED DETOUR DEGREES 3.2.6 An enhanced lemma. The theorem below is an enhanced version of the Lemma 7. from DD, which states that when f, g, h are detours and h is strictly monotone increasing, then f vF g ⇒ f h vF gh. The proof found in DD is quite contrived, and much less general than the one we give here: in DD closure of F under bcount1 is used in an essential way, in addition to drawing on the auxiliary notion of inversedetours. Thirdly, in our proof of the result for general idc.’s, we weaken the hypothesis that h should be strictly monotone increasing – a relaxation which will be exploited in an essential way in e.g. corollary 57 below. Theorem (F) 55 Let f, g, h ∈ DF , then: f vF g ⇒ f h vF gh . Proof: Recall that the graphs of all the involved detours f, g, h, f h and f g belong in F by definition and closure of DF under composition. Let A ∈ Ffh , and fix χ̂fAh ∈ F. Clearly, for ~x such that f h(~x) ≤ gh(~x), then L. 42 χ̂fAh (~x, gh(~x)) = χ̂fAh (~x, f h(~x)) = χA (~x) . Next, consider the F0 -predicate B 0 (~x, z, u) defined by: def B 0 (~x, z, u) ⇔ Γh (~x, z) ∧ ∃v≤u Γgh (~x, v) ∧ χ̂fAh (~x, v) 6= χ̂fAh (~x, u) . We first observe that since h is a detour, Γh (~x, z) ⇒ max(~x, z) = z. Consequently Γh (~x, z) ⇒ f (~x, z) = f (z) . (†) def Set B(~x, z) ⇔ B 0 (~x, z, f (z)), so that def B(~x, z) ⇔ Γh (~x, z) ∧ ∃v≤f (z) Γgh (~x, v) ∧ χ̂fAh (~x, v) 6= χ̂fAh (~x, f (z)) . Then B ∈ Ff , since χB 0 = χ̂fB . As f vF g by assumption, we infer that also B ∈ Fg . † Next ∀x∈N (h(x), g(x) ≤ gh(x)) ⇒ g, h vF gh, which means that B, and h are both F-computable in gh, and thus – by h(x) ≤ gh(x) for all x – we can find gh χ̂gh ∈ F. B , ĥ Examining the definition of B, we see that it consists of those (~x, z) for which h(~x) = z, such that and the value v = g(z) = gh(~x) computes χA incorrectly via χ̂fAh . But then, by defining def ξ(~x, z) = · fh 1− χ̂A (~x, z) , if χ̂gh x, ĥ(~x, z), z) = 0 B (~ fh gh χ̂A (~x, z) , if χ̂B (~x, ĥ(~x, z), z) = 1 , we clearly have ξ ∈ F and ξ(~x, gh(~x)) = χA (~x), and so A ∈ Fgh . This concludes the proof. q.e.d. Informally, Fgh knows exactly when χ̂fAh fails to compute A via gh! Hence we can compute A via χ̂fAh and gh never the less. 126 CHAPTER 3. RELATIVISED DETOUR DEGREES This lemma can also be interpreted as saying that f vF g reflects that an Ffunction φ̂ is always boosted at least as much by g(z) as by f (z). Thus, the particular value z = h(x) is no different in this respect. In particular, if f = g, then both fh vF gh and gh vF fh: Corollary (F) 56 f = g ⇒ fh = gh. q.e.d. Thus any detour h defines a map h : DF → DF , by h(a) = ah = fh for arbitrary f ∈ a. Furthermore, each h is monotonous as a map on DF in the sense that a ≤ b ⇒ h(a) ≤ h(b). Definition 41 Define, for h ∈ DF the operator Φh : DF → DF by: def def Φh (a) = ah = dgF (f h) (for some f ∈ a) . Contrasting this result, for right-composition the situation could be the opposite; even though in ‘most’ cases f ≡F f + 1 (see section 3.3.1), we may have h ◦ f 6≡F h ◦ (f + 1) for many F and h ∈ DF . This also means that attempting ‘composing degrees a, b by representatives’ is most probably a doomed venture. Investigating how one may define operators on DF seems to be a worthwhile and intrinsically interesting subject. Both as a means for understanding the general F-lattice, and also with the end of discovering more about particular lattices (viz. with a fixed F in mind). However, we shall not have time to pursue this issue much further, except for including one nice and straightforward result about so called jumps. One can define a jump operator Φ : DF → DF as any operator such that a < Φ(a) for all a ∈ DF . Often one denotes by a0 the jump of a. A slightly weaker definition asks only that 0 < 00 < 000 < · · · . In DD it is proved that for the detour j(x) = 2x , we have that Φj is a jump operator on Dpr in this weaker sense, and it follows from results (which we shall not include here) that j is (in this weaker sense) in fact a jump operator on all the foundations mentioned in the next section. It is possible to infer directly from theorem 55 that: Corollary 57 a < af ⇔ ∃n∈N (a < af n ) . q.e.d. To see this, it suffices to note that if af = a , then we also have af 2 = af etc. so that af n = a for all n. In particular, relatively slow growing detours will either make a degree jump upwards at a first application, or it never will after any number of iterated applications. This results is obvious from the degreeperspective, but intriguing and perhaps unexpected when pulled back to the underlying structure of idc.’s. Unfortunately, as for the jump operators, we cannot afford to pursue this any further at the present, except for illustrating the phenomenon with a few examples in the last section of this chapter. 3.2.7 Three open problems and some partial answers. Our next lemma bestows us with some insight into the relative complexity between a function and its graph. It asserts that when a function φ(~x) is F- 3.2. F-DEGREES – RELATIVISED DETOUR DEGREES 127 computable in f (~x), its graph – with characteristic function χΓφ (~x, z) – is Fcomputable in f (~x). It is paramount to note here that in general, to compute ψ(~x, z) correctly via ψ̂(~x, z, f (~x, z)), a priori the value f (~x, z) is required. The lemma thus states that: for the particular case where ψ is the characteristic function for the graph of an F-computable-in-f -function φ, we may omit from f ’s arguments the ‘z’: Lemma (F) 58 If φ(~x) ∈ F(f ), then Γφ is F-computable in f (~x). def Proof: Assume φ ∈ F(f ), so that we have an φ̂f ∈ F. Define ψ(~x, z, u) = χ= (φ̂f (~x, u), z) ∈ F. Then 0 , if φ(~x) = z def f ψ(~x, z, f (~x)) = χ= (φ̂ (~x, f (~x)), z) = χ= (φ(~x), z) = . 1 , if φ(~x) 6= z Thus ψ is the required χ̂Γφ . q.e.d. We remark that φ̂f (~x, f (~x, ~y )) = φ(~x) for any ~y , since max(~x) ≤ max(~x, ~y ) in general. Hence the ψ defined in the last proof is χ̂fΓφ in the usual sense. lemma 58 thus asserts that the graph of a function can be no more complex than the function itself – if φ ∈ F(f ) ⇒ Γφ ∈ F(f ). We also know that φ ∈ F(f ) then φ f : Observation (F) 59 φ ∈ F(f ) ⇒ φ f ∧ Γφ ∈ Ff . q.e.d. We would very much like to know whether the converse result holds; that is: Open Problem (F) 1 Will φ f ∧ Γφ ∈ Ff ⇒ φ ∈ F(f )? The trouble with this direction is best illustrated by the following attempt at a naive F(f )-definition of φ: def φ(~x) = µv≤f (~x) [Γφ (~x, v)] . (†) This definition ‘cheats’ in the sense that it hides a double application of an f -bounded µ-search. If we wanted to define φ via χ̂Γφ (~x, v, z) we are forced into constructing first def ψ(~x, z) = µv≤z [χ̂Γφ (~x, v, z)] , whence def ψ(~x, f (~x)) = µv≤f (~x) [χ̂Γφ (~x, v, f (~x))] , From the mere assumption Γφ (~x, v) ∈ Ff – viz. the predicate Γφ – all we know is that χ̂Γφ (~x, v, f (~x, v)) will compute it correctly – and not necessarily χ̂Γφ (~x, v, f (~x)). We depended upon φ̂f ∈ F in order to ensure that χ̂Γφ (~x, v, f (~x)) = χ̂Γφ (~x, v) in the proof of lemma 58, but assuming this now would be a case of begging the question. Indeed, if φ(~x) = f (~x) for a particular ~x ∈ Nk , then f (~x, v) = f (~x, f (~x)) = ff (~x) may be needed by χ̂Γφ lest it should fail. 128 CHAPTER 3. RELATIVISED DETOUR DEGREES Another way to say the same is that in (†) we compose two functions from F(f ), and so the default bound we obtain from lemma 43 (p. 119) is φ ∈ F(ff ) . On the other hand, we will never obtain a false ‘low’ value. More precisely: Lemma (F) 60 Γφ ∈ Ff ⇒ φ̄ ∈ F(f ). Proof: We prove Γφ̄ ∈ Ff : def φ̄(~x, y) = v ⇔ min(φ(~x), y) = v ⇔ _ φ(~x) ≤ y ∧ v = φ(~x) _ ∃z≤y (φ(~x) = z ∧ v = z) ⇔ φ(~x) > y ∧ v = y ∀z≤y ¬(φ(~x) = z) ∧ v = y . (‡) By assumption the relation φ(~x) = z belongs to Ff , thus the last formula in (‡) above is clearly in Ff by proposition 47. q.e.d. This lemma gives us a partial answer to the problem posed above: Definition 42 For c ∈ N, define φ̌c , by φ̌c (~x) = min(φ(~x), max(~x, c)). def Define X̌ = φ̌c | φ ∈ X , c ∈ N . Note the difference between φ̄ and φ̌c – φ̄ is (k + 1)-ary, while φ̌c is k-ary, and we ask the reader to note that when φ is a.b. – viz. ∃cφ ∈N (φ(~x) ≤ max(~x, cφ )) – then φ̌cφ = φ. Proposition (F) 61 Γφ ∈ Ff ⇒ ∀c∈N (φ̌c ∈ F(f )). L. 60 ae Proof: Since φ̄ ∈ F(f ), max ∈ F(id) and because φ̌c (~x) = φ̄(~x, max(~x)), we are done by corollaries 44 & 45. q.e.d. The difference between φ̌c and φ̄ is a subtle one, but the former have several advantages over φ̄, illustrated by the next observation. The reason for using φ̄ in the most general cases is that φ̄-results do not depend on max, while φ̌-results usually require some appeal to max ∈ F(id). ˇ ) = {γ ∈ F(f ) | γ is a.b. }. Observation (F) 62 F(f ˇ ). Proof: Let φ ∈ {γ ∈ F(f ) | γ is a.b. }. Then φ = φ̌cφ , and so φ ∈ F(f ˇ ), viz. ψ = φ̌c for some φ ∈ F(f ) and c ∈ N. In particular, Next, let ψ ∈ F(f ψ is a.b. trivially. That ψ ∈ F(f ) follows from the fact that Γφ ∈ Ff , and so φ̌c ∈ F(f ) for all c ∈ N by proposition 61. q.e.d. Thus, the best partial converse(s) we have been able to produce are: Corollary (F) 63 1. φ id ∧ Γφ ∈ Ff ⇒ φ ∈ F(f ). 3.2. F-DEGREES – RELATIVISED DETOUR DEGREES 129 2. φ f ∧ Γφ ∈ Ff ⇒ φ̌c ∈ F(f ). q.e.d. This is not entirely satisfying, since it does not give the full answer (which could be negative; see below). We see that there is some sort of trade-off going on. Either – if φ is id-bounded (a.b.) with Ff -graph, then we can ease φ into F(f ) – but – if φ is f -bounded with Ff -graph, then we must settle for some φ̌c . Nevertheless, corollary 63 and φ̌c will be of some use in the continuation. The general problem may very well have a negative solution though. For, assume 0 < f < ff , and let A ∈ Fff \ Ff . Define f (~x) , ~x ∈ A def φ(~x) = . · f (~x) −1 , ~x 6∈ A We immediately see that φ ∈ Fff and that φ(~x) f (~x). But, is Γφ ∈ Ff ? We have that _ Γf (~x, v) ∧ χ̂f f (~x, f (~x, v)) A φ(~x) = v ⇔ Γf (~x, v + 1) ∧ χ̂fAf (~x, f (~x, v + 1)) When v = f (~x), we have f (~x, v) = ff (~x), and so it does not seem impossible that the graph of φ could be computed in f . However, the detour f (x+1) = f ◦(id+1) – necessary to guarantee the successful computation of φ(x) = v for x 6∈ A – could be much larger than f in the -order. There is a potentially vF -essential · · difference between (f ◦f ) −1 and f ◦(f −1), suggesting that the graph of φ might not be F-computable in f . Inter-hierarchical questions. We have arrived at a point where it seems natural to start to talk about what can be inferred about how different foundations F and G – and the degreestructure erected upon them – will relate to each other. In particular, we will consider the case where F0 ⊆ G0 . We fist observe that∗ Proposition (F) 64 F(id) ⊆ G(id) ⇔ F0 ⊆ G0 . Proof: The ⇒-inclusion is trivial. Thus assume F0 ⊆ G0 , and let φ ∈ F. Since φ is a.b. and since ΓF ⊆ F0 ⊆ G0 , we are done. q.e.d. Two natural questions are: Open Problem (F) 2 Let f ∈ DF ∩ DG . Will F(f ) = G(f ) ⇔ Ff = Gf ? Open Problem (F) 3 Assume f vF ,G g. Will Ff = Gf ⇒ Fg = Gg ? ˇ ) to the We will prove the following partial answer to problem 2 (note the F(f left): ∗ Note that in a (F)-marked result, both the F and the G mentioned in the hypothesis are assumed to be foundations. 130 CHAPTER 3. RELATIVISED DETOUR DEGREES Proposition (F) 65 Let f ∈ DF ∩ DG . Then ˇ ) = G(f ˇ ) ⇔ Ff = Gf . F(f Our best partial answer to problem 3 will be: Proposition (F) 66 Let f, h ∈ DF ∩ DG . Then Ff = Gf ⇒ Ffh = Gfh . We just remark that the ⇒-direction of proposition 65 is immediate by observation 62, and the fact that characteristic functions can be assumed 0–1-valued. Before we give their proofs we need a lemma asserting that in a sense F(f ) is to F(f h), what F is to F(h): Lemma 67 Let ψ be a (unary) function such that Γψ ∈ F0 and f ψ ∈ DF , and let φ ∈ F(f ψ). Then for some φ̃ψ ∈ F(f ) we have z ≥ ψ(~x) ⇒ φ̃ψ (~x, z) = φ(~x). Furthermore, φ̃ψ (~x, z) can be chosen to satisfy φ̃ψ (~x, z) ≤ φ(~x). Proof: Fix φ̂f ψ ∈ F witnessing φ ∈ F(f ψ), and recall that f ∈ F(f ). Hence, by lemma 43 we have def φ̃ψ (~x, z) = φ̂f ψ (~x, f (z)) ∈ F(f ) . But then def L. 42 z ≥ ψ(~x) ⇒ φ̃ψ (~x, z) = φ̂f ψ (~x, f (z)) = φ̂f ψ (~x, f (ψ(~x))) = φ(~x) . That we can assume φ̃ψ (~x, z) ≤ φ(~x) follows by considering e.g.: ψ def φ̃ (~x, z) , ∃v≤z Γψ (~x, v) 0 φ̃ (~x, z) = . 0 , otherwise (†) This last function still computes φ for sufficiently large z, and it belongs in F(f ) by lemma 43 since it is defined by the composition of the F(f )-function φ̃ into basic F(id)-functions like C and Γh . That φ0 ≤ φ now follows from (†), since φ0 (~x, z) ∈ {φ(~x), 0} by definition. q.e.d. Proof: (of propositions 65 & 66) We first prove the ⇐-direction of propo† sition 65. Thus assume Ff = Gf and let φ ∈ G(f ) be a.b. By (†), we have ˇ Γφ ∈ Ff , whence φ ∈ F(f ) by corollary 63. Since the role of F and G can be reversed, this proves 65. Assume Ff = Gf , and let A ∈ Gfh . We must prove that A ∈ Ffh . Set φ = χ1A , and note that φ̌1 = φ. We now have φ ∈ G(f h), and thus – where def φ̃h is defined as in lemma 67 – we can find ξ = φ̃h ∈ G(f ) satisfying ξ(~x, h(~x)) = φ(~x) . Furthermore ξˇ1 = ξ, since ξ ≤ φ ≤ 1, and hence is a.b. Moreover ˇ ) P.=65 F(f ˇ ) . ξˇ1 ∈ G(f 3.2. F-DEGREES – RELATIVISED DETOUR DEGREES 131 L. 43 This means that indeed ξ ∈ F(f ). Since h ∈ F(h), we have that ξ(~x, h(~x)) ∈ def G(f h). As noted ξ(~x, h(~x)) = φ(~x) = χA (~x) , meaning that A ∈ Ffh . Since the role of F and G can be reversed, this proves 66. q.e.d. It is tempting to try to prove proposition 66 with a ψ such that f ψ ∈ DF in place of the h, but this would be a far stronger result – even stronger than the open question it partially answers. Consider the three variants here: Let f, h, f ψ, g ∈ DF ∩ DG and assume f vF g. P. 65 (A) Ff = Gf ⇒ Ffh = Gfh ; ? (B) Ff = Gf ⇒ Fg = Gg ; ? (C) Ff = Gf ⇒ Ff ψ = Gf ψ . So (A) is proposition 66, (B) is problem 3, and (C) is the hinted at variant. Both A and B are assertions about what can be inferred about the two degreestructures (DF , vF ) and (DG , vG ) above f (given the hypothetical information Ff = Gf ). The full version (B) would be even better than ‘just’ (A), but based on (A) alone we can still conclude that: Corollary (F) 68 Let f ∈ DF ∩ DG . Then ∃f ∈DF ∩DG (Ff 6= Gf ) ⇒ F0 6= G0 . Proof: F0f = Ff , G0f = Gf and F0 = G0 ⇒ F0f = G0f . q.e.d. In other words, any structural difference between (DF , vF ) and (DG , vG ), however far up in the order, implies that the foundations differ in computational power. The assertion (C) on the other hand, would allow us to conclude downwards in the order by choosing ψ suitably. More precisely, given f ∈ DF , one could define a ‘near-inverse’ f −1 such that f ◦ f −1 , would be a detour but such that f ◦ f −1 ≺ f – and possibly also dgF (f −1 ◦ f ) @F f – in turn spawning results of the type ∃f ∈DF ∩DG (Ff 6= Gf ⇒ Ffh 6= Gfh ) , (‡) by choosing ψ to be a near-inverse to h. But a very dubious sentence, ·(‡) ·is yielding ‘ ⇔ ’ in corollary 68. For, if F = −, · ; bmin , and G = [ ; pr], it is conceivable and consistent with what is known at the time of writing this that∗ G0 = E?0 ) ∆N 0 = F0 , while for some large detour, say f = E4 (x, x), they coincide in the sense that Ff = Gf . ∗ That F = ∆N is a main result of [B08a], and that G = E 0 holds is proved in [K&V08] 0 0 ? 0 or [Bel79]. 132 CHAPTER 3. RELATIVISED DETOUR DEGREES 133 3.3. SPECIFIC F-DEGREES. 3.3 Specific F-degrees. It is now time for the promised application of the framework developed in section 3.2 to some of the idc.’s we have acquainted ourselves with in chapter 2. The first thing we will do is, to identify the foundations, and point out their inter-relationships. We will also point out where many of the canonical or wellknown sub-recursive classes are situated in the resulting hierarchies (or lattices). The following proposition has been proved in [B09a] for the items 1. and 2., and by Kristiansen in [Kri05]: Observation 69 The following (for arbitrary ` ∈ N): · idc.’s are `foundations · ` ; bmin , ; bcount and ; pr . q.e.d. · · Hence, also ; it` is a foundation, where the particular case of ` = 1 is highly non-trivial and due to Esbelin [Esb94]. We next note that an arbitrary union of foundations is itself trivially a foundaS def tion. Denote by pr<ω = spr the union `∈N pr` . For this section we will be concerned with the idc.’s F bmin F bcoun F it F pr F spr def = def = def = def = def = Fµ F] I− E− S− def = def = def = def = def = · · −, · ; bmin · ; bcount1 · P ; it1 ; pr1 [ ; spr] Convention 8 In the continuation we use op as a meta-variable, ranging over op ∈ bmin, bcount1 , it1 , pr1 , pr<ω . We identify bmin with µ, and bcount1 with ], and omit the superscript ‘1’ from bcount1 , it1 and pr1 . We consider the schemata as ordered by µ ≤ ] ≤ it ≤ pr ≤ spr, motivated by the inclusions∗ : F µ ⊆ F ] ⊆ I− ⊆ E− ⊆ S− . We will abbreviate DF OP as DOP , and we ask the reader to kindly suppress from · memory the former use of Dµ for the idc. [ − ; µ], and to accept the F µ -detours µ as D ’s new referent. Theorem 70 (Barra, Esbelin, Kristiansen, Voda) (i) For op as above, each F OP is a foundation, and (ii) – with the exception of op = µ – each F OP is closed under bcount. (iii) For each F OP , we have either∗∗ max ∈ F OP or ΓF op,max ⊆ F0OP . Furthermore, (iv) F op1 (id) = F op2 (id) ⇔ F0op1 = F0op2 . ∗ For pr and spr we actually have pr ⊆ spr. F op,max is the idc. resulting from augmenting the set of initial functions of F OP with ∗∗ Here max. 134 CHAPTER 3. RELATIVISED DETOUR DEGREES OP OP Proof: That ∆N is closed under µ, has been proved in 0 ⊆ F0 and that F [B09a] for µ and ], in [Esb94] for it, and in e.g. [K&B05] for pr and spr. That each op preserve argument-boundedness has been established in the previous sections. Closure under ] is trivial when op = ], was established in [Esb94] for op = it, and proved in e.g [K&V08] for pr ⊆ spr. We see from def φ(x, y, z) = I21 (x, y) I32 (x, y, φ(x, y, z − 1)) ,z = 0 ,z > 0 , and max(x, y) = φ(x, y, χ≤ (y, x)), that max ∈ [χ≤ ; it], whence max ∈ F OP for op ≥ it. For φ ∈ F OP and op ≤ ] we proceed by induction on φ ∈ F op,max . The induction start is trivial. Induction step – case φ = ψ ◦ ~γ : By the ih we have that Γψ , Γγi ∈ F0OP . Clearly: ^ φ(~x) = v ⇔ ∃~y≤max(~x) Γγi (~x, yi ) ∧ Γψ (~y , v) , i≤` and the right formula belongs in F OP by proposition 47. Induction step – case φ = µz≤y [g1 , g2 ] or φ = ]z<y [g1 , g2 ]: That ∃w≤max(~x) (Γg1 (~x, z, w) ∧ Γg2 (~x, z, w)) V _ ∀u<z ∀w≤max(~x) ¬(Γg1 (~x, u, w) ∧ Γg2 (~x, u, w)) ∃z≤y φ(~x, y) = v ⇔ v=z v=y takes care of the induction step for the µ-case, while for the ]-case, we have φ(~x, y) = v ⇔ ∃=v x, z, w) ∧ Γg2 (~x, z, w)) . x) (Γg1 (~ z<y ∃w≤max(~ This proves (iii) by the results from [B09a]. The item (iv) follows from proposition 64. q.e.d. Our immediate first aim is to establish the missing link in the chain-of-inclusions: µ ] − − 1 N 0 ∆N 0 =1 F0 ⊆2 F0 =3 ∆] ⊆4 I0 =5 I? =6 I? ⊆7 E0 =8 − 2 0 1 E? ⊆9 E? ⊆10 S0 =11 E? =12 linspace . (CoI) Above 1. 2. and 3. were proved in [B09a], 4. is due to Esbelin [Esb94], 6. is due to Kutylowski [Kut87], 7. is trivial given 6. 8. was proved by Kristiansen and Voda in [K&V08b], but was unbeknownst to this author already known by at least Bel’tyukov, Kutylowski and Loryś in 1979. In section 3.3.1 this equality and the one numbered 5 is given their own section. 9. and 10. are is trivial, 11. is due to Kristiansen [Kri05] while 12. was ? 2 proved by Ritchie in [Ric69]. Recall that ∆N 0 ⊆ E? , so that the status of each inclusion with regard to strictness is unknown. The missing link is thus ‘=5 ’, which we will prove, by slight modifications to Kristiansen & Voda’s proof of the equality ‘=8 ’. The proof is from the the , 135 3.3. SPECIFIC F-DEGREES. unpublished paper Constant Detour do not matter and so P?− = E?0 [K&V08b], and can be stated as: theorem (Bel’tyukov, Kristiansen & Voda) 1 1 1 ∀f ∈Dpr1 dgpr (f ) ≡pr dgpr (f + 1) . q.e.d. from which it follows by induction on k – T. 55 1 f + k = (id + (k − 1)) ◦ (f + 1) ≡pr (id + k) ◦ (f + 1) = f + (k + 1) 1 1 1 – that dgpr (f ) ≡pr dgpr (f + k) for all k ∈ N. We also remark here that ‘=11 ’ follows directly from =8 and the result by S Warkentin (see [Ros84]) that E?2 = `<ω E 0(`) , where the `-fold recursion is limited (in the Grzegorczyk-sense). Incidentally, (CoI) also means that F µ is a minimal foundation, since part of the definition is that ∆N 0 ⊆ F? for any base. Considering the top-index properties of F µ , it is hard to imagine any natural foundation properly contained in F µ . For the time being, and independently of the equality 7. (which is trivially an inclusion), let us motivate why we feel (CoI) should be considered justification for investing time in this line of research. First of all, it follows from the definition of DOP and (CoI) that∗ Dµ ⊆ D] ⊆ Dit ⊆ Dpr ⊆ Dspr . with equality in any of the links iff there is equality in the corresponding link in (CoI). In particular, the set Dµ , which contains most of the ‘natural detours’, can be viewed as an arsenal with which to investigate all the lattice-hierarchies. Secondly, since majorisation subsumes vOP (for all op), any -linearly∗∗ ordered subset A of Dµ induces hierarchies [ [ F OP (f ) and Ff (for all op). f ∈A f ∈A Recalling next that P. 66 Ff = Gf ⇒ Ffh = Gfh . the above means that if f ∈ Dµ is arbitrary and F OP1 (f ) ( F OP2 (f ) or, equivalently FfOP1 ( FfOP2 , 2 then ∆N 0 ( E? , where the proper containment may be specified further by the particular values of op1 and op2 . One could of course also search for e.g. an f ∈ DPR witnessing F PR (f ) ( F SPR (f ). ∗ We ask the reader to bear over with the inconsistent notation here: in this chapter Dµ is not the idc. from [B09a], but the set of detours DF BMIN . ∗∗ The isolation of well-ordered subsets of D µ , will be a by-product of the results developed in chapter 4, and this subject is an interesting and rewarding field of research in its own right. 136 CHAPTER 3. RELATIVISED DETOUR DEGREES This result quite naturally provide a uniform notational system for, and a unifying perspective on, numerous classical results from sub-recursion theory. The theory of foundations is a novel one, and there are inevitably many more questions and problems than there are answers and theorems. Some questions regard the structure of the degrees, and where theorem 55 states that ah is well defined, one can ask for similar results – negative or positive – for whether or not e.g. a+f can be defined. The question of jump operators is all but ignored here, even though some preliminary results (not really interesting on their own) have been established. Kristiansen and Voda mention them in [K&V08]. Before we end this chapter by illustrating the unifying quality of the detour-degree notation for the field of subrecursion-theory, and outlining paths for further research, we include a section dedicated to the equalities numbered 5 and 8 in the (CoI) on page 134. 3.3. SPECIFIC F-DEGREES. 3.3.1 137 On id v.s. id + 1 when op = it . An important remark on the theorem E?0 = E?− . Prior to proving the equality 5. from the chain-of-inclusions found on p. 134, we must remark on the theorem that E00 = E0− . This author learnt of this equality in the paper The structure of the detour degrees [K&V08] – in which Kristiansen and Voda invokes this theorem in their developing of detour degrees w.r.t. the class E − . The reference to Bel’tyukov’s proof of the equality appeared in a paper by Kutylowski and Loryś [K&L87] which this author browsed early autumn 2009 while searching MathSciNet for small-idc. results before finishing the work on this dissertation∗ . The paper is much less known than Kutylowski’s [Kut87] – which is a canonical read for anyone within this field. (The equality E?0 = E?− is invoked in [K&L87] in order to obtain new results on yet another variation of limited primitive recursion.) The paper by Bel’tyukov is to my best knowledge only available in Russian, which, if true, might help to explain why it is not more widely known. The attribution of this result to the trio Bel’tyukov-Kristiansen-Voda as independent discoverers seems appropriate to this author. Moreover, in spite of the fact that examining the details of Bel’tyukov’s 1979 proof has not yet been possible for lack of time and resources for a proper translation – the method of proof seems to be essentially different from that employed by Kristiansen and Voda. The latter proof is direct, transparent, and as we shall see below, suitable for adaptation to the schema of iteration. Proof of the theorem I?− = I?0 . Definition 43 Define for u, v, f ∈ N def u, v ≤f 1, 0 ⇔ (u = 1 ∧ v = 0) ∨ (u = 0 ∧ v < f ) . ~ ~γ . We say that the 2-fold Let φ1 , φ2 be defined by 2-fold iteration from ψ, iteration is base-max+1 -limited if: ~ x, y) ≤max(~x,y)+1 1, 0 . ∀~x,y∈Nk+1 φ(~ Remark 4 For the reader familiar with Kutylowski’s classes I 0,c and E 0,c from [Kut87], and who find the schema above reminiscent of his – hold on to that intuition – but also make brief pause here to understand their differences. The schema employed by Kutylowski is somewhat more liberal than base-max+1 limited 2-fold iteration/recursion. Whereas Kutylowski’s schema is designed to allow an extra bit of information to be propagated through the unfolding of iterations/recursions – and thus in effect allowing for the encoding of arguments in the range 0–2 · (max(~x) + c) by arguments in the range 0–max(~x) + c – the schema above is designed to accommodate encoding of arguments in the range 0–max(~x) + 1 by arguments in the range 0–max(~x) only. ∗ In fact, the existence of this paper caused minor delays in preparing the final version of the thesis as amongst other adjustments, this remark needed to be incorporated into it. 138 CHAPTER 3. RELATIVISED DETOUR DEGREES Still, any intuition about Kutylowski’s method should be valuable for understanding the proof(s) of this section’s main theorem. The solution is definitively in Kutylowski’s tradition, as it consist of the same components: In both cases one have classes F ⊆ F 00 and F 0 where the only difference is that the main operator op0 of F 0 is seemingly more general than F’s main operator op, viz. op ⊆ op0 . One next prove that indeed op0 ⊆ op, yielding F = F 0 . Next, iterated use of the proof of op0 ⊆ op ensures that F 00 ⊆ F, yielding the desired F = F 00 . Definition 44 Define E, D1 and D2 by: c def e E(c, e, u, v) = v and ~ v) def D(c, = 1, 0 0, v , u, v = 1, 0 , u, v = 0, c , otherwise ,v = c , otherwise . def Next, given functions ψ1 , ψ2 , γ1 , γ2 define functions G~γ (c, e, ~x) = E(c, e, ~γ (~x)) and e ,z = e def Hψ~ (c, e, ~x, z) = . ~ ~ E(c, e, ψ1 (~x, D(c, z)), ψ2 (~x, D(c, z))) , otherwise def I Define Fψ,~ ~ γ = Hψ ~ G~ γ. I − ~ ~γ ∈ I − ⇒ G~γ , H ~ ∈ I − ; consequently F ~ def Note that ψ, . ~ G~ γ ∈I ψ ψ,~ γ = Hψ Will need: ~ E(c, e, u, v)) = u, v . Claim 71 (u, v ≤max(u,v)+1 1, 0) ∧ (v 6= c) ⇒ D(c, Proof: Clearly: ~ c) = 1, 0 D(c, ~ ~ e) D(c, E(c, e, u, v)) = D(c, D(c, ~ v) = 0, v , if u = 1, v = 0 , if u = 0, v = c , if u = 0, v 6= c (otherwise) . The above also implies that: ~ e) = 0, e . c 6= e ⇒ D(c, (†) q.e.d. ~ def ~ I ~γ is a case of base-max+1 -limited iteration, and Claim 72 Assume φ = ψ set def ~ x, y) 6= 0, c . R(c, e, ~x, y0 ) ⇔ e 6= c ∧ ∀y≤y0 φ(~ Then (where we write F for Fψ,~ ~ γ ): 1. R(c, e, ~x, y0 ) ∧ y ≤ y0 ⇒ F (c, e, ~x, y) 6= e; 139 3.3. SPECIFIC F-DEGREES. ~ x, y) = D(c, ~ F (c, e, ~x, y)) . 2. R(c, e, ~x, y0 ) ∧ y ≤ y0 ∧ F (c, e, ~x, y) 6= e ⇒ φ(~ ~ x, y) So, under the hypothesis of 1. above, the function F encodes the pair φ(~ ~ w.r.t. D . Proof: Proof of 1.–2. by simultaneous induction on y (for all y0 and ~x). induction start (y = 0 ≤ y0 ): Assume R(c, e, y0 , ~x) ∧ y ≤ y0 . Then, since def ~ x, y) 6= 0, c , ~γ (~x) = φ(~ def def F (c, e, ~x, y) = G(c, e, ~x) = E(c, e, ~γ (~x)) 6= e . Hence, by claim 71 and e 6= c we also see that 2. holds for the induction start. induction step (y + 1 ≤ y0 ): Consider: def ~ x, D(c, ~ F (c, e, ~x, y)))) . F (c, e, ~x, y + 1) = H(c, e, ~x, F (c, e, ~x, y)) = E(c, e, ψ(~ ih(1.) Assuming R(c, e, ~x, y0 ) ∧ y ≤ y0 we have by the ih(1.) that F (c, e, ~x, y) 6= e, and therefore also: ~ x, D(c, ~ F (c, e, ~x, y)))) IH(2.) ~ x, φ(~ ~ x, y))) def ~ x, y + 1)) . E(c, e, ψ(~ = E(c, e, ψ(~ = E(c, e, φ(~ ~ x, y + 1)) = 0, e, then E(c, e, φ(~ ~ x, y + 1)) = e, and ‘all bets are off’. If, Now, if φ(~ def ~ on the other hand φ(~x, y + 1)) = u, v 6= 0, e, then – by hypothesis u, v is neither ~ concludes (0, e) nor (0, c) – and thus claim 71 and the definitions of E and D the proof. q.e.d. ~ and ~γ for which φ ~ def Claim 73 Let G, H and F be defined as above w.r.t. ψ = ~ I ~γ is a base-max+1 -limited iteration. Then: ψ ~ x, z) = 0, c ⇒ F (c, e, ~x, y) = e . c 6= e ⇒ ∃z≤y φ(~ Proof: First, observe that if F (c, e, ~x, z0 ) = e, then z ≥ z0 ⇒ F (c, e, ~x, z) = e: this is true by hypothesis for z = z0 , and def def F (c, e, ~x, z + 1) = H(c, e, ~x, F (c, e, ~x, z)) = H(c, e, ~x, e) = e . ~ x, z) = 0, c , let z0 be the least such z. If z0 = 0, then as Hence, if ∃z≤y φ(~ def def F (c, e, ~x, 0) = E(c, e, 0, c) = e , indeed y 7→ F (c, e, ~x, y) is identically e. If def z0 > 0, set w = F (c, e, ~x, z0 − 1) and consider † def ~ w)), ψ2 (~x, D(c, ~ w))) , F (c, e, ~x, z0 ) = H(c, e, ~x, F (c, e, ~x, z0 −1)) = E(c, e, ψ1 (~x, D(c, † where = simply invokes the definition of H. By the minimality of z0 and the assumption that c 6= e , we now have – for R as in claim 72 – that R(c, e, ~x, z0 ) , ~ w) = φ(~ ~ x, z0 − 1), which in turn implies that: whence by the same claim D(c, ~ w)), ψ2 (~x, D(c, ~ w)) = 0, c . ψ1 (~x, D(c, 140 CHAPTER 3. RELATIVISED DETOUR DEGREES Hence F (c, e, ~x, z0 ) = e, which concludes this proof by the initial observation. q.e.d. It is the contra-positive implication we are after: c 6= e ⇒ ~ x, z) 6= 0, c F (c, e, ~x, y) 6= e ⇒ ∀z≤y φ(~ . We can finally prove one of our main lemmata: Lemma 74 I − is closed under base-max+1 -limited 2-fold iteration. ~ are definable by base-max+1 -limited iteration, Proof:∗ We note first that if φ then {(φ1 (~x, z), φ2 (~x, z)) | z ≤ y } ⊆ {(0, 0), (0, 1), . . . , (0, max(~x, y)), (1, 0))} . (†) so that the first set has at most max(~x) + 1 members. The idea behind the proof is to search for values c, e in such a way that c may represent the pair (1, 0) in an 1-fold iteration simulating the base-max+1 limited iteration, and where e will serve as an error-value. It is the claim 73 which enables us to do this without circularity: we can find a suitable c without ~ knowing which error-value to use in the final simulation of φ. ~ are defined as above, the sequence of pairs If now y0 + 3 ≤ max(~x, y), and φ ~ ~ φ(~x, 0), . . . , φ(~x, y0 ) trivially consist of at most y0 + 1 different pairs, while there are at least y0 + 4 many elements in {0, . . . , y0 +n3} ⊆ {0, . . . , y}. In particular, o ~ x, 0), . . . , φ(~ ~ x, y0 ) is guarthe existence of c 6= e satisfying (0, c), (0, e) 6∈ φ(~ anteed: because there are three missing values – even if (1, 0) is one of them – there are two left which have to be on the form (0, u). Let F be as in the previous claims and define: def · ξ(~x, y, e) = µc≤max(~x,y) [F (c, e, ~x, y −3) 6= e ∧ c 6= e] . · ~ x, y −3), Then, if y ≥ 3, because some pair (0, c) is avoided by φ(~ and by combining claims 72 & 73, we have: e 6= c ⇒ ~ x, z) 6= 0, c F (c, e, ~x, y) 6= e ⇔ ∀z≤y φ(~ , def · that c(~x, y) = ξ(~x, y −3, 1) is one such pair. By the same argument def e(~x, y) = ξ(~x, y, c(~x, y)) is a good error-value. ∗ The proofs of this lemma, the claims 71, 72, 73, and the lemma 76 below are adapted from Kristiansen and Voda’s proof of closure of E − under the corresponding version of basemax+1 -limited recursion. Though it was very compact and contained minor errors, their proof contained the entire idea. 141 3.3. SPECIFIC F-DEGREES. def Now, for i = 1, 2 the functions Gi (~x, y) = Di (c, e, F (c, e, ~x, y)) enables us to conclude the proof by observing that: ~γ (~x) ,y=0 ψ(~ ~ x, ~γ (~x)) ,y=1 ~ x, y) def φ(~ = ~ x, ψ(~ ~ x, ~γ (~x))) ψ(~ , y=2 ~ · ~ x, ψ(~ ~ x, G(~ ~ x, y −3))) ψ(~x, ψ(~ ,y≥3 q.e.d. Definition 45 Define functions M1 , M2 by: 0, v , v < f def ~ M(f, v) = 1, 0 , f ≤ v . ~ ∈ I − since they are defined by d.b.c. over a ∆N -predicate. We note that M 0 ~ Obviously M(f, f ) = 1, 0 for all f , and we recall the identity ‘10b = b’ – viz. ‘10 is f + 1 in base f + 1’. ~ For the rest of this section we adopt the notation from [K&V08b], where M(f, ~x) abbreviates the 2k-long sequence M1 (f, x1 ), M2 (f, x1 ), . . . , M1 (f, xk ), M2 (f, xk ) . This convention applies to the Mi ’s only. This next observation will be important: ~ Observation 75 M(f, x) ≤f −1 1, 0 . q.e.d. ~ ~ − 1, M(f, ~ Lemma 76 ∀φ∈I − ∃φ∈I max(~x) ≤ f > 0 ⇒ M(f, φ(~x)) = φ(f ~x)) . − ~ Proof: By induction on φ. For a function φ of arity k, by construction the φi ’s must have arity 2k + 1. We denote the extra arguments by ~z, so that if ~x are the arguments of φ, then the arguments to the φi ’s are f, w ~ ≡ f, z1 , x1 , . . . zk , xk . The induction start is obvious for constants and projections. If φ = P, then ~ φ(x) < f by assumption, so that M(f, ~x) = 0, P(x) , and the result is immediate. ~ and ~γ j ’s by the ih for induction step (case φ = ψ ◦ ~γ ): We then have ψ which it is routine to verify that def φi (f, w) ~ = ψi (f, ~γ 1 (f, w), ~ . . . , ~γ ` (f, w)) ~ solves the problem. ~ and ~γ , and define: induction step (case φ = ψ I γ): Fix first ψ u, v , (u = 1 ∧ v = 0) ∨ (u = 0 ∧ v ≤ f ) def 0 ~ Φ (f, u, v) = 0, 0 , otherwise. We emphasize here the fact that: ~ 0 (f, u, v) = u, v . u, v ≤f −1 1, 0 ⇒ Φ (z) 142 CHAPTER 3. RELATIVISED DETOUR DEGREES Next, def ~ Φ(f, w, ~ y) = ( ~ 0 (f, ~γ (f, w)) Φ ~ ~ w, ~ 0 (f, ψ(f, Φ ~ Φ(f, w, ~ y − 1)) ,y = 0 , y > 0. ~ are defined by base-max+1 -limited iteration follows from (z) and That the Φ − that all I -functions are a.b. ensures that during sub-computations of the iter~ 0 L.∈74 I − . ation, all values are bounded by∗ max(w, ~ f, y, 1). Hence Φ We next define, for i = 1, 2, def ~ w, φ(f, ~ z, y) = ( ~ Φ(f, w, ~ y) ~ ~ ψ(f, w, ~ Φ(f, w, ~ f )) ,z = 0 , otherwise. ~ ∈ I− Note that this last definition is a d.b.c., viz. it is not an iteration, whence Φ is immediate. We have that: def ~ M(f, φ(~x, y)) = ( ~ M(f, γ(~x)) ~ M(f, ψ(~x, φ(~x, y − 1)) IH(γ) = = IH(ψ) ~ ~γ (f − 1, M(f, ~x)) ~ ~ ψ(f − 1, M(f, ~x, φ(~x, y − 1))) We finish the proof by sub-induction on y. The sub-induction hypothesis will be augmented to include the assertion that: ~ ~ − 1, M(f, ~ ~ − 1, M(f, y ≤ f − 1 ⇒ Φ(f ~x), y) = φ(f ~x), 0, y) . † ‡ ~ ~ induction start (y = 0): Then, if we set w ~ = M(f, ~x), and because M(f, y) = 0, 0 , we have: ‡ def def ~ − 1, M(f, ~ ~ − 1, w, ~ − 1, w, φ(f ~x, y)) = φ(f ~ 0, 0) = Φ(f ~ 0) = † ~ ~ 0 (f − 1, ~γ (f − 1, w)) ~ 0 (f − 1, ~γ (f − 1, M(f, Φ ~ =Φ ~x))) . ~ Hence, by (z) we are done if ~γ (f − 1, M(f, ~x)) ≤f −1 1, 0 . But O. 75 IH(γ) ~ ~ ~γ (f − 1, M(f, ~x)) = M(f, γ(~x))) ≤f −1 1, 0 . induction step (y +1): We have two sub-cases to investigate – when y +1 = f ~ and when y + 1 < f – which correspond to M(f, y + 1) equaling 1, 0 and 0, y + 1 ‡ ~ respectively. For the first sub-case, note that f − 1 = y, that now M(f, y + 1) = 1, 0, and so (for w as above) we obtain: ‡ def ~ − 1, M(f, ~ ~ − 1, w, φ(f ~x, y + 1)) = φ(f ~ 1, 0) = IH IH ~ − 1, w, ~ − 1, w, ~ − 1, w, ~ − 1, w, ψ(f ~ Φ(f ~ f − 1)) = ψ(f ~ φ(f ~ 0, y)) = † ~ − 1, w, ~ ~ − 1, M(f, ~ ψ(f ~ M(f, φ(~x, y))) = ψ(f ~x, φ(~x, y))) . ∗ Here we rely on corollary 44 in a straightforward but essential way. For values for which this bound does not hold, of which there are but finitely many, simply redefine the involved functions to satisfy it. Later, one can then recuperate the original function by explicit d.b.c. ,y=0 , y > 0. 3.3. SPECIFIC F-DEGREES. 143 ‡ ~ For the last case, M(f, y + 1) = 0, y + 1 , and now ‡ def ~ − 1, M(f, ~ ~ − 1, w, φ(f ~x, y + 1)) = φ(f ~ 0, y + 1) = def IH ~ − 1, w, ~ − 1, w, ~ 0 (f − 1, ψ(f ~ − 1, w, Φ(f ~ y + 1) = Φ ~ Φ(f ~ y))) = IH ~ − 1, w, ~ − 1, w, ~ − 1, M(f, ~ ~ 0 (f − 1, ψ(f ~ 0 (f − 1, ψ(f Φ ~ φ(f ~ 0, y))) = Φ ~x, φ(~x, y)))) . ~ − 1, w, ~ − 1, w, That ψ(f ~ φ(f ~ 0, y)) ≤f −1 1, 0 , now follows from the ih on ψ, (z) and observation 75 just as above, and suffices to conclude this proof. q.e.d. We can finally prove: Corollary 77 ∀f ∈DIT (f ≡IT f + 1) . Proof: Let f ∈ DIT and R ∈ If−+1 be arbitrary. We need to construct a χR ∈ I − (f ). Towards this end, fix ψ = χˆR f +1 ∈ I − , viz. ψ(~x, f (~x) + 1) = χR (~x) . By lemma 76, we have ψ1 and ψ2 satisfying: ~ (~x), 1, 0, 0, x1 , 0, x2 , . . . , 0, xk ) = M(f ~ (~x) + 1, ψ(f (~x) + 1), ~x) . ψ(f ~ ∈ I − . Because ψ is 0–1-valued (w.l.o.g.), we have By lemma 74 ψ ψ(f (~x) + 1, ~x) = ~x 7→ ψ2 (f (~x), 1, 0, 0, x1 , . . . , 0, xk ) . The function to the right clearly belongs to I − (f ), since it is an explicit definition involving one composition of an I − (f ) function into an I − (id) function. q.e.d. With this, we have obtained one of the results we desired: the equality 5. in (CoI) on p. 134. 144 CHAPTER 3. RELATIVISED DETOUR DEGREES 3.4. DIRECTIONS FOR FURTHER RESEARCH. 3.4 145 Directions for further research. One of the most obvious projects to undertake, and which should not be very hard in many cases, is to map well-known idc.’s C to pairs (F OP , f ∈ DOP ) so that C =? F OP (f ) . Often, one have to associate C not with one specific detour, but with an -increasing family {fα }α<λ ⊆ DOP where α < β ⇒ fα fβ S OP and where C =? α<λ F (fα ) . One example we have seen here is, when {fn }n<ω = {x 7→ xn | n ∈ N }, that [ [ E 2 =? S − =? E − (fn ) = I − (fn ) . n<ω n<ω In many cases such characterisations yield the possibility of finding (linear) sub-orders of the p.o. vOP , giving rise to various (refined) hierarchies, and often one can refine classical hierarchies – like the Grzegorczyk-hierarchy – almost indefinitely. It would also be of interest to map more exotic variations of the most canonical idc.’s, like the variation on limited primitive recursion found in [K&L87], to specific F-degrees. Another way to proceed within the framework of detour degrees is to continue to obtain results like those summarised in the following theorem: Theorem 78 (Barra, Belt’yukov, Esbelin, Kristiansen, More, Voda) 1. op = pr ⇒ ∀k∈N, f ∈DOP (f ≡OP f + k); 2. op = it ⇒ ∀k∈N, f ∈DOP (f ≡OP k · f ); 3. op ∈ {µ, ], spr} ⇒ ∀p∈N[x], f ∈DOP (f ≡OP pf ). Proof: The item 1. was discussed in section 3.3.1. For 2., by corollary 77 we first obtain id ≡IT id + k for arbitrary k. Hence, by theorem 55, we have: f = id ◦ f ≡IT (id + k) ◦ f = f + k . In the paper Small Grzegorczyk Classes [Kut87] Miroslaw Kutylowski proves that I 0 =? I 1 . Now it straightforward to show that [ [ I0 = I − (id + k) and that I 1 = I − (kx) . k∈N k∈N By this and id ≡IT id + k we thus have I − (id) ⊇ I − (kx) for arbitrary k, from which the item follows directly. The reader is asked to note how this result relies on all three results corollary 77, theorem 55 and Kutylowski’s theorem that I 0 =? I 1 . For the item 3., when p ∈ N[x], that id ≡OP p for op = µ, ] follows from e.g. Theorem 27 and Corollary 36 from [B09a]. That this assertion is true also for op = spr follows from results in [K&B05]. Of course, p = id + 1 is a special case of this more general result. Again theorem 55 yield: f = id ◦ f ≡OP p ◦ f = pf . 146 CHAPTER 3. RELATIVISED DETOUR DEGREES q.e.d. Obtaining more results of this type could be potentially very rewarding. Either of the form f ≡OP g or f 6≡OP g. All such results would say something about how close or how different the various degree-structures really are. Some examples of the notation in use. Lemma 79 ∀x∈N (φ(x) ≤ φ(x + 1)) ∧ Γφ ∈ F? ∧ f ∈ DOP ⇒ f + φ ∈ DOP . Furthermore f × max(φ, 1) ∈ DF . Proof: That Γf +φ ∈ F? is a consequence of closure under bounded quantifiers and monotonicity of f and φ: f (x) + φ(x) = v ⇔ ∃z1 ,z2 ≤v (Γf (x, z1 ) ∧ Γφ (x, z2 ) ∧ z1 + z2 = v) . Secondly x ≤ f (x) ≤ f (x) + φ(x) = (f + φ)(x) ≤ f (x + 1) + φ(x + 1) = (f + φ)(x + 1) . Similarly f (x) × φ(x) = v ⇔ ∃z1 ,z2 ≤v (Γf (x, z1 ) ∧ Γφ (x, z2 ) ∧ z1 z2 = v) . We need to use max(φ, 1) to ensure x ≤ f (x) ≤ f (x) × φ(x) when φ(x) = 0 for x > 0. q.e.d. Theorem 80 id ≡ x + bxc ⇔ id ≡ 2x . def Proof: Set f = id and h = x + bxc. Since φ(x) = bxc ∈ F OP and because φ(x) ≤ φ(x + 1) we have by Lemma 79 that h ∈ DOP . It is easy to show that 1 , we have hn (x) ≥ x + n bxc, and that for m such that ≥ m jxk ae ae x < 2x < x + 2m ≤ h2m . m We thus have that id idh = h 2x h2m . If id ≡ x + bxc, then the entire h-chain is ≡. If id ≺ x + bxc, clearly id ≺ 2x . q.e.d. Similarly, we have: √ Theorem 81 id ≡ x · b k xc ⇔ id ≡ x2 . q.e.d. Another result, which is very easily stated in terms of detour degrees is: √ Theorem 82 k ≥ 2 ⇒ id ≡PR id + b k xc . Proof: It is known from results in Bel’tyukov [Bel82] that 2≤k ⇒ [ √ E − (x + k x )? ⊆ E − (id + n)? n∈N ! , 3.4. DIRECTIONS FOR FURTHER RESEARCH. 147 where we have written E?0 as the union over n ∈ N of E − (id+n)? . By Kristiansen and Voda [K&V08b] we know that the hierarchy to the right collapses, yielding E − (id + n)? = E0− = E?0 . q.e.d. √ This does not solve any of the notorious open problems, as for h = id + b k xc, ae we have hn (x) < 2x. Theorem 83 E 0 =? E 1 ⇔ ∀n, m > 0 (mx ≡pr nx) . Proof: Because E0− = E?0 , if E?0 = E?1 , the proof of the ⇒-direction is analouge to the proof above that x ≡it kx . For the ⇐-direction, if id ≡pr 2x S, set h(x) = 2x. Then hn (x) = 2n x, and also id ≡ idh ≡ idh2 ≡ · · · . Since E?1 = k∈N E − (kx)? , we would have E?1 = E − (2x)? = E − (id)? = E?0 . q.e.d. In Kutylowski [Kut87] it is shown that E?0 = E?1 ⇔ E 0 = E 0,2 where def E 0,2 = I ∪ N ∪ {S, max} ; comp, prx,2 and where prx,2 is a certain schema of simultaneous recursion where the second function should be {0, 1}-valued. It can be shown that E?0,2 = E − (2x)? . The promised example of a detour f – in the sense x ≤ f (x) ≤ f (x + 1) – but with a too complex graph to serve as an F-detour can also be fitted here. It is 2xo S ·· def · 3 − n , and that very easy to prove that E = n∈N E (fn )? where fn (x) = 2 S 2 2 − n 3 E = n∈N E (x ) . As is well-known E? \ E? 6= ∅ , thus choose some predicate R in this set. Define next a function f by: µz>f (x) [z is even] , if x + 1 ∈ R f (0) = 0 and f (x + 1) = . µz>f (x) [z is odd] , if x + 1 6∈ R This function is clearly a detour, and we see that f ≤ 2x ≺ x2 . However, the graph cannot belong to E?2 , whence not to E − (xn )? for any n. since this would enable us to decide whether x belong to R or not by the simple construction χ̂fR = χ2N ◦ I22 (just assume w.l.o.g. that 0 ∈ R). In other words, the requirement that the graph of a detour belongs to F? in order to admit it as a boosting functions to F’s programs, is to avoid such encoding of complex predicates into a function which majorisation properties alone is insufficient to compute that very same predicate. We feel that the above testifies to the usefulness of the detour degrees as a conceptual framework for discussing and stating results in small idc-theory, and it is our hope that the framework will be exploited to its full potential by researchers in this field – a tool for describing and interrelating results based on different recursive schemata, and possibly discovering new ones. Secondly there are many questions of interest related to jump-operators on the degree structures which deserves attention, and which might shed some light also on the relative strength of the specific foundations. We have touched this subject here, and DD also contains some results on jump-operators on Dpr . In the same category of open problems are also questions regarding the existence of incomparable degrees. How does one construct f, g ∈ DOP such that f ⊥ g? Again Kristiansen and Voda report on some results in [K&V08], but we have nothing new of any significance to report on in this department either. On a 148 CHAPTER 3. RELATIVISED DETOUR DEGREES personal note this author must admit that the construction given in [K&V08] for two incomparable pr-degrees a1 and a2 does not seem completely satisfying in as much as a lack of a detailed proof that 0 < ai < 00 is missing. Indeed the details of a1 ∩ a2 = 0 and a1 ∪ a2 = 00 is provided, but the possibility 0 = a1 < a2 = 00 is not convincingly argued. However, the reason why this (missing) part of the proof is difficult at all, is quite intriguing. Another question which have been neglected is e.g. how definitions increase in size when one eliminates the successor from a [{+} ; bmin]-function and obtain · an equivalent [{ −} ; bmin]-function, or how they grow when a (id + k)-detour is to be simulated by a id-detour. There is a strong tradition for investigating such questions in Logic’s, even though this falls outside the scope of idc.-theory as such. It would also be interesting to analyse how efficient the algorithms extracted from an idc.-definition is when run on a TM as a rewriting system. Such questions were studied in e.g. this author’s MsC [B04], and by Oitavem in [Oit02] and Beckmann and Weiermann in [B&W96]. It is our belief that working with the detour degrees will prove rewarding and challenging for any researcher in this field. In the next and final chapter, a quite different topic is treated. However, keeping in mind that the functions studied there are all ∆N 0 -detours, the functions of chapter 4 may prove to be important also for understanding the theory discussed in this chapter. Majorisation Majorisation Unity is strength... when there is teamwork and collaboration, wonderful things can be achieved. – Mattie Stepanek This entire chapter – including the embedded article Skolem + Tetration is well-ordered [B&G09] – is the result of the joint enterprise of Philipp Gerhardy and the author of this thesis to better understand the majorisationrelation. I am indebted to Philipp – both as a friend and as a colleague – for having undertaken this task with me. The question I pitched to Philipp the spring 2008 was simply if he had any ideas about how to determine the -relationship between f g and gf for certain detours f and g. Luckily for me, the question immediately caught Philipp’s interest, and roughly a year later we have been able to answer not only my original question, but also to develop new tools for studying the majorisation relation, answer a long forgotten problem of Skolem’s, and – as always in mathematics – uncover an array of interesting problems for logicians to pit our skills against. In order to better fit in the context of this dissertation, I have taken the liberty of rewriting our working-notes so as to enable the reuse of notation and terminology, and to avoid unnecessary repetitions. All results and proofs are the product of text-book mathematical collaboration: coffee, blackboards, long nights and – occasionally – a few beers! 4.1 Definitions This chapter deals with the majorisation relation ‘’. Recall from definition 1 ae def that f g ⇔ ∃N ∈N ∀x∈N (x ≥ N ⇒ f (x) ≤ g(x)), viz. f ≤ g. Our ‘Ackermann-Péter’-functions. Definition 46 We define two sequences of functions Ek and Tk by: def def def E1 (x, y) = x + y, T1 (x, y, z) = z · x + y and E2 (x, y) = x · y , k≥2 ⇒ Ek+1 (x, y) def Tk (x, y, z) def = = Tk (x, 1, y) y Ek (x, Tk (x, y, z − 1)) ,z=0 ,z>0 152 CHAPTER 4. MAJORISATION Definition 47 Define∗ the operator of rank-k exponentiation ek : NN × NN → NN by ek (f, g) = x 7→ Ek (f (x), g(x)) = Ek ◦ (f, g) def Define: def [ def def Ak = 1, id ; e1 , . . . , ek and A = Ak . k∈N Also, we consider this idc.-like description of the classes Ak to be a sufficient substitute for a more cumbersome syntactic definition of terms (or something similar) in order to discuss them from a syntactic point of view. The reader will recognize En as roughly the nth Ackermann branch or the nth Grzegorczyk function, and thus each En ∈ E n , while x 7→ Ex (x) is super-primitive recursive. Note also that E3 is the usual exponential function. It is also a mere observation that Ak ⊂ E k . Furthermore, if we had included comp in the Ak , then Ak ’s unary functions would coincide with Ak as we have defined it now. Thirdly Ak can be described as the intersection between A and E k. For the suspicious expert we also point out that this characterisation of Ak does not conflict with Schwichtenberg’s result from [Sch69] that E 3 is the second class in a hierarchy induced by so-called recursion-rank. Indeed, we do not have recursion at all in our sets – there is a finite number of operators in the idc. characterisation of the Ak ’s. def def We have that T2 (x, y, z) = xz · y when x ≥ 1. Hence E3 (x, 0) = T2 (x, 1, 0) = 1, def which means that if we consider 1 = T2 (0, 1, 0) = 00 · 1 then the ‘inherited’ 0 interpretation of ‘0 ’ is 1. We also see that: T3 (x, y, k) = x y ·x o · · k · and E4 (x, y) = x · where the exponentiation associates upwards: xo · x o · k ·· x · def x · k+1 = x . ·xo y , Observation 84 2 ≤ x, y ⇒ x + y ≤ x · y ≤ xy . Observation 85 Note that defining the Ek ’s by: def def Ek (x, 1) = x and Ek+1 (x, y + 1) = Ek (x, Ek+1 (x, y)) would yield the same E-functions (for non-zero arguments). However, in addition to providing versatile notation∗∗ , conceptually the Tk -functions will provide a means for generalising the familiar notion of degree as associated with polynomials, to functions further up in the hierarchy. ∗ Please note that the convention of including the schema of composition in an idc. does not apply in this chapter. Thus calling Ak an idc. is an abuse of the term if we are to be consistent with the discussion in chapter 3. Let us agree that this is a minor issue, since we will not be appealing to any idc. results here, except for the convenient fact that we will be able to prove results about functions in A by induction on the build-up. ∗∗ Think generalised Exponential, and generalised exponential Tower, and note that e.g. T3 (x, y, z) is a tower of x’s, of height z, topped off with a y. 153 4.1. DEFINITIONS The subject matter. Our long-range goal is to characterise the structure (A, ) as far as possible. As pointed out, (NN , ) is reflexive, transitive and almost-anti-symmetrical: it is an ‘almost-p.o.’. That is, if we consider NN /ae – the set NN divided out by = ae . In the equivalence relation ‘=’ – then is well-defined and a p.o. on NN /ae = particular we are curious as to whether (A, ) is a linear order, even perhaps a well-order – and if so – can we say anything about its order type, or the order type of some initial segment (An , ). 154 CHAPTER 4. MAJORISATION 155 4.2. ON THE ORDER 4.2 On the Order We assume that the reader is familiar with the concept of an ordinal (and that of a well-order ) and their arithmetic. For more on ordinals, consult e.g. Sierpińsky [Sie65]. That A2 = N[x] is rather obvious, whence O(A2 , ) = ω ω as is well-known, and there simply is not much more to be said. The well-order (A3 , ) is much more enigmatic. It has been studied extensively in the past, and we recall here that ε0 ≤ O(A3 , ) ≤ τ0 . Above ε0 is the least solution to the (ordinal) equation ω α = α, viz. ω ε0 = ε0 . This number is known as epsilon-nought, or the least epsilon number. There are as many epsilon-numbers as there are ordinals, and the epsilon numbers are alternatively characterised as the principal exponential numbers due to the truth of the following assertion: α, β < α ⇒ αβ < α ; i.e. the epsilon numbers are closed under ordinal exponentiation. τ0 is the least solution to the equation εα = α, viz. ετ0 = τ0 , and is also known by the name the least critical epsilon number. In particular, τ0 is an epsilon number. Both ordinals also feature in the Veblen hierarchy, where they correspond to φ(1, 0) and φ(2, 0) respectively. For more on the Veblen-function φ mentioned above, consult e.g. [Veb08]. That (A3 , ) is a total linear order was shown by D. Richardson [Ric69]; that it is indeed a well-order is due to Ehrenfeucht [Ehr73]. Furthermore, it is known that ε0 ≤ O(A3 , ) ≤ τ0 . The lower bound follows from Levitz’s result that (A3 , ) contains a sub-order of length ε0 ([Lev77]). The same author provided the upper bound shortly after in [Lev78], where the following results – translated into our notation – were also proved: Theorem L (Levitz [Lev78]) (i) n < m ⇒ T3 (2, x, n) ≺ T3 (2, x, m) ; (ii) ∀n,a,b∈N ∃c∈N T3 (2, xa , n) · T3 (2, xb , n) T3 (2, xc , n) ; (iii) ∀a∈N (T3 (2, xa , n) ≺ T3 (2, x, n + 1)) ; (iv) f ≺ T3 (2, x, n + 1) ⇒ ∃a∈N (f T3 (2, xa , n)) ; (v) f ≺ T3 (2, x, n + 1) ∧ g ≺ T3 (2, x, n) ⇒ f g ≺ T3 (2, x, n + 1) . q.e.d. Now, the above theorem contains majorisation-results, that is results describing the eventual relationship between unary functions. They are almost everywhereresults. We will prove various majorisation results – amongst them a uniformin-k version of the items (i)–(v) above – in section 4.2.3 below. First we 156 CHAPTER 4. MAJORISATION will prove several monotonicity-results in the next section. What distinguishes monotonicity from majorisation is that the latter depends on the former type, and that monotonicity make sense also for several arguments directly. It is straightforward define f ≺ g when f : Nk → N and g : N → N, and the most common way is to define ae def f ≺ g ⇔ f (~x) ≤ g(max(~x)) . This gives no way for comparing two k-ary functions directly and this relation is not of any particular use for us in this work. 4.2.1 Monotonicity properties. Proposition 86 Let k ≥ 2. Then: (a) ∀x,y (Tk (x, 1, y) = Tk+1 (x, y, 1)) ; (b) ∀x (Tk (x, 1, 1) = Ek (x, 1) = x) ; (c) ∀x≥1 ∀y (Ek (x, y) ≤ Ek (x, y + 1)) ; (d) ∀x,y≥2 (Ek (x, y) ≤ Ek+1 (x, y)) ; (e) ∀y≥1 ∀x (Ek (x, y) < Ek (x + 1, y)) ; (f ) ∀x,y,z≥2 (x, y, z < Tk (x, y, z)) ; (g) ∀x,y,z≥2 (Tk (x, y, z) + 1 < Tk (x + 1, y, z), Tk (x, y + 1, z), Tk (x, y, z + 1)) ; (h) ∀x,y≥2 (Ek (x, y) ≤ Ek (x, y + 1) − x) . Proof: def def def (a): Tk+1 (x, y, 1) = Ek+1 (x, Tk+1 (x, y, 0)) = Ek+1 (x, y) = Tk (x, 1, y) . (b): By induction on k: For k = 2 we have Tk (x, 1, 1) = x1 · 1 = x. Next, def def def IH Tk+1 (x, 1, 1) = Ek+1 (x, Tk (x, 1, 0)) = Ek+1 (x, 1) = Tk (x, 1, 1) = x . So in fact this holds for all x. (c): By induction on k. case k = 2: def def E2 (x, y) = x · y ≤ x · y + x = x · (y + 1) = E2 (x, y + 1) . case k + 1: By sub-induction on y. Note that IH(k) z0 ≤ z1 ⇒ Ek (x, z0 ) ≤ Ek (x, z1 ) . sub-case y = 0: Then def def (b) Ek+1 (x, y) = Ek+1 (x, 0) = Tk (x, 1, 0) = 1 ≤ x = Ek+1 (x, 1) = Ek+1 (x, y + 1) . 157 4.2. ON THE ORDER sub-case y + 1: Then: def def def Ek+1 (x, y + 1) = Tk (x, 1, y + 1) = Ek (x, Tk (x, 1, y)) = Ek (x, Ek+1 (x, y)) , and similarly Ek+1 (x, y + 2) = Ek (x, Ek+1 (x, y + 1)) . The result now follows from inferring first IH(y) def def z0 = Ek+1 (x, y) ≤ Ek+1 (x, y + 1) = z1 , IH(k) whence Ek (x, z0 ) ≤ Ek (x, z1 ) . (d): By induction on k – the case k = 2 is observation 84. case k + 1: By sub-induction on y: sub-case y = 2: Then: def (b) def Ek+2 (x, 2) = Tk+1 (x, 1, 2) = Ek+1 (x, Tk+1 (x, 1, 1)) = Ek+1 (x, x) , while similarly Ek+1 (x, 2) = Ek (x, x). The sub-case now follows by the ih(k). sub-case y + 1: We have: def def Ek+2 (x, y + 1) = Tk+1 (x, 1, y + 1) = def Ek+1 (x, Tk+1 (x, 1, y)) = Ek+1 (x, Ek+2 (x, y)) , and also Ek+1 (x, y + 1) = Ek (x, Ek+1 (x, y)). For z0 = Ek+1 (x, y) and z1 = IH(y) Ek+2 (x, y) we now have z0 ≤ z1 , whence (c) IH(k) Ek (x, z0 ) ≤ Ek (x, z1 ) ≤ Ek+1 (x, z1 ) , which completes the proof of (d). (e): By induction on k – the induction start is covered by (c). case k + 1: By sub-induction on y. def def sub-case y = 0: Observe first that Ek+1 (x, 0) = Tk (x, 1, 0) = 1 for all x, thus Ek+1 (x, 0) = Ek+1 (x + 1, 0) . sub-case y + 1: Then Ek+1 (x, y + 1) = Ek (x, Ek+1 (x, y)) , while Ek+1 (x + 1, y + 1) = Ek (x + 1, Ek+1 (x + 1, y)) . def IH(y) def For z0 = Ek+1 (x, y) and z1 = Ek+1 (x + 1, y) we conclude that: z0 ≤ z1 , whence (c) IH(k) Ek (x, z0 ) ≤ Ek (x, z1 ) ≤ Ek (x + 1, z1 ) , whence the conclusion follows. (f) & (g): Follows easily from the previous items. 158 CHAPTER 4. MAJORISATION (h): If k = 2 we have equality, while for k > 2 we have that def Ek (x, y + 1) = Ek−1 (x, Ek (x, y)) ≥ E1 (x, Ek (x, y)) = x + Ek (x, y) . q.e.d. Note that for almost all arguments x, y, z the above inequalities are ‘very strict’, and could most probably be replaced by much stronger majorisation results. First of all, majorisation will be the subject of the next section, but then for unary functions. In this section we focus on monotonicity in several variables, properties which testify to how fast the hierarchy accelerates. The next proposition gives some special identities for the Ek ’s. Proposition 87 Let k ≥ 2. Then: (0) Ek (0, x) ∈ {0, 1} ; (1) Ek+1 (x, 0) = 1 and Ek (x, 1) = x ; (2) Ek (x, 2) = Ek−1 (x, x) ; (4) Ek (2, 2) = 4 . Proof: We prove (0) by induction on k. case k = 2: E2 (0, x) = E2 (x, 0) = 0 . case k + 1: By sub-induction on x. sub-case x = 0: This case also proves the first part of (1). We have: def def Ek+1 (y, 0) = Tk (y, 1, 0) = 1 ∈ {0, 1} , which in particular proves this case for y = 0. sub-case (x > 0): We have: def IH(k) def Ek+1 (0, x) = Tk (0, 1, x) = Ek (0, Tk (0, 1, x − 1)) ∈ {0, 1} . This proves (0). We have already seen that Ek+1 (x, 0) = 1. That Ek (x, 1) = x (2) and (4) follows dfd. q.e.d. The numbers 2 and 4 play a special rôle, and by the above we have e.g. Tk (2, 2, 1) = Tk (2, 1, 2) = 4 while Tk (2, 2, 2) = Ek (2, 4) . It is also easy to prove that k ≥ 3 ⇒ Ek (1, x) = 1 for all x – in particular (1) Ek (1, 2) = 1 – and Ek (2, 1) = 2 holds for all k. Thus if z ∈ {0, 1} and k ≥ 3, then we x\y 0 1 Ek (x, y) = 2 3 .. . have 0 1 1 1 1 .. . 1 z 1 2 3 .. . 2 z 1 4 bk .. . 3 z 1 ak ck .. . ··· ··· ··· ··· ··· .. . 159 4.2. ON THE ORDER Hence, ‘things only start to happen above 2’, and we may have an odd behaviour of our Ek ’s when its first argument is 0. This is to be expected since implicitly ‘00 ’ is embellished with a determinate value∗ by our definitions. What proposition 87 shows, is that this is no problem since the anomalies are confined to a very few special initial values and our interest lies with the majorisation relation – which is only connected with the properties of the tail∗∗ . Observation 88 It is useful to note that when x, y, z ≥ 2, and k ≥ 3, then Tk (x, y, z) and Ek (x, y) are powers xa for some a ∈ N. This is obvious for k = 3, and follows by induction on k from the identity Ek+1 (x, 2) = Ek (x, x) and sub-induction on y ≥ 2 in Ek (x, y). In particular, this means that e.g. x · Tk (x, y, z) = xa+1 ≤ Tk (x, y + 1, z), Tk (x, y, z + 1), since each term to the right is strictly greater than Tk (x, y, z) and a power of x. q.e.d. The next proposition generalises the familiar fact (in standard tower-of-twos 2x notation) that 2xn+m = 2nm . Proposition 89 min(k, x, y, n) ≥ 1 ⇒ Tk (x, y, n+m) = Tk (x, Tk (x, y, n), m) . Proof: For k = 1, 2 the proofs are easy calculations. When k ≥ 3 we proceed by induction on m, for all x, y, n simultaneously (so we omit the k from notation). The case m = 0 is trivial. case m ≥ 1: Set m = m0 + 1, and consider that def IH T(x, y, n + m) = E(x, T(x, y, n + m0 )) = def E(x, T(x, T(x, y, n), m0 )) = T(x, T(x, y, n + 1), m0 + 1) , which concludes the proof. q.e.d. Our next theorem says that in the function Tk (x, y, z), in a quite precise sense, the argument z is more significant than y, which in turn is more significant than x. Similarly for E(x, y). Of course, this intuition is simply false for E2 by commutativity of multiplication. Theorem 90 k ≥ 3 ∧ y ≥ 2 ∧ x, z ≥ 1 ⇒ Ek (Ek (x, y), z) ≤ Ek (x, Ek−1 (y, z)) . (ih(k)) Proof: We show the result by induction on k, with sub-induction on z. The case k = 3 reduces to simple calculations. case k + 1: sub-case z = 1: We have Ek+1 (Ek+1 (x, y), 1) = Ek+1 (x, y) = Ek+1 (x, Ek (y, 1)) . ∗ Since T2 (x, y, z) = xz · y when not both x and y are zero, we have 00 · 1 = 1 – suggesting 00 = 1. ∗∗ The interested reader will be able to verify that the sequence of z’s will be 010101 · · ·. 160 CHAPTER 4. MAJORISATION sub-case z + 1: IH def Ek+1 (Ek+1 (x, y), z + 1) = Ek+1 (Ek+1 (x, y), Ek+1 (Ek+1 (x, y), z)) ≤ Ek+1 (Ek+1 (x, y), Ek+1 (x, Ek (y, z))) . We will need: claim i x, u, v ≥ 1 ⇒ Ek (Ek+1 (x, u), Ek+1 (x, v)) ≤ Ek+1 (x, u + v) . Proof of claim: By induction on u. case u = 1: Observe that def Ek (Ek+1 (x, 1), Ek+1 (x, v)) = Ek (x, Ek+1 (x, v)) = Ek+1 (x, v + 1) . case u + 1: We obtain ih(k) Ek (Ek+1 (x, u + 1), Ek+1 (x, v)) = Ek (Ek (x, Ek+1 (x, u)), Ek+1 (x, v)) ≤ ih(u) Ek (x, Ek−1 (Ek+1 (x, u), Ek+1 (x, v)) ≤ def Ek (x, Ek+1 (x, u + v)) = Ek+1 (x, u + v + 1) , which completes the proof of claim i. Finally: C. I Ek+1 (Ek+1 (x, y), Ek+1 (x, Ek (y, z))) ≤ Ek+1 (x, Ek (y, z) + y) ≤ Ek+1 (x, Ek (y, z + 1)) . q.e.d. P. 86(d) Remark 5 Of course, since Ek−1 (x, y) ≤ Ek (x, y) , in conjunction with proposition 86(c) the theorem 90 can be used to infer that Ek (Ek−1 (x, y), z) ≤ Ek (Ek (x, y), z) ≤ Ek (x, Ek−1 (y, z)) ≤ Ek (x, Ek (y, z)) . That is, in an ‘Ek -context’ we can infer an inequality e.g. ‘from k − 1 to k − 1’ or ‘from k to k’. We have two immediate corollaries. The first generalises the inequality · x· ·xo u · ·x· ·xo v · ≤x· ·xo max(u, v) + 1 : Corollary 91 k ≥ 3 ∧ x, u, v ≥ 2 ⇒ Ek (Ek+2 (x, u), Ek+2 (x, v)) ≤ Ek+2 (x, max(u, v) + 1) . 161 4.2. ON THE ORDER Proof: def Ek (Ek+2 (x, u), Ek+2 (x, v)) = Ek (Ek+1 (x, Ek+2 (x, u − 1)), Ek+1 (x, Ek+2 (x, v − 1)) T. 90(C. I) Ek+1 (x, 2 · Ek+1 (x, max(v − 1, u − 1)) ≤ ≤ Ek+1 (x, Ek+1 (x, max(v, u)) = Ek+1 (x, max(v, u) + 1) . q.e.d. We also have the following corollary – which for k = 3 states that for most∗ x, y, z we have: Corollary 92 x ·xo y ·· z =x x ·· ·xo z−1 ·y ≤x y ·x o · · z . k ≥ 3 ∧ y ≥ 2 ∧ x, z ≥ 1 ⇒ Tk (Tk (x, 1, z), y, 1) ≤ Tk (x, y, z) . Proof: Recall that T(x, y, 1) = E(x, y), whence T(T(x, 1, z), y, 1) = E(T(x, 1, z), y). By induction on z ≥ 1. case z = 1: Then both sides evaluates to E(x, y). case z + 1: For such z we obtain: IH def T. 90 T(x, y, z + 1) = E(x, T(x, y, z)) ≥ E(x, E(T(x, 1, z), y)) ≥ def E(E(x, T(x, 1, z)), y) = E(T(x, 1, z + 1), y) , which concludes the proof (we have appealed to remark 5 above). q.e.d. Lemma 93 k ≥ 2 ∧ x ≥ 4 ∧ y ≥ 2 ⇒ Ek (y, x) ≤ Ek+1 (y, x − 1) . Proof: For k = 2, the assertion is that y · x ≤ y x−1 . We have y · x ≤ y x−1 ⇔ log(x) + log(y) ≤ log(y)(x − 1) ⇔ log(x) +2≤x . log(y) Because y ≥ 2, we also have log(x) log(y) ≤ log(x), and it is easy to verify that x ≥ 4 ⇒ log(x) + 2 ≤ x . For k ≥ 3, the case is by induction on x for all k and y simultaneously. case x = 4: Then Ek (y, 4) y ≥ 2&P. 87(4) ≤ Ek (y, Ek (y, y)) P. 87(2) = Ek (y, Ek+1 (y, 2)) = Ek+1 (y, 3) . case x + 1: Then IH Ek (y, x + 1) = Ek−1 (y, Ek (y, x)) ≤ Ek−1 (y, Ek+1 (y, x − 1)) ≤ Ek+1 (y, x) . ∗ E.g. in the sense ‘when min(x, y, z) ≥ 2’. 162 CHAPTER 4. MAJORISATION q.e.d. That E3 (x, 2) = x2 ≤ 2x = E3 (2, x) ‘from x = 4’ is easily seen. This feature is uniform in k, as the next lemma shows. Lemma 94 k ≥ 3 ∧ x ≥ 4 ⇒ Ek (x, 2) ≤ Ek (2, x) . Proof: The proof is by induction on k ≥ 3 with sub-induction on x ≥ 4. The induction start simply states that x2 ≤ 2x when x ≥ 4, a rather obvious fact. For the sub-induction step we will need ` ≥ 2 ∧ x ≥ 3 ⇒ x + 1 ≤ 2(x − 1) ≤ E` (2, x − 1) . (z) induction start x = 4: Ek+1 (4, 2) P. 87(2) = Ek (4, 4) P. 87(4) = T. 90 Ek (Ek (2, 2), Ek+1 (2, 2)) ≤ Ek (2, Ek (2, Ek+1 (2, 2))) = Ek+1 (2, 4) . induction step x + 1: Ek+1 (x + 1, 2) P. 87(2) = T. 90 z Ek (x + 1, x + 1) ≤ Ek (Ek−1 (2, x − 1), x + 1) ≤ L. 93 Ek (2, Ek−1 (x − 1, x + 1)) ≤ Ek (2, Ek (x − 1, x)) P. 87(2) = IH(x) Ek (2, Ek+1 (x, 2)) ≤ Ek+1 (2, x + 1) . q.e.d. Theorem 95 k ≥ 3 ∧ y ≥ 2 ⇒ Ek (2, Ek (y, z)) ≤ Tk (2, y, z) . Proof: By induction on z ≥ 1, for all k simultaneously (z = 0 is proved in the end). case z = 1: Then T(2, y, 1) = E(2, y) P. 87(1) = E(2, E(y, 1)). case z + 1: Then: def IH L. 94 T(2, y, z + 1) = E(2, T(2, y, z)) ≥ E(2, E(2, E(y, z))) ≥ E(2, E(E(y, z), 2)) z ≥ 1 P. 87(2) = def E(2, Ek−1 (E(y, z), E(y, z))) ≥ E(2, Ek−1 (y, E(y, z)) = E(2, E(y, z + 1)) . y ≥ 2 Finally, observe that z = 0 yields E(2, E(y, 0)) = E(2, 1) = 2 ≤ y = T(2, y, 0) , so indeed the result hold for all z. q.e.d. Lemma 96 (i) k ≥ 3 ∧ x, y ≥ 2 ∧ z ≥ 1 ⇒ x · Ek (y, z) ≤ Ek (y, z + x − 1) ; (ii) k ≥ 3 ∧ x ≥ 2 ∧ y, z ≥ 1 ⇒ Ek (xy, z) ≤ Ek (x, yz) ; 163 4.2. ON THE ORDER (iii) k ≥ 3 ∧ x, y ≥ 1 ⇒ Ek (x, y) ≤ Ek (2, xy) . (iv) k ≥ 3 ∧ x ≥ 2 ∧ y ≥ z ≥ 2 ⇒ Ek x, y z ≤ j Ek (x,y) z k . Proof: (i): By induction on x ≥ 1 for all k ≥ 3 and y ≥ 2 simultaneously – induction start is a trivial equality. case x + 1: IH (x + 1) · Ek (y, z) = x · Ek (y, z) + Ek (y, z) ≤ Ek (y, z + x − 1) + Ek (y, z) ≤ def 2·Ek (y, z+x−1) ≤ Ek−1 (2, Ek (y, z+x−1)) ≤ Ek−1 (y, Ek (y, z+x−1)) = Ek (y, z+x) . That (ii) ⇒ (iii) follows from the simple observation that x ≤ 2x, (ii): We first observe that for k = 3 the assertion is obvious: (xy)z ≤ xyz ⇔ y z ≤ (xy−1 )z , and since x ≥ 2 we have y ≤ xy−1 . For k + 1 the proof is by sub-induction on z ≥ 1: case z = 1: Ek+1 (xy, 1) = xy = E2 (x, y) ≤ Ek+1 (x, y) . case z + 1: def Ek+1 (xy, z + 1) = Ek (xy, Ek+1 (xy, z)) IH(k, y) ≤ (i) Ek (x, y · Ek+1 (x, yz)) ≤ def Ek (x, Ek+1 (x, zy + y − 1)) = Ek+1 (x, zy + y) = Ek+1 (x, y(z + 1)) . For (iv), write y = zq + r where 0 ≤ r < y, and note that q ≥ 1. Then: j y k z · Ek (x, z(q − 1) + 1) (i) ≤ = Ek (x, q) ≤ Ek (x, z(q − 1) + 1) = Ek x, z z Ek (x, z(q − 1) + 1 + (z − 1)) Ek (x, zq + r) Ek (x, y) ≤ = . z z z q.e.d. By combining (i) and (ii) we also see that x · Ek (y, z) ≤ Ek (y, xz) as expected. Observation 97 Let x =jpyk + r where p ≥ 2, x > r ≥ 0 and y ≥ 2; viz. x ≥ 2y ≥ 4. Then x − y ≥ xy since x − y = (p − 1)y + r ≥ 2(p − 1) ≥ p = x . y (z) q.e.d. Lemma 98 For x, y as in observation 97 we have: Ek (z, x) x k≥3 ∧ z≥2 ⇒ ≥ Ek z, . Ek (z, y) y 164 CHAPTER 4. MAJORISATION Proof: By induction on k ≥ 3. For k = 3 the assertion follows from (z) above, since z x x E3 (z, x) zx = y = z x−y ≥ z b y c = E3 z, . E3 (z, y) z y induction step k + 1: Now assume first 4 = x = 2y. Then, by expanding the k+1 (z,4) numerator in EEk+1 (z,2) and applying lemma 96(iv) twice, we see that the result holds (see also below). Assume next that x = 2(y + 1). Then Ek+1 (z, 2(y + 1)) def Ek (z, Ek (z, Ek+1 (z, 2y) IH(k) ≥ = Ek+1 (z, y + 1) Ek (z, Ek+1 (z, y)) IH(y) Ek (z, Ek+1 (z, 2y) L. 96(iv) Ek+1 (z, 2y) Ek z, ≥ Ek z, Ek z, ≥ Ek+1 (z, y) Ek+1 (z, y) def Ek (z, Ek (z, Ek+1 (z, y))) = Ek+1 (z, y + 2) ≥ Ek+1 (z, y + 1) . If x > 2y, then Ek+1 (z, x) ≥ Ek+1 (z, 2y), which concludes the proof. q.e.d. z Lemma 99 Let Tk (2, 1, a) ≥ b. Then: Tk (2, y, a) ≥ Tk (b, y, 1) ⇒ ∀m (Tk (2, y, ma) ≥ Tk (b, y, m)) . Proof: By induction on m, for all k ≥ 3 simultaneously. For m = 0 we have that both expressions evaluate to y, while m = 1 yields T(2, y, a) ≥ T(b, y, 1) which is the hypothesis of the lemma. We proceed by induction on m ≥ 1: induction step m + 1: Then IH L. 89 T(2, y, (m + 1)a) = T(2, T(2, y, ma), a) ≥ C. 92 z T(2, T(b, y, m), a) ≥ T(T(2, 1, a), T(b, y, m), 1) ≥ dfd T(b, T(b, y, m), 1) = T(b, y, m + 1) , dfd where ‘ =’ follows directly from definitions. q.e.d. In particular: T(2, 1, a) ≥ b = T(b, 1, 1) ⇒ Tk (2, 1, am) ≥ Tk (b, 1, m) . We register that we have been able to prove quite strong monotonicity properties. In particular the theorems 90 and 95 will prove versatile tools later. Another thing to notice is that most of the steps in any of the chain-of-inequalities T. 90 T. 90 presented in the proofs above, any occurrence of say ‘ ≤ ’ is a strict ‘<<’ for most arguments, and most results can probably be improved and strengthened as majorisation results. The crucial attribute of this section’s results is that they are monotonicity results in several variables, and as such quite strong nevertheless. We now turn our attention to the subject matter – majorisation. 165 4.2. ON THE ORDER 4.2.2 Majorisation properties. In this section we exploit the monotonicity properties developed in the previous section to obtain interesting and useful majorisation-results. On the rank-k-monomials. Consider the function: Mk (m) = x 7→ Tk (x, 1, m) . For k = 2 we obtain M2 (m) = xm , so that M2 (m) is a polynomial of degree m; more specifically {M2 (m) | m ∈ N } are exactly the monomials. def Now, let p, q ∈ N[x], recall that E1 (x, y) = x+y , and note that if p, q ≺ M2 (m), then also p + q ≺ M2 (m) : p, q ≺ T2 (x, 1, m) ⇒ E2−1 (p, q) ≺ T2 (x, 1, m) . def We also have that ∀p∈N[x] p ≺ 2x = E3 (2, x) , and that: def ∀p,q∈N[x] E2 (p, q) ≺ E2+1 (2, x) = T2 (2, 1, x) . Anticipating a similar behaviour ‘above 2’ – we define: def def Definition 48 The function Mk (m)(x) = Tx,1,m = Ek+1 (x, m) is called the def rank-k-monomial of degree m. The function Ek+1 (2, x) = Tk (2, 1, x) is called the minimal k-exponential. We thus have Mk (0) = 1 and Mk (1) = x for all k. The function E4 (x, x) = E5 (x, 2) = M5 (2) also coincides with the function denoted ‘xx ’ in [B&G09], and for m = 2 the picture looks like this: E3 (x, 2) = x2 = x · · x} | ·{z 2 ·xo x ·· 2 E (x, 2) = x = x 4 xo · · def · x = x E5 (x, 2) = x x = x Mk (2) = n 2 E (x, 2) = x 6 n x x .. . k=2 k=3 ·· k=4 · x ·· k=5 · .. . For k = 3 or 4 we have: m and M4 (m) = x n x · xo ·· ·· M3 (m) = x · m 166 CHAPTER 4. MAJORISATION def and later we will ask the reader to verify that M4 (n + 2) = ψn , when {ψn }n∈N are the functions from Theorem 2. [B&G09]. We begin by making some observation about the behaviour of usual monomials w.r.t. other monomials, and their behaviour when they occur as the exponent in a tower of two’s. We know that E2 (M2 (m), M2 (n)) = xm · xn = xm+n , and thus we can hope that e.g. Ek (Mk (m), Mk (n)) ≤ Mk (m + n). In fact, this is exactly claim i from theorem 90. As remarked upon earlier, with respect to majorisation, we expected that many of the monotonicity results from section 4.2.1 could be improved if inequality was replaced by majorisation. The lemma 100 generalises that · · xm+1 −n · xm = xm (x −n) xm (†) for arbitrary n ∈ N. Lemma 100 k ≥ 3 ⇒ ∀a,b,m≥1 Mk (m + 1) a · Mk (m) ≺ b · Mk (m) . Proof: By induction on k, sub-induction on m, uniformly j in a, bk ≥ 1. Furtherk (m+1) more, note that the lemma asserts that 2 · a · Mk (m) ≤ Mb·M ≤ Mk (m + 1) k (m) for any a (and b) for sufficiently large x, whence it follows that z Mk (m + 1) · . Mk (m + 1) −a · Mk (m) a · Mk (m) induction start k = 3: When m = 1, we have $ · % xx −1 ae xx def M3 (2) a · M3 (1) = ax ≺ = = b b·x b · M3 (1) which truth is a mere observation. sub-induction step m + 1: We have: % $ (M3 (m) −M · 3 (m−1)) O. 97 M3 (m + 1) def xM3 (m) x ae = = b · M3 (m) b b · xM3 (m−1) M (m) 3 M (m−1) ab·M (m−1) 3 x 3 IH x ab · xM3 (m−1) = a · M3 (m) . b b b This concludes the sub-induction when k = 3. For k ≥ 3, since Mk (1) = M3 (1), and Mk (2) ≥ M3 (2), the sub-induction start follows directly as for the case k = 3. In the case m + 1 we see that: E k (m) x, MM IH L. 98 k+1 (m−1) Mk (m + 1) Ek+1 (x, Mk (m)) k = b · Mk (m) b · Ek+1 (x, Mk (m − 1)) b 167 4.2. ON THE ORDER Ek+1 (x, ab · Mk (m − 1)) † ab · Ek+1 (x, Mk (m − 1)) def = a · Mk (m) b b † as required. We consider ‘’ as obvious. q.e.d. Theorem 101 k ≥ 3 ∧ m, x ≥ 1 ∧ n ≥ 0 ⇒ Mk (m) ≺ j Tk (2,x,m) xn Proof: By induction on m ≥ 1, for all n, k simultaneously. k . def case m = 1: Note that Mk (1) = Tk (x, 1, 1) = x . When k = 3 and m = 1 we obtain: x † T3 (2, x, 1) 2 = , x≺ xn xn † where ≺ is well-known (and equivalent to xn+1 ≺ 2x ). Since Tk (x, y, z) ≥ T3 (x, y, z) in general, this take care of the induction start in a proof of the lemma uniformly in k by induction on m ≥ 1. For the inductive step we will need: b b Claim I a ≥ 1 ∧ an+1 ≤ b ⇒ a · an+1 ≤ an . n+1 n+1 Proof ofclaim: p + r, for 0 ≤ r < an+1 , By a ≤bbwe may write r b=a b so that a · an+1 = ap and an = ap + an . This proves the claim. We may now proceed with the induction thus: case m + 1: Let n ≥ 1 be arbitrary. We get: dfd IH M(m + 1) = E(x, T(x, 1, m)) ≺ L. 96(iii) C. I T(2, x, m) T(2, x, m) ≤ E 2, x · E x, n+1 n+1 x x L. 96(iv) T(2, x, m) E(2, T(2, x, m)) T(2, x, m + 1) E 2, ≤ = , xn xn xn which was what needed to be shown. q.e.d. It is really the case n = 0 which interests us. That is, since Tk (2, x, m) Mk (m) ≺ ≺ Tk (2, x, m) ≺ Tk (x, x, m) = Mk (m + 1) , xn we obtain: Theorem 102 k ≥ 3 ⇒ Mk (m) ≺ Tk (2, x, m) ≺ Mk (m + 1) . q.e.d. Corollary 103 ∀m∈N (k ≥ 2 ⇒ Mk (m) ≺ Ek+1 (2, x)) . Proof: The simplest argument for this proposition consist in observing that Mk ∈ E k (for any m ∈ N) while Ek+1 (2, x) ∈ E k+1 \E k precisely for majorisationT. 102 def reasons. However, that Mk (m) ≺ Tk (2, x, m) ≤ Tk (2, 1, x) = Ek+1 (2, x) for all m ∈ N – e.g. when x ≥ 2m + m – should be equally obvious by now. q.e.d. 168 CHAPTER 4. MAJORISATION Thus the general situation is: Ek (2, x) ≺ Mk (2) ≺ Tk (2, x, 2) ≺ Mk (3) ≺ Tk (2, x, 3) ≺ · · · ≺ Ek+1 (2, x) . For example, when k = 3 we recognise this chain as def 2x E3 (2, x) = 2 ≺ x ≺ 2 x x ≺x xx x 22 ≺2 x ≺x xx · ≺ ··· ≺ 2 · ·2o x def = E4 (2, x) . The next lemma is another step towards justifying our terminology, as it generalises the fact that the product p · q of two polynomials is majorised by a monomial of suitable degree. For example, when k = 3: Lemma 104 Set m = max(m0 , m1 ). Then: k≥3 ⇒ Proof: f ≺ Mk (m0 ) ∧ g ≺ Mk (m1 ) ⇒ Ek (f, g) ≺ Mk (m + 2) . L. 100 Ek (f, g) ≺ Ek (Mk (m0 ), Mk (m1 )) ≺ Mk (m + 1) Mk (m + 1) L. 96(iii) ≤ Ek 2, Mk (m) · ≤ Ek Mk (m), Mk (m) Mk (m) def Ek (2, Mk (m + 1)) ≺ Ek (x, Mk (m + 1)) = Mk (m + 2) . If either m0 or m1 equals one, the lemma is obvious. q.e.d. We also note that Ek (x, Mk (m)) = Mk (m + 1) so that this lemma cannot be improved. Theorem 105 k ≥ 2 ⇒ ∀f ∈A (f ≺ Ek+1 (2, x) ⇒ ∃m∈N (f ≺ Mk (m))) . Proof: By induction on f ∈ A, uniformly in k, the induction start is trivial. Thus let f = E` (g, h) ≺ Ek (2, x). Of course, then g, h ≺ Ek+1 (2, x) , and if g = 1 the statement become trivial. Thus by the ih we can find mg , mh such that g ≺ Mk (mg ) and h ≺ Mk (mh ) . Secondly, assuming h x, we have ` ≤ k lest h x ` > k E` (g, h) ≥ Ek+1 (g, h) ≥ Ek+1 (2, x) . Hence: L. 104 E` (g, h) ≺ Ek (Mk (mg ), Mk (mh )) ≺ Mk (max(mg , mh ) + 2) . If h = c ∈ N, we still must have g ≺ Mk (mg ) for some mg ∈ N, and the case where ` ≤ k is as above. If now ` ≥ k + 2, we have g ≺ Mk (mg ) E` (g, h) ≥ Ek+2 (g, c) = Ek+1 (g, Ek+2 (g, c − 1)) ,c = 1 , c > 1 (contradiction!) If ` = k + 1 we have – by induction on c ≥ 1 – that E` (g, h) = ( g ≺ Mk (mg ) Ek+1 (g, c) = IH(c) Ek (g, Ek+1 (g, c − 1)) ≺ Mk (max(mg , mG ) + 2) IH(c) where Ek+1 (g, c − 1) ≺ Mk (mG ). ,c = 1 ,c > 1 . , q.e.d. 169 4.2. ON THE ORDER 4.2.3 Generalised towers of two’s. We continue to develop our analogue to the relationship between monomials xm x ·2 o · n. First, the last new definition: and tower of two’s 2 · def Definition 49 Define 2m n (k) = Tk (2, Mk−1 (m), n). We omit the ‘k’ from notation when irrelevant or clear from context. def def Note that 21n (k) = Tk (2, Mk−1 (1), n) = Tk (2, x, n) , while 20n (k) is simply a def def ‘large’ constant, and 211 (k) = Tk (2, Mk−1 (1), 1) = Ek−1 (2, x) . In particular C. 103 Mk−1 (m) ≺ 211 (k) for all m ∈ N. The rest of this section is dedicated to generalising Levitz’s theorem L (see p. 4.2). Observation 106 (Theorem L(i) generalised) k≥2 ⇒ n < m ⇒ 21n (k) ≺ 21m (k) . q.e.d. Our next lemma subsequently proves a result which roughly states that the function Mk−1 (m) has the rôle of a monomial with respect to functions above Ek (2, x). M Lemma 107 k ≥ 3 ⇒ ∀m,n∈N ∃M ∈N Ek (2(k)m n , 2) ≺ 2(k)n . Proof: If m = 0 the result is trivial: the left side of the inequality is a (potentially ‘large’) constant. By induction on n, uniformly in k ≥ 1, m ≥ 1. case n = 0: We have def Ek (2m 0 , 2) = Ek (Mk−1 (m), 2) = Ek−1 (Mk−1 (m)), Mk−1 (m)) def T. 90(C. I) ≤ def Ek−1 (x, Mk−1 (2m)) = Mk−1 (2m + 1) = 22m+1 . 0 M Hence Ek (2m 0 , 2) ≺ 20 for M = 2(m + 1) . dfd m case n + 1: Verify first that 2m n+1 = Ek (2, 2n ) , whence T. 90 IH def m m M M Ek (2m n+1 , 2) = Ek (Ek (2, 2n ), 2) ≤ Ek (2, Ek (2n , 2)) ≺ Ek (2, 2n ) = 2n+1 , which concludes the induction. q.e.d. As usual, when we invoke theorem 90 in this very weak sense, we expect this result to have an ‘exponential speed-up’, and for k > 3 probably M = m + 1 suffice. The next lemma is just theorem 105 in a new guise. 170 CHAPTER 4. MAJORISATION Lemma 108 ∀f ∈A f ≺ 211 (k + 1) ⇒ ∃n∈N f ≺ 21n (k) . q.e.d. The next proposition generalises Levitz’s theorem(ii) which states that 2(3)an · 2(3)bn is majorised by 2(3)cn for a suitable c (see p. 4.2): Proposition 109 (theorem L(ii) generalised) m2 M 1 ` < k ⇒ ∀m1 , m2 , n∈N f ≺ 2m n (k) ∧ g ≺ 2n (k) ⇒ ∃M ∈N E` (f, g) ≺ 2n (k) P. 87(2) m2 1 Proof: Set m = max(m0 , m1 ) so that E` (2m n , 2n ) ` < k ≤ L. 107 E`+1 (2m n , 2) . But m M then E`+1 (2m n , 2) ≤ Ek (2n , 2) ≺ 2n , for a suitable M q.e.d. The next proposition generalises the fact that height of exponential towers stratifies the elementary functions with respect to majorisation. Proposition 110 (Theorem L(iii) generalised) 1 k ≥ 3 ⇒ ∀m, n∈N 2(k)m n ≺ 2(k)n+1 . Proof: By induction on n ≥ 0, uniformly in k ≥ 3 and m ∈ N (again m = 0 is a trivial special case). P. 103 dfd dfd 1 case n = 0: This is proposition 103: 2m 0 = Mk−1 (m) ≺ Ek (2, x) = 21 . def IH def 1 1 m case n + 1: 2m n+1 = E(2, 2n )) E(2, 2n+1 ) = 2n+2 . q.e.d. M m2 1 Lemma 111 k ≥ 3 ⇒ ∀m0 , m1 , n∈N ∃M ∈N Ek (2(k)m n+1 , 2(k)n ) ≺ 2(k)n+1 Proof: (Uniform in k, n, m0 , m1 ) . T. 90 m2 m1 m2 1 E(2m n+1 , 2n ) = E(E(2, 2n ), 2n ) ≤ P. 109 def M M m2 1 E(2, Ek−1 (2m n , 2n )) ≺ E(2, 2n ) = 2n+1 . q.e.d. Lemma 112 (Theorem L(iv) generalised) ∀f ∈A f ≺ 21n+1 (k) ⇒ ∃m∈N (f ≺ 2m . n (k)) def Proof: Note that since 2m 0 (k) = Mk−1 (m), when n = 0, the result is a special case of corollary 103 – thus assume n ≥ 1. By induction on f ∈ A with trivial induction start. Let f = E` (g, h) ≺ 21n+1 (k). Of course, g must precede 21n+1 (k) so that the ih yield mg for which g ≺ 2m n (k). Secondly, ` > k leads to a contradiction (unless g = 1 or h is a constant, which are trivial special cases), and we consider the case ` < k as def obvious. When k = `, then 21n h implies Ek (g, h) Ek (2, 21n (k)) = 21n+1 (k), whence we may find mh for which h ≺ 2m n−1 (k). . 171 4.2. ON THE ORDER Combining the above we have: L. 111 m M f = E` (g, h) ≺ Ek (2m n (k), 2n−1 (k)) ≺ 2n (k) for a suitable M ∈ N. q.e.d. Corollary 113 (Theorem L(v) generalised) ∀g,h,∈A g ≺ 21n+1 ∧ h ≺ 21n ⇒ Ek (g, h) ≺ 21n+1 . Proof: As before, assume w.l.o.g. that n ≥ 1. For g, h as in the assumption, m lemma 112 yield g ≺ 2m n (k) and h ≺ 2n−1 (k) for suitable m. An appeal to lemma 111 concludes the proof. q.e.d. We have now proved Levitz’s theorem for generalised towers and monomials. The importance of this is that these results are sufficient ingredients for proving the following interesting theorem. Theorem 114 ∀f ∈A ∀k,n∈N f ≺ 21n (k) ∨ 21n (k) f . That is, each 21n (k) is comparable to all f ∈ A by the majorisation relation. Proof: By induction on f ∈ A we show that for arbitrary k and n that either f is strictly -below 21n , or 21n is -below f . The induction start is trivial. Furthermore, that Ek+1 (2, x) majorises any function which is built up with only functions E1 , . . . , Ek follows by e.g. the work of Grzegorczyk, Péter or Ackermann. That any function which includes Ek+1 succeeds (or equals) Ek+1 (2, x) is equally obvious, and follows immediately from results above. That these functions all relate to each other is simple monotonicity. When f = E` (g1 , g2 ) we have by the ih that the gi ’s are comparable to all 21n0 (k 0 ). Hence def ki = min k ∈ N gi ≺ 211 (k + 1) is a well defined number, and likewise def ni = min n ∈ N gi ≺ 21n+1 (ki ) . Finally, we can define def mi = min m ∈ N gi ≺ 2m . ni (ki ) i Now 21ni (ki ) gi ≺ 2m 6 ` yield: ni (ki ), and we have, for k = max(k1 , k2 ), that k = (`, 1) , if ` > k 0 0 0 21n0 (k 0 ) E` (g1 , g2 ) ≺ 2M n0 +1 (k ) where (k , n ) = (k, max(n1 , n2 )) , if ` < k When k = `, we must distinguish between k1 = k2 and k1 6= k2 . If k1 6= k2 , then: 21n0 (k) Ek (g1 , g2 ) ≺ 2M n0 +1 (k) where n = 0 n1 n2 + 1 , if k2 < ` , if k1 < ` , . 172 CHAPTER 4. MAJORISATION for suitable M . (That this covers this case exhaustively follows from k = max(k1 , k2 ) = `.) If k1 = k2 then: 0 21n0 (k) Ek (g1 , g2 ) ≺ 2M n0 +1 (k) where n = max(n1 , n2 + 1) . q.e.d. We have shown that the functions 21n (k) represent points in the p.o. which partition (A, ) into ‘half-open’ lattices def def ∆nk = [2n1 (k), 2n+1 (k)) = f ∈ A 2n1 (k) f ≺ 2n+1 (k) 1 1 ensuring that f ∈ ∆nk is -comparable to all g ∈ A \ ∆nk . The 2nk ’s thus represent divides, or ‘check-points’, in the order. Remarks. We have now arrived at a point where it seems natural to summarise our results so far. Doubtlessly, the interested reader will have several ideas as to how to extend and refine the results from the previous sections. In order to further refine the theorem 114 one could e.g. attempt to prove that each 2m n (k) is comparable to all f ∈ A. One way to do this, which would probably work, is a(x) to prove theorem L-like results for functions 2n (k), where the function a(x) m could be e.g. 2n (k − 1) . However, this seems like a genuine Sisyphus-enterprise unless one can first find a way to perform the refinements uniformly, e.g. by some inductive construction. The methods and results above are strong and sharp enough to obtain some interesting results – as the next section and its embedded article demonstrates – but they seem far to weak to tackle the problem (at least directly) of whether (A, ) is a well-order or not; indeed, to weak to handle even A3 . We now turn our attention to a problem left open by Thoralf Skolem, and return with a few concluding comments in the final section. 173 4.3. SKOLEM’S PROBLEM 4.3 Skolem’s problem This section’s main results are contained in the embedded article Skolem + Tetration is Well-ordered [B&G09]. Before the paper is presented, we include a few definitions which explains the notational differences. We also present the proof of the Theorem 2. Immediately after the paper, we have also included parts of other proofs from [B&G09] which were omitted there due to page-number limitation. 4.3.1 Notational bridge. In the paper we do not employ the notation so far developed, and we have relied upon the readers intuition to a much greater extent than thus far. Of course, with a 10 page limit, this is the only feasible approach. Secondly, the well-order which we prove is induced on the set S∗ in [B&G09] is a quite sever restriction of the full set A4 above. We thus first define our class (as one of many) in our new notation. Definition 50 Define the operator of base-x rank-k exponentiation ek− : NN → NN by ek− (f ) = x 7→ Ek (x, f (x)) = Ek ◦ (id, f ) def def (notice the sub-scripted ‘−’). Define: def Ak− = 1, id ; e1 , e2 , e3− , . . . , ek− . . . . So the class S∗ from [B&G09] is simply A4− , the class S ∗ (as studied by Levitz) is A3 , and Skolem’s class S is A3− . We observe that Ak ( Ak+1 – since e.g. Ek (2, x) ∈ Ak \ Ak+1 – and that − − k+1 k . A− ( A since Ek+1 (x, x) majorises all functions in Ak . Thus Ak ⊥ Ak+1 − Hence we have the following diagram: = S∗ def def S ( ( A4− ( ( ( = = A3− A3 def A4 A5− ··· ··· ··· Ak−1 Ak− ( ( ( Ak Ak+1 − ( ( ( Ak+1 Ak+2 − S∗ Including the main theorem from [B&G09], we also have that SKO LEV LEV B&G O(A3− ) = ε0 ≤ O(A3 ) ≤ τ0 ≤ O(A4− ) , with a pretty strong conjecture of equality to the right. Thus, a pattern is not impossible. Of course, except for the three classes mentioned above, it is very much open whether O(Ak ) or O(Ak− ) exist – viz. whether the (Ak[−] , )’s are well-orders or not. 174 CHAPTER 4. MAJORISATION We expect that methods similar to the ones employed in [B&G09] will be suitable for generalisation to the cases Ak− for k > 4. Indeed, the solution presented is a generalisation of Skolem’s solution to the A3− -problem, in the sense that finding suitable normal-forms is a crucial part of the proof. We hope to be able to make some progression with respect to these classes in the future. W.r.t. the Ak , entirely different methods seems to be required. Indeed, as the possibility of A3 being an undecidable order (given two terms definitions decide which is the -larger) is open, it is unlikely that a normal-form based result can be used to prove well-orderedness, since such results usually embody a decisionprocedure. We next present the embedded article Skolem + Tetration is well-ordered, only pausing to give first the proof of [B&G09]-Theorem 2. 4.3.2 Proof of Theorem 2. Please note that the functions ψn defined on p. 5 in conjunction with Theorem 2. in [B&G09] coincide with the functions M4 (n + 2) from definition 48 (and that Skolem’s φn is M3 (n)). Lemma 115 k ≥ 2 ⇒ ∀n,m∈N (Tk+1 (x, Mk (n), m) ≺ Mk+1 (m + 2)) . Proof: When k = 2, the lemma is obvious: e.g. m = 0 corresponds to xn ≺ xx , xn xx while m = 2 corresponds to xx ≺ xx . For k ≥ 3, by induction on m ≥ 0 for all n ∈ N simultaneously. The case m = 0 def is Mk (n) = Ek+1 (x, n) ≺ Ek+1 (x, x) = Mk+1 (2) . For m + 1 we obtain: IH def Mk+1 (m + 3) = Ek+1 (x, Mk+1 (m + 2)) def Ek+1 (x, Tk+1 (x, Mk (n), m)) = Tk+1 (x, Mk (n), m + 1) . q.e.d. When n = 1 above, the lemma just says that Mk (m + 1) ≺ Mk (m + 2), a rather obvious statement. The lemmata to follow can probably be uniformly proved in k rather than ‘in 4’, but we have not included these here since we will only need this particular version now. First, we just remark upon the fact that Tk (x, y, z) + 1 ≤ Tk (x + 1, y, z), Tk (x, y + 1, z), Tk (x, y, z + 1) for all x, y, z ≥ 1 as the reader can easily verify. Lemma 116 ∀n,m∈N ∃N ∈N (E3 (x, T4 (x, M3 (n), m) ≺ T4 (x, M3 (N ), m)) . 175 4.3. SKOLEM’S PROBLEM Proof: By induction on m ≥ 0 for all n simultaneously. The case m = 0 is settled by the observation that E3 (x, M3 (n)) = M3 (n + 1) , and for m + 1 we have: def E3 (x, T4 (x, M3 (n), m + 1) = E3 (x, E4 (x, T4 (x, M3 (n), m))) T. 90(C. I) ≤ E4 (x, T4 (x, M3 (n), m) + 1) ≤ E4 (x, T4 (x, M3 (n) + 1, m)) ≺ T4 (x, M3 (N ), m + 1) for suitable N . q.e.d. Corollary 117 ∀n,m∈N (g ≺ T4 (x, M3 (n), m) ⇒ ∃N ∈N (E3 (x, g) ≺ T4 (x, M3 (N ), m))) . q.e.d. Proof: (of Theorem 2.) Recall that this theorem asserts that any function def def in A4− = S∗ is comparable to ψn = M4 (n + 2) for arbitrary n ∈ N. That this assertion holds for id and 1 is obvious, and it is equally clear (by e.g. the results of Grzegorczyk) that any function in A3 – thus also A3− – is comparable to the ψn ’s. That they compare with each other is also clear. We will next prove by induction on f ∈ A4− that: ∀f ∈A4− ∃n,m∈N (Mk (m) f ≺ T4 (x, M3 (n), m − 1)) . Since T4 (x, M3 (n), m − 1) corollary. L. 115 ≺ Mk (m + 1) this yield the desired result as a induction step: First of all, it is clear that: M4 (m) g ≺ T4 (x, M3 (n), m−1) ⇒ Mk (m+1) E4 (x, g) ≺ T4 (x, M3 (n), m) . which proves the case f = E4 (x, g). When f = E3 (x, g), we have that C. 117 M4 (m) g ≺ T4 (x, M3 (n), m−1) ⇒ M4 (m) ≺ E3 (x, g) ≺ T4 (x, M3 (N ), m−1) , which takes care of this particular case. The two remaining cases – that is f = g + h or f = g · h – follows by the observation that both functions succeeds M4 (max(mg , mh )) (where e.g. M4 (mg ) g), and that both functions are succeeded by T4 (x, M3 (N ), max(mg , mh ) − 1) for some suitable N . q.e.d. 176 Skolem + Tetration Is Well-Ordered Mathias Barra and Philipp Gerhardy Dept. of Mathematics, University of Oslo, P.B. 1053, Blindern, 0316 Oslo, Norway georgba@math.uio.no, philipge@math.uio.no http://folk.uio.no/georgba http://folk.uio.no/philipge Abstract. The problem of whether a certain set of number-theoretic functions – defined via tetration (i.e. iterated exponentiation) – is wellordered by the majorisation relation, was posed by Skolem in 1956. We prove here that indeed it is a computable well-order, and give a lower bound τ0 on its ordinal. 1 Introduction In this note we solve a problem posed by Thoralf Skolem in [Sko56] regarding the majorisation relation on NN restricted to a certain subset S∗ . Definition 1 (Majorisation). Define the majorisation relation ‘’ on NN by: f g ⇔ ∃N ∈N ∀x≥N (f (x) ≤ g(x)) . def We say that g majorises f when f g, and as usual f ≺ g ⇔ f g ∧ g f . We say that f and g are comparable if f ≺ g or f = g or g ≺ f . def Hence g majorises f when g is almost everywhere (a.e.) greater than f . The relation is transitive and ‘almost’ anti-symmetric on NN ; that is, we cannot a.e. have both f ≺ g and g ≺ f , and f g ∧ g f ⇒ f = g. N Given A ⊆ N , one may ask whether (A, ) is a total order? if it is a wellorder? – and if so – what is its ordinal? In his 1956-paper An ordered set of arithmetic functions representing the least -number [Sko56], Skolem introduced the class of functions S, defined by: 0, 1 ∈ S and f, g ∈ S ⇒ f + g, xf ∈ S . In his words (our italics): ‘we use the two rules of production [which] from [. . . ] functions f (x) and g(x) we build f (x) + g(x)’. That is, S is a typical inductively defined class, or an inductive closure. In [Sko56] the set S is stratified into the hierarchy n∈N Sn in a natural way: For f ∈ S define the Skolem-rank ρS (f ) of f inductively by: def def def def ρS (0) = ρS (1) = 0; ρS (f + g) = max(ρS (f ), ρS (g)) and ρS (xf ) = ρS (f ) + 1 , Both authors are supported by a grant from the Norwegian Research Council. K. Ambos-Spies, B. Löwe, and W. Merkle (Eds.): CiE 2009, LNCS 5635, pp. 11–20, 2009. c Springer-Verlag Berlin Heidelberg 2009 177 12 M. Barra and P. Gerhardy def and define Sn = {f ∈ S | ρS (f ) ≤ n }. So e.g. S0 = {0, 1, 2, . . .} and S1 = N[x]. def def Skolem next defined functions φ0 = 1, φn+1 = xφn , and it is immediate that ρS (φn ) = n and ρS (f ) < n ≤ ρS (g) ⇒ f ≺ φn g. When (A, ) is a well-order and f ∈ A, we let O (A, ) denote the ordinal/ order-type of the well-order and O (f ) denotes the ordinal of f w.r.t. (A, ). The main results from [Sko56] are summarised in a theorem below: Theorem A (Skolem [Sko56]) k 1. If f ∈ Sn+1 , then f can be uniquely written as i=1 ai xfi where ai ∈ N , fi ∈ Sn and fi fi+1 ; 2. f, g ∈ S ⇒ f · g ∈ S, i.e. S is closed under multiplication; def 3. φn+1 ∈ S n+1 = Sn+1 \ Sn ; 4. (S, ) is a well-order; ω ·· 5. O (Sn , ) = ω · n + 1 = O (φn+1 ); 6. O (S, ) = supn<ω O (φn ) = 0 = min {α ∈ ON | ω α = α }. def Above, ON is the class of all ordinals. In the final paragraphs of [Sko56] the following problems are suggested: Problem 1 (Skolem [Sko56]). Define S ∗ by 0, 1 ∈ S ∗ , and, if f, g ∈ S ∗ then f + g, f g ∈ S ∗ . Is (S ∗ , ) a well-order? If so, what is O (S ∗ , )? For the accurate formulation of the 2. problem, we define the number-theoretic x ·· 1 ,y = 0 def def · y . function t(x, y) of tetration by: t(x, y) = xy = =x x(xy−1 ) , y > 0 Problem 2 (Skolem [Sko56]). Define S∗ by 0, 1 ∈ S∗ , and, if f, g ∈ S∗ then f + g, f · g, xf , xf ∈ S∗ . Is (S∗ , ) a well-order? If so, what is O (S∗ , )? We remark here that with respect to asymptotic growth, S ∗ is a ‘horizontal extension’ of S, while S∗ is a ‘vertical extension’. More precisely, let E n be the n+1 def n+1 =E \ E n . Then: nth Grzegorczyk-class, and set E 4 S ∗ ⊆ E 3 while S∗ ⊆ E 4 and E ∩ S∗ = ∅ . Also 2x ∈ S ∗ \ S∗ , and xx ∈ S∗ \ S ∗ , so the classes are incomparable. Problem 1. has been subjected to extensive studies, leading to a ‘Yes!’ on the well-orderedness, and to estimates on the ordinal. We shall briefly review the relevant results below. Problem 2. – to our best knowledge – is solved for the first time here. On Problem 1. In [Ehr73], A. Ehrenfeucht provides a positive answer to Problem 1. by combining results by J. Kruskal [Kru60] and D. Richardson [Ric69]. Richardson uses analytical properties of certain inductively defined sets of functions A ⊂ RR to show (as a corollary) that (S ∗ , ) is a total order. Ehrenfeucht 178 Skolem + Tetration Is Well-Ordered 13 then gives a very basic well-partial-order ⊆ on S ∗ , and invokes a deep combinatorial result by Kruskal which ensures that the total extensions of ⊆ – which include (S ∗ , ) – are necessarily well-orders. Later, in a series of papers, H. Levitz has isolated sub-orders of (S ∗ , ) with ordinal 0 [Lev75, Lev77], and he has provided the upper bound τ0 on O (S ∗ , ) def [Lev78]. Here τ0 = inf {α ∈ ON | α = α } Hence 0 ≤ O (S ∗ , ) ≤ τ0 ; a rather large gap. On Problem 2. Since Skolem did not precisely formulate his second problem, below follows the last paragraphs of [Sko56] verbatim: It is natural to ask whether the theorems [of [Sko56]] could be extended to the set S ∗ of functions that may be constructed from 0 and x by addition, multiplication and the general power f (x)g(x) . However, I have not yet had the opportunity to investigate this. It seems probable that we will get a representation of a higher ordinal by taking the set of functions of one variable obtained by use of not only x + y, xy, and xy but also f (x, y) defined by the recursion f (0, y) = y, f (x + 1, y) = xf (x,y) The difficulty will be to show the general comparability and that the set is really well ordered by the relation []. - Skolem [Sko56] Exactly what he meant here is not clear, and at least two courses are suggested: One is to study the class obtained by general tetration – allowing from f and g the formation of fg – or to consider the class most analogous to S – allowing the formation of xf only. This paper is concerned with the second interpretation. Finally, we have included multiplication of functions as a basic rule of production for S∗ , since we feel this best preserves the analogy with S. Whereas in S multiplication is derivable, in S∗ it is not: e.g. xx · x is not definable without multiplication as a primitive. 2 Main Results and Proofs It is obvious that f g ⇔ xf xg ∧ xf xg . Secondly, an S∗ -function def f belongs to S iff no honest application of the rule f = xg has been used, where honest means that g ∈ N (identically a constant). For, if g ≡ c, then xg = xc which belongs in S, and in the sequel we will tacitly assume that in the expression ‘xg ’ the g is not a constant. It is straightforward to show that xx def majorises all functions of S, and that xx is -minimal in S∗ = S∗ \ S. 179 14 2.1 M. Barra and P. Gerhardy Pre-Normal- and Normal-Forms We next prove that all functions are represented by a unique normal-form (NF). Strictly speaking we need to distinguish between functions in S∗ and the terms which represent them, as different terms may define the same function. We will write f ≡ g to denote syntactical identity of terms generated from the definition of S∗ , write f = g to denote extensional identity (e.g. xx+1 ≡ xxx but xx+1 = xxx ), and blur this distinction when convenient. In this spirit, we will also refer to arithmetic manipulations of S∗ -functions as rewriting. Definition 2 (Pre-normal-form). For f ∈ S, we call the unique normal-form for f from [Sko56] the Skolem normal-form (SNF) of f . Let s, t range over S. We say that a function f ∈ S∗ is in (ΣΠ-) pre-normalform ((ΣΠ-) PNF) if f ∈ S and f is in SNF, or if f ∈ S∗ and f is of the form n ni fij where either f = i=1 j=1 fij ≡ xg fij ≡ xg fij ≡ s where g is in ΣΠ-PNF; ng where g is in PNF, g ∈ S, g ≡ ( i=1 gi ) ≡ xh , and gng ≡ (s + t); where s is in SNF, and j = ni . An fij on one of the above three forms is an S∗ -factor, a product Πfi of S∗ factors is an S∗ -product, also called a Π-PNF. We say that xh is a tetrationfactor, that xΠgj is an exponent-factor with exponent-product Πgj , and that s ∈ S is a Skolem-factor. The requirement on exponent-factors can be reformulated as: the exponentproduct is not a single tetration-factor, nor is its Skolem-factor a sum. Proposition 1. All f ∈ S∗ have a ΣΠ-PNF. Proof. By induction on the build-up of f . The induction start is obvious, and the cases f = g + h, f = gh and f = xg are straightforward. Let f = xg , and ni let g = ΣΠgij . Then f = xΣΠgij = Πi xΠ gij . By hypothesis, this is a PNF for f except when some exponent-product P = Π ni gij is either a single factor or when Π ni gij = P · (s + t). Such exponent factors can be rewritten as xh+1 in the first case, and as xP ·s · xP ·t in the second case. Definition 3 (Normal-form). Let f ∈ S∗ . We say that the PNF ΣΠfij is a normal-form (NF) for f if (NF1) fij fi(j+1) , and fij fi(j+1) ⇒ ∀∈N fij (fi(j+1) ) ; (NF2) ∀s∈S Πj fij (Πj f(i+1)j ) · s ; (NF3) If fij is on the form xh or xh , then h is in NF. Informally NF1–NF3 says that NF’s are inherently ordered PNF’s. Proving uniqueness is thus tantamount to showing that two terms in NF are syntactically identical, lest they define different functions. The property marked FPP below we call the finite power property. 180 Skolem + Tetration Is Well-Ordered 15 Lemma 1 (and definition of FPP). Let F ⊆ S∗ be a set of comparable S∗ factors in NF such that ∀f1 ,f2 ∈F ∀∈N f1 ∈ S∗ ∧ f1 f2 ⇒ f1 (f2 ) . (FPP) Then all NF’s f ≡ ΣΠfij , g ≡ ΣΠgij composed of factors from F are comparable. In particular f g ⇔ fi0 j0 gi0 j0 for the least index (i0 j0 ) such that fij ≡ gij . n n mi g ki f Proof. Let f = Σi=1 Πj=1 fij ≡ Σi=1 Πj=1 gij = g , i.e. f and g have distinct NF’s. Let (i0 j0 ) be as prescribed above, and assume w.l.o.g. that fi0 j0 gi0 j0 (comparable by hypothesis). Since ΣΠgij is a NF all summands majorise later summands. Hence, for = max {ki | i ≤ ng } and c = ng we have (gi0 j0 ) · FPP c gi0 j0 · · · giki + g(i+1)1 · · · g(i+1)ki+1 + · · · + gng 1 · · · gng kng . Clearly fi0 j0 (gi0 j0 )·c (gi0 j0 ) · c implies f g. Lemma 2. f ≺ g ⇒ 1 ∀∈N (xf ) ≺ xg . I.e. (honest) tetration-factors satisfy the FPP. We skip the proof. 2.2 The Well-Order (S∗ , ) In this section we prove our main theorem: Main Theorem 1. (S∗ , ) is a well-order. We establish this through the following lemmata: Definition 4 (Tetration rank). The tetration rank, denoted ρT (f ), of a function f ∈ S∗ is defined by induction as follows: ρT (0) = ρT (1) = 0 , ρT (f + g) = ρT (f g) = ρT (xf ) = max(ρT (f ), ρT (g)) , def def def def def ρT (xf ) = ρT (f ) + 1 (f not constant). def For all n ∈ N, define S∗,n = {f ∈ S∗ | ρT (f ) ≤ n }, and S∗,n+1 = S∗,n+1 \ S∗,n . def def Clearly S∗ = n∈N S∗,n , and f, g ∈ S∗,n implies f + g, f g, xf ∈ S∗,n . Calculating the tetration-rank of any f ∈ S∗ is straightforward, and terms with different tetration-rank cannot define the same function: Theorem 2. Let ψn ∈ S∗ be defined by ψ0 = xx , and ψn+1 = xψn . Then ψn is comparable to all functions in S∗ , and ρT (f ) < n ≤ ρT (g) ⇒ f ≺ ψn g . def 1 def Actually, the assertion remains true when s ∈ S is substituted for ∈ N, but we shall not need this here. 181 16 M. Barra and P. Gerhardy We omit the proof for lack of space. The theorem above states that ψn+1 is -minimal in S∗ \ S∗,n , and that ψn+1 majorises all g ∈ S∗,n . The next definition is rather technical, but we will depend upon some way of ‘dissecting’ exponent-factors in order to facilitate comparison with tetrationfactors. Definition 5 (Tower height). Let f = ΣΠfij and ρT (f ) = n. Define the def n-tower height τn (ΣΠfij ) inductively by τn (ΣΠfij ) = maxij (τn (fij )), where τn is defined for factors by: 0 , if ρT (fij ) < n or fij ≡ xh , def τn (fij ) = τn (Πgk ) + 1 , if ρT (fij ) = n and fij ≡ xΠgk . Lemma 3. Assume that each f ∈ S∗,n has a unique NF (satisfying NF1–NF3), and that S∗,n satisfies FPP (so that (S∗,n , ) is a well-order). Then: (1) Any two S∗,n+1 -factors are comparable, and S∗,n+1 has the FPP; (2) All S∗,n+1 -products are comparable and have a unique NF; (3) If Πgj is a S∗,n+1 -product, then (xd ) x ·· ∃h∈S∗,n ∃0<c,d∈N ∀0<∈N xh+c ≺ xΠgj (xΠgj ) ≺ x · h+c−1 . Proof. The proof is by induction on the maximum of the tower heights of the involved terms. More precisely, since all functions have a PNF, the number def minτn+1 (f ) = min m ∈ N ∃ΣΠfij (f = ΣΠfij ∧ τn+1 (ΣΠfij ) = m) is well-defined for any f ∈ S∗,n+1 . Given two factors, or a product of factors, when we want to prove one of the items (1)–(3), we proceed by induction on m = maxi minτn+1 (fi ) , where f1 , . . . fk are the involved factors. In light of Theorem 2 – which immediately yields the FPP for pairs of factors of different tetration rank – the proof need only deal with S∗,n+1 -products. We note first that such products may be written in the Π-PNF f1 · · · fm · P , where ρT (fi ) = n + 1 and ρT (P ) ≤ n. By assumption, the part P of the product has a unique NF, and is comparable to all other S∗,n -products and S∗,n+1 -factors. Induction start (τn+1 = 0): If τn+1 (f ) = 0, each fj is a tetration factor xhj for some hj ∈ S∗,n with hj in NF. Ordering the hj in a decreasing sequence yields a unique NF for f1 · · · fm ·P satisfying NF1–NF3 (by invoking Lemma 2). Since all factors compare, all products compare and so (1) and (2) hold. k k gj ) = 0 means that Πj=1 gj may be assumed With regard to (3), τn+1 (Πj=1 to be a NF by (1) and (2). Because ρT (g1 ) = n + 1 and τn+1 (g1 ) = 0, we must have g1 ≡ xg1 for some g1 ∈ S∗,n . We have 2 x(x ) · k k k+1 † · x (x ) (x ) (x ) g1 ≺ x g1 ≺x· . xg1 +1 = x g1 ≡ xg1 ≺ xΠgj = f x g1 x g1 † Above the ‘≺’ is justified by a generalisation of the inequalities (xy )k = xyk ≺ 2 xy , the proof of which we omit for lack of space. Thus, setting h = g1 , c = 1 and d = 2 we have the desired inequalities. 182 Skolem + Tetration Is Well-Ordered 17 Induction step (τn+1 = m + 1): As the induction start contains all the important ideas we only include a sketch. First we prove comparability of factors (1), and only the case xf vs. xΠgj is involved: here we rely on the ih(3) to obtain the FPP. Items (2) and (3) follow more or less as for the induction start. We can now prove Theorem 1: Proof (of Theorem 1). That S∗,0 = S is well-ordered and have unique NF’s is clear, and it vacuously satisfies the FPP. Lemma 3 furnishes S∗,n+1 with the FPP. If we can produce unique ΣΠ-NF’s for f ∈ S∗,n+1 we are – by Lemma 1 and induction on n – done, as an increasing union of well orders is itself well ordered. Now, let ΣΠfij = f ∈ S∗,n+1 . Since all products have a unique NF, by rewriting each of the PNF-products Πfij to their respective NF’s, and then rewriting P · s + P · t to P · (s + t) where necessary – clearly an easy task given that all products are unique – before reordering the summands in decreasing order produces the desired NF. It is interesting to note that the above proofs are completely constructive, and an algorithm for rewriting functions to their normal forms for comparison can be extracted easily from the proof of Lemma 1: Theorem 3. (S∗ , ) is computable. This stands in stark contrast to the Ehrenfeucht-Richardson-Kruskal proof(s) that S ∗ is well-ordered. Indeed, substituting S ∗ for S∗ in the above theorem turns it into an open problem. In the next section we will see that NF’s are a clear advantage when searching for the order type O (S∗ , ) of the well-order. 2.3 On the Ordinal of Normal-Forms In this section the reader is assumed to be familiar with ordinals and their arithmetic. See eg. Sierpiński [Sie65]; α, β range over ordinals, γ over limits. That O (xx ) = 0 follows from Theorems A&2. It is also obvious that O (f + 1) = O (f ) + 1 for all f ∈ S∗ , and that if O (f ) = α + 1, then f = f + 1 for some f ∈ S∗ . It follows that ΣΠfij correspond to a limit – except when fnnn ≡ c ∈ N. When f g, we let [f, g) denote the segment {h ∈ S∗ | f h ≺ g }, and we write O ([f, g)) for O ([f, g), ). In particular O (f ) = O ([0, f )). Also f g implies [0, g) = [0, f ) ∪ [f, g), viz. O (g) = O ([0, f )) + O ([f, g)). Lemma 4. Let f ≡ Πim fi . Then f h f · 2 ⇒ ∃f f (h = f + f ). We omit the proof, and remark that this lemma cannot be generalised to the case where f is a general function. Lemma 5. Let g f ≡ Πi fi . Then (1) O (f + g) = O (f ) + O (g). Moreover, (2) O (ΣΠfij ) = ΣO (Πfij ), and (3) O (f · n) = O (f ) · n. 183 18 M. Barra and P. Gerhardy Proof. Clearly (2) ⇒ (3). Since the majorising product of a NF succeeds the remaining sum, (2) follows by iterated applications of (1), which we prove by induction on O (g). The induction start is trivial. Case O (g) = α + 1: Then, for some g we have g + 1 = g, and so IH O (f + g) = O (f + g + 1) = O (f + g )+1 = O (f )+O (g )+1 = O (f )+O (g) . Case O (g) = γ: As noted above, we have O (f + g) = O (f ) + O ([f, f + g)). Since g f ⇒ f + g f · 2, by Lemma 4 [f, f + g) = {f + g | g ≺ g } whence the map Θ : [f, f + g) → γ, defined by Θ(f + g ) = O (g ), is an order-preserving bijection. Lemma 6. Let f ≡ Πkm fk . Then g fm ⇒ O (f · g) = O (f ) · O (g). Proof. By induction on O (g) for all Πkm fk simultaneously, where g ≡ Σin Πjni gij . The induction start is trivial. L. 5(2) Case O (g) = α + 1: Set αi = O Πjni gij and obtain that O (g) = Σin O Πjni gij = Σin αi . Since g fm , the NF for f ·g has the form Σin f ·(Πjni gij ), since Πjni gij ≺ g for each i (since g is a successor). Hence IH O (f · g) = Σin O f · (Πjni gij ) = Σin O (f ) · O Πjni gij = Σin O (f ) · αi , which – by right-distributivity of ordinal arithmetic – completes this case. Case O (g) = γ: If n ≥ 2 the result follows from the ih by considering the NF of f · g. If n = 1, and n1 ≥ 2, again we may invoke the ih wrt. the product f · Πjn1 −1 g1j and the function g1n1 . In fact, the only remaining case is when g is a constant, which is resolved by Lemma 5(3). Theorem 4. O (ΣΠfij ) = ΣΠO (fij ). m Lemma 7. Let g = Πkm+1 gk , m ≥ 1. Then {x(Πi gi )·f m }f ≺gm+1 is cofinal in xg . Proof. We must prove that ∀h≺xg ∃f ≺gm+1 (h x(Πi gi )·f ). Let h ≡ ΣΠhij ≺ xg , and fix ∈ N such that h (h11 ) ≺ xg . If ρT (h11 ) < ρT (xg ) the result is m immediate. Indeed, unless xΠi gi ≺ h11 we are done. h If h11 ≡ x – either an exponent- or a Skolem-factor – then h · ≺ g. Choosing f = 1 is sufficient unless Πim gi h · , which may occur exactly when for some h ≺ gm+1 , we have h = g · h . If gm+1 ≡ x, then h = c, and since h is an exponent-product, c = 1. Hence f = + 1 witnesses cofinality. Since gm+1 is a factor in an exponent-product, we have x ≺ gm+1 ⇒ x2 gm+1 ; now h ≺ gm+1 ⇒ h · c ≺ gm+1 for all c ∈ N, so that f = h · ( + 1) suffices. If h11 ≡ xh , then comparing h to xg0 +c and d satisfying the condition of Lemma 3(3) completes the case, since those inequalities are easily seen to hold w.r.t. xg1 as well. 184 Skolem + Tetration Is Well-Ordered 19 O(h) Theorem 5. Let g ≡ Πkm gk , and let h gm . Then O xg·h = O (xg ) . Proof. By induction on O (h), the induction start is trivial. def Case O (h) = α + 1: Then, for h ≡ Σin Πjni hij set hi = Πjni hij . We obtain T. 4 O xg·h = O Πin xg·hi = Πin O xg·hi . Next, n ≥ 2 since h is a successor, IH O(h ) O(h) whence hi ≺ h. Hence Πin O xg·hi = Πin O (xg ) i = O (xg ) , by basic ordinal arithmetic. Case O (h) = γ: Let h be as above. If n ≥ 2, the proof is as above, For n = 1, if n1 ≥ 2, the identity (μα )β = μα·β combined with the ih is sufficient. L. 7 The remaining possibility – h is a single factor – is resolved by O xg·h = IH O h def O(h) . O sup xg·h = sup O (xg ) ( ) = O (xg ) h ≺h 2.4 h ≺h The Ordinal O (S∗ , ) An epsilon number satisfies ω = , and can alternatively be characterised as ordinals above ω closed under exponentiation [Sie65, p. 330]. Naive tetration of ordinals – which we denote by α(n) – is well-defined, even though α(ω) = α(β) for all β ≥ ω. Thus α(ω) is an epsilon number for α ≥ ω, and (α )(ω) = α+1 . We follow the notation from [Lev78] and let τ0 denote the smallest solution to the equation2 α = α. In this section we prove our second main result: Main Theorem 6. τ0 ≤ O (S∗ , ) . Lemma 8. O (xf +x ) ≥ O (xf )(ω) . T. 4 Proof. Set O (xf ) = γ. Then ∀k∈N (xf )k ≺ xxf = xf +1 . Hence O (xf )k = O (xf )k , which right side is cofinal in γ ω . Hence (†): γ ω ≤ O (xf +1 ). Since T. 5 O(xf ) xf ·xf (xf +1 )xf = xxf ·xf ≺ xf +2 we ) = O (xxf ) obtain: O (xf +2 ) > O (x O(xf ) ω γ γ . (γ ) ≥ γ , whence (‡): ∀f O (xf +2 ) ≥ O (xf ) † ≥ This constitutes induction start (n = 1) in a proof of O (xf +2n ) ≥ γ(n+1) . For the induction step, assuming O (xf +2n ) ≥ γ(n+1) , immediately yields: ‡ O xf +2(n+1) = O x(f +2n)+2 ≥ (γ(n+1) )γ(n+1) ≥ γ(n+2) . Finally, since xf +n ≺ xf +x , we see that γ(n+1) ≤ O (xf +x ), for arbitrary n, whence the conclusion is immediate. Lemma 9. Let x h ∈ S∗ . Then O (xh·x ) ≥ O(h) . Proof. By induction on O (h). As O (xx ) = 0 , Lemma 8 implies O x(n+1)·x ≥ n for n < ω. Hence O (xx2 ) ≥ supn<ω n = ω . When h = x this is exactly induction start. Both cases α + 1 and γ follows immediately: IH L. 8 (α + 1): O x(h+1)x = O x(h·x+x) ≥ O (xh·x )(ω) ≥ O(h) (ω) = O(h)+1 . IH (γ): O (xh·x ) ≥ suph ≺h O (xh ·x ) ≥ suph ≺h O(h ) = O(h) . 2 A more graphic notation for τ0 is · . · ·0 ω 185 20 M. Barra and P. Gerhardy def def Proof (of Theorem 6). Define 0,0 = 0 , n+1,0 = n,0 , For the functions ψn def (as defined in Theorem 2) we have ψn+2 = xψn+1 xψn ·x so that, for all n, we L. 9 IH have O (ψn+2 ) ≥ O(ψn ) . But then (by induction on n) O (ψ2n+2 ) ≥ O(ψ2n ) ≥ def n,0 = n+1,0 . That clearly τ0 = supn<ω n,0 concludes the proof. Corollary 1. O (S ∗ , ) ≤ O (S∗ , ). 3 Concluding Remarks First observe that if ∀f ∈S∗ ∃n<ω O (xf +1 ) ≤ O (xf )(n) (†) holds, then O (S∗ ) = τ0 , and we hope to settle (†) in the near future. Secondly, this article represents just an ‘initial segment’ of our research on the majorisation relation: why stop at tetration? Pentration (iterated tetration), hexation (iterated pentration) etc. yield a hierachy majorisation-wise cofinal in the primitive recursive functions. If A− n allows n-tration with base x, and An ∗ general n-tration (i.e. S = A− , S = A3 , S∗ = A− 3 4 ), one may ask what kind def [−] of order (An , ) is. We plan to extend the work presented here to A− = def − n∈N An , and study related questions for A = n∈N An in a forthcoming paper. In particular, as 0 = φ(1, 0) and – if (†) holds – τ0 = φ(2, φ is the 0) (where = φ(n, 0). Veblen-function), we would be pleased to discover that O A− n+2 References [Ehr73] Ehrenfeucht, A.: Polynomial functions with Exponentiation are well ordered. Algebra Universialis 3, 261–262 (1973) [Kru60] Kruskal, J.: Well-quasi-orderings, the tree theorem, and Vazsonyi’s conjecture. Trans. Amer. Math. Soc. 95, 261–262 (1960) [Lev75] Levitz, H.: An ordered set of arithmetic functions representing the least number. Z. Math. Logik Grundlag. Math. 21, 115–120 (1975) [Lev77] Levitz, H.: An initial segment of the set of polynomial functions with exponentiation. Algebra Universialis 7, 133–136 (1977) [Lev78] Levitz, H.: An ordinal bound for the set of polynomial functions with exponentiation. Algebra Universialis 8, 233–243 (1978) [Ric69] Richardson, D.: Solution to the identity problem for integral exponential functions. Z. Math. Logik Grundlag. Math. 15, 333–340 (1978) [Sie65] Sierpiński, W.: Cardinal and Ordinal Numbers. PWN-Polish Scientific Publishers, Warszawa (1965) [Sko56] Skolem, T.: An ordered set of arithmetic functions representing the least number. Det Kongelige Norske Videnskabers selskabs Forhandlinger 29(12), 54–59 (1956) 186 187 4.3. SKOLEM’S PROBLEM Proofs completed. Proof: (of Lemma 2. p. 5) We want to prove that tetration factors satisfy the FPP, i.e. that f ≺ g ⇒ (xf )` ≺ xg for arbitrary ` ∈ N. This assertion follows from the slightly more general∗ ∀`∈N (k ≥ 3 ⇒ Ek (Ek+1 (x, f (x)), `) ≺ Ek+1 (x, f (x) + 1)) . To see this, note that f ≺ g ⇒ f + 1 g(x) , whence def def (xf )` = E3 (E4 (x, f (x)), `) ≺ E4 (x, f (x) + 1)) E4 (x, g(x)) = xg . Now, for any y(x) ≥ 1: def Ek (Ek+1 (x, y(x)), `) ≺ Ek (Ek+1 (x, y(x)), x) = Ek (Ek+1 (x, y(x)), Ek+1 (x, 1)) T. 90(C. I) ≤ Ek+1 (x, y(x) + 1) . q.e.d. Proof: (of Lemma 3.) There are two items in the proof-sketch of Lemma † 3. in [B&G09] which were not rigidly proved. The first item is the ‘≺’ found in the last formula on p. 6; the second is the hinted at Induction step. We prove a general version of the first item (in the notation from the previous sections), while the missing induction-step is re-introduced in its original form – viz. as it would have appeared in [B&G09] had space permitted. † proof of ≺: In order to prove that 2 ` † x(xa(x) ) ≺ x we first note that: x ·x o ·· a(x) 2 x(xa(x) ) = E3 (x, E3 (T3 (x, 1, a(x)), `)) and x · ` xx o ·· a(x) = T3 (x, 2, a(x) + 1) . Since T3 (x, 2, a(x) + 1) = E3 (x, T3 (x, 2, a(x))) we have reduced our assertion to E3 (T3 (x, 1, a(x)), `) ≺ T3 (x, 2, a(x)) . The assumption g10 ∈ S∗,n means that we can assume a(x) ≥ x. Secondly, if ` = 1 the assertion is trivial. We now prove that: x ≥ ` ≥ 2 ∧ k ≥ 3 ⇒ (∀a≥2 Ek (Tk (x, 1, a), `) ≤ Tk (x, 2, a)) . The proof is by induction on a ≥ 2. ∗ Which is not the result hinted at in the footnote 4. appearing in Lemma 2. which we will not prove here, and which for the time being must be considered a conjecture. 188 CHAPTER 4. MAJORISATION induction start a = 2: Then: T. 90 dfd ` ≤ x Ek (Tk (x, 1, 2), `) = Ek (Ek (x, x), `) ≤ Ek (x, Ek−1 (x, `)) ≤ Ek (x, Ek−1 (x, x)) P. 87(2) = dfd def Ek (x, Ek (x, 2)) = Ek (x, Tk (x, 2, 1)) = Tk (x, 2, 2) . induction step a + 1: Then: T. 90 def Ek (Tk (x, 1, a + 1), `) = Ek (Ek (x, Tk (x, 1, a)), `) ≤ IH def Ek (x, Ek (Tk (x, 1, a), `) ≤ Ek (x, Tk (x, 2, a)) = Tk (x, 2, a + 1) . Since this result holds for any a ≥ 2 when x ≥ `, we see that E3 (T3 (x, 1, a(x)), `) ≺ T3 (x, 2, a(x)) , † as required. This proves ‘≺’. proof of Induction step (τn = m + 1): First we prove comparability of factors. The case xf vs. xg is already covered by the induction start [found in [B&G09]]. Remaining are the cases (i) xf vs. xΠgj and (ii) xΠfj vs. xΠgj . For case (i), since τn (xf ) = 0, it is when τn (xΠgj ) = m + 1 that we do not already have our result. By definition τn (Πgj ) = m, so we can assume, by the ih. that (xd ) ·x o · h + c − 1 for some Πgj is a NF, and also that xh+c xΠgj (xΠgj )` x · ` h ∈ S∗,n and c, d ∈ N. If f h + c, clearly ∀`>0 (xf ) (xh+c )` ≺ xΠgj . If h + c ≺ f , then d ∀`>0 (xΠgj )` ≺ x · x(x )o ·· h+c−1 ≺x· xo ·· h+c ≺ xf , which completes the case (i). For case (ii), by the ih we are able to compare the two exponent-products, thus the greater majorises all finite powers of the lesser. This proves (1). Item (2) now follows as for the induction start. To finish the proof, let τn (Πgj ) = m + 1. By (1) and (2),we can assume it is on its NF. Thus g1 is the most significant factor. If it is a tetration-factor, or an exponent-factor of tower height strictly less than m, we are done (since then τn (xg1 ) ≤ m). If 0 not, τn (g1 ) = m and is on the form xΠgj . By the ih fix h + c0 and d0 as above, 0 g g 0 k0 wrt. g10 . Now xh+c0 +1 = xxh+c0 ≺ xx 1 ≺ x(x 1 ) ≺ xg1 ≺ xΠgi , and for all ` > 0 k k+1 xΠgi ≺ (xΠgi )` ≺ (x(g1 ) )` ≺ x(g1 ) x x 0 k+1 0 k g1 ≺x 0 0 k +1 xg1 ≺x ≺ d0 x x k0 +1 ·x o ·· h + c0 ≺x· which, for c = c0 + 1 and d = d0 + 1, concludes the proof. xx ·· d0 +1 o h + c0 + 1 , q.e.d. Proof: (of Lemma 4.) We must prove that given a product f ≡ Πm i fi of factors fi , and a function h which is intermediate to f and f · 2 in the -order, 4.3. SKOLEM’S PROBLEM 189 then h can be written as the sum f + f 0 for some f 0 f . Writing h on its ` m ΣΠ-NF, we thus have Πm i fi Σi=1 Πhij (Πi fi ) · 2. By the properties of normal forms, in order to satisfy the first inequality, we see that Πj h1j must be (Πm i fi ) · h1jm+1 · · · h1jm+k , for some k, and to satisfy the second inequality h1jm+1 · · · h1jm+k ≤ 2. Hence ΣΠhij is either f · 2, in which case f 0 = f proves the lemma, or ΣΠhij is f + Σ`i=2 Πhij where Σ`i=2 Πhij = f 0 ≺ f does the job. q.e.d. 190 CHAPTER 4. MAJORISATION 4.4. CONCLUDING REMARKS 4.4 191 Concluding Remarks We feel that our new functions – the En ’s and Tn ’s – have proved themselves as natural and intuitive candidates for canonical backbone-functions in majorisation based hierarchy-theory for several reasons. For one they are uniformly defined, and they provide a means for generalising polynomials, monomials and exponential towers relative to the classical E n -classes of Grzegorczyk. As we saw in [B&G09] they also helped us to solve an open problem, perhaps because they allowed for an ‘extrapolation of intuition’ to the next level. We also saw how we could generalise the results of Levitz in a straightforward manner. As for directions for further research, as pointed out, the results of this chapter are often derived by first using intuition about the situation at level k for ‘small k’ (i.e. 3 or 4), and then generalise. Our intuition is that at each level, one can probably speed-up the results. However, how one is to gain any intuition at all about what the probable difference between levels, say k = 106 and k = 106 + 1, should be is hard to imagine, and obtaining induction start for such cases may only be theoretically possible. We have argued that e.g. the function M13 (m) behaves like a polynomial with respect to functions above E14 (2, x). A question which springs to mind is if, or when, a function from level say k = 7 ‘become a constant’ w.r.t. higher ranked functions (of rank `), in the sense that the same bounds and/or properties are straightforwardly obtainable for E` (f, E7 (2, x)) and E` (f, c). We are also considering to investigate how one can define ordinal tetration in a sensible way. The main difficulty is not to get ‘stuck’ on ordinal fix-points such as ε0 or τ0 . Some progress has been made based on McBeth’s ideas from [McB80], which treats a special case of Skolem’s problem considered and partially solved in [B&G09]. These questions, and many more, are still work-in-progress, but it is our hope that the results obtained so far and presented in this dissertation are interesting enough in their own right to merit being here. 192 CHAPTER 4. MAJORISATION Bibliography [IBF] Barra, M.: Inherently bounded functions and complexity. (unpublished manuscript). [B04] Barra, M.: Complexity bounds on ARSs characterizing complexity classes. M.Sc. thesis, University of Oslo (2004) [BKV07] Barra, M., Kristiansen L. and Voda P. J.: Nondeterminism without Turing machines. in the local proceeedings of CiE2007, Technical Report no. 487 Universitá degli studi di Siena (2007) 71–78 [B08a] Barra, M.: A characterisation of the relations definable in Presburger Arithmetic. in Theory and Applications of Models of Computation, (Proceedings of 5th Int. Conf. TAMC 2008, Xi’an, China, April 2008), LNCS 4978, Springer-Verlag Berlin Heidelberg (2008) 258– 269 [B08b] Barra, M.: Pure iteration and periodicity. in Logic and Theory of algorithms (Proceedings of CiE 2008) LNCS 4978, Springer-Verlag Berlin Heidelberg (2008) 248–259. [B09a] Barra, M.: Bounded Minimalisation and Bounded Counting in Argument-bounded idc.’s. (Accepted – Invited paper to TCS Special Issue for the TAMC 2008 conference). [B&G09] Barra, M. and Gerhardy, P.: Skolem + Tetration is wellordered. in Mathematical Theory and Computational Practice, (Proceedings of CiE 2009) LNCS 5635, Springer-Verlag Berlin Heidelberg (2009) 11–20 [B&W96] Beckmann, A. and Weiermann, A.: A term rewriting characterization of the polytime functions and related complexity classes. in Arch. Math. Logic 36 (1996) 11–30 [Bel79] Beltyukov, A. P.: Small classes based on bounded recursion. (Russian) in Vycil. Tehn. i Voprosy Kibernet 16 (1979) 75–84 [Bel82] Beltyukov, A. P.: A machine description and the hierarchy of initial Grzegorczyk classes. in Journal of Soviet Mathematics, 20(4) (1982) 2280–2289 [Ben62] Bennett, J. H.: On Spectra. Ph.D. thesis, Princeton University (1962) 194 BIBLIOGRAPHY [Cal87] Calude, C.: Super-exponentials nonprimitive recursive, but rudimentary. in Information Processing Letters (25) (1987) 311–315 [Clo96] Clote, P.: Computation Models and Function Algebra. in Handbook of Computability Theory, (ed. E. Griffor), Elsevier (1996) [Ehr73] Ehrenfeucht, A.: Polynomial functions with Eponentiation are well ordered. in Algebra Universialis, (3) (1973) 261–262 [End72] Enderton, H. B.: A mathematical introduction to logic. Academic Press, Inc., San Diego (1972) [Esb94] Esbelin, H-A.: Une classe minimale de fonctions rcursives contenant les relations rudimentaires. (French) [A minimal class of recursive functions that contains rudimentary relations] in C. R. Acad. Sci. Paris Série I 319(5) (1994) 505–508 [E&M98] Esbelin, H-A. and More, M.: Rudimentary relations and primitive recursion: a toolbox. in Theoret. Comput. Sci. 193(1-2) (1998) 129–148 [Gan84] Gandy, R.: Some relations between classes of low computational complexity. in Bulletin of London Mathematical Society (1984) 127–134 [Gla67] Gladstone, M. D.: A Reduction of the Recursion Scheme. in Journal of Symbolic Logic 32(4) (1967) 653–665 [Gla71] Gladstone, M. D.: Simplifications of the Recursion Scheme. in Journal of Symbolic Logic 36(4) (1971) 505–508 [Grz53] Grzegorczyk, A.: Some classes of recursive functions. in Rozprawy Matematyczne No. IV, Warszawa (1953) [Har73] Harrow, K.: Sub-elementary classes of functions and relations. Ph. D. Theisis, New York University (1973) [Har75] Harrow, K.: Small Grzegorczyk classes and limited minimum. in Zeitscr. f. math. Logik und Grundlagen d. Math. Bd. 21 (1975) 417–426 [Har78] Harrow, K.: The bounded Arithmetic Hierarchy. in Information and Control 36 (1979) 102–117 [Har79] Harrow, K.: Equivalence of some hierarchies of primitive recursive functions. in Zeitscr. f. math. Logik und Grundlagen d. Math. Bd. 25 (1979) 411–418 [Kri08] Kristiansen, L.: Complexity-theoretic hierarchies induced by fragments of Gödel’s T . in Theory Comput. Syst. 43(3–4) (2008) 516–541 (MathSciNet): MR2461283 [Kri05] Kristiansen, L.: Neat function algebraic characterizations of logspace and linspace. in Computational Complexity 14(1) (2005) 72–88 BIBLIOGRAPHY 195 [Kri01] Kristiansen, L.: Subrecursive degrees and fragments of Peano Arithmetic. in Arch. Math. Logic 40 (2001) 365–397 [Kri98] Kristiansen, L.: A jump operator on honest subrecursive degrees. in Arch. Math. Logic 37 (1998) 105–125 [K&B05] Kristiansen, L. and Barra, M.: The small Grzegorczyk classes and the typed λ-calculus. in New Computational Paradigms, LNCS 3526, Springer Verlag Berlin Heidelberg (2005) 252–262 [K&V08] Kristiansen L. and Voda, P. J.: The structure of Detour Degrees. in Proceedings of TAMC’08, LNSC 4978, Springer Verlag Berlin Heidelberg (2008) 148–159 [K&V08b] Kristiansen L. and Voda, P. J.: Constant Detour do not matter and so P?− = E?0 . available at http://www.ii.fmph.uniba.sk/voda/E0.ps (April 16th 2008) [Kru60] Kruskal, J.: Well-quasi-orderings, the tree theorem, and Vazsonyi’s conjecture. in Trans. Amer. Math. Soc. 95 (1960) 261– 262 [Kun99] Kunen, K.: Set Theory – An Introduction to Independence Proofs. (7th ed.) Elsevier, (1999) [Kut87] Kutylowski, M.: Small Grzegorczyk classes. in J. London Math. Soc. (2)36 (1987) 193–210 [K&L87] Kutylowski, M. and Loryś, K.: A note on ‘E∗0 = E∗2 ?’ problem. in Z. Math. Logik Grundlag. Math. (33) (1987) 115–121 [Lea00] Leary, C. C.: A Friendly Introduction to Logic. Prentice Hall, Upper Saddle River, New Jersey (2000) [Lev75] Levitz, H.: An ordered set of arithmetic functions representing the least -number. in Z. Math. Logik Grundlag. Math. (21) (1975) 115–120 [Lev77] Levitz, H.: An initial segment of the set of polynomial functions with exponentiation. in Algebra Universialis (7) (1977) 133–136 [Lev78] Levitz, H.: An ordinal bound for the set of polynomial functions with exponentiation. in Algebra Universialis (8) (1978) 233–243 [Lip79] Lipton, R. J.: Model theoretic aspects of computational complexity. in Proc. 19th Annual Symp. on Foundations of Computer Science, IEEE Computer Society, Silver Spring MD (1978) 193–200 [McB80] McBeth, R.: Exponential polynomials of linear height. in Zeitscr. f. math. Logik und Grundlagen d. Math. Bd. 26 (1980) 399–404 [Odi92] CLASSICAL RECURSION THEORY – The Theory of Functions and Sets of Natural Numbers. (Abramsky, S. et.al. eds.) (2nd ed.) Studies in Logic and the Foundations of Mathematics (125) North-Holland Publishing Co., Amsterdam (1992) 196 BIBLIOGRAPHY [Odi99] CLASSICAL RECURSION THEORY Volume II – The Theory of Functions and Sets of Natural Numbers. (Abramsky, S. et.al. eds.) Studies in Logic and the Foundations of Mathematics (142) Elsevier, Amsterdam (1999) [Oit02] Oitavem, I.: A term rewriting characterization of the functions computable in polynomial space. in Arch. Math. Logic 41 (2002) 35–47 [P&V85] Paris, J. and Wilkie, A.: Counting problems in bounded arithmetic. in Methods in mathematical logic. Proceedings, Caracas 1983, Lecture Notes in Mathematics 1130, Springer Verlag, (1985) 317–340 [Pet67] Péter, R.: RECURSIVE FUNCTIONS. (3. ed.) Academic Press, New York and London (1967) [Pre30] Presburger, M.: Über die Vollständigket eines gewissen Systems der Arithmetik ganzer Zahlen, in welchem die Addition als einzige Operation hervortritt. Sprawozdanie z I Kongresu Matematyków Slowańskich (395) Warsaw (1930) 92–101 [Pre91] Presburger, M. and Jaquette, D. (translator): On the Completeness of a Certain System of Arithmetic of Whole Numbers in Which Addition Occurs as the Only Operation. History and Philosophy of Logic (12) (1991) 225–233 [Ric69] Richardson, D.: Solution to the identity problem for integral exponential functions. Z. Math. Logik Grundlag. Math. (15) (1978) 333–340 [Rit63] Ritchie R. W.: Classes of predictably computable functions. in Trans. Am. Math. Soc. (106) (1963) 139–173 [JRo50] Robinson, J.: General recursive functions. in Proceedings of the American Mathematical Society 1(6) (1950) 703–718 (MathSciNet): MR38912 [JRo55] Robinson, J.: A note on primitive recursive functions. in Proceedings of the American Mathematical Society 6(4) (1955) 667–670 (MathSciNet): MR73536 [RRo47] Robinson, R. M.: Primitive recursive functions. in Bulletin of the American Mathematical Society 53(10) (1947) 925–942 (MathSciNet): MR22536 [RRo55] Robinson, R. M.: Primitive recursive functions II. in Proceedings of the American Mathematical Society 6(4) (1955) 663–666 (MathSciNet): MR73535 [Rog67] Rogers, H. Jr.: Theory of Recursive Functions and Effective Computability. McGraw-Hill Book Co. (1967) [Ros84] Rose, H. E.: Subrecursion – Functions and hierarchies.. Clarendon Press, Oxford (1984) BIBLIOGRAPHY 197 [Swk05] Schweikardt, N.: Arithmetic, First-Order Logic, and Counting Quantifiers. in ACM Trans. Comp. Log. 6(3) (2005) 634–671 [Sch69] Schwichtenberg, H.: Rekursionszahlen und die GrzegorczykHierarchie. (German) in Arch. Math. Logik Grundlagenforsch. 12 (1969) 85–97 (MathSciNet: MR0253900) [Sev08] Severin, D. E.: Unary primitive recursive functions. (English summary) in J. Symbolic Logic 73(4) (2008) 1122–1138 [Sho67] Shoenfield, J. R.: Mathematical Logic, Association for Symbolic Logic, A K Peters Ltd., Natick, Massachusetts (1967) [Sie65] Sierpiński, W.: Cardinal and Ordinal Numbers. PWN-Polish Scientific Publishers Warszawa Poland (1965) [Sko23] Skolem, T.: The foundations of elementary arithmetic established by means of the recursive mode of thought, without the use of apparent variables ranging over infinite domains. (with foreword). (1923) Translation of Begründeung der elementären Arithmetik durch die rekurriende Denkweise ohne Anwendung scheinbarer Veränderlichen mit Veränderlichem Ausdehnungsbereich. from From Frege to Gödel (Van Heijenoort ed.) Harvard University Press (1967) 302–333 [Sko56] Skolem, T.: An ordered set of arithmetic functions representing the least -number. in Det Kongelige Norske idenskabers selskabs Forhandlinger (29) Nr. 12 (1956) 54–59 [Smu61] Smullyan, R. M.: Theory of formal systems. (Revised edition), Princeton University Press, Princeton, New Jersey (1961) [Soa] Soare, R. I.: Computability and Recursion. available from http://people.cs.uchicago.edu/ soare/ Publications/computability.ps (March 3d 2009) [Terese] Terese: Term Rewriting Systems. (Bezem, M. et.al. eds.) Cambridge University Press (2003) [Veb08] Veblen, O.: Continuous Increasing Functions of Finite and Transfinite Ordinals. in Trans. Am. Math. Soc. 9(3) (1908) 280–292 [Wai72] Wainer, S. S.: Ordinal Recursion, and a Refinement of the Extended Grzegorczyk Hierarchy. in Journal of Symbolic Logic 37(2) (1972) 281–292 [Wra78] Wrathall, C.: Rudimentary predicates and relative computation. in SIAM J. Comput. 7(2) (1978) 194–209