Computer Algebra Systems: Are We There Yet? Richard Fateman Computer Science Univ. of California Berkeley, CA Univ. of Arizona -- February, 2004 1 The Subject: “Symbolic Computation Systems” • • • • What are they? How good are they now? Where are they going? When will they be “there”? Univ. of Arizona -- February, 2004 2 What are they? • An attempt to build a “mathematical intelligence” or at least a very skilled assistant. Univ. of Arizona -- February, 2004 3 What is their current state? • We don’t know how to achieve the stated goals. • We keep trying, anyway. • New systems are produced every few years, but rarely push the state of the art, much less advance it. Univ. of Arizona -- February, 2004 4 What then? • If we don’t have one now, and progress seems slow, what do we need to do, when will we do it, and what will it look like? Univ. of Arizona -- February, 2004 5 What does it take to build a Computer Algebra System? • • • • • • A. Software engineering. B. Language choice (Aldor, C++, Java, Lisp,…). C. Algorithms, data structures. D. Mathematical framework (often the weak spot). E. User interface design. F. Conformance to Standards, TeX, MathML, COM, .NET, Beans. • G. Community of users (IMPORTANT). • H. Leadership/ Marketing(?) Univ. of Arizona -- February, 2004 6 An Aside on your non-constructive education In freshman calculus you learned to integrate rational functions. You could integrate 1/x and 1/(x-a) into logarithms, and you used partial fractions. Unless you’ve recently taken (or taught) this course, you’ve forgotten the details. That’s OK. Let’s review it fast. Univ. of Arizona -- February, 2004 7 Here’s an integration problem Univ. of Arizona -- February, 2004 8 You need to factor the denominator You learned to do this by guesswork, and fortunately it works. Univ. of Arizona -- February, 2004 9 And then do the partial fraction expansion You probably remember one way to do this ,vaguely if at all.. Univ. of Arizona -- February, 2004 10 And then integrate each term… Univ. of Arizona -- February, 2004 11 Can we program this? Note we can’t computerize “guessing the answer” generally. Do you really know an algorithm to factor the denominator into linear and quadratic factors? • Can you do this one, say… • And if the denominator does not factor (it need not, you know… ) what do you do then? Univ. of Arizona -- February, 2004 12 If the denominator doesn’t factor And it gets worse … there is no guarantee that you can even express the roots of irreducible higher degree polynomials in radicals like 31/2 and a2/3 Univ. of Arizona -- February, 2004 13 Moral of this story • Freshmen are not taught how to integrate rational functions. Only some easy rational functions. • A freshman could not write a program. Polynomial factoring or rational integration uses ideas you may never encounter. • Much of the math you learned is nonconstructive and must be re-invented to write a general computer algebra program! End of aside Univ. of Arizona -- February, 2004 14 Some History: Ancient • Ada Augusta, 1844 foresaw prospect of nonnumeric computation using Babbage’s machines. Just encode symbols as numbers, and operations as arithmetic. Univ. of Arizona -- February, 2004 15 Ada Augusta on Symbolic Computing, 1844 Many persons who are not conversant with mathematical studies imagine that because the business of [Babbage's Analytical Engine] is to give its results in numerical notation, the nature of its processes must consequently be arithmetical and numerical, rather than algebraical and analytical. This is an error. The engine can arrange and combine its numerical quantities exactly as if they were letters or any other general symbols; and in fact it might bring out its results in algebraic notation, were provisions made accordingly. -- Ada Augusta, Countess of Lovelace, (1844) Univ. of Arizona -- February, 2004 16 Some History: Slightly Less Ancient • Arithmetization of Mathematics: Formalisms • Philosophers/Mathematicians, e.g. Gottlob Frege, then Bertrand Russell, Alfred North Whitehead (Principia Mathematica 1910-1913) Univ. of Arizona -- February, 2004 17 The Flip side: proofs you can’t do all math • Impossible. • K. Gödel, A.M. Turing Univ. of Arizona -- February, 2004 18 New optimism. If people can, why not Computers? 1958-60 first inklings .. automatic differentiation, tree representations, Lisp, • Minsky ->Slagle, (1961), Moses (1966); Is it AI? Pattern Matching? Univ. of Arizona -- February, 2004 19 Computer Algebra Systems : threads • Three trends emerged in the 1960s: – AI / later…expert systems – Constructive Mathematics (Integration) – Algorithms on polynomials (GCD) Univ. of Arizona -- February, 2004 20 Some Early Ambitious Systems • Early to mid 1960's - big growth period, considerable optimism in programming languages, as well as in computer algebra… • - Mathlab, Symbolic Mathematical Laboratory, • Formac, Formula Algol, PM, ALPAK, Reduce, CAMAL; Special purpose systems, • Simple poorly-specified systems that did some useful computations coupled with uncritical optimism about what could be done next. Univ. of Arizona -- February, 2004 21 Some theory/algorithm breakthroughs • • • • • 1967-68 algorithms: Polynomial GCD, Berlekamp’s polynomial Factoring, Risch Integration "near algorithm", Knuth’s Art of Computer Programming 1967 - Daniel Richardson: interesting zeroequivalence results. Univ. of Arizona -- February, 2004 22 Some old systems survive, new ones arrive • General: – SAC-1, Altran, Macsyma, Scratchpad, Mathlab 68, MuSimp/MuMath, SMP, Automath, JACAL, others. • Specialists: – Singular, GAP, Cocoa, Fermat, NTL, Macaulay • Further development; new entrants since 1980's – Maple, Mathematica (1988), Derive, Axiom, Theorist, Milo… MuPad, Ginac, Pari) • For a list, see: www.symbolicnet.org Univ. of Arizona -- February, 2004 23 The Marketing Blitz: aren’t they all the same? • Mathematica + NeXT or Apple = graphics. • Maple does the same. 1 0.75 2 0.5 0.25 1 0 -2 0 -1 -1 0 1 2 -2 Plot exp(-(x2+y2)) in (-2,2) (-2,2) Univ. of Arizona -- February, 2004 25 More of the same… • Mathematica + NeXT or Apple = graphics. Macsyma too Univ. of Arizona -- February, 2004 26 The blitz… • Mathematica. Endorsed by Steve Jobs and the New York Times? • Maple changes its image, belatedly. • Macsyma follows suit. • Axiom (Scratchpad) sold by IBM to NAG. • Mupad starts up. Univ. of Arizona -- February, 2004 27 The shakeout • Axiom under NAG sponsorship, then is killed. (2001) • MuPad, once free, now sold. • Macsyma goes into hiding, earlier version emerges free as Maxima. Univ. of Arizona -- February, 2004 28 Connections gain new prominence • MathML puts “Math on the Web”. • Connections – Links from Matlab or Excel to Maple; Macsyma to Matlab; – Scientific Workplace to Maple or Mathematica or Mupad. • The arrival of network agents for problem solving. – Calc101, Tilu, TheIntegrator, Ganith, … – Java beans for symbolic computation – MP, distributed computing Univ. of Arizona -- February, 2004 29 Are there really differences in systems? • What we see today in systems: – Mathematica essentially takes the view that mathematics is a collection of rules with a procedure for pattern matching; and that math can be reduced to what might be good for physicists, even if slightly wrong. – Axiom takes the view that a computer algebra system is an implementation of Modern Algebra, and the physicists better know algebra. – “Advanced” math is spotty. Univ. of Arizona -- February, 2004 30 A broad brush of commonality today: – – – – Objects Operations Properties? Axioms? Extensions to a base system (programming? Declarations?) – Underlying all of this: efficient representations – Common bugs (e.g. by violating “fundamental theorem of calculus” continuity requirements.) – A shell around the whole thing. Menus, notebooks, etc Univ. of Arizona -- February, 2004 31 Moving to the future • Computer math + WWW adds new prospects. • Repository for everything that was previously published (paper digital form). • Could include everything NEW (born digital). – What to do with repetitive garbage? • Need methods to find appropriate information – Index/search :: vastly dependent on CONTEXT – Certify authenticity and correctness (referees?) • Algorithms may not yet exist for some problems. – How to pay for development – Availability to (all?). • Free “public library”, pay-per-view, subscription, … pop-up ads (This integral brought to you by XYZ bank ) Univ. of Arizona -- February, 2004 32 Digital Library of Mathematical Functions (at NIST) • Mostly aimed at traditional usage • Intimations of support for new modes of interaction with WWW, CAS Univ. of Arizona -- February, 2004 33 Competition for DLMF Mostly aimed at supporting CAS users. • ESF: generate automatic symbolic data for Encyclopedia of Special Functions. • Wolfram’s special functions project: collect material from humans in special forms, display in Mathematica oriented forms. Less CAS… • CRC/Maple tables • Dan Zwillinger, ODEs, Gradshteyn Univ. of Arizona -- February, 2004 34 Contrast: Non-digital tradition: to find out something we might do this • • • • • Look in an individually owned reference work Visit a library Access to colleagues by letter, phone Paper and pencil exploration Numerical experimentation Univ. of Arizona -- February, 2004 35 Contrast: Digital tradition: to find out something we might do this • • • • • Try Google Visit an on-line library database e.g. INSPEC Download papers to local printer or view online CAS exploration Numerical experimentation • Major Problem: How can you type a differential equation into Google??? Univ. of Arizona -- February, 2004 36 Wolfram Research’s Special Functions site: 3 versions • • • • Huge posters Interactive web site/ Mathematica notebooks Printed form (or the equivalent PDF) Now (2004) some 87,000 “formulas” and many “visualizations”. Univ. of Arizona -- February, 2004 37 The posters Univ. of Arizona -- February, 2004 38 The web site (here, the Arcsin page) Univ. of Arizona -- February, 2004 39 WRI’s Categories/ Some Subcategories primary definition specific values general characteristics series representations generalized power series at various points q-series exponential fourier series dirichlet series asymptotic series other series integral reprsentations on the real axis contour integrals multiple integral representation analytic continuations product representations limit representations continued fractions generating functions group representations differential equations difference equations transformations addition formulas etc operations integral transforms identities representations through more general functions relations with other functions zeros inequalities theorems other information history and applications references Univ. of Arizona -- February, 2004 40 Click on “Series Representations”… Computer Algebra and DLMF 41 The posters are not very useful • These are pictures of out-of-context math formulas. • The most plausible next step given the charts is to copy them down on paper and check by hand. • There is a possibility of making typos or fresh algebra mistakes. • The notation might be different from what you are using. • Sparse (or no) info on singularities, regions of validity. • To run some numbers through, you need to write a computer program (Fortran? Matlab? C++?,) Univ. of Arizona -- February, 2004 42 On-line versions are more useful • Less possibility of making new typos. • The notation are unambiguous, presumably using a CAS or formal syntax. • Still, sparse (or no) info on singularities, regions of validity. • Automated visualizations and cut/paste programming to run some numbers through. Univ. of Arizona -- February, 2004 43 Notebook form (I) Input form ArcSin[z] == z^3/6 + z + (3*z^5)/40 + \[Ellipsis] == Sum[(Pochhammer[1/2, k]*z^(2*k + 1))/((2*k + 1)*k!), {k, 0, Infinity}] == z*Hypergeometric2F1[1/2, 1/2, 3/2, z^2] /; Abs[z] < 1 Wolfram (and others) will claim that a “system independent” language such as proposed by the OpenMath consortium would replace this language. Note however that agreement on the semantics of \[Ellipsis] would be difficult. Univ. of Arizona -- February, 2004 44 Notebook form (II) Displayed form (one version) In reality, Mathematica does not look quite as good as our typesetting here in the interactive mode. Univ. of Arizona -- February, 2004 45 Notebook form (III) TeX form {Condition}(\arcsin (z) = {\frac{{{\Mfunction{z}}^3}}{6}} + z + {\frac{3\,{z^5}}{40}} + \ldots = \Mfunction{\sum}_{k = 0}^{\infty } {\frac{\Mfunction{Pochhammer}({\frac{1}{2}},k)\, {{\Mfunction{z}}^{2\,k + 1}}}{\left( 2\,k + 1 \right) \,k!}} = \Mfunction{z}\,\Mfunction{Hypergeometric2F1}( {\frac{1}{2}},{\frac{1}{2}},{\frac{3}{2}},{z^2}), \Mfunction{Abs}(z) < 1)) Useful in case you wanted to paste/edit this into a paper, (or powerpoint) but requires using Mathematica TeX macros. Univ. of Arizona -- February, 2004 46 Notebook form (IV) OpenMath form { too ugly to believe} Useful in case you wanted to send this to an OpenMath aware program. If you can find one. Univ. of Arizona -- February, 2004 47 Computing Inside the Notebook How good is the 3-term approximation at z= ½ ? ArcSin[z] == z + z^3/6 + (3*z^5)/40 + ... Pi/6 == 2009/3840 + ... /. z -> 1/2 Surprised? N[ Pi/6 == 2009/3840 + ...] 0.523599 == 0.523177 + ... N[ Pi/6 == 2009/3840 + ..., 30] 0.523599 == 0.523177 + ... N[ Pi/6 == 2009/3840 + ..., 30] 0.52359877559829887307710723055 == 0.52317708333333333333333333333 + ... Univ. of Arizona -- February, 2004 48 Simplification Inside the Notebook In[30] := z* Hypergeometric2F1[1/2, 1/2, 3/2, z^2] Note: this is how Mathematica interactive output looks. This should be the same as ArcSin[z] for |z|<1. And yes, z/Sqrt[z^2] is not the same as 1. Univ. of Arizona -- February, 2004 49 Many computer algebra systems (CAS) have essentially the same notebook paradigm • • • • • • • Macsyma Maple Mathematica Axiom MuPad Scientific Word / Maple Derive Univ. of Arizona -- February, 2004 50 This old “knowledge”? Can we convert from scanned text? Example from integral table In practice, we can do some parsing using OCR if we know about the domains. But in general, we cannot read “with understanding” without context. Univ. of Arizona -- February, 2004 51 What about using LaTeX as source and then converting to OpenMath/ CAS? Generally speaking: not automatically TeX does not distinguish semantically between 1*2*3 and 123. Or between x cos x and xfoox. It has no notion of precedence of operators Gradshteyn and Rhyzik, Table of Integrals and Series (Academic Press) was re-typeset completely in TeX TWICE, because the first version did not reflect semantics. MathML, XML, and OpenMath are inadequate. Univ. of Arizona -- February, 2004 52 Using OpenMath as original human-written source is pretty much out of the question. If your intent is to code: x cos x You are supposed to write something like <OMOBJ> <OMA> <OMS cd = "arith1" name="times"/> <OMV name="x"/> <OMA> <OMS cd="transc1" name="cos"/> <OMV name="x"/> </OMA> </OMA> </OMOBJ> Univ. of Arizona -- February, 2004 53 Using MathML as original source is pretty much out of the question, too. <math> <msqrt> <mfrac> <mrow><mn>2</mn><mi>&pi;</mi></mrow> <mrow><mi>&kappa;</mi></mrow> </mfrac> <mfenced open="(" close=")"> <mn>1</mn> <mi>&minus;</mi> <mi>&beta;</mi> <msup> <mrow><mn>2</mn></mrow> </msup> <mi>/</mi><mn>2</mn></mfenced></msqrt></math> Univ. of Arizona -- February, 2004 54 What about “Wikis” •Volunteers inserting “information” into an informal structure on the internet. Anyone can edit anything. •Unlikely to have the accuracy and scope of a funded activity. •Replaces single bias with many biases. •Unlikely to have the proprietary interest of a commercial enterprise. Univ. of Arizona -- February, 2004 55 How will a CAS fit into this vision of Math of the future? •The semantics for most (not all ) CAS is immediate. • Input requires immediate syntactic disambiguation. • Easy translation into MathML for display. • Easy translation into OpenMath, if anyone else cares •Important Advantage: There is an immediate computational ontology. THE BEST CHANCE FOR A FOUNDATION TO GROW CONTEXT. Univ. of Arizona -- February, 2004 56 Context might be the role of some Server Side software. • Pro: – arbitrarily powerful, – always up-to-date, (contains yesterday’s new math) – controlled by reputable authority… • Con: – Requires reliable communication. – Authoritarian. Univ. of Arizona -- February, 2004 58 A challenge: Input and Output of Math • Handwriting on a tablet is an obvious choice on Tablet PCs, but on closer examination, a very weak method. (30 years of experience!) 0Oo 1l| 5S vV Yy < l< K • Speech, oddly enough, can help. • The importance of context emerges again… enormous in math communication, digital storage, etc. Univ. of Arizona -- February, 2004 60 Finally: Are we there yet? • No, we are not. • Many efforts are re-working the easy parts. • Many efforts are mostly marketing: “improving the user interface.” • The importance of context is enormous. A “search engine for math facts and algorithms” seems our best bet to build a mathematical assistant. • What can we do:… Univ. of Arizona -- February, 2004 61