Singular Value Analysis of Cryptograms Author(s): Cleve Moler and Donald Morrison Source: The American Mathematical Monthly, Vol. 90, No. 2 (Feb., 1983), pp. 78-87 Published by: Mathematical Association of America Stable URL: http://www.jstor.org/stable/2975804 . Accessed: 14/03/2011 13:52 Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at . http://www.jstor.org/action/showPublisher?publisherCode=maa. . Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American Mathematical Monthly. http://www.jstor.org 78 CLEVEMOLERAND DONALD MORRISON [February newsletter culminated in the creation of our very successful, very well-received newsletter, FOCUS. Its editor, MarciaP. Sward,calls Ed the "Father of FOCUS". Let me conclude by emphasizingother service to the Association. Ed was elected chairmanof the Texas Section,but he left for Michiganbefore he took office. He was a visiting lecturerfor the MAA, and served a five-yearterm as editor of the MathematicalNotes Section of the MONTHLY. As chairmanof the Committeeon Publicationssince 1971,he guided the unprecedentedgrowthof our journals and our series of books and monographswith skill, determination,and enthusiasm. In all his activities, Ed Beckenbachenlisted the cooperation of his colleagues by his skill at negotiation, his unfailing courtesy and considerationtoward others, and his common sense and good humor. But Ed's cooperativeand accommodatingspirit at the committee table completely disappeared in another of his roles. On the tennis court Ed was a hard contender, a tough adversarywho showed 'em no mercy. Captain of the tennis team at Rice University back in the twenties, and later the coach of the team, he spanned six decades with his favorite sport. Even recently Ed and his wife Alice competed in national tournamentsof "superseniors";in more peaceful moments they indulged their lifelong hobby, tending their hillside acre of plants ranging from apricots to orchids. Incidentally, Alice surely holds a record for faithful attendance at meetings of mathematicians,having started at the age of eleven as daughter of a one-time presidentof the Association. Ed Beckenbachclearly served the mathematicalcommunityvery well. We are all indebted to him for his preeminentleadership.Ed Beckenbachis indeed a most worthy recipientof the Award for DistinguishedServiceto Mathematics. * * * Edwin F. Beckenbachdied on September5, 1982. SINGULAR VALUE ANALYSIS OF CRYPTOGRAMS CLEVE MOLER AND DONALD MORRISON* Departmentof ComputerScience, Universityof New Mexico, Albuquerque,NM 87131 1. Singular Value Analysis and Cryptanalysis.The singular value decomposition is a matrix factorization which can produce approximationsto large arrays. Cryptanalysisis the task of breaking coded messages. In this paper, we present an unusual merger of the two in which the singularvalue decompositionmay aid the cryptanalystin discoveringvowels and consonants in messagescoded in certainvariationsof simple substitutionciphers. Texts in many languages, including English, have the property that vowels are frequently followed by consonants, and consonants are frequently followed by vowels. We say a text is a Cleve B. Moler received his Ph.D. at Stanford under George Forsythe. He has taught both mathematics and computer science at Stanford, the University of Michigan, and the University of New Mexico. He recently succeeded Morrison as Chairman of the Department of Computer Science at New Mexico. His academic and research interests include: numerical analysis, mathematical software, and scientific computing. Don Morrison earned his Ph.D. in mathematics at the University of Wisconsin in 1950, under C. C. MacDuffee. He taught mathematics at Tulane, 1950-55, and computer science at the University of New Mexico, 1967 to the present. He was a staff member, division supervisor and department manager at Sandia Corporation, 1955-71. He has one wife, three children and numerous grandchildren. His present research interests include cryptology, data structures, and design of algorithms. *This work was partially supported by a grant from the National Science Foundation. 1983] SINGULARVALUEANALYSISOF CRYPTOGRAMS 79 "vowel-follows-consonant"text, or more briefly, a "vfc" text if the proportion of vowels following vowels is less than the proportionof vowels following consonants. That is, if numberof vowel-vowelpairs < numberof consonant-vowel.pairs numberof vowels numberof consonants Some languagesproduce better vfc texts than others and some deviations by individualletters can be expected. In English text, the letter h is often preceded by other consonants to form a single sound, as in ch, gh, ph, sh, and, especiallyth. The letters 1,n, m, and r are often followed by consonants. Nevertheless,Englishis a predominantlyvfc language.In Hawaiian,every consonant is followed by a vowel, so there are no consonant-consonantpairs. In the romajitransliterationof Japanese,the only consonant-consonantpairs are ch, sh, ts, and n, followed by a consonant and several double consonants. Russian has many sounds which, in transliteration, are of the consonant-consonantor vowel-vowel form, such as sh, ch, ts, ya, and ye, but in the Cyrillic alphabet, these are represented as single letters. Thus, Russian is also a predominantly vfc language. 2. The Cryptanalyst'sProblem. A simple substitutioncryptogramis a coded message in which the individualletters have been replacedby a permutationof themselves.When faced with such a message, a cryptanalystmight first count the occurrencesof individualletters in the message and compare the frequencieswith the known frequenciesin typical uncoded text. However, a fairly long message is requiredbefore this techniquehas much chance of success. Another aid in breakingthe code is a partitioningof the alphabet of the cryptograminto two subsets, representingthe vowels and the consonants.In order that such a partitionbe plausible,it ought to satisfy the vfc rule (1). This is the main task which we wish to considerhere-and is what we call the cryptanalyst'sproblem. A trial and error solution, which tries all possible partitions until one satisfying the vfc rule is found, is clearly prohibitive. Let n be the numberof lettersin the alphabet.For any text, the digramfrequencymatrixis the n-by-n arrayA with aij = the number of occurrencesof the ith letter followed by the jth letter. Blanks and punctuation,if present, are ignored and the first letter of the text is assumedto follow the last letter. In general,the matrixis not symmetric,but for each i Eaij = Eaj1 =f I i wherefi = the numberof occurrencesof the i th letter. For any proposed partitioning of the alphabet into vowels and consonants, two column vectors, v and c, can be defined by vi = 1 if the ith letteris a vowel, 0 otherwise, Ci = 1 if the i th letter is a consonant, 0 otherwise. Note that v + c is a vector with all l's and that the inner product vTc is zero. Also note that the value of the quadraticform vTAvis the numberof vowel-vowelpairs in the text. Using A, v and c, the vfc rule (1) can be stated vTAv cTAv vTA(v + c) cTA(v + c) Cross-multiplyingand cancelling the common term, we obtain (2) (vTAv)(cTAc) - (vTAc)(cTAv) < 0. The cryptanalyst'sproblem, then, is: Given A, find a partitioningv and c so that (2) holds. 3. The SingularValue Decomposition. The singularvalue decomposition,or SVD, is a matrix factorizationwhich numericalanalystsuse in a wide varietyof ways. Althoughits primaryuses are 80 CLEVEMOLERAND DONALD MORRISON [February in the analysis of systems of simultaneouslinear equations and in the computationof pseudoinverses, we will use it here to obtain "simple"approximationsto the digramfrequencymatrix. For this purpose,we express the SVD as a sum of rank one matricesof the form (3) A = a1xXyT + a2x2yT + *- + anXnYnT where a1 xiTx= a22 * > an > 0 uij (the Kroneckerdelta) Yii ij The coefficients aj are known as the singularvalues and xj, and yj are the left and right singular vectors, respectively.They can also be characterizedin terms of the solutions to the symmetric eigenvalueproblem: tT o} y} It is not hard to see that the rank of a matrixis the numberof nonzero singularvalues. (In fact, this is a particularlyuseful way to define rank.) There is a fast, reliable algorithmfor computing the SVD [3], [6]. For a 26-by-26 matrix, the computation of the singular values and correspondingleft and right vectors takes only a few seconds on a medium speed modermcomputer. The normalizationof the vectors xj and yj insures that all of the rank one matricesxjyT have the same norm (that is, the sum of the squares of the elements). Consequently, the numerical importanceof each term in the sum (3) can be measuredby the size of the coefficient aJ..If the a decrease fairly rapidly as j increases, then an accurate approximationto A can be obtained by truncatingthe series after only a few terms.Truncatingthe series after k nonzero terms providesa rank k approximationto A. Such approximationsare related to those obtained by factor analysis and have a wide varietyof applications.One unusualapplicationis to digital image processing[1]. 4. Rank One Approximation.A rough approximationto a digram frequency matrix can be obtained by taking only the first term in its SVD: A ~A1 = a1xXlyT. Let e be the vector of all l's and let f be the vector withf letter in the text. Then Ae = ATe= f and so a1 (yTe)xl a-(-xaTe)x e = fy the numberof occurrencesof the i th . Consequently,if we were to assume that the digram frequency matrix were only rank one, we would conclude that it was symmetric and that each row and column was proportional to the frequencyvectorf. Of course, in practice,A is not rank one. Nevertheless,the first left and right singularvectors tend to be approximatelyequal and reflect the frequenciesof the letters in the text. 5. Rank Two Approximation.The rank two approximationobtained from the first two terms of the SVD is the simplest approximationwhich takes into account the correlationbetween pairs of letters in the text. The second left and right singularvectors contain the key to the solution of the ciyptanalyst'sproblem. Assume that A is approximatedby a matrix of rank two, so that (4) A A2 = a xTyf + a2x2yT. We propose to use the signs of the components of x2 and Y2 to partition the alphabet as follows: 1983] SINGULARVALUEANALYSISOF CRYPTOGRAMS { i I 0 0 ()c ( 1 0 81 ifx2 > O andy,2 < O otherwise, f Xi2< 0 andyi2 > 0 otherwise, if sign (Xi2) = sign (Yi2) = O otherwise. The third category-the "neuter"letters-are the ones that cannot be classified as either vowels or consonants. It turns out in practice that few letters fall into this category. In English text, for example, the letter h is usually neuter. It might be possible to consider a finer partitioninvolving "left vowel, right consonant" and so on, but we have not pursuedthis idea. Using these definitions,we obtain a solution to the cryptanalyst'sproblem as follows. THEOREM. Let A = A2 be a nonnegativerank 2 matrixwith the SVD expansionin (4). Let v and c be definedby (5). Thenthe vfc rule, (6) is satisfied. D = (vTAv)(cTAc) - (vTAc)(cTAv) < 0 Proof. Since A is a nonnegativematrix,it follows from the Perron-Frobeniustheorem [4] that xl and Yi have nonnegativecomponents. Placing (4) in (2) and expanding produces eight terms. The two terms involving only subscript 1 cancel, so do the two terms involving only subscript2: D (V"ZIY1v cTz2y2Cc + vTz2y2v cITzyIc =1F2 T -vZIy1C T T CTZ2Y22V - T T VTZ2Y2C CTZIYl V) Of all the different inner products appearing in this expression, only two-namely, y2Tvand CTZ2-are negative.Consequently,all four termsin the parenthesesare negative and D is negative. 6. Effects of Encipherment.When a message M is encoded by a simple substitutioncipher c, (wherec is a permutationof the integers 1 to n) each occurrenceof the ith letter u, is replacedby the c(i)th letter, uc(z). The resultingcryptogramis called MC.If A is the digramfrequencymatrix of M, and C is the permutationmatrix C = (8c(i)j) which has in its ith row, the c(i)th row of the identity matrix, then it is not difficult to show that the digram frequencymatrix of MCis CACT. This matrix has in row c(i), column c(j), the frequency of the digram uiu1 in M, which it represents.It follows that the singularvalues of CACT are the same as those of A, and thejth left and right singularvectors are, respectively,Cxj and Cyj. These have the same coefficients as xj and y1, but they are permutedby C; the coefficients appearingin row i in xJ ad yJ appearin row c(i) in Cxj and Cyj.If the scheme describedabove, applied to A, classifies the letter ui as a vowel in M, then applied to CACT, it will classify uc(,), the encoding of ui, as a vowel in Mc. Simple substitutionciphersare, of course,very simple ciphers,and no self-respectingcryptanalyst regards them as a challenge. A more sophisticatedcipher is the k-alphabeticcipher (k is a positive integer). In this cipher,k permutations,cl, ca.. ., Ckare used in cyclic fashion to encode the letters in M to produce the cryptogram M,. A letter is said to be in position p, in M (1 < p < k), if it is the mth letter of M and m is congruentp modulo k. If the letter ui occurs in positionp, it is encoded as uc(i). Thus, the encoding of a letter depends,not only on what letter it is, but also on its position in M. The digramfrequencymatrixof a cryptogramencoded by a k-alphabetcipheris of little use in decoding it, since the various occurrencesof a digramin MCdo not representthe same digramin M, if they do not occur in the same position. Cryptanalystshave some clever ways to deduce the probablevalue of the cycle length k. See, for example, Gaines [2] or Sinkov [5]. Armed with this information,a cryptanalystcan calculate k digram frequencymatrices,Ap = (aijp) where ai jp is CLEVE MOLER AND DONALD MORRISON 82 [February CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD CDCODDC CDl rO CD CD CD rO rO CDl -,O r.CCD - CO CDCD >- r CD CD CD 1r-C) OCO CD CD CD CD CD CD CD CD CD CD OCO O CD CD CD rO CD rO Dn CCD CD CD xooooooooooooooooooooooooo O CD r- z CM OCDLO CO CD CD O, 0 or:00000 CD CD CD CD z 4 D D q C CD o o CO O CD CD CO O O CD r-Cr D CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD 000000 00c\Jnj00000000000 LC o D D D D C)CCDCD\ D D D O CD o D o tDk -0 C CD nC C C C c0 oC xoceooooooooooo C\i. o. ~ o , lo L C rO CD CD CD CD CD CD - O CD CD CD C\ CD rO CD L" rO CD CD CD) On LIJ C LO I-,t I-, CD CD CD O CD CD CD CD rO CD CD CD C C\ O O CD CD CD CD CD CD CD CD CD r CY) CDJ r- CY) CLO CO CD m r CD CD CD O - m CD CDC) C\l O CD CDI, CD 0n CD) 0O CD O CD CD CD CD CD CO CD CDO CY) CD CD CD CD CD CD CD CD CD CD CD CD CD C\J C C O O ctCD I C D CD004 CD "\ -4 CDC.D 00O C Cl . 0 0 D O M LO -,t CD CD CD C\l LU bI_ OCD-T) Lr-00 O C D O - C )OCD CD C\ CDO 0D 0 D 00 CO CD CD D 0DO C\l C\ C\l CD rO CD CD CO md M 1 r= D 3 CD CD C\l CD >> 1983] SINGULARVALUEANALYSISOF CRYPTOGRAMS 83 the frequencywith which the digram uiuj occurs in positions p, p + 1 modulo k. All of these occurrencesare encoded into the same digramuc (i)uc +(i) in the cryptogramMc. By an argument similar to that given above for the simple substitution,if Ap is the digram frequencymatrix for is the corresponding digram digrams occurring in position p, p + 1 in M, then CpApCp/+I frequency matrix for the cryptogramMc. It has the same coefficients as Ap4,but its rows are +. Its singularvalues are the same as those of Ap, and its permutedby Cp,and its columns by Cp+ j th left and right singularvectors are, respectively,Cpxpj and Cp+ IYpj,where xpj and ypjare those of Ap. If the classification scheme described above, applied to xpj and y(p- 1)j classifies u, as a vowel, then it will also, applied to Cpxpjand Cpy(p+1)j, classify its encoding uc (j) as a vowel. If this is done for eachp, then the cryptanalysthas, for eachp, a classificationof the letters in position p of the cryptogram,into vowels and consonants. The classificationis, of course, only as reliable as it would have been if it had been applied to the original message M, or, more particularly,to those subsequences of M consisting of the digrams occurring in a particular position p, p + 1. If these are representativesamples of the digramsoccurringin the language of M, then the classificationis as reliableas it is when applied to equally representativeplaintexts, as we have done in sections 7 and 8. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 81.7256 53.5189 45.1604 31.1073 19.9258 18.8196 16.8777 12.1174 10.1647 9.4667 7.6957 6.2491 4.7882 2.7120 2.1048 1.7556 1.6395 0.9360 0.8129 0.4996 0.3232 0.2069 0.0148 0.0000 0.0 0.0 FIG. 2. The singular values. 7. ExperimentalResults. Fig. 1 is the digram frequency matrix for Lincoln's Gettysburg Address, a text of 1,148 characters.Notice, for example, that "th," with 47 occurrences,is the most frequentpair, that "q" occurs only once, and that "j," "x" and "z" do not occur at all. Fig. 2 gives the singularvalues of the matrixin Fig. 1. Since threeletters are missing, the matrix has rank 23 at most, and 3 of the singularvalues should be zero. The subroutinefinds two exact zero values and one value, the size of roundoff erroron the computer.(Exact zeros are printed as 0.0, while numbersless than 10' are printed as 0.00000.) 84 [February CLEVEMOLERAND DONALD MORRISON It certainlycannot be claimed that the singularvalues decrease rapidly. In fact, the rank two approximationonly vaguelyresemblesthe originalmatrix.Nevertheless,useful informationcan be obtained from the first two pairs of singularvectors. Fig. 3 shows the first singularvectors xl and Yi, togetherwith the frequencyvectorf. It can be seen that the components of the two singular vectors are roughly equal, and are roughly proportional to the components of the frequency vector. Thus, even though the matrix is not particularlywell approximatedby the first term of its SVD, the singularvectors still retain the propertiespredictedby the rank one theory. A B C D E F G H I J K L M N 0 P Q R S T U V W 0.3275 0.0470 0.1200 0.2011 0.4394 0.0875 0.0966 0.3481 0.1830 0.0000 0.0099 0.1203 0.0607 0.2165 0.2387 0.0522 0.0007 0.2954 0.1683 0.4453 0.0532 0.1167 0.1210 0.3221 0.0441 0.1135 0.2259 0.4517 0.1065 0.0799 0.3378 0.2339 0.0000 0.0097 0.1218 0.0468 0.2435 0.2563 0.0564 0.0054 0.2493 0.1391 0.4367 0.0597 0.0817 0.1044 0.0 X 0.0 Y 0.0339 0.0344 Z 0.0 0.0 102. 14. 31. 58. 165. 27. 28. 80. 68. 0. 3. 42. 13. 77. 93. 15. 1. 79. 44. 126. 21. 24. 28. 0. X 10. Y 0. Z FIG. 3. The first right and left singular vectors and the letter frequency vector. A B C D E F G H I J K L M N 0 P Q R S T U V W -0.5097 0.0412 0.0719 0.1436 -0.3306 -0.0006 0.0521 0.3877 -0.2070 0.0000 0.0033 0.0298 0.0634 -0.0649 -0.3891 0.0385 -0.0010 0.1692 0.0801 0.3785 -0.0860 0.1884 0.1559 0.0 -0.0045 0.0 0.1574 -0.0386 -0.0788 -0.2163 0.5305 -0.0198 -0.0623 0.3293 0.1791 -0.0000 -0.0049 -0.0853 -0.0613 -0.3983 0.1070 -0.0535 -0.0062 -0.3402 -0.1462 -0.3878 -0.0516 -0.1405 -0.0112 0.0 -0.0215 0.0 FIG. 4. The second right and left singular vectors. The sign patterns identify vowels and consonants. Fig. 4 shows the second singularvectors x2 and Y2. The alternatingsign patternspredictedby the rank two theory are clearlyevident. Figs. 5 and 6 give a graphicalsummaryof the quantitative informationin Fig. 4 by plotting each letter at a point in the two-dimensionalplane determined by its components in x2 and Y2. Thus, "a" is plotted at coordinates (-0.5097,0.1574), "b" at coordinates (0.0412, -0.0386), and so on. The more frequent letters are in Fig. 5 and the less frequentlettersin Fig. 6. The box in both figureshas cornersat (? 0.1, ? 0.1). Since they fall in the same quadrant(the second), the letters "a,""e," "i" and "o" should all be classified as either vowels or consonants, and we have, of course, chosen to call them vowels. The letters "h," " n," " u" and "y" must be called neuterbecause the correspondingsigns in the two vectors agree. The letter "q" occurred only once. The classification of q as neuter is neverthelessinteresting. Its one occurrenceis in the word "equal." One might expect it to be classified as a consonant, since it occurs between two vowels. Note, however, that the algorithm does not recognizeu as a vowel. It is classified as neuter.This is, no doubt, attributableto the very high frequencyof the digramou, which accounts for 7 of the 21 occurrencesof u. The letter "h" 85 SINGULARVALUEANALYSISOF CRYPTOGRAMS 1983] E H A 0 V ~~~ D R N T FIG. 5. The more frequent letters, plotted with coordinates from Fig. 4. clearly shows a tendency to be followed by a vowel, and to be precededby a consonant. The letter "n" shows a weak tendency to be followed by a consonant and a strong tendency to be preceded by a vowel. The other three neuterletters occur infrequently.The letter "ij,I"x," and "z" are seen not to occur at all. The remaining14 letters are classified as consonants. If a simple substitutioncryptogramwere made from text such as the GettysburgAddress, the digrammatrixA would be replacedby PAPT for some unknownpermutationmatrixP. As shown in Section 6, the singularvectors of the transformedmatrix would classify the encoding of each letter in the same way as x2 and Y2 classified the letters of the plaintext. 8. Other Languages.The same experiment was performed on texts of approximately one thousand characters each, written in five other languages selected for their diversity. The classificationsof letters resultingfrom these tests are tabulatedbelow. Language Vowels Consonants Hawaiian AEIOU HKLMNPW Japanese German Spanish Finnish AEIOU AEFOPU ABO AEIOUY BDGHKMNRSTWYZ BHIKLNRV CDFGLMNQRSXYZ DHJKLMNPRSTV Neuter Absent CDGJMSTWZ BHIJPTUV B BCDFGJQ RSTVXYZ CFJLPQVX QXY KW CFGQWXZ 86 CLEVEMOLERAND DONALD MORRISON [February QK GM L C FIG. 6. The less frequent letters. The scale is 5 times that of Fig. 5. None of the discrepanciesis very surprising.Severalof them are associatedwith letters which occurredso infrequentlythat no significancecan be attachedto their classification.In the Finnish example,B occurredonly once, at the beginningof a word of foreign origin. It followed a final N in the precedingword and preceded an E. In the German text, P occurredonly once, and that occurrencewas in the very GermanictrigramSPR. Occurringonly between two consonants, it is not surprisingthat the analysisclassifiedit as a vowel. The most common neighborof F, on both sides, in the selected sample, is R. The classification of I is confused because of the very high frequencyof the diagramsIE and El. More generally,the Germanlanguage does not share the aversionof the other languagesto consecutiveconsonants,and consequentlymany Germanletters fall into the neuter category. The classification of Y as a vowel in the Finnish sample is not surprising;Y has a value in Finnish that is hardly distinguishableto a non-Finnishear, from that of U. The neuterclassificationof I and U in the Spanishexampleis clearlyattributableto the high frequencyof vowel-voweldigramsin which U or I is the first letter, having a value equivalentto W or Y in English.The perfect performanceof the algorithmin Japaneseand Hawaiianis clearly the result of their rigorouslyobserved exclusion of consonant-consonantdigrams. Most of the JapaneseHiraganacharactersare transliteratedas a consonant-voweldigram. 9. Conclusions.The second singularvectors in the singularvalue decompositionof the digram frequencymatrixprovide the cryptanalystwith a helpful and surprisinglyreliableway to classify the letters in a cryptogram as vowels or consonants, if the encoding algorithm is simple WHAT IS THE GEOMETRYOF A SURFACE? 1983] 87 substitutionor k-alphabeticsubstitution,with k known, and if the text if vfc text. Texts writtenin many naturallanguages,with certainexceptions,tend to be vfc texts. Near-exceptionsare German (and, presumably,other germaniclanguages)in which the vfc characteris somewhat diminished by the frequencyof consonant-consonantdigrams,and Spanish(and, presumably,other romance languages)in which certainvowel-vowelpairs are frequent.Despite these deviations from the vfc rule, the second singularvectors classify correctlymost of the letterswhich occur with a frequency high enough to be statisticallysignificant. A computer program which uses the SVD as the starting point in an automated, heuristic approachto solving cryptogramsis describedby Schatz [7]. References 1. H. C. Andrews and C. L. Patterson, Outer product expansions and their uses in digital image processing, this 82 (1975) 1-12. 2. H. F. Gaines, Cryptanalysis, Dover, New York, 1956. 3. G. E. Forsythe, M. A. Malcolm, and C. B. Moler, Computer Methods for Mathematical Computations, Prentice-Hall, Englewood Cliffs, NJ, 1977. 4. Richard S. Varga, Matrix Iterative Analysis, Prentice-Hall, Englewood Cliffs, NJ, 1962. 5. A. Sinkov, Elementary Cryptanalysis, A Mathematical Approach, New Mathematical Library, vol. 22, Mathematical Association of America, Washington, D.C., 1968. 6. J. J. Dongarra, J. R. Bunch, C. B. Moler, and G. W. Stewart, LINPACK Users' Guide, Society for Industrial and Applied Mathematics, Philadelphia, PA, 1979. 7. Bruce R. Schatz, Automated analysis of cryptograms, Cryptologia, 1 (1977) 116-142. MONTHLY, WHAT IS THE GEOMETRY OF A SURFACE? ROGER FENN School of Mathematicsand Physical Sciences, Universityof Sussex, Falmer, BrightonBNJ 9QH, England 1. In this article we shall assume the definition of a geometry proposed by Klein in his Erlangenprogram, that is, a geometry is a group of transformationsof space together with the propositionsleft invariantby this group. The space in question will be the two-dimensionalplane. By a surfacewe shall mean a closed compact orientablesurfaceof genus y. That is a spherewith y handles attached.The fact that two such surfacesare homeomorphicif and only if their generaare equal was probably known to Riemann. A modern proof can be found in Massey's book [5, Chapter 1]. The exact result which will be proved is the following: THEOREM1. Let y > 1. Then there is a groupof hyperbolictranslationsof the hyperbolicplane such that the space of orbitsunderthese translationsis homeomorphicto a surfaceof genus -y. Moreover,thereis a compactpolygonin theplane whosetranslatesunderthe groupare distinctfor distinct group elementsand whichform a networkof nonoverlappingpolygons covering the whole hyperbolicplane. Our aim is to prove the above by means as elementaryas possible, in particularwithout using deep results from the theory of functions. In order to illustrate the general result, we consider firstly the case where -y= 1. Here the Roger Fenn is a Lecturer in Mathematics at the University of Sussex, England. He received his Ph.D. in 1968 under the direction of John Reeve and is presently writing a book on Geometric Topology. He believes people should be at the same time renaissance polymaths and twentieth century specialists insofar as this is possible. He is married, with three children, and enjoys playing chess, singing madrigals, and walking.