An Analysis of Hamptonese Using Hidden Markov Models Ethan Le and Mark Stamp Department of Computer Science San Jose State University McNair Scholars Program Summer Research Project, 2003 Agenda James Hampton Purpose What is Hamptonese? Hidden Markov Models (HMMs) HMMs and Hamptonese Other approach Results Questions James Hampton WWII veteran Janitor in Washington D.C. Life of solitude Died in 1964 Hamptonese discovered after his death “The Throne” also found “The Throne of the Third Heaven of the Nations’ Millennium General Assembly” James Hampton alongside his Throne Pieces of The Throne Purpose Analyzed Hamptonese using Hidden Markov Model to determine whether Hamptonese was created using simple substitution of English letters (or other languages). Consider alternative interpretations of Hamptonese What is Hamptonese? What is Hamptonese? (Cont’d) Unknown origin and content Spans 167 pages Could it be: A cipher system? Gibberish? Example of human randomness? Other? Hidden Markov Models (HMMs) Markov process with hidden states HMMs provide probabilistic information about the underlying state of a model, given a set of observations of the system. HMMs (Cont’d) Three solvable problems 1. 2. 3. Find probability of observed sequence Find “optimal” state sequence Train the model to fit observations HMMs examples Example 1: Music Information Retrieval System (MIR) Example 2: Deciphering English text (Cave and Neuwirth) 27 symbols (alphabet letters plus space) Assume 2 hidden states Train HMM to best fit input data Results: separation of consonants and vowels HMM Example (Cont’d) – English Text Character Initial Final A 0.03685 0.03793 0.0044447 0.1306242 B 0.04007 0.03978 0.0241154 0.0000000 C 0.03362 0.03423 0.0522168 0.0000000 D 0.03777 0.03654 0.0714247 0.0003260 E 0.03409 0.03608 0.0000000 0.2105809 F 0.03685 0.03932 0.0374685 0.0000000 G 0.03593 0.03839 0.0296958 0.0000000 H 0.03961 0.03932 0.0670510 0.0085455 I 0.04007 0.03377 0.0000000 0.1216511 J 0.03501 0.03515 0.0065769 0.0000000 K 0.03685 0.03700 0.0067762 0.0000000 HMMs and Hamptonese Transcribed 103 pages About 30,000 observations More than 40 distinct symbols Assume 2 hidden states 3 different reading techniques Other Approach Hamptonese vs. 246 different languages HMM for other languages Religious references & organizational patterns Analysis of first 40 pages Pattern recognition Results Hamptonese is not a simple substitution for English letters Hamptonese is probably not a simple substitution for any other language First comprehensive study of Hamptonese Transcription of entire text Future research suggestions Acknowledgments… Dr. Mark Stamp SJSU McNair Scholars Program SCCUR UC Irvine Questions??? References Chai, Wei and Vercoe, Barry. (n.d.). Folk Music Classification Using Hidden Markov Models. MIT Media Lab. M. Stamp, A revealing introduction to Hidden Markov Models http://www.cs.sjsu.edu/faculty/stamp/Hampton/HMM.pdf M. Stamp and E. Le, Hamptonese website. http://www.cs.sjsu.edu/faculty/stamp/Hampton/hampton. html R.L. Cave and L.P. Neuwirth, Hidden Markov Models for English, in Hidden Markov Models for Speech, IDA-CRD, Princeton, NJ, 1980