Character Recognition using Hidden Markov Models Anthony DiPirro Ji Mei Sponsor:Prof. William Sverdlik Our goal Recognize handwritten Roman and Chinese characters Ji This is an example of the Noisy Channel Problem Noisy Channel Problem • Find the intended input, given the noisy input that was received • Examples – iPhone 4S Siri speech recognition – Human handwriting Markov Chain We use a Hidden Markov Model to solve the Noisy Channel Problem A HMM is a Markov chain for which the state is only partially observable. Markov Chain Definition Illustration Hidden Markov Model Our Project How to solve our problem? • Using a HMM, we can calculate the hidden states chain, based on the observation chain • We used our collected samples to calculate transition probability table and emission probability table • Use Viterbi algorithm to find the most likely result Pre-Processing • Shrink • Medium filter • Sharpen Feature Extraction • We count the regions in each area to represent the observation states Compare Canonical A S2 Adjusted Input S2 S3 S3 S2 Compare S3 S2 S3 Canonical B S2 S2 S1 S3 … Experimenting How to split character Experimenting How to represent states Result Conclusions • Factors that will affect accuracy – Pre-processing – How to split word – Number of states In the future • Spend more time on different features Pixel Density Counting lines • Use other algorithms such as a neural network to implement character recognition.