PatternDiviner: A Pattern Recognition Tool Gregory S. Hill and Goran Trajkovski Cognitive Agency and Robotics Laboratory Towson University, 8000 York Road, Towson MD 21252-0001 E-mail: {ghill1, gtrajkovski} @ towson.edu Phone 410-704-6310, Fax 410-704-3868 Overview of Original Software Activity is central to thought and cognition. Through Figure 1: TamTam ExampleFigure 1 gives an interaction autonomous agents build working example of the TamTam output for a pattern. Patterns that representations of the environment they inhabit are successfully recognized are the ‘winners’ and are used (Trajkovski 2007). as the parents for the next generation of more sophisticated TamTam is a software demo based on interactionist patterns, in an approach reminiscent of genetic algorithms. principles (Bickhard 1980) based on unsupervised learning (Buisson 2006). This applet was developed to recognize We viewed TamTam not only as an demonstrative and anticipate rhythmic patterns entered via a computer simulation, but also as a possible learning paradigm in keyboard. TamTam always starts with a basic set of pattern recognition and anticipation. We studied the rhythms (e.g., four full-notes), and then uses a efficacy of converting this approach into a generalized sophisticated algorithm for generating more and more pattern recognition tool to be used for pattern mining for complicated child patterns, based on the previously large datasets, including sets of bioinformatics data such as recognized patterns input by the user. gene sequences. Methods One of the issues when working with the code was that it was written as a java applet (see [Buisson 2003] for details), with all of the classes residing in one file, and a subsequent reliance on global variables. Based on the TamTam code, we developed PatternDiviner, a software suite for pattern mining in gene sequences. The core of the product is the pattern recognition engine (PRE), that relies Figure 1: TamTam Example on a multitude of other independent classes and interfaces, including Note.java, PatternBuilder.java, PatternDisplay.java, Sequence.java, through the gene sequence in set sizes of 4 genes through SequenceCanvas.java, Stroke.java, TamTamPanel.java, 22 genes, as customary in bioinformatics data mining and TouchPanel.java. PatternDisplay is an interface, used when studying nucleotide sequence interaction. by TamTamPanel, which implements the method updatePatternDisplay(Sequence). This method is used to For example, given a partial gene sequence of explicitly display the results of the PRE. In the case of the ‘AGGGTGCGCA AATTGGCGCA …’, the first round of original TamTam applet, TamTamPanel displays a sequences would be ‘AGGG’, ‘GGGT’, GGTG’, ‘GTGC’, graphical sequence of notes, based on the pattern being fed etc. Any patterns recognized by the PRE are stored by the to it by the user, as output in the applet. The core pattern engine internally. recognition code was placed in PatternBuilder, which is the PRE of this system. See Figure 2 for the UML diagram Work in Progress of the essential classes. The results of running the Salmonella gene sequence (NCBI GeneBank 2006) through the engine were negative. The key sequence lengths we were interested in were between 18 and 22 genes. The PRE was able to recognize patterns of up to five genes in length, but nothing further. There are several possible reasons for the negative result. The first, and most obvious, is that there may simply be no patterns to be recognized. A further issue might be the particular algorithm for generating the child patterns off of the winning parent patterns. A winning pattern of, for example, ‘atat’ might generate a child pattern of ‘atgat’, which may very well not be a larger pattern within the sequence. Finally, larger gene sequence datasets should be tested against the tool. As well as different types of data (meteorological, traffic-flow patterns in various urban areas, etc.). Figure 2: UML Diagram This is a work in progress. More research into alternative A separate class, DNAPatterns.java, implements PatternDisplay, which loads up the sample dataset and feeds it to the Sequence class. We used the following conversion between notes representation in TamTam and the base nucleotides as follows: adenine (abbreviated A, equivalent to a full note), cytosine (C, half note), guanine (G, note) and thymine (T, 1/8 note). We then iterated pattern generation schemes might be worthwhile. Also, it would be interesting to further develop the PRE into a more abstract, and extensible, class. With a little work, a base class could be developed that managed some of the basic pattern recognition, and then let any sub-classes implement the specific pattern recognition algorithms needed by the developer. Using a basic Factory pattern, TamTam could be used in a variety of environments, easily modifiable and testable. 8) Stojanov, Bozinovski, S, Trajkovski, G, "Interactionist-Expectative View on Agency and Learning", in: IMACS Journal of Mathematics and Acknowledgements Computers in Simulation, North-Holland, Amsterdam, This work was assisted generously by Dr Jean-Christophe vol 44 (1997) 295-310. Buisson, and was partially funded by the Faculty Research and Development Committee of Towson University. References 1) Bickhard, M. “Interactivist Manifesto”. Retrieved online on June 10, 2006 at http://www.lehigh.edu/~mhb0/InteractivismManifesto. pdf. 2) TamTam, Retrieved online on June 5, 2006 at http://diabeto.enseeiht.fr/tamtam/, Dr. Jean-Christophe Buisson of L’Ecole Nationale Supérieure d'Electrotechnique, d'Electronique, d'Informatique, d'Hydraulique et des Télécommunications (http://enseeiht.fr/) 3) Jean-Christophe Buisson: “A rhythm recognition computer program to advocate interactivist perception”, Cognitive Science, Volume 28, Issue 1, January-February 2004, Pages 75-87, 4) NCBI GeneBank: http://www.ncbi.nlm.nih.gov/Genbank/GenBankFtp.ht ml 5) Trajkovski, G., “An Imitation-Based Approach to Modeling Homogenous Agents Societies”, IDEA Publishing, 2007. 6) Collins, S, and Trajkovski, G (2006) “Attack of the Rainbow Bots: Generating Diversity through MultiAgent Systems”. In Trajkovski, G. (ed) Diversity in Information Technology Education. Hershey, PA: InfoSys Press, pp 196-241. 7) Trajkovski, G., Collins, S.: “Autochthony Through Self-Organization: Interactivism and Emergence in a Virtual Environment”, New Ideas, Elsevier, in press.