Megaputer Intelligence 2000. 3. 27 인공지능연구실 석사 2학기 최윤정 (cris@ai.ewha.ac.kr) Outline Overview Technology PolyAnalyst solution overview Customer cases Future developments Overview Megaputers… 1989년 모스크바 주립대학 AI 연구소 1994년 Polyanalyst1.0 개발 2000 Knowledge Discovery Semantic 정보검색 및 분석에 기반을 둠. Technology Subject-Oriented analytical systems Statistical packages Neural Networks Evolutionary Programming Memory Based Reasoning(MBR) Decision Tress Genetic Algorithms Product 데이타마이닝과 지식탐사를 위한 툴과 semantic Text 분석, information retrieval을 위한 툴 제공 PolyAnalyst 4.0 PolyAnalyst COM TextAnalyst 1.5 TextAnalyst Com MegaSearch tm PolyAnalyst overview Features in more detail multi-strategy data mining suite utilizing the latest achievements in knowledge discovery with a broad selection of exploration engines powerful data manipulation and visualization tools Modeling Classifying Predicting Explaining Clustering PolyAnalyst workplace Multiple machine learning algorithms can be accessed through pull-down and pop-up menus, or control buttons The project data, charts, discovered rules, and system reports are represented by icons held in separate containers Classify Find Laws Cluster PolyNet Predictor Find Dependencies Linear Regression Memory Based Reasoning Discriminate PolyAnalyst COM Learning algorithms Stepwisevalues Predicts Determines linear whatof Identifies Separates Finds Assigns New algorithm: ancases explicit agroups set to of the target variable characteristics regression - of a thesimilar of model two robustly most different forclassifies influential records the - a hybridtreats specified correctly data of set predictors and relation classes records finds predicting into bythe and utilizing best GMDH anditand distinguish categorical Neural from determines outliers clustering the Fuzzy multiple target Logic categories variable variables Net rest algorithms the yes/no variables of the data Find Dependencies All considered variables Predicted target value for a cell Outliers Most influential variables determined Cluster Variables providing the best clustering Number of points in a cluster Cluster sequential number Individual clusters PolyNet Predictor Similar to all other PolyAnalyst algorithms the best PN model is found as an optimal solution in terms of Predicted vs. Actual target variable The following graphs display the accuracy of PN and LR models developed to predict relative performance of computers from different manufacturers: Linear Regression R^2 = 0.86 PolyNet Predictor R^2 = 0.93 Classify PolyAnalyst Lift chart illustrates an increase in the response to a campaign based on the discovered model instead of random mailing PolyAnalyst Gain chart helps optimize the profit obtained in a direct marketing campaign Targeted mailing Mass mailing Targeted mailing Mass mailing Linear Regression Partial contributions of individual terms in the linear regression formula Yes/no variable taken into account correctly Discriminate algorithm Determines what features of a selected data set distinguish it from the rest of the data Requires no preset target variable Can be powered by Find Laws PolyNet Predictor Linear Regression Memory-Based Reasoning Performs classification to multiple categories Is based on identifying similar cases in the previous history Implemented only in PolyAnalyst COM (available in the end of March 1999) Data Access PolyAnalyst works with ODBC-compliant databases: Oracle, DB2, Informix, Sybase, MS SQL Server, etc. A customized version works with IBM Visual Warehouse Solution and Oracle Express Data and exploration results can be exchanged with MS Excel CSV or DBF format files New data can be added to the project when necessary Visualization Data can be displayed in various visual formats: Histograms Line and point plots with zoom and drill-through capabilities Colored charts for three dimensions Interactive rule-graphs with sliders help visualizing and manipulating multi-variable relations Frequencies charts provide for a quick and thorough visualization of the distribution of categorical, integer, or yes/no variables Lift and Gain charts are very useful in marketing applications Histograms and Frequencies Histogram displays distribution of numerical variables Frequencies chart displays distribution of categorical and yes/no variables 2D charts and Rule-graphs Sliders help visualize effects of other variables in more than two-dimensional models The Find Laws model (red line) for a product market share dependence on the price predicts a dramatic change in the formula when the product goes on promotion PolyAnalyst platforms Standalone system: PolyAnalyst Power - Windows 95/98/NT PolyAnalyst Pro - Windows NT PolyAnalyst Lite - Windows 95/98/NT PolyAnalyst 2.1 - IBM OS/2 Client/Server system: PolyAnalyst Knowledge Server Client - Windows NT or OS/2 - Windows NT, 95, 98, or OS/2 Sample customer cases PolyAnalyst supports medical projects at 3M Timothy Nagle Consulting Scientist 3M Corporation St. Paul, MN, USA “Analytical engines do an excellent job of finding relations amongst many fields without overfitting. I found the user interface both intuitive and easy to use. Megaputer support is outstanding. The inevitable problems one expects with a complex system are dealt with immediately.” PolyAnalyst helps improving flight control system at Boeing James Farkas Senior Navigation Engineer The Boeing Company Kent, WA, USA “PolyAnalyst provides quick and easy access for inexperienced users to powerful modeling tools. The user interface is intuitive and new users come up to speed very quickly. Interfaces to spreadsheet tools provide flexibility needed to work solutions as a team.” PolyAnalyst facilitates marketing research at Indiana University Raymond Burke E.W. Kelley Professor of BA Kelley Business School Indiana University Bloomington, IN, USA “PolyAnalyst provides a unique and powerful set of tools for data mining applications, including promotion response analysis, customer segmentation and profiling, and cross-selling analysis. Unlike neural network programs, PolyAnalyst displays a symbolic representation of the relationship between the independent and dependent variables - a critical advantage for business applications.” PolyAnalyst helps medical research at the University of Wisconsin-Madison Prof. Roger L. Brown Director of RDSU University of Wisconsin Madison, WI, USA “PolyAnalyst suite enabled our researchers to search their data for rules and structure while providing a symbolic knowledge of the structure, the detail they needed. The software has provided very interesting results for one of our projects, which had been presented at a major cardiology meeting.” PolyAnalyst enjoys international success Alexander Fomenko Director Analytical Dept Killiney Investments Europe Rep. Moscow, Russia “PolyAnalyst proves capable of providing models for building reliable trading strategies even for a difficult to predict FOREX market. PolyAnalyst is a leader in reliability, accuracy, and diversity of automatically built models.” David McIlroy Analytical Department Master Foods Olen, Belgium “PolyAnalyst scores extremely well by providing a complete environment in which almost any research worker could data mine his or her own data. It is a very useful product, potentially with a wide user base, and it appears to me to be unique.” Product Price $$ Custom-build own PolyAnalyst system! Product Price $$(continue) Custom-build own PolyAnalyst system! - COM 모듈은 어플리케이션을 작성하는데 적당 - 각각의 필요한 알고리즘에 해당하는 Tool Kit을 구입할 수 있음 Future developments New machine learning algorithms: Memory Based Reasoning Weighted variable Clustering and Classification PolyAnalyst COM built on the basis of Component Object Model - an integrated kit for simple development of decision support applications utilizing advanced PolyAnalyst algorithms (see PCAI Magazine, March 99, p. 16-19) Enhanced graphics (Snake and Boxplot charts) and data import and manipulation PolyAnalyst evaluation www.megaputer.com