polyanalyst

advertisement
Megaputer
Intelligence
2000. 3. 27
인공지능연구실
석사 2학기 최윤정
(cris@ai.ewha.ac.kr)
Outline
Overview
Technology
PolyAnalyst solution overview
Customer cases
Future developments
Overview
Megaputers…
1989년
모스크바 주립대학 AI 연구소
1994년
Polyanalyst1.0 개발
2000
 Knowledge Discovery
 Semantic 정보검색 및 분석에 기반을 둠.
Technology
 Subject-Oriented analytical systems
 Statistical packages
 Neural Networks
 Evolutionary Programming
 Memory Based Reasoning(MBR)
 Decision Tress
 Genetic Algorithms
Product
데이타마이닝과 지식탐사를 위한 툴과 semantic
Text 분석, information retrieval을 위한 툴 제공
PolyAnalyst 4.0
PolyAnalyst COM
TextAnalyst 1.5
TextAnalyst Com
MegaSearch tm
PolyAnalyst
overview
Features in more detail
 multi-strategy data mining suite
 utilizing the latest achievements in knowledge discovery
 with a broad selection of exploration engines
 powerful data manipulation and visualization tools
 Modeling
 Classifying
 Predicting
 Explaining
Clustering
PolyAnalyst workplace
Multiple machine learning algorithms can be accessed through
pull-down and pop-up menus, or control buttons
The project data, charts, discovered rules, and system reports are
represented by icons held in separate containers
Classify
Find Laws
Cluster
PolyNet Predictor
Find Dependencies
Linear Regression
Memory Based Reasoning
Discriminate
PolyAnalyst COM
Learning algorithms
Stepwisevalues
Predicts
Determines
linear
whatof
Identifies
Separates
Finds
Assigns
New
algorithm:
ancases
explicit
agroups
set to
of
the target variable
characteristics
regression
- of a
thesimilar
of
model
two
robustly
most
different
forclassifies
influential
records
the
- a hybridtreats
specified
correctly
data
of set
predictors
and
relation
classes
records
finds
predicting
into
bythe
and
utilizing
best
GMDH anditand
distinguish
categorical
Neural
from
determines
outliers
clustering
the
Fuzzy
multiple
target
Logic
categories
variable
variables
Net rest
algorithms
the
yes/no
variables
of the data
Find Dependencies
All considered
variables
Predicted target
value for a cell
Outliers
Most influential
variables determined
Cluster
Variables providing
the best clustering
Number of points
in a cluster
Cluster sequential
number
Individual clusters
PolyNet Predictor
Similar to all other PolyAnalyst algorithms the best PN model is
found as an optimal solution in terms of
Predicted vs. Actual target variable
The following graphs display the accuracy of PN and LR models developed to
predict relative performance of computers from different manufacturers:
Linear Regression
R^2 = 0.86
PolyNet Predictor
R^2 = 0.93
Classify
PolyAnalyst Lift chart illustrates an
increase in the response to a campaign
based on the discovered model instead of random mailing
PolyAnalyst Gain chart helps
optimize the profit obtained in a
direct marketing campaign
Targeted mailing
Mass mailing
Targeted mailing
Mass mailing
Linear Regression
Partial contributions
of individual terms in
the linear regression
formula
Yes/no variable taken
into account correctly
Discriminate
algorithm
 Determines what features of a selected data set distinguish it
from the rest of the data
 Requires no preset target variable
 Can be powered by



Find Laws
PolyNet Predictor
Linear Regression
Memory-Based Reasoning
Performs classification to multiple categories
Is based on identifying similar cases in the previous history
Implemented only in PolyAnalyst COM (available in the end of
March 1999)
Data Access
 PolyAnalyst works with ODBC-compliant databases:
Oracle, DB2, Informix, Sybase, MS SQL Server, etc.
 A customized version works with IBM Visual
Warehouse Solution and Oracle Express
 Data and exploration results can be exchanged with
MS Excel
 CSV or DBF format files
 New data can be added to the project when
necessary
Visualization
Data can be displayed in various visual formats:

Histograms

Line and point plots with zoom and drill-through capabilities

Colored charts for three dimensions



Interactive rule-graphs with sliders help visualizing and
manipulating multi-variable relations
Frequencies charts provide for a quick and thorough
visualization of the distribution of categorical, integer, or
yes/no variables
Lift and Gain charts are very useful in marketing applications
Histograms and Frequencies
Histogram displays distribution
of numerical variables
Frequencies chart displays
distribution of categorical and
yes/no variables
2D charts and Rule-graphs
Sliders help visualize effects of
other variables in more than
two-dimensional models
The Find Laws model (red line) for
a product market share dependence
on the price predicts a dramatic
change in the formula when the
product goes on promotion
PolyAnalyst platforms
Standalone system:

PolyAnalyst Power - Windows 95/98/NT
PolyAnalyst Pro - Windows NT
PolyAnalyst Lite - Windows 95/98/NT

PolyAnalyst 2.1 - IBM OS/2


Client/Server system:

PolyAnalyst Knowledge Server

Client
- Windows NT or OS/2
- Windows NT, 95, 98, or OS/2
Sample
customer cases
PolyAnalyst supports medical
projects at 3M
Timothy Nagle
Consulting Scientist
3M Corporation
St. Paul, MN, USA
“Analytical engines do an
excellent job of finding
relations amongst many fields
without overfitting. I found the
user interface both intuitive and
easy to use.
Megaputer support is
outstanding. The inevitable
problems one expects with a
complex system are dealt with
immediately.”
PolyAnalyst helps improving flight
control system at Boeing
James Farkas
Senior Navigation Engineer
The Boeing Company
Kent, WA, USA
“PolyAnalyst provides quick
and easy access for
inexperienced users to
powerful modeling tools.
The user interface is
intuitive and new users
come up to speed very
quickly. Interfaces to
spreadsheet tools provide
flexibility needed to work
solutions as a team.”
PolyAnalyst facilitates marketing
research at Indiana University
Raymond Burke
E.W. Kelley Professor of BA
Kelley Business School
Indiana University
Bloomington, IN, USA
“PolyAnalyst provides a unique
and powerful set of tools for data
mining applications, including
promotion response analysis,
customer segmentation and
profiling, and cross-selling
analysis.
Unlike neural network programs,
PolyAnalyst displays a symbolic
representation of the relationship
between the independent and
dependent variables - a critical
advantage for business
applications.”
PolyAnalyst helps medical research at
the University of Wisconsin-Madison
Prof. Roger L. Brown
Director of RDSU
University of Wisconsin
Madison, WI, USA
“PolyAnalyst suite enabled
our researchers to search
their data for rules and
structure while providing a
symbolic knowledge of the
structure, the detail they
needed.
The software has provided
very interesting results for
one of our projects, which
had been presented at a
major cardiology meeting.”
PolyAnalyst enjoys international success
Alexander Fomenko
Director Analytical Dept
Killiney Investments
Europe Rep.
Moscow, Russia
“PolyAnalyst proves capable of
providing models for building
reliable trading strategies even
for a difficult to predict FOREX
market. PolyAnalyst is a leader
in reliability, accuracy, and
diversity of automatically built
models.”
David McIlroy
Analytical Department
Master Foods
Olen, Belgium
“PolyAnalyst scores extremely
well by providing a complete
environment in which almost
any research worker could data
mine his or her own data. It is a
very useful product, potentially
with a wide user base, and it
appears to me to be unique.”
Product Price $$
Custom-build own PolyAnalyst system!
Product Price $$(continue)
Custom-build own PolyAnalyst system!
- COM 모듈은 어플리케이션을 작성하는데 적당
- 각각의 필요한 알고리즘에 해당하는 Tool Kit을 구입할 수 있음
Future developments
New machine learning algorithms:


Memory Based Reasoning
Weighted variable Clustering and Classification
PolyAnalyst COM built on the basis of
Component Object Model - an integrated
kit for simple development of decision support
applications utilizing advanced PolyAnalyst
algorithms (see PCAI Magazine, March 99, p. 16-19)
Enhanced graphics (Snake and Boxplot charts)
and data import and manipulation
PolyAnalyst evaluation
www.megaputer.com
Download