Opportunity Explorer: Navigating Large Databases ... Knowledge Discovery Templates Tej Anand & Gary Kahn

advertisement
From: AAAI Technical Report WS-93-02. Compilation copyright © 1993, AAAI (www.aaai.org). All rights reserved.
Opportunity Explorer: Navigating Large Databases Using
Knowledge Discovery Templates
Tej Anand& Gary Kahn
A. C. Nielsen
2201 Waukegan
Road, Suite S-200
Bannockburn1L 60015
tanand@nis.naitc.comgkahn@nis.naitc.com
Introduction
In July 1991A. C. Nielsenreleased NielsenSpotlightTMthe first product with an embedded
knowledge-based
systemfor explainingthe data in large point-of-saledatabases.Spotlight[1, 2] is a softwaresystemthat
extracts fromvery large databasesmeaningfulinformationabout exceptionalevents, explanationsfor these
eventsand correlationsamongevents. It then presentsthe user with a series of formattedhard-copyreports. To
generatethese reports the systemuses a pre-definedknowledge
base, a contextset up by the user anda predefinedformatfor the reports. Spotlight reports highlight productsandmarketswith exceptionalchangein
share, reasonsfor the share change,and trends across various productsegments.Thesereports providethe user
with a comprehensive
answerto the question "Whatis happeningin the market=placeand why?"
Spotlightfilled a needin the marketplace for an efficient, batch, top-line report generationsystem.Earlier
attemptsto addressthis needby systemssuchas CoverStory[3] werenot as widelyaccepted.Spotlight was
innovativein its use of an expert systemshell whichallowedfor deployment
of knowledge
bases on end-user
computers,a distributed approachto knowledge
discoveryandthe notion of an abstract markuplanguage[2].
Spotlight has beensuccessfully used by sales representatives and brandmanagersof majorconsumerpackaged
goodsmanufacturers
to obtain a top-level look at the state of their business. Alarge number
of organizations
haverolled the productout to their entire sales force whouse the systemto preparethemselvesbeforecalling on
buyersin retail organizations.Smallerorganizationstend to use Spotlightas a headquarterst0ol. Sales analysts
preparereports usingSpotlightandthensendthemout to sales representativesin the field.
Therewerethree technical limitations with the Spotlightproduct:
discoveryanddid not put in place tools or an
1. It representedan instanceof an applicationfor knowledge
architecture that couldbe usedto developother knowledge
discoveryapplications.
2.
Thereports created for the user werenot interactive. Thusthe user hadto manuallynavigatethroughthe
informationcreated, andthe systemdid not lead the user towardsseekingeither related or moregranular
information.
3. All the parametersthat couldbe usedto customizethe reports hadto be known
a priori. For example,it
wasnot possiblefor the user to create a completelynewtable.
~,
Oneof the reasonsfor the successof Spotlight wasthat it usedsimpletracking measuresuchas "Volume
"PriceN, and"Distribution"to explainin a languageeasily understoodby the sales representativesthe reasons
for changein share. However,
the use of simpletracking measuresalonewasalso a limitation becauseit
preventedthe user frommoresophisticatedanalysessuchas identifyingselling strategies.
TM product that wasreleased to the market-placein
In this paper wedescribe the NielsenOpportunityExplorer
March,1993.OpportunityExplorergenerates interactive reports using knowledge
discoverytemplates,
convertinga large space of data into concise, inter-linked informationframes.Eachinformationframe
addressesspecific businessissues, and leads the user to seek related informationby meansof dynamically
created hypedinks.
AAAI-93
KnowledgeDiscovery in Databases Workshop1993
~,, Page 45
In terms of domainknowledge
OpportunityExplorerin addition to the tracking measuresused in Spotlight,
also makesuse of several modeledmeasuresthat help the user quantify the impactof promotionsand thus
identify selling strategies. OpportunityExplorerinformationframesprovidethe user with a comprehensive
answerto the question "Whatis happeningin the market-place,why,and howthe share of a product can be
changed?"
Opportunity
Explorer
hasledtothearticulation
ofa coherent
architecture
fordeveloping
knowledge
discovery
applications
with
a large
number
ofreusable
libraries
with
standardized
functionalities
andsimple
wellunderstood
application
programming
interfaces
(APIs).
Opportunity
Explorer
doesnotallow
theusertocreate
completely
newreports
butithasmade
substantial
progress
towards
thatgoal
bycreating
high-level
output
objects
andabstract
queries
tospecify
information
frames.
A developer
defines
a knowledge
discovery
template
byspecifying
instances
ofthese
output
objects
andabstract
queries.
Thishasmade
itrelatively
easy
fordevelopers
withprogramming
experience
todefine
newinstances
ofknowledge
discovery
applications.
Inthispaper
wefirst
describe
thenotion
ofinformation
frames
andcomponents
ofthearchitecture
for
delivering
information
frames
totheuser.
Wethendescribe
themechanics
ofgenerating
information
frames
fromknowledge
discovery
templates,
Thepaper
endswitha discussion
ofworkinprogress.
Information Frames
Withthe availability of graphicaltools andclient-server databasesit has become
possiblefor users to browse
and querydata fromlarge databasesirrespective of wherethey are located, either on remotedata servers, LAHs
or local workstations.However,the user needsto havein-depth knowledge
of the semanticcontent of the
databaseto relate th~ data to the task that is to be completed.Aninformationframe,on the other hand,
providesa user withdirect accessto the semanticcontentsof the databaserelating the data to the users’ task.
Theinformationframeis a concisepackagingof analyzeddata that directly providesan answerto a specific
businessissue. In this sense, a knowledge
discoveryapplicationhas the followingproperties:
it is a series of informationframeseachaddressinga specific businessissue
it consists of navigationaltools that link the informationframesin the mannerthat representsa model
or workflowof the task the user is involved in
T
it allowsthe user to performstandardoperationson an informationframewhichare independentof the
informationcontent
Aninformationframeis an abstract conceptconsisting of a numberof textual, tabular, andgraphicalviews.
Thestandard operationson an informationframe include the following:
1. Displayan alternate view,i.e. displayeither the tabular or graphicalviewsfroman active textual
view
2.
Print the current active view
3.
Createor viewrelated informationframesusing hyperlinks
TheOpportunityExplorerapplicationconsists of informationframesthat help the sales representativeof a
consumerpackagedgoodscompany
sell moreproduct, moreprofitably to the retailer. This is accomplished
by
preparinga presentationthat highlightsthe advantages
for the retailer ffthe retailer stocksadditionalproducts
or participates in additionalpromotions.
Asales representativesells to the retailer productsbelongingto variousproductcategoriessuchas Coffee,
CarbonatedBeverages,Detergents,etc. Asa first step, he needsto determinehowhis productsare performing
in each product category. The"Cross CategoryAnalysis"informationframein OpportunityExplorerhighlights
for the sales representativethe contributioneachcategoryis makingtowardsthe total businessin a particular
retailer.
Page 46
KnowledgeDiscovery in Databases Workshop1993
AAAI-93
Nielsen Advisor
OpportunityExplorer - [CategoryRoadMap:text1]
Standart
Operafi(
ca~ge~ ~,d ~
Cembimd
Or~’q~eJuice - Florida Morni~- Schnuck’s- St. Louls
Dollars
13 Weeks ~ September 5,1992
¯ ~O.XJ~.~
share
inSchnuck’s
- St.Louis
was3.7,down2.6points
vs.lastyear.
Florida
Morning
share
inSt.Louis
$2MMalso
decreased,
down2.3points
to5.I.
Florida Morningvolumein Selmuck’s- St. LOUIswas$76,084, down44.6%vs. last year. Combined"
OrangeJuice declined also, down6.5%to $2,039.7M.
¯ ~ ~ Vo/wNOtaggd EzFlaaah’oa: Incremmtal Volume was down $56,971 to $15,181
in the sdected period. TI~ decrease contributed 93.0%to the total Florida Morningvolumeloss. H~erli
lacr~ Volur~ ~pim~’lon:
[
Total M~r~mdismg
Supportfor Florida Morningdecreased(-3.2 weeksto 5.0). 6 In addition, "quality"
(nan-TPR)support also decreased(-0.3 weeksto 0.O). Florida Momiag
D~JPrice was$1.78 per unit.
This correspondsto a 36.4%promotionalprice discount fromits everydayprice. This price discount
wassmaller than a year ago whenFlorida Morningreceived a 45.2%promotionalprice discount ($1.28
deal
price).
¯ Sunshine
V¢li~
~ained
themostshare,
up5.5points
to33.2.
Itsvolume
increased
11.8%
to$677.
I
MVolm~,t~taugJ F.xplautloa: SunshineVaUeyvolumeincrease wasprimarily due to an increase in
incremental volumeof $88,225to $322.0 M
.q v~- ¯..
Figure 1: The Category RoadmapInformation Frame
l
After examiningthis informationframethe sales representativecan decideto pursuefurther analysis in a
particular category, for examplea categoryin whichthe marketshare of his productis declining.Thesales
representativewouldlike to knowwhatare the factors that are leadingto this decreasein share. This
informationis presentedto the user in the "CategoryRoadmap"
informationframe. This informationframe
providesthe sales representativewith a detailed look at the performance
of his productsandhis competitors
includingexplanationsin termsof merchandising,
pricing &distribution factors. Eachfactor that constitutes
an explanation,contains a hyperlinkto an informationframethat providesfurther details aboutthat factor. The
CategoryRoadmap
also contains hyperlinks wherebythe user can "drill down"to seek moregranular
informationabouta product. For example,a brandmightconsist of three sizes, instead of lookingat aggregated
data for a brandthe user mightwantto look at data for the individualsizes that constitute that brand.Taken
together these frameshelp the user in understandingwhatis happeningin the marketplace.
This understandinghelps the sales representativeseek framesthat recommend
specific actions the user could
suggestto the retailer in terms of alternative merchandising
promotionsor in terms of distributing new
products. For example,the systemmightdeterminethat if the retailer whereto take a weekof promotionaway
fromproduct"A"and instead promoteproduct"B" (whichis the productthe sales representativeis trying to
sell) additionalprofits couldaccrueto the retailer andthe sales representativewouldincreasethe shareof his
product.
Byprinting selected viewsthe sales representativecan create a presentationfor the retailer. In this mannerthe
informationframeswithin OpportunityExplorermapthe data in large point of sale databasesto the task a sales
representativehas.
AAAI-98
KnowledgeDiscovery in Databases Workshop1998
Page 47
FigureI depicts the text viewof a CategoryRoadmap
informationframefor a fictitious product&its
competitorsin the Combined
OrangeJuice product category.
Generating Information Frames
Figure 2 providesa conceptualframework
for the developmentof knowledgediscoveryapplications such as
OpportunityExplorer. Thereare three majorcomponents
of this architecture:
This component
provides the application access to data in a
1. DatabaseAccessMethodology:
databaseindependentmanner,i.e. it hides fromthe applicationthe location of the database(remote
server, LAN
or local workstation)as well as the type andstructure of the database.Theapplication
shoulduse the data access methodology
to take advantageof anyinherent structure in the database,
for exampleproductandmarkethierarchies, queryoptimization,and built-in functionsfor ranking,
sorting, aggregating, and performingmathematicalcomputations.
TM[4], a product developedat Nielsen for ad
OpportunityExplorerused the Nielsen Workstation
hocdata retrieval. TheNielsenWorkstationis used as the databaseaccess methodology
across all
current andplannedknowledge
discoveryapplicationsat Nielsen. In addition to providingthe
features of a databaseaccess methodology
mentionedabove,the Nielsen Workstationautomatically
distributes the computation
with the remoteserver, if the remoteserver supportslocal computations,
it also allowsfor the applicationto querydata frommultipledatabases.
Knowledge-Based
Module
AbstractQueryProcessor
~
Expert System
Analyses
Nielsen Workstation
Knowledge
Discovery
T Template
Information
FrameGenerator
~___ User Interface
Selection
Database Access
Methodology
P--~~ntation
l Navigation
Figure2: Architecture
for generating
information
frames
2.
Page 48
TheUser Interface: This component
consists of 3 modules.The Selection modulehelps the user
in defining a context for the analysis. Theuser defines domaindependentterms that will guide the
generation of informationframes. For example,in order to generate the CategoryRoadmap
shown
in Figure1, the user needsto define domaindependentterms suchas the target product(the
productthat the sales representativewantsto analyze),the target market(the retailer in whichthe
KnowledgeDiscoveey in Databases Workshop1993
AAAI-93
sales representativeis interested), the share-basts(definition of the productcategory),competitors
(a methodfor selecting the productsthat are consideredcompetitorsof the target product),etc.
Thepresentationmoduleconsists of a set of viewersfor text, tables andgraphs. Theseviewers
supportvariablefonts, specializedscrolling, andall the standardoperations[see Figure1].
Thenavigationmoduleprovidesa user the capability to viewor create informationframesthat are
related to the active informationframe.This can be doneeither by meansof hyperlinkson the
active informationframeor by meansof a list maintainedby the systemof the informationframes
that havebeencreated, Thenavigationmoduleinterprets the hyperlinksthat are embedded
in the
contentsof the frame.
Theuser-interfacehas beendevelopedas a set of genericclass libraries that can be re-usedby other
applications such as OpportunityExplorer.
.
Knowledge-Based
Module:This componenthas the functionality for creating informationframes,
basedon the context suppliedby the user (fromthe selection module),and the informationframe
definition providedby the knowledgediscoverytemplate. Theknowledgediscoverytemplate
consists of three sections: abstract data queries, analysesto be performed,
andspecificationsof
outputobjects to be displayedin the informationframe.Theabstract data queriesin conjunction
with the user selections drive the AbstractQueryprocessorto extract data neededfor generatingthe
informationframe. This moduletypically has applicationspecific functionality encodedin it and
has to be developedfor everyapplication.
Theinformationframe generator has generic knowledgeabout the processingof conditional
objects, suchas bullets, paragraphs,tables andgraphs.Werefer to these objects as conditional
objects, becausethe systemchecksfor pre.-conditionssuchas the existenceof certain typesof data,
or the existenceof certain types of analytic results beforegeneratingthe object. Conditionscan be
attached to output objects in the knowledge
discoverytemplate. This moduleincludes a library of
output objects, conditions, and a vocabularywhichcan be extendedfor other applications.
Knowledge Discovery
Templates
Aknowledge
discoverytemplateSpecifies the informationcontent that will be present in the informationframe.
Togenerate this informationthe systemneedsto knowwhatdata is available, whatanalyses can be performed
or needsto be performed,and the format for presentingthe information.Theknowledge
discoverytemplateis a
text file createdby the developer.It is loadedat runtimeandinterpreted by the rule sets withinthe knowledgebasedmodule.There are three sections within a knowledge
discoverytemplate. Figure 3 showshowa
knowledge
discoverytemplatecan be used to generate informationframes. Rulesinterpret keywordsin queries
to extract data fromthe database.Theseare asserted as facts, the analysis rules forward-chain
on these facts to
create analytic objects whichare applicationspecific. Akeywordbasedvocabularyis used within the
knowledge
discoverytemplatethat specifies howthe data objects andanalytic objects are to be used within
bullets, paragraphs,tables andgraphs.
Abstract Queries
Theseare instances of queries that use domainconceptsto extract data. For examplefor the Category
Roadmap
informationframeshownin Figure 1, weneedto extract causal data relating to changein
share for all competitors.This constitutes an abstract query.Thesystemwill use the definition of
competitorsandcausal as specifiedby the user to extract the data. For example,competitorscouldbe
definedas productsthat makeup 80%of the total productcategoryvolumein the market.Similarly,
causal couldbe definedas in-store trade promotion
activities, distribution, and pricing. Thusan
abstract querysuchas "fetch causal data for all competitorsin the target market"will be containedin
the knowledgediscovery templatethat specifies the CategoryRoadmap
informationframe. Whenthe
user requeststhis informationframethe data will be automaticallyretrieved by the system.
AAAI-93
KnowledgeDiscovery in Databases Workshop1993"
Page 49
Types of Analyses
Theknowledge
discoverytemplategives the user the ability to specifytypes of analysesto be
performed.For example,OpportunityExplorer includes a causal modelfor changein volumebased on
trade promotions,pricing anddistribution factors. This modelis interpreted usingrules, but neednot
be used in all informationframes.Theknowledge
discoverytemplatespecifies whetherthis analysis
should be used or not. For the CategoryRoadmap
shownin Figure I the knowledge
discoverytemplate
will specify a causal modelin termsof cause& effect relationships, for examplewhendistribution
increasesthe volumeof the productincreases. After the data has beenretrieved by the systemthe rules
interpret the causal relationships and determineexplanationsfor the changein volume.Opportunity
Exploreralso includes rules for determiningopportunitiesfor a productin a market,and determinethe
leading and decliningproductsfor various measures.
Database Access
Queries
Analysis Rules
ueries
/.
Knowledge
Discovery
Template
Generation
Rules
Information Frames
Figure 3: Using KnowledgeDiscovery Templates
OutputObjects
Withina knowledge
discoverytemplateoutput objects suchas bullets, tables andgraphsare specified
depending
on the nature of the informationthe frameis trying to present. Abullet is composed
of sentences,a
sentenceis composed
of phrasesand phrasesare madeup individual words.Similarly, a table is madeup of
rows, rows are madeup of columns,and columnsare madeup of individual words. Graphsconsist of data
groups,data groupsconsist of elementsand elementsconsist of words.Columns
haveadditional attributes
suchas width, justification, etc. Graphshavegraphtypes suchas bar, pie, etc. associatedwith them.Depending
Page 50
KnowledgeDiscovery in Databases Workshop1993
AAAI-93
on the graphtype the viewerinterprets the data groups.Outputobjects are conditional,i.e. depending
on the
result of a specificanalysisa different graphtypecouldbe selected.
For examplethe knowledgediscoverytemplatefor the CategoryRoadmap
shownin Figure I will contain the
specificationof a bullet for the target productvolumechangeexplanations.This bullet will only be generatedif
the systemhas beenable to find an explanationfor the changein volumeof the target product.This bullet will
be specifiedin termsof sentencesthat describevariousfactors that mightconstitute an explanation.
Discussion
This paper represents a statementof workin progress. WhileSpotlightwasthe first instance of ideas
concerningknowledge
discovery,OpportunityExploreris the first realization of ideas concerningan
architecture for knowledge
discovery.This is an evolvingarchitecture in that it will growto incorporatenew
developments
in the areas of object-oriented design, databasemanagement
systems,and multimedia.It also
needsto incorporateanalysis modelsthat are basedon statistical regression,andothers that reasonfromcases.
Theanalysis modelsthat are includeddependexclusivelyon the task an applicationis assisting with.
OpportunityExplorerwasreleased to the marketplace in March,1993andearly indications are that it will
meet with tremendousacceptancefromthe customer-base.
Acknowledgments
Aproject with a scopeas wideas OpportunityExplorercould never havebeencompletedwithoutthe creative
participation of all members
of the joint marketingand development
team. Wewouldlike to specifically
acknowledge
the tremendouscontributions of MarkAhrens, GlennHicks, TonyRochon,Paul Terry, Patrick
A~Arelloand Joe Cannonin the conceptionand development
of all the ideas embodiedin this paper.
References
A Data-ExplanationSystem. In Proceedingsof the Eighth
1. Anand,T. and Kahn,GI 1992. SPOTLIGHT:
IEEEConferenceon AI for Applications, 2-8. Washington,D.C.: IEEEComputerSociety.
2.
Anand,T. and Kalm,G. 1992. MakingSense ofGigabytes: A Systemfor Knowledge-Based
Market
Analysis.In InnovativeApplicationsof Artificial Intelligence4, 57-69.MenloPark, Calif." AAAI
Press.
3.
Schmitz,J., Armstrong,G. and Little, J. D. C. 1990. CoverStory- Automated
NewsFindingin Marketing.
In DSSTransactions,LindaVolino(Ed.), 46-54. Providence,R+I." TheInstitute of Management
Sciences
4.
NielsenInf~act WorkstationReferenceGuide.1993.A. C. Nielsen, Northbrook,Illinois.
AAAI-93
KnowledgeDiscovery in Databases Workshop1993
Page 51
Download