Data Mining and Visualisation Infrastructure

Data Mining and Visualisation Infrastructure
• Aimed to address the overlap between visualisation
and data mining and the data, and the infrastructural
– found this difficult after we lost our data-mining expert
(Chris) … particularly given our first question (just after he
left) was:
• What is the difference between data mining and
NESC Scientific Data Mining, Integration and
Visualisation Breakout Group Feedback (25 Oct 02)
What is the difference between data mining and
• data mining is machine oriented, visualization is human
• data mining is number crunching on assumption that some
model can be used; visualization is more exploratory data
analysis. Also we use visualization to validate the model
• visualization is presentation with added value?
– vis is useful for generating hypothesis; and for testing
• Considerable overlap in functionality (from the data-flow point
of view), but what about overlap in implementation,
NESC Scientific Data Mining, Integration and
Visualisation Breakout Group Feedback (25 Oct 02)
What standards exist?
• Why? To allow communication between processes.
– (How to plug data into data-mining -> visualisation chains?)
• Answer? None!
What are people currently doing?
• For data discovery: DAME and NDG are both using Z39.50
now and planning to use OGSA/DAI …
– Z39.50 provides a protocol for indexing databases for distributed
– still not obvious what OGSA/DAI will give us beyond Z39.50 and
that’s here now …
– … but Z39.50 doesn’t do more than using HTML hyperlinks to hand
over from discovery to data (it’s mostly used for indexing discovery
information not the information/data itself!)
– The Open-Archive project has an alternative approach to either, which
involves harvesting metadata, but might share the same hand-off
problem? (In any case need to talk more with library community).
NESC Scientific Data Mining, Integration and
Visualisation Breakout Group Feedback (25 Oct 02)
What next? The challenges!
• Standards, standards, standards!
• At the moment those who are not in the “in-crowd”
can ask the following question “how do I use these
tools (which tools?) with my data?
• Can we describe what a particular data mining tool
can do? Can we describe what inputs it requires? Can
we describe it’s outputs? Can any of these
descriptions become machine readable?
• The challenge is to address these points!!!
NESC Scientific Data Mining, Integration and
Visualisation Breakout Group Feedback (25 Oct 02)
NESC Scientific Data Mining, Integration and
Visualisation Breakout Group Feedback (25 Oct 02)