TG 5 Data Management Workshop:

advertisement
TG 5 Data Management Workshop:
Best practice solutions for Grid data management
Technical Working Groups
ƒ Part of EU Grid concertation initiative
ƒ Funded by EU-FP6-IST
ƒ Forum for
• Discussion
• Exchange of information
• Exchange of the state of the art
Page 2
Technical Working Group 5
ƒ TG5 is concerned with Grid data management
ƒ Objectives
• Transfer from FP5 to FP6
• Developing a unified view on Grid data
management
• Collaboration between FP6 Grid projects
ƒ TG5 results are published in a white paper
Page 3
Transfer from FP5 to FP6
ƒ FP5 projects produced results that could been taken up by FP6
projects
ƒ Transfer from projects that produced
• Grid applications, e.g.:
ƒ Grace
->DataMiningGrid
ƒ Grace
->Simdat
• Basic data management facilities, e.g.:
ƒ DataGrid
-> EGEE
ƒ OpenMolGrid
->DataMiningGrid
ƒ OGSA-DAI important building block for many FP6 Grid projects
Page 4
Developing a unified View on Grid Data Management
ƒ Classifying data management in three areas
ƒ Data Sources
• Previously: flat file–based data access dominating
• Currently: variety ranging from XML over Relational
Database to Flat Files
ƒ Data Discovery and Metadata
• No standard solution for metadata management yet
ƒ Data Access and Data Transfer
• Convergence on OGSA-DAI for database access
Page 5
Purpose of this Workshop
ƒ Demonstration of best practice solutions
ƒ Particular focus to three areas of Data Management:
• Data Sources
• Data Discovery and Metadata
• Data Access and Data Transfer
ƒ Developing a common understanding of the available
Options
ƒ Getting closer to a unified view on Grid data management
ƒ Encouraging collaboration between FP6 Grid projects
Page 6
Agenda
ƒ 10:00 – 10:15 Welcome & Introduction (Michael May, Fraunhofer AIS)
ƒ 10:15 – 10:30 OGSA Data Architecture (Dave Berry, GGF)
ƒ 10:30 – 12:00 Demonstrations by
ƒ
OGSA-DAI (Neil Chue Hong, EPCC)
ƒ
DataMiningGrid (Martin Swain, University of Ulster)
ƒ
Simdat (Michael Krueger, Intel)
ƒ 12:00 – 13:00 Lunch
ƒ 13:00 – 14:30 Demonstrations by
ƒ
CoreGRID (Domenico Talia, University of Calabria)
ƒ
InteliGrid (Krzysztov Kurowski, PSNC)
ƒ
NextGrid (Philipp Masche, BT)
Page 7
Agenda
ƒ 14:30 – 14:45 Coffee break
ƒ 14:45 – 16:45 Discussion
ƒ
Lessons learned
ƒ
Unsolved problems concerning Data Management
ƒ
Visions & Future Challenges
ƒ
Unified View on Grid Data Management
ƒ 16:45 – 17:00 Conclusions & Next Steps
ƒ 17:00 End of Workshop
Page 8
Discussion
ƒ Lessons learned
ƒ Unsolved problems concerning data
management
ƒ Towards an unified view on Grid data
management
ƒ Claim to fame of FP6 grid data management
ƒ Visions and future challenges
ƒ Next steps
Page 9
Technology Matrix
DM Areas\ Projects
DataMiningGrid
EGEE
Data Source
Relational, XML, Flat File
Flat files, mass storage Relational, XML, flat
systems
files, WebDAV, product
model databases
Data
Metadata
Discovery/ OGSA-DAI , OD
Inteligrid
OD
OGSA-DAI OD (DEX)
-
OGSA-DAI
File Access /File Transfer GridFTP, HTTP
Grid FTP, OD
GridFTP, FTP, HTTP,
HTTPS, WebDAV
Data Replication
-
OD
-
Applications
Data Mining
Data intensive simulations Engineering
end-user
and analysis (e.g. LHC)
applications
(CAD,
Secure data access (e.g. structural analysis, ...)
biomedic)
Database Access
OGSA-DAI
Page 10
Technology Matrix
DM Areas\ Projects
NextGrid
OGSA-DAI
SIMDAT
Data Source
RDBMS, XML, files,
semistructured,
resource metadata
RDBMS, XML, files,
semistructured,
data services,
Relational,
Flat
semistructured,
data services,
Data
Metadata
Discovery/ OD (DEX), Service Group OD, Grimoires, Service
Registries
Group Registries
Database Access
File
Access/
Transfer
OGSA-DAI, OD
File OGSA-DAI, OD,
GridFTP, Parallel//HTTP
(OD)
OGSA-DAI, OD,
Grimoires, gLite
OD (OGSA-DAI)
OGSA-DAI,OD,
OD, Lucene, GridFTP,
FTP
GridFTP, OD, GRIA,
-
OD
Data replication
? GridFTP
Applications
Digital media, financial
various (medical,
applications,
astronomy, meteorology,
supply chain management, geology, biosciences)
data mining
File,
Data Mining, Simulation,
Aeospace, Bioinformatics,
Meteo, Pharma, CAD,
CAT, Automotive
Page 11
Technology Matrix
DM Areas\ Projects
CoreGrid (Virtual KDM)
Data Source
Relational DB, XML, Flat
Files, UNstructured Data
HTML
Data
Metadata
Discovery/ XML, Globus Services,
Database Access
OGSA-DAI, GDSE
File Access /File Transfer GridFTP, RFT
Data Replication
Replicate Catalogue
Applications
Data Mining, Distributed
Query Processing, DBMS
Schema
Integration,
Bioinformatics
Page 12
Download