A Reference Model for Big Data

advertisement
32N2386
Next Generation Analytics &
Big Data
(A Reference Model for Big Data)
Jangwon Gim
Sungjoon Lim
Hanmin Jung
ISO/IEC JTC1 SC32 Ad-hoc meeting
May 29, 2013, Gyeongju Korea
Contents






Background
Brief history of discussions
Case study
Procedure for developing standardizations for Big Data
Reference model for Big Data
Conclusions
2
Discussion of Big Data
 Data analytics
 Data analysis
 Baba: Vocabulary, Use-case, and so on
 Stabilize Architecture
 Define Interfaces
 Standardization opportunities





Jim: The aspect of Big Data is “There is many different forms”
Krishna: Refers to Wikipedia definition
Keith Gorden: Volume, Complex, Velocity
Keith W. Hare: Open Big Data
 Volume, Variety, Velocity, Value, Veracity
Any combination is OK.
3
Background
 Emerging Technologies For Big Data
 In 2012, The hype cycle of Gartner
 Diverse definitions of technologies and services, having different views of data
4
Background
 Big Data on hype cycle
 A general and common reference model for Big Data is needed
5
Brief history of discussions
Issue Date
Summary
16 November 2011.
[SC32N2181] ISO/IEC JTC 1/SC 32 N2181, “Resolutions and topics from the recent JTC 1 me
eting of particular interest to SC 32 participants”, SC32 Chair – Jim Melton
12 January 2012.
[SC32N2198] ISO/IEC JTC 1/SC 32 N 2198, “Analysis of 2012 Gartner Technology Trends”,
JTC1 SWG-P - Mario Wendt – Convener
 SC 6 Telecommunications and information exchange between systems
 SC 32 Data management and interchange
 SC 39 Sustainability for and by Information Technology
19 March 2012.
[SC32N2199]ISO/IEC JTC 1/SC 32 N 2199, “Discussion: SC 32 Response to 2011 JTC 1
Resolution 33”, SC32 Chair – Jim Melton
6 June 2012.
[SC32N2241] Ad-hoc on “Next gen analytics” - Keith Hair - Chair
6
The view of Next-Generation Analytics of SC32
 Referencing from [SC32N2241]
Architectural
Next-Generation Analytics
Social Analytics
From Baba
Mechanisms
Metadata
Raw Storage
 Need a reference model for Big Data to enhance interoperability
7
Case Study (1)
 Korea Institute of Science and Technology (KISTI)
 Dept. of Computer Intelligence Research
8
Case Study (2)
 Architecture of InSciTe Adaptive Service
9
Case Study (3)
 Semantic Analysis
 Text Data to Ontology
10
Case Study (4)
 Semantic Analysis
 Ontology Schema
11
Case Study (5)
 Semantic Analysis
 Example of Semantic Analysis
12
Case Study (6)
 InSciTe Service Functions – (Hybrid Vehicle)
Technology
Navigation
Technology
Trend
Core Element
Technology
Convergence
Technology
Agent Level
Agent Partner
Integrated
Roadmap
Report
13
Case Study (7)
 In 2013, About 10 Billion triples from diverse sites will be extracted
Sites
Freebase
The number of Count
1,015,762,951
Yago
224,949,079
DBPedia
449,383,705
DBLP
81,986,947
baseKB
147,549,529
Etc (WhoisWho,NYTimes,LinkedObervedData,…)
2,296,838,760
Total
4,216,470,971
14
Case Study (8)
 In 2013, System Architecture of InSciTe Adaptive Service
15
Procedure for developing a reference model for Big Data
1. Eliciting requirements and analyzing the environment of
Big Data
2. Establishing visions and strategies for achieving the goal
of Big Data
We are here
3. Defining a concept model / a reference model /
a framework for Big Data
4. Deriving use-cases for applying the Big Data
16
A lifecycle of Big Data
• Collection/Identification
• Repository/Registry
• Semantic
1.
Intellectualization
• Integration
Data
• Analytics / Prediction
2.
• Visualization
Insight
Big Data
Action
• Data Curation
• Data Scientist
• Data Engineer
3.
Decision
4.
• Workflow
• Data Quality
17
Reference Model for Big Data
 A Reference Model for Big Data
Service Layer
Analysis & Prediction
Big Data
Management
Interface
Workflow
Management
Data Quality
Management
Data
Visualization
Service Support
Layer
Interface
Data
Curation
Data Integration
Platform Layer
Data Semantic Intellectualization
Security
Interface
Data Identification
(Data Mining & Metadata Extraction)
Data Collection
Data Registry
Data Layer
Data Repository
18
Reference Model for Big Data
 A Reference Model for Big Data
???
19763
Service Layer
Analysis & Prediction
Big Data
Management
Interface
Workflow
Management
Data Quality
Management
Data
Visualization
Service Support
Layer
Interface
Data
Curation
Data Integration
Platform Layer
Data Semantic Intellectualization
Interface
9075
Security
13249
Data Identification
(Data Mining & Metadata Extraction)
Data Collection
Data Registry
Data Layer
Data Repository
11179
19
Conclusions
 Summary
 Analyzing the circumstance of Big Data
 Building a framework for Big Data
 Define detail procedure to create the Big Data
 Discussion
 Possible suggestions
• New Working Group for the reference model of Big Data
 New Work Items could be derived from the model
• New Study Group
 Future work
 Discussion of the concept of NWI
• 2013. 11. Interim meetings
 Propose extended the reference model of Big Data (NWI)
• 2014. 5 Plenary meeting
20
Download