32N2386 Next Generation Analytics & Big Data (A Reference Model for Big Data) Jangwon Gim Sungjoon Lim Hanmin Jung ISO/IEC JTC1 SC32 Ad-hoc meeting May 29, 2013, Gyeongju Korea Contents Background Brief history of discussions Case study Procedure for developing standardizations for Big Data Reference model for Big Data Conclusions 2 Discussion of Big Data Data analytics Data analysis Baba: Vocabulary, Use-case, and so on Stabilize Architecture Define Interfaces Standardization opportunities Jim: The aspect of Big Data is “There is many different forms” Krishna: Refers to Wikipedia definition Keith Gorden: Volume, Complex, Velocity Keith W. Hare: Open Big Data Volume, Variety, Velocity, Value, Veracity Any combination is OK. 3 Background Emerging Technologies For Big Data In 2012, The hype cycle of Gartner Diverse definitions of technologies and services, having different views of data 4 Background Big Data on hype cycle A general and common reference model for Big Data is needed 5 Brief history of discussions Issue Date Summary 16 November 2011. [SC32N2181] ISO/IEC JTC 1/SC 32 N2181, “Resolutions and topics from the recent JTC 1 me eting of particular interest to SC 32 participants”, SC32 Chair – Jim Melton 12 January 2012. [SC32N2198] ISO/IEC JTC 1/SC 32 N 2198, “Analysis of 2012 Gartner Technology Trends”, JTC1 SWG-P - Mario Wendt – Convener SC 6 Telecommunications and information exchange between systems SC 32 Data management and interchange SC 39 Sustainability for and by Information Technology 19 March 2012. [SC32N2199]ISO/IEC JTC 1/SC 32 N 2199, “Discussion: SC 32 Response to 2011 JTC 1 Resolution 33”, SC32 Chair – Jim Melton 6 June 2012. [SC32N2241] Ad-hoc on “Next gen analytics” - Keith Hair - Chair 6 The view of Next-Generation Analytics of SC32 Referencing from [SC32N2241] Architectural Next-Generation Analytics Social Analytics From Baba Mechanisms Metadata Raw Storage Need a reference model for Big Data to enhance interoperability 7 Case Study (1) Korea Institute of Science and Technology (KISTI) Dept. of Computer Intelligence Research 8 Case Study (2) Architecture of InSciTe Adaptive Service 9 Case Study (3) Semantic Analysis Text Data to Ontology 10 Case Study (4) Semantic Analysis Ontology Schema 11 Case Study (5) Semantic Analysis Example of Semantic Analysis 12 Case Study (6) InSciTe Service Functions – (Hybrid Vehicle) Technology Navigation Technology Trend Core Element Technology Convergence Technology Agent Level Agent Partner Integrated Roadmap Report 13 Case Study (7) In 2013, About 10 Billion triples from diverse sites will be extracted Sites Freebase The number of Count 1,015,762,951 Yago 224,949,079 DBPedia 449,383,705 DBLP 81,986,947 baseKB 147,549,529 Etc (WhoisWho,NYTimes,LinkedObervedData,…) 2,296,838,760 Total 4,216,470,971 14 Case Study (8) In 2013, System Architecture of InSciTe Adaptive Service 15 Procedure for developing a reference model for Big Data 1. Eliciting requirements and analyzing the environment of Big Data 2. Establishing visions and strategies for achieving the goal of Big Data We are here 3. Defining a concept model / a reference model / a framework for Big Data 4. Deriving use-cases for applying the Big Data 16 A lifecycle of Big Data • Collection/Identification • Repository/Registry • Semantic 1. Intellectualization • Integration Data • Analytics / Prediction 2. • Visualization Insight Big Data Action • Data Curation • Data Scientist • Data Engineer 3. Decision 4. • Workflow • Data Quality 17 Reference Model for Big Data A Reference Model for Big Data Service Layer Analysis & Prediction Big Data Management Interface Workflow Management Data Quality Management Data Visualization Service Support Layer Interface Data Curation Data Integration Platform Layer Data Semantic Intellectualization Security Interface Data Identification (Data Mining & Metadata Extraction) Data Collection Data Registry Data Layer Data Repository 18 Reference Model for Big Data A Reference Model for Big Data ??? 19763 Service Layer Analysis & Prediction Big Data Management Interface Workflow Management Data Quality Management Data Visualization Service Support Layer Interface Data Curation Data Integration Platform Layer Data Semantic Intellectualization Interface 9075 Security 13249 Data Identification (Data Mining & Metadata Extraction) Data Collection Data Registry Data Layer Data Repository 11179 19 Conclusions Summary Analyzing the circumstance of Big Data Building a framework for Big Data Define detail procedure to create the Big Data Discussion Possible suggestions • New Working Group for the reference model of Big Data New Work Items could be derived from the model • New Study Group Future work Discussion of the concept of NWI • 2013. 11. Interim meetings Propose extended the reference model of Big Data (NWI) • 2014. 5 Plenary meeting 20