Data Science Education Prof Andy Koronios Head School of Information Technology & Mathematical Sciences Big Data – ‘Virtual trail of physical reality’ The Internet of Things Everything, Everywhere…. An Intelligent, Instrumented & Interconnected world! 2.2 Billion People use the Internet 60 % of Australians used it today Big Data… Everywhere! All these are widely available & virtually free ‘Data Scientists’ are not widely available and certainly not ‘free’ “Data Scientists are better at statistics than software engineers and better at programming than statisticians” “they make discoveries while swimming in data” Data Science: A Multidisciplinary Activity Data Sciences’ Value Chain Data Capture Data Mgt • Transactions • Social Media • Stream Data o Environmental o Industrial o GPS o Image/Video • Exhaust Data o Network data o System logs • High rate financial data Data Storage & Access Analytics Application Evaluation Data Sciences’ Process Model Data Capture Data Mgt • • • • • Data Storage & Access Integration Security LCM MDM Data Quality Analytics Application Evaluation Data Sciences’ Process Model Data Capture Data Mgt Data Storage & Access • • • • Analytics Hadoop HDFS Map Reduce DWH Federated Discovery & Navigation Application Evaluation Data Sciences’ Process Model Data Capture Data Mgt Data Storage & Access Analytics Application • Descriptive Analytics o Association Rules o Sequence Rules o Segmentation • Predictive Analytics o Regression o Classification • • • • Decision Trees Neural Networks Text Analytics Real time Analytics Evaluation Data Sciences’ Process Model Data Capture Data Mgt Data Storage & Access Analytics Application Evaluation • Discussion of Insights with domain experts; • Running experiments at scale; • Operationalising the Models; • ROI calculations • Business Case Development • Implementation Issues; Data Sciences’ Process Model Data Capture Data Mgt Data Storage & Access Analytics Application Evaluation • Monitoring; • Model Optimisation; • Evaluation of initiative Attributes of a Data Scientist 1. Communication Skills are underrated; 2. The biggest challenge is not modelling, it is collecting and cleaning; 3. A Data Scientist is better at statistics than a SW engineer and better at SW engineering than a statistician; 4. A curiosity about working with data is a quality better than technical skills; 5. Good storytelling is a must. 6. The area is nascent and the role is freeform – good time to join; https://s3.amazonaws.com/leada/handbook/Handbook_Pt1.pdf Ask the right Qs * Analyse data * Build statistical models * Developing data apps A Very Rare Creature Indeed! “a hybrid of data hacker, analyst, communicator, and trusted adviser…” Data Scientist Employment Growth The U.S. could face a shortage by 2018 of 140,000 to 190,000 people with "deep analytical talent" and of 1.5 million people capable of analyzing data in ways that enable business decisions. (McKinsey & Co) Big Data and its Impact Circa late 2012….. …. ‘there are no university programs offering degrees in data science’….. HbR, 2012 Data Science degrees Today US Universities • • • • • • • • • North Carolina State Stanford UC Berkley MIT North Western Washington George Mason NY Etc… Australian Universities • • • • • UniSA Deakin Macquarie UTS ……+++ Certification Programs • • • • • EMC Data Science Associate (EMCDSA) Cloudera CCP-Data Scientist Insight Data Science Fellows Program, SAS Institute for Data Science and Engineering More than 250 universities World wide now offer some courses in Data Science & Big Data Late in 2013 UniSA MDSc - Key features • Suite of nested programs developed in conjunction with the Institute of Analytics Professionals of Australia (IAPA) and SAS, industry leader in business analytics • Available face-to-face or entirely online, part-time or full time • Emphasis on professional practice • Technical skills in Data Science as well as project management, communications and visualisation School of Information Technology & Mathematical Sciences Entry pathways Bachelor degree in any discipline (plus relevant work experience) Graduate Certificate Bachelor degree in Information Technology OR Mathematics Graduate Diploma Master of Data Science School of Information Technology & Mathematical Sciences Program structure Partnership with SAS • Benefits: – Licence to use SAS software in a number of courses. – SAS certification for graduates of the Master program. – Eligibility for placement in the final semester through SAS Work Placement Program. • Approximately 20 placements a year across Australia. • A good final year student in the Master of Data Science should have a good chance of obtaining a placement, but it cannot be guaranteed. School of Information Technology & Mathematical Sciences Data Science Professional Development Program Student Demographics • Variety of backgrounds, mainly technical – Engineers, Mathematicians, Computer Scientists, Finance specialists, Marketers • • • • • • • Mostly part time/online; 2/3 ‘Out-of-State’; 2/3 Male; Median Age 39; Mostly employed in similar role (mainly BI); Highly motivated; Already in demand. School of Information Technology & Mathematical Sciences