IS6145 Database Analysis and Design Lecture 1: Introduction to IS6145 and the changing nature of data Rob Gleasure R.Gleasure@ucc.ie www.robgleasure.com IS6145 Today’s session Change of time/place for next week Course outline Data a few years ago Data now The cloud Big data Business Intelligence The case of Spotify IS6145 Lecture times 13.00-15.00, Wednesday (KANE B10A) Contact me at Ext 2503 Room 2.112 R.Gleasure@ucc.ie Website for this course http:// girtab.ucc.ie/rgleasure/index.html IS6145 Module content Data modelling is studied in its practical dimensions and enterprise relational database applications (e.g. Oracle, MS SQLServer) are used to demonstrate the key issues in database administration with an introduction to SQL. Topics covered include Data modelling (ERDs and normalisation) Database technology and developing database systems IS6145 Learning objective Analyse organisational activities to identify key data requirements Generate ERD models to identify data sources and their relationships Employ normalisation processes to assist in meeting the data integrity requirements Identify and utilise various relational data management systems (RDMS) to meet an organisations functional requirements Demonstrate proficiency in basic SQL scripting, including the ability to insert, delete and update database records. IS6145 Course Assessment Continuous assessment: 30 marks In-class exam – 20 marks Group report – 30 marks Exam: 50 marks IS6145 Some things to note Ask questions Help each other Make use of the Internet Use search engines (e.g. google) to find information on things you want to know more about If you see cases or interesting stories you think we should talk about in lectures, email me IS6145 Some things to note This is essentially a skill-building course We’re going to have to discuss things and practice different modelling techniques to improve As a starting point What is data? Data a few years ago Image from http://www.hotcleaner.com/web_storage.html The Cloud Capacity Resources Web is overtaking/has overtaken desktop Mobile is replacing local Utility-based computing is replacing once-off purchase Makes resources seem endless Lowers risk in terms of usage (pay as you go) Demand Resources Capacity Demand Time Static data center Time Data center in the cloud Slide Credits: Berkeley RAD Lab The Cloud The ‘Internet of things’ was born in about 2009 More devices connected to the Web than people… Image from http://computinged.com/edge/become-part-of-the-cloud-computing-revolution/ The Cloud This has meaningful implications for data in terms of Capacity Measurement Integration Security Privacy Big data All of this interaction with one linked information system means vast quantities of data can be captured throughout user interaction, often in real-time ‘big data’ The idea is that the vast amounts of interaction data allow for systems that are nuanced and responsive in ways that were previously not possible Also a realisation that, if it can be analysed, this data is a huge commodity, meaning new business models are possible Firms like Google, eBay, LinkedIn, Facebook, etc. are based on the principles of big data 3 Vs of Big data Volume Facebook generates 10TB of new data daily, Twitter 7TB A Boeing 737 generates 240 terabytes of flight data during a flight from one side of the US to the other Velocity Clickstreams and asynchronous data transfer can capture what millions of users are doing right now Variety Move from structured data to unstructured data, including image recognition, text mining, etc. Gathered from users, applications, systems, sensors Big data All of this means huge changes for a number of sectors e.g. Healthcare Trading Education Transport Big data E.g. in Healthcare Modernizing Medicine EMA dermatology system https://www.youtube.com/watch?v=jMGaGtK9nzU Big data E.g. in entertainment Amazon What did Amazon do that bricks-and-mortar bookshops didn’t? Assignment 1 In groups, you are tasked with identifying and researching a business that uses data in an interesting and creative way. The report should be approximately 2000 words and describe the key values offered by the business to its consumers, how this differentiates it from competitors, and how its use of data at different points in the creation, delivery, and support of products/services enables this differentiation. You don’t need to go into deep technical detail concerning how data is handled, nor about the technologies used. However you should discuss data-related processes at a high-level, insofar as you understand them from the information you gathered The report is due on the 23rd October, at which time a soft-bound report should be handed into Ann O’Riordan in room 3.75 Assignment 1 The groups are as follows: Group 1: Hartigan, Stephen John; Ojo, Afolabi; Liu, Yang; Group 2: Li, Xiaochen; Cofalik, Emilia Agnieszka; Curtin, Peter Laurence; Group 3: Hayes, Brian James; Murphy, Charles Francis; Murphy, Laura; Group 4: Kelleher, Shona; Wang, Pengcheng; Walsh, Bernard John; Group 5: Carey, Caroline; Nolan, Ryan; Wu, Jiahua; Group 6: Aslam, Usman; Quirke, David; Martin, James Michael; Group 7: Wang, Meng; Cahill, Liam; Foley, Ciara Mary; Group 8: Lee, David James; Lu, Zicheng; O Brien, Patrick Anthony; Want to read more? On Modernizing Medicine https://www.modmed.com/ On Spotify http://www.bigdata-startups.com/BigData-startup/big-dataenabled-spotify-change-music-industry/#!prettyPhoto On the cloud and big data The Little Book of Cloud Computing 2013 edition, Lars Nielsen