Lecture 2

advertisement
IS6125
Database Analysis and Design
Lecture 2: The changing nature and role of
data
Rob Gleasure
R.Gleasure@ucc.ie
www.robgleasure.com
IS6125

Today’s session
 Change of time/place for next week
 Data a few years ago
 Data now
 The cloud
 Big data
 Business Intelligence
 The case of Spotify
Data a few years ago
Image from http://www.hotcleaner.com/web_storage.html
The Cloud

Capacity
Resources

Web is overtaking/has overtaken desktop
Mobile is replacing local
Utility-based computing is replacing once-off purchase
 Makes resources seem endless
 Lowers risk in terms of usage (pay as you go)
Demand
Resources

Capacity
Demand
Time
Static data center
Time
Data center in the cloud
Slide Credits: Berkeley RAD Lab
The Cloud

The ‘Internet of things’ was born in about 2009
 More devices connected to the Web than people…
Image from http://computinged.com/edge/become-part-of-the-cloud-computing-revolution/
The Cloud

This has meaningful implications for data in terms of
 Capacity
 Measurement
 Integration
 Security
 Privacy
Big data

The idea is that the vast amounts of interaction data allow for
systems that are nuanced and responsive in ways that were
previously not possible

Also a realisation that, if it can be analysed, this data is a huge
commodity, meaning new business models are possible

So when is data ‘big data’
3 Vs of Big data

Volume
 Facebook generates 10TB of new data daily, Twitter 7TB
 A Boeing 737 generates 240 terabytes of flight data during a flight
from one side of the US to the other

We can use all of this data to tell us something, if we know the right
questions to ask
3 Vs of Big data
Traditional Approach
Analyzed
informatio
n
Big Data Approach
All available
information
analyzed
All available
information
Analyze small
subsets of data
Analyze all data
From http://www.slideshare.net/ibmcanada/big-dataturning-data-into-insights?qid=0b4c69bc-3db2-4e12-ae47-a362a25752eb&v=qf1&b=&from_search=3
3 Vs of Big data

Velocity
 Clickstreams and asynchronous data transfer can capture what
millions of users are doing right now

Make a change, then watch the response.

No guesswork required up front as to what to gather, we can induce
the interesting stuff as we see it
3 Vs of Big data
Traditional Approach
Hypothesis
Question
Answer
Data
Start with hypothesis
and test against
selected data
Big Data Approach
Data
Exploration
Insight
Correlation
Explore all data and
identify correlations
From http://www.slideshare.net/ibmcanada/big-dataturning-data-into-insights?qid=0b4c69bc-3db2-4e12-ae47-a362a25752eb&v=qf1&b=&from_search=3
3 Vs of Big data

Variety
 Move from structured data to unstructured data, including image
recognition, text mining, etc.
 Gathered from users, applications, systems, sensors

Increasingly comprehensive data view of our ecosystem
 The Internet of Things
The Internet of Things
From http://www.pcworld.com/article/2039413/new-intel-ceo-creates-mysterious-new-devices-division.html
The Internet of Things

RFID sensors, bluetooth, microprocessors, wifi all becoming easier
to embed in ‘dumb’ devices

Move to mobile also means more data streaming from us at all
times, e.g. location, call activity, net use
The Internet of Things

Smart homes/smart cities
 Temperature, lighting, food stocks, energy, security

Smart cars
 Diagnostics, traffic suggestions, sensors, self-driving

Smart healthcare
 Worn and intravenous computing detects issues early and
monitors care outcomes remotely

Smart factories, farms
 Machines coordinated efficiently, linked dynamically to
consumption models
Big data

Success stories
 Books
 Barnes and Noble: Discovered that readers often quit
nonfiction books less than halfway through. Introduced highly
successful new series of short books on topical themes
 Amazon: originally used a panel of expert reviewers for
books. Data surplus allowed them to create increasingly
predictive recommendations. Panel has since been disbanded
and 1/3 of sales are now driven by the recommender system
Big data and the Internet of Things

Success stories (continued)
 Transport
 Flyontime.us: used historical weather and flight delay
information to predict likelihood of flights get delayed
 Farecast: looked at ticket prices for specific flights based on
historical data, then advised users to buy or wait according to
predicted fare costing trajectory
 UPS: Uses a range of traffic data to calculate most efficient
time/fuel efficient routes according to complex algorithm
Big data and the Internet of Things

Famous success stories (continued)
 Healthcare
 Modernizing Medicine EMA dermatology system
 https://www.youtube.com/watch?v=jMGaGtK9nzU
Big data and the Internet of Things

Famous success stories (continued)
 Social media
 Google (data for information relevance)
 Twitter (c.f. #RescuePH)
 Facebook (social data)
Issues with big data

Google Flu Trends
 Life imitating data, imitating life?

No one is really average height

Your Xbox knows you like that Katy Perry song

Also, Target called to say your teenage daughter is pregnant.

Icecream sales and shark attacks…
Icecream sales and shark attacks
continued (correlation, not causation)
From http://xkcd.com/552/
Target’s family monitoring continued
Assignment 1

In groups, you are tasked with identifying and researching a
business that uses data in an interesting and creative way.
 The report should be approximately 2000 words and describe the
key values offered by the business to its consumers, how this
differentiates it from competitors, and how its use of data at
different points in the creation, delivery, and support of
products/services enables this differentiation.
 You don’t need to go into deep technical detail concerning how
data is handled, nor about the technologies used. However you
should discuss data-related processes at a high-level, insofar as
you understand them from the information you gathered

The report is due on the 23rd October, at which time a soft-bound
report should be handed into Ann O’Riordan in room 3.75
Assignment 1

The groups are as follows:










Group 1: Hennessy, John James; Gao, Yun; Kenny, Mark Paul;
Group 2: O'Driscoll, Nicole; Flood, Lee; Yang, Siyu;
Group 3: Duggan, Claire Bernadette; Nolan, Robert Cunningham;
Power, Declan;
Group 4: Huang, Junqi; Kenneally, Alan Kieran; Murray, Jack Joseph;
Group 5: Lawton, Fiona Margaret; Hennessy, Darragh Ross; Chen, Qi;
Group 6: Xu, Chenjun; Kilcoyne, Shane Anthony; O'Donovan, MaryKate;
Group 7: O'Donovan, Paul Andrew; Guerin, Steven John;
MolerRodriguez, Marta;
Group 8: O'Riordan, Christina Eilish; Anso, Gabriel; Mc Carthy, Patricia;
Group 9: O'Donovan, Eileen; Wang, Mengjian; Lowham, Joshua George;
Group 10: Kerrisk, Edward; Meaney, Brendan; Qin, Xiaolu;
Want to read more?



On Modernizing Medicine
 https://www.modmed.com/
On Spotify
 http://www.bigdata-startups.com/BigData-startup/big-dataenabled-spotify-change-music-industry/#!prettyPhoto
On the cloud and big data
 The Little Book of Cloud Computing 2013 edition, Lars Nielsen
Download