ppt

advertisement
Ch 3. A Cross-Section of Big Data
Sources and the Value They Hold
Taming The Big Data Tidal Wave
17 May 2012
SNU IDB Lab.
Hye Chan, Bae
Outline








Telematics
Text
Time and Location
RFID(Radio Frequency IDentification)
Smart-Grid
Sensor
Telemetry
Social Network
2
Telematics (1/2)
Telematics Data
 Receives serious attention in the auto insurance industry
– Putting a sensor, or black box to capture the current car information
 Helps insurance companies better understand customer risk levels
 Has the potential to both
– Lower insurance rates (for most drivers)
– Increase profits (for insurers)
3
Telematics (2/2)
The Value of Telematics Data
 It's possible to answer the question like causes of traffic jams
without very focused and expensive testing
 It may improve our lives by reducing the stress and frustration
4
Outline








Telematics
Text
Time and Location
RFID(Radio Frequency IDentification)
Smart-Grid
Sensor
Telemetry
Social Network
5
Text (1/4)
Text Data
 It is one of the biggest and most common sources of big data
–
–
–
–
–
–
E-mails
Text messages
Social media postings
Instant messages
Real-time chats
Etc.
 Natural language processing
– Parsing text and assigning meaning to the words,
phrases and components
6
Text (2/4)
Text Data (cont.)
 The results of those processes can be analyzed
– Sentiment mining
– The comments about specific products from customers
 Interpreting text data is quite difficult
– The words change meaning based on
 Emphasis (but there is no emphasis in text)
 Full context
7
Text (3/4)
Using Text Data
 Sentiment analysis
– Looks at the general direction of opinion
– Understanding the trends can be immensely
valuable in planning what to do next
– For a product, positive or negative is valuable information
 If the customer hasn't yet purchased that product
8
Text (4/4)
Using Text Data (cont.)
 Pattern recognition
– Complains, repair notes, other comments
– An organization will be more quickly able to identify and fix problems
 Before they become bigger issues
 Fraud detection
– A major application for text data
– Within health insurance
 It's possible to use text analysis to parse out the comments and justifications
9
Outline








Telematics
Text
Time and Location
RFID(Radio Frequency IDentification)
Smart-Grid
Sensor
Telemetry
Social Network
10
Time and Location (1/3)
Time and Location Data
 Time and location information is a growing source of data
– GPS (Global Positioning Systems)
– Cellular phones
 Tracking the routes
– You can track the exact routes you travel
 When you exercise
 How long the routes are
 How long it takes you to complete the routes
11
Time and Location (2/3)
The Value of Time and Location Data
 Local police and fire agencies
– Provide information on where they typically travel with
 Real time information
 Accurate location
 Time- and location-sensitive offers
– This is going to be huge in the future of marketing
12
Time and Location (3/3)
The Value of Time and Location Data (cont.)
 Enhancing social network analysis
– Time and location data allows identification of what people were at the
same place at the same time
 Who attended a given concert or movie?
 Who was dining at a specific restaurant at the same time?
– It is possible to identify people who may not know each other
 But who have a lot of common interests
 Imagine a dating service with this information!
13
Outline








Telematics
Text
Time and Location
RFID(Radio Frequency IDentification)
Smart-Grid
Sensor
Telemetry
Social Network
14
RFID (1/4)
RFID Data
 Radio Frequency Identification(RFID) tag
– A small tag placed on objects
– It contains a unique serial number (different from UPC code)
 RFID reader can read the tag
– by sending out and receiving back a signal
15
RFID (2/4)
Using RFID Data
 Identifying "restocking" situation on the shelf
Storage room
16
RFID (3/4)
Using RFID Data
 Identifying "restocking" situation on the shelf
Storage room
17
RFID (4/4)
Using RFID Data (cont.)
 Product tracking system
18
Outline








Telematics
Text
Time and Location
RFID(Radio Frequency IDentification)
Smart-Grid
Sensor
Telemetry
Social Network
19
Smart-Grid (1/2)
Smart-Grid Data
 The next generation of electrical power infrastructure
– much more advanced and robust
 Has highly sophisticated
– Monitoring
– Communications
– Generation systems
 Enable more consistent service and better recovery from outages
20
Smart-Grid (2/2)
The Value of Smart-Grid Data
 The smart-grid data will benefit all
– Consumers will have customized rate plans
– Utilities will have much better forecasts of demand
– Household or business will gain the power
21
Outline








Telematics
Text
Time and Location
RFID(Radio Frequency IDentification)
Smart-Grid
Sensor
Telemetry
Social Network
22
Sensor (1/2)
Sensor Data
 There are a lot of complex machines and engines
– Aircraft, trains, military vehicles, etc.
– Keeping running smoothly is critical given how much it costs
– Embedded sensors have begun to be utilized in everything
 Monitoring
– It's worth capturing as much detailed
data as possible
– It's quite expensive to replace a
flawed component once an engine
is released
23
Sensor (2/2)
The Values of Sensor Data
 It's possible to pinpoint specific patterns
– Patterns that lead to imminent failures
 Strategies for minimizing down time
– Holding spare parts or engines
– Creating diagnostics to quickly identify the parts that must be replaced
– Investing in more reliable versions
24
Outline








Telematics
Text
Time and Location
RFID(Radio Frequency IDentification)
Smart-Grid
Sensor
Telemetry
Social Network
25
Telemetry (1/2)
Telemetry Data
 Telemetry is the term used in the video game industry
– To describe the capture of in-game activities
 In a hockey game
– Where a player was when a shot on goal was taken
– What type of shot is was
– What speed the shot had
 Telemetry data makes it possible for game producers to know
intimate details about
– How customers actually play and interact with the games
26
Telemetry (2/2)
Using Telemetry Data
 In video games
– Maintaining renewal rates is absolutely critical
 Many games make money through subscriptions
– Customer satisfaction is a big issue
 For casual gamer: a game is easy
 For hardcore gamer: a game is too hard
27
Outline








Telematics
Text
Time and Location
RFID(Radio Frequency IDentification)
Smart-Grid
Sensor
Telemetry
Social Network
28
Social Network (1/3)
Social Network Data
 Social network data qualifies as a big data source
– Social network analysis looks into several degrees of association
 A phone company
Company
Person
Person
Person
Person
Company
Person
Person
Person
Person
Traditional data
Social network analysis
29
Social network data
Social Network (2/3)
The Value of Social Network Data
 A traditional wireless carrier
Let him go
The customer isn't worth
the cost of saving
Low-value customer
 A modern wireless carrier
Do not let him go
Heavy user
who have wide networks of friends
30
Social Network (3/3)
Using Social Network Data
 Identification of highly connected customers can pinpoint
– He can influence brand image
– Highly connected customers can be provided perks, advance trials, and
other goodies
 Advertisement in social networking sites
– social network analysis can yield insights into what advertisements might
appeal to given users
 Based on knowing what their circle of friends has an interest in
I like to ride a bike!
Advertisement?
Friend
31
Thank you
Download