talk pptx

advertisement
Go Stream
Matvey Arye, Princeton/Cloudflare
Albert Strasheim, Cloudflare
Awesome CDN service for websites big & small
Millions of request a second peak
24 data centers across the globe
Data Analysis
– Customer facing analytics
– System health monitoring
– Security monitoring
=> Need global view
Functionality
• Calculate aggregate functions on fast, big data
• Aggregate across nodes (across datacenters)
• Data stored at different time granularities
Basic Design Requirements
1. Reliability – Exactly-once semantics
2. High Data Volumes
Our Environment
Source
Stream
processing
Source
Storage
Basic Programming Model
Storage
Op
Op
Op
Storage
Op
Op
Op
Storage
Op
Op
Op
Storage
Existing Systems
S4
The reliability model is not consistent
Storm
Exactly-once-semantics requires batching
Reliability only inside the stream processing system
What if a source goes down? The DB?
The Need For End-to-End Reliability
Source
Stream Proccessing
Storage
When source comes back up where does it start sending data from?
If using something like Storm, need additional reliability mechanisms
The Takeaway
Need end-to-end reliability
- Or Multiple reliability mechanisms
Reliability of stream processing not enough
Design of Reliability
• Avoid queuing because destination has failed
– Rely on storage at the edges
– Minimize replication
• Minimize edge cases
• No specialized hardware
Big Design Decisions
End-to-end reliability
Only transient operator state
Recovering From Failure
Source
I am starting a stream with you.
What have you already seen from me?
I’ve seen <X>
Source
Okie dokie. Here is all the new stuff.
Storage
Tracking what you have seen
Store identifier for all items
4
3
2
1
Store one identifier for highest number
Tracking what you have seen
Store identifier for all items
The answer to what have I seen is huge
Requires lots of storage for IDs
4
3
2
1
Store one identifier for highest number
Parallel processing of ordered data is tricky
Tension between
Ordering
Reliability
Parallelization
High Volume Data
Go Makes This Easier
Language from Google written for concurrency
Goroutine
I run code
Channels send data
between Go routines
Goroutine
I run code
Most synchronization is done by passing data
Goroutine Scheduling
Channels are FIFO queues that have a maximum capacity
So goroutine can be in 4 states:
1. Executing Code
2. Waiting for a thread to execute code
3. Blocking to receive data from a channel
4. Blocking to send data to a channel
Scheduler optimizes assignment of goroutines to threads.
Efficient Ordering Under The Hood
4
3
2
1
Input tuple
Count of output tuples
for each input
Actual result tuples
Source distributes
items to workers
in a specific order
Reading from each worker:
1. Read one tuple off the count
channel. Assign count to X
2. Read X tuples of the result channel
Intuition behind design
Multiple output channels allows each worker to
write independently.
Count channel tells reader how many tuples to
expect. Does not block except when result needed
to satisfy ordering.
Judicious blocking allows scheduler to use blocking
as a signal for which worker to schedule.
Throughput does not suffer
10000
9000
Tuples per Second
8000
7000
6000
5000
Ordered
4000
Unordered
3000
2000
1000
0
2
4
8
16
Floating Point Operations (x1000)
32
The Big Picture - Reliability
• Source provide monotonically increasing ids
– per stream
• Stream processor preserves ordering
– per source-stream
• Central DB maintains mapping of:
Source-stream => highest ID processed
Functionality of Stream Processor
• Compression, serialization
• Partitioning for distributed sinks
• Bucketing
– Take individual records and construct aggregates
• Across source nodes
• Across time – adjustable granularity
• Batching
– Submitting many records at once to the DB
• Bucketing and batching all done with transient state
Where to get the code
Stable
https://github.com/cloudflare/go-stream
Bleeding Edge
https://github.com/cevian/go-stream
arye@cs.princeton.edu
Data Model
Streaming OLAP-like cubes
Useful summaries of high-volume data
foo.com/r
foo.com/q
bar.com/n
bar.com/m
Time
Cube Dimensions
URL
27
Cube Aggregates
bar.com/m
01:01:01
28
Updating A Cube
foo.com/r
foo.com/q
bar.com/n
bar.com/m
Time
Request #1
bar.com/m
01:01:00
Latency: 90 ms
URL
29
Map Request To Cell
foo.com/r
foo.com/q
bar.com/n
bar.com/m
Time
Request #1
bar.com/m
01:01:00
Latency: 90 ms
URL
30
Update The Aggregates
foo.com/r
foo.com/q
bar.com/n
bar.com/m
Time
Request #1
bar.com/m
01:01:00
Latency: 90 ms
URL
31
Update In-Place
foo.com/r
foo.com/q
bar.com/n
bar.com/m
Time
Request #2
bar.com/m
01:01:00
Latency: 50 ms
URL
32
URL
foo.com/r
foo.com/q
bar.com/n
bar.com/m
Time
…
Cube Slice
Slice
33
Cube Rollup
URL: foo.com/*
Time: 01:01:01
URL
foo.com/r
foo.com/q
bar.com/n
bar.com/m
Time
URL: bar.com/*
Time: 01:01:01
34
Rich Structure
Cell
URL
Time
A
bar.com/*
01:01:01
B
*
01:01:01
C
foo.com/*
01:01:01
D
foo.com/r
01:01:*
E
foo.com/*
01:01:*
35
Key Property
2 types of rollups
1. Across Dimensions
2. Across Sources
We use the same aggregation function for both
Powerful conceptual constraints
Semantic properties preserved when changing the
granularity of reporting
Where to get the code
Stable
https://github.com/cloudflare/go-stream
Bleeding Edge
https://github.com/cevian/go-stream
arye@cs.princeton.edu
Download