On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University Challenges Large Volume of Data MPEG2 100 Minute Movie: 3-4 GBytes Large Data Transfer Rate MPEG2: 4 to 6 Mbps HDTV: 19.2 Mbps Just-in-Time Data Requirement Simultaneous Users 2 ...Challenges Traditional Optimization Objectives: Maximizing Throughput! Maximizing Throughput!! Maximizing Throughout!!! How about Cost? How about Initial Latency? 3 Related Work IBM T.J. Watson Labs. (P. Yu) USC (S. Ghandeharizadeh) UCLA (R. Muntz) UBC (Raymond Ng) Bell Labs. (B. Ozden) etc. 4 Outline Server (Single Disk) Revisiting Conventional Wisdom Minimizing Cost Minimizing Initial Latency Server (Parallel Disks) Balancing Workload Minimizing Cost & Initial Latency Client Handling VBR Supporting VCR-like Functions 5 Conventional Wisdom (for Single Disk) Reducing Disk Latency leads to Better Disk Utilization Reducing Disk Latency leads to Higher Throughput Increasing Disk Utilization leads to Improved Cost Effectiveness 6 Is Conventional Wisdom Right? Does Reducing Disk Latency lead to Better Disk Utilization? Does Reducing Disk Latency lead to Higher Throughput? Does Increasing Disk Utilization lead to Improved Cost Effectiveness? 7 Tseek: Disk Latency TR: Disk Transfer Rate DR: Display Rate S: Segment Size (Peak Memory Use per Request) T: Service Cycle Time 8 S = DR × T T = N × (Tseek + S/TR) 9 Disk Utilization N × TR × DR × Tseek S = S is directly proportional to Tseek Dutil TR - N × DR S/TR = Dutil S/TR + Tseek is Constant! 10 Is Conventional Wisdom Right? Does Reducing Disk Latency lead to Better Disk Utilization? NO! Does Reducing Disk Latency lead to Higher Throughput? Does Increasing Disk Utilization lead to Improved Cost Effectiveness? 11 What Affects Throughput? × Disk Utilization Disk Latency Throughput ? Memory Utilization 12 Memory Requirement We Examine Two Disk Scheduling Policies’ Memory Requirement Sweep (Elevator Policy): Enjoys the Minimum Seek Overhead Fixed-Stretch: Suffers from High Seek Overhead 13 Per User Peak Memory Use S N × TR × DR × Tseek = TR - N × DR 14 Sweep (Elevator) Disk Latency: Minimum IO Time Variability: Very High 15 Sweep (Elevator) Memory Sharing: Poor Total Memory Requirement: 2 * N * Ssweep 16 Fixed-Stretch Disk Latency: High (because of Stretch) IO Variability: No (because of Fixed) 17 Fixed-Stretch Memory Sharing: Good Total Memory Requirement: 1/2 * N * Sfs 18 Throughput Sweep 2 * N * Ssweep Available Memory = 40 Mbytes N = 40 Fixed Stretch 1/2 * N * Ssf Available Memory = 40 Mbytes N= 42 Higher Throughput * Based on A Realistic Case Study Using Seagate Disks 19 What Affects Throughput? × Disk Utilization Disk Latency Throughput ? Memory Utilization 20 Is Conventional Wisdom Right? Does Reducing Disk Latency lead to Better Disk Utilization? NO! Does Reducing Disk Latency lead to Higher Throughput? NO! Does Increasing Disk Utilization lead to Improved Cost Effectiveness? 21 Per Stream Cost 22 Per-Stream Memory Cost Cm × S Cm × N × TR × DR × Tseek = TR - N × DR 23 Example Disk Cost: $200 a unit Memory Cost: $5 each MBytes Supporting N = 40 Requires 60 MBytes Memory $200 + 300 = $500 Supporting N = 50 Requires 160 MBytes Memory $200 + 800 = $1,000 For the same cost $1,000, it’s better to buy 2 Disks and 120 Mbytes to support N = 80 Users! Memory Use is Critical 24 Is Conventional Wisdom Right? Does Reducing Disk Latency lead to Better Disk Utilization? NO! Does Reducing Disk Latency lead to Higher Throughput? NO! Does Increasing Disk Utilization lead to Improved Cost Effectiveness? NO! 25 So What? 26 Outline Server (Single Disk) Revisiting Conventional Wisdom Minimizing Cost Minimizing Initial Latency Server (Parallel Disks) Balancing Workload Minimizing Cost & Initial Latency Client Handling VBR Supporting VCR-like Functions 27 Initial Latency What is it? The time between when a request arrives at the server to the time when the data is available in the server’s main memory Where is it important? Interactive applications (e.g., video game) Interactive features (e.g., fast-scan) 28 Sweep (Elevator) 29 Fixed-Stretch Space Out IOs 30 Fixed-Stretch 31 Fixed-Stretch 32 Our Contribution: BubbleUp Fixed-Stretch Enjoys Fine Throughput BubbleUp Remedies Fixed-Stretch to Minimize Initial Latency 33 Schedule Office Work 8am: 9am: 10am: 11am: Noon: Host a Visitor Do Email Write Paper Write Paper Lunch 34 BubbleUp 35 BubbleUp Empty Slots are Always Next in Time No additional Memory Required Fill the Buffer up to the Segment Size No additional Disk Bandwidth Required The Disk Is Idle Otherwise 36 Evaluation 37 Fast-Scan 38 Fast-Scan 39 Data Placement Policies Please refer to our publications 40 41 Chunk Allocation Allocate Memory in Chunks A Chunk = k * S Replicate the Last Segment of a Chunk in the Beginning of Next Chunk Example Chunk 1: s1, s2, s3, s4, s5 Chunk 2: s5, s6, s7, s8, s9 42 Chunk Allocation Largest-Fit First Best Fit (Last Chunk) 43 18 Segment Placement 44 Largest-Fit First 45 Best Fit 46 Outline Server (Single Disk) Revisiting Conventional Wisdom Minimizing Cost Minimizing Initial Latency Server (Parallel Disks) Balancing Workload Minimizing Cost & Initial Latency Client Handling VBR Supporting VCR-like Functions 47 Unbalanced Workload 48 Balanced Workload 49 Per Stream Memory Use (Use M Disks Independently) S = N × TR × DR × Tseek TR - N × DR M×N 50 Per Stream Memory Use (Use M Disks As One Disk) M×N 51 …Continue S = S’ = S’ = N × TR × DR × Tseek TR - N × DR N × M × TR × M × DR × Tseek TR × M - N × M × DR M × N × TR × DR × Tseek = M×S TR - N × DR 52 Challenges Using M Disks Independently: Unbalanced Workload Low Per-Stream Memory Cost Using M Disks As One Virtual Disk (i.e., Employing Fine-Grained Striping): Balanced Workload High Per-Stream Memory Cost 53 Our Approach (2DB) Use Disks Independently To Minimize Cost Replicate Hot Movies (20% Movies) To Balance Workload Use BubbleUp To Minimize Initial Latency 54 2D BubbleUp (2DB) Intelligent Data Placement Efficient Request Scheduling FODO, 1998 55 2DB Data Placement: Chunk Allocation 56 2DB Scheduling Formally, This is a Bipartite Weighted Matching problem Can be solved using Hungarian method in O(V^3), where V = NM We use a Greedy Method to reduce the problem to a Bipartite Unweighted Matching problem Can be solved in O(M^2) 57 Why 2DB Works? 58 59 60 n balls n urns, finite n: ln n / ln ln n(1 + o(1)) ln ln n / ln 2 + O(1) m balls n urns, m > n and infinite m and n: d: number of possible destinations ln ln n / ln d (1 + o(1)) + O(m/n) 61 What 2DB Costs? Storage Cost Addition disk cost = % hot movies Typically 20% of movies subscribed 80% of time Throughput Throughput is scaled back by a fraction to achieve balanced work 62 Evaluation 2DB Achieves Balanced Workload with High Throughput Compared to e.g., some dynamic load balancing schemes 2DB Incurs Low Additional Storage Cost 2DB Enjoys Minimum Initial Latency 63 Outline Server (Single Disk) Revisiting Conventional Wisdom Minimizing Cost Minimizing Initial Latency Server (Parallel Disks) Balancing Workload Minimizing Cost & Initial Latency Client Handling VBR Supporting VCR-like Functions 64 Media Client Most Studies Assume Dumb Clients We Propose Smart Clients for Handling VBR Supporting VCR-like Functions 65 Handling VBR Server Can Handle VBR Frame rate fluctuates but the moving average does not fluctuate as much Rates are even out when N is large, which is typically the case 66 ...VBR But, the Server Cannot Eliminate Bitrate Mismatch Packetization and Channel Delay can change the bitrate The Solution Must Be at the Client Side! 67 Supporting VCR-like Functions Pause Phone call interruptions Biological needs Fast Forward Catching up the program after a pause Instant Replay 68 How to Pause A Movie? Broadcast TV Cannot Be Paused Pausing Via a Point-to-point Link Affects the Server’s Scheduling Caching!!! Main Memory Caching? Too expensive! (19.2 mbps * 20 min = 2 GBytes) 69 Buffer Management 70 Challenges Must Ensure Arriving Bits Do Not Overflow the Network Buffer Must Ensure Decoder Buffer Does Not Underflow Must Work for Any Off-the-shelf Disks, CPU Box 71 Our Contribution: MEDIC MEDIC: MEmory & Disk Integrated Cache MEDIC Manages IOs Between Memory and Disk Efficiently Only 4 Mbytes main memory needed!!! Make a set-top box affordable MEDIC Adapts to Hardware Configuration 72 Demo Regular Playback Pause Resume Regular Playback Fast Forward Instant Replay (not shown) 73 Visualize MEDIC 74 Conclusions (Contributions in Blue) Server (Single Disk) Revisiting Conventional Wisdom Minimizing Cost Minimizing Initial Latency Server (Parallel Disks) Balancing Workload Minimizing Cost & Initial Latency Client Handling VBR Supporting VCR-like Functions 75 …Conclusions Our Server Supports Low Latency Playback and Fast Forward Our Client Supports Pause and Low Latency Instance Replay Together, We Propose A Complete Endto-end Solution for Continuous Media Data Delivery! 76 Future Work Enhancing MEDIC for Managing Heterogeneous Data, from Both Broadcast & Internet Channels Video Panoramas Interactive TV Indexing Videos for Replay Video/Image databases 77