On Managing Continuous Media Data Edward Chang Hector Garcia-Molina

On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University Challenges Large Volume of Data MPEG2 100 Minute Movie: 3-4 GBytes Large Data Transfer Rate MPEG2: 4 to 6 Mbps HDTV: 19.2 Mbps Just-in-Time Data Requirement Simultaneous Users 2 ...Challenges Traditional Optimization Objectives: Maximizing Throughput! Maximizing Throughput!! Maximizing Throughout!!! How about Cost? How about Initial Latency? 3 Related Work IBM T.J. Watson Labs. (P. Yu) USC (S. Ghandeharizadeh) UCLA (R. Muntz) UBC (Raymond Ng) Bell Labs. (B. Ozden) etc. 4 Outline Server (Single Disk) Revisiting Conventional Wisdom Minimizing Cost Minimizing Initial Latency Server (Parallel Disks) Balancing Workload Minimizing Cost & Initial Latency Client Handling VBR Supporting VCR-like Functions 5 Conventional Wisdom (for Single Disk) Reducing Disk Latency leads to Better Disk Utilization Reducing Disk Latency leads to Higher Throughput Increasing Disk Utilization leads to Improved Cost Effectiveness 6 Is Conventional Wisdom Right? Does Reducing Disk Latency lead to Better Disk Utilization? Does Reducing Disk Latency lead to Higher Throughput? Does Increasing Disk Utilization lead to Improved Cost Effectiveness? 7 Tseek: Disk Latency TR: Disk Transfer Rate DR: Display Rate S: Segment Size (Peak Memory Use per Request) T: Service Cycle Time 8 S = DR × T T = N × (Tseek + S/TR) 9 Disk Utilization N × TR × DR × Tseek S = S is directly proportional to Tseek Dutil TR - N × DR S/TR = Dutil S/TR + Tseek is Constant! 10 Is Conventional Wisdom Right? Does Reducing Disk Latency lead to Better Disk Utilization? NO! Does Reducing Disk Latency lead to Higher Throughput? Does Increasing Disk Utilization lead to Improved Cost Effectiveness? 11 What Affects Throughput? × Disk Utilization Disk Latency Throughput ? Memory Utilization 12 Memory Requirement We Examine Two Disk Scheduling Policies’ Memory Requirement Sweep (Elevator Policy): Enjoys the Minimum Seek Overhead Fixed-Stretch: Suffers from High Seek Overhead 13 Per User Peak Memory Use S N × TR × DR × Tseek = TR - N × DR 14 Sweep (Elevator) Disk Latency: Minimum IO Time Variability: Very High 15 Sweep (Elevator) Memory Sharing: Poor Total Memory Requirement: 2 * N * Ssweep 16 Fixed-Stretch Disk Latency: High (because of Stretch) IO Variability: No (because of Fixed) 17 Fixed-Stretch Memory Sharing: Good Total Memory Requirement: 1/2 * N * Sfs 18 Throughput Sweep 2 * N * Ssweep Available Memory = 40 Mbytes N = 40 Fixed Stretch 1/2 * N * Ssf Available Memory = 40 Mbytes N= 42 Higher Throughput * Based on A Realistic Case Study Using Seagate Disks 19 What Affects Throughput? × Disk Utilization Disk Latency Throughput ? Memory Utilization 20 Is Conventional Wisdom Right? Does Reducing Disk Latency lead to Better Disk Utilization? NO! Does Reducing Disk Latency lead to Higher Throughput? NO! Does Increasing Disk Utilization lead to Improved Cost Effectiveness? 21 Per Stream Cost 22 Per-Stream Memory Cost Cm × S Cm × N × TR × DR × Tseek = TR - N × DR 23 Example  Disk Cost: $200 a unit  Memory Cost: $5 each MBytes  Supporting N = 40 Requires 60 MBytes Memory $200 + 300 = $500  Supporting N = 50 Requires 160 MBytes Memory $200 + 800 = $1,000  For the same cost $1,000, it’s better to buy 2 Disks and 120 Mbytes to support N = 80 Users!  Memory Use is Critical 24 Is Conventional Wisdom Right? Does Reducing Disk Latency lead to Better Disk Utilization? NO! Does Reducing Disk Latency lead to Higher Throughput? NO! Does Increasing Disk Utilization lead to Improved Cost Effectiveness? NO! 25 So What?  26 Outline Server (Single Disk) Revisiting Conventional Wisdom Minimizing Cost Minimizing Initial Latency Server (Parallel Disks) Balancing Workload Minimizing Cost & Initial Latency Client Handling VBR Supporting VCR-like Functions 27 Initial Latency What is it? The time between when a request arrives at the server to the time when the data is available in the server’s main memory Where is it important? Interactive applications (e.g., video game) Interactive features (e.g., fast-scan) 28 Sweep (Elevator) 29 Fixed-Stretch Space Out IOs 30 Fixed-Stretch 31 Fixed-Stretch 32 Our Contribution: BubbleUp Fixed-Stretch Enjoys Fine Throughput BubbleUp Remedies Fixed-Stretch to Minimize Initial Latency 33 Schedule Office Work 8am: 9am: 10am: 11am: Noon: Host a Visitor Do Email Write Paper Write Paper Lunch 34 BubbleUp 35 BubbleUp Empty Slots are Always Next in Time No additional Memory Required Fill the Buffer up to the Segment Size No additional Disk Bandwidth Required The Disk Is Idle Otherwise 36 Evaluation 37 Fast-Scan 38 Fast-Scan 39 Data Placement Policies Please refer to our publications 40 41 Chunk Allocation Allocate Memory in Chunks A Chunk = k * S Replicate the Last Segment of a Chunk in the Beginning of Next Chunk Example Chunk 1: s1, s2, s3, s4, s5 Chunk 2: s5, s6, s7, s8, s9 42 Chunk Allocation Largest-Fit First Best Fit (Last Chunk) 43 18 Segment Placement 44 Largest-Fit First 45 Best Fit 46 Outline Server (Single Disk) Revisiting Conventional Wisdom Minimizing Cost Minimizing Initial Latency Server (Parallel Disks) Balancing Workload Minimizing Cost & Initial Latency Client Handling VBR Supporting VCR-like Functions 47 Unbalanced Workload 48 Balanced Workload 49 Per Stream Memory Use (Use M Disks Independently) S = N × TR × DR × Tseek TR - N × DR M×N 50 Per Stream Memory Use (Use M Disks As One Disk) M×N 51 …Continue S = S’ = S’ = N × TR × DR × Tseek TR - N × DR N × M × TR × M × DR × Tseek TR × M - N × M × DR M × N × TR × DR × Tseek = M×S TR - N × DR 52 Challenges Using M Disks Independently: Unbalanced Workload Low Per-Stream Memory Cost Using M Disks As One Virtual Disk (i.e., Employing Fine-Grained Striping): Balanced Workload High Per-Stream Memory Cost 53 Our Approach (2DB) Use Disks Independently To Minimize Cost Replicate Hot Movies (20% Movies) To Balance Workload Use BubbleUp To Minimize Initial Latency 54 2D BubbleUp (2DB) Intelligent Data Placement Efficient Request Scheduling FODO, 1998 55 2DB Data Placement: Chunk Allocation 56 2DB Scheduling Formally, This is a Bipartite Weighted Matching problem Can be solved using Hungarian method in O(V^3), where V = NM We use a Greedy Method to reduce the problem to a Bipartite Unweighted Matching problem Can be solved in O(M^2) 57 Why 2DB Works? 58 59 60 n balls n urns, finite n: ln n / ln ln n(1 + o(1)) ln ln n / ln 2 + O(1) m balls n urns, m > n and infinite m and n: d: number of possible destinations ln ln n / ln d (1 + o(1)) + O(m/n) 61 What 2DB Costs? Storage Cost Addition disk cost = % hot movies Typically 20% of movies subscribed 80% of time Throughput Throughput is scaled back by a fraction to achieve balanced work 62 Evaluation 2DB Achieves Balanced Workload with High Throughput Compared to e.g., some dynamic load balancing schemes 2DB Incurs Low Additional Storage Cost 2DB Enjoys Minimum Initial Latency 63 Outline Server (Single Disk) Revisiting Conventional Wisdom Minimizing Cost Minimizing Initial Latency Server (Parallel Disks) Balancing Workload Minimizing Cost & Initial Latency Client Handling VBR Supporting VCR-like Functions 64 Media Client Most Studies Assume Dumb Clients We Propose Smart Clients for Handling VBR Supporting VCR-like Functions 65 Handling VBR Server Can Handle VBR Frame rate fluctuates but the moving average does not fluctuate as much Rates are even out when N is large, which is typically the case 66 ...VBR But, the Server Cannot Eliminate Bitrate Mismatch Packetization and Channel Delay can change the bitrate The Solution Must Be at the Client Side! 67 Supporting VCR-like Functions Pause Phone call interruptions Biological needs Fast Forward Catching up the program after a pause Instant Replay 68 How to Pause A Movie? Broadcast TV Cannot Be Paused Pausing Via a Point-to-point Link Affects the Server’s Scheduling Caching!!! Main Memory Caching? Too expensive! (19.2 mbps * 20 min = 2 GBytes) 69 Buffer Management 70 Challenges Must Ensure Arriving Bits Do Not Overflow the Network Buffer Must Ensure Decoder Buffer Does Not Underflow Must Work for Any Off-the-shelf Disks, CPU Box 71 Our Contribution: MEDIC MEDIC: MEmory & Disk Integrated Cache MEDIC Manages IOs Between Memory and Disk Efficiently Only 4 Mbytes main memory needed!!! Make a set-top box affordable MEDIC Adapts to Hardware Configuration 72 Demo Regular Playback Pause Resume Regular Playback Fast Forward Instant Replay (not shown) 73 Visualize MEDIC 74 Conclusions (Contributions in Blue) Server (Single Disk) Revisiting Conventional Wisdom Minimizing Cost Minimizing Initial Latency Server (Parallel Disks) Balancing Workload Minimizing Cost & Initial Latency Client Handling VBR Supporting VCR-like Functions 75 …Conclusions Our Server Supports Low Latency Playback and Fast Forward Our Client Supports Pause and Low Latency Instance Replay Together, We Propose A Complete Endto-end Solution for Continuous Media Data Delivery! 76 Future Work Enhancing MEDIC for Managing Heterogeneous Data, from Both Broadcast & Internet Channels Video Panoramas Interactive TV Indexing Videos for Replay Video/Image databases 77

On Managing Continuous Media Data Edward Chang Hector Garcia-Molina

Related documents

Products

Support

On Managing Continuous Media Data Edward Chang Hector Garcia-Molina

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib