Tradeoffs in CDN Designs for Throughput Oriented Traffic Minlan Yu University of Southern California Joint work with Wenjie Jiang, Haoyuan Li, and Ion Stoica 1 Throughput-Oriented Traffic • Throughput-oriented traffic is growing in Internet – Cisco report predicts that 90% of the consumer traffic will be video by 2013 (E.g., NetFlix, Youtube) – Software, game, movie downloads – Most are delivered by content distribution networks Revisit CDN design choices for throughputoriented traffic 2 Where is the throughput bottleneck? Client: Network: Server: Computer/access Congestions at peering Not enough resource link too slow and upstream links (CPU, power, bw) 3 Understanding Throughput Bottleneck • Network bottlenecks are common Buffering ratio – NetFlix sees reduced video rates due to low ISP capacity – Akamai reported bottlenecks at peering links 4 3.5 3 2.5 2 1.5 1 0.5 Degraded video performance caused by network congestion 0 2 4 6 8 10 12 14 16 18 Concurrent views (K) 4 Nature of Bottleneck is Changing • More throughput-oriented applications – Video traffic lasts longer and has higher volume • More elephants step on each other in the future – Decreases the benefits of statistical multiplexing – Introduces more challenges in bandwidth provisioning 5 Improving Network Throughput • ISP-CDNs: multiple paths and better path selections – ISPs move up in the revenue chain to deliver content • ISP-CDNs such as AT&T and Verizon – Control both servers and the network – Better traffic engineering for CDN traffic • Existing CDNs: Deploy servers at more locations and setting up more peering points Peering Question points 1: What’s the throughput benefit of more paths over more peering points? …… 6 Improving CDN Throughput • Highly distributed approach (e.g., Akamai) – Many server locations, more high-throughput paths – Higher management, replication, bandwidth cost • More centralized approach (e.g., Limelight) – A few large data centers with more peering points – Lower cost due to economy of scale More centralized Highly distributed Question 2: How to compare more centralized vs. more distributed CDNs on throughput and cost? …… Modeling CDN Design Choices • CDNs: Increase peering points at the edge • ISPs: Improve path selection at the core 8 Increase Peering Points • Modeling peering points (PPs) – Increase #PPs to study throughput effect – Pick PP locations from synthetic and real topologies • Peering point selection – Maximize aggregate throughput – By assigning client locations to PPs … and splitting traffic to different PPs 9 Improve Path Selection • Today: No cooperation (1path) – ISPs: Shortest path routing (e.g., OSPF) – CDNs: Select peering points to maximize throughput • Better contracts between ISPs and CDNs (n paths) – ISPs: Expose multiple shortest paths to CDNs (e.g.,MPLS) – CDNs: Select peering points and paths 10 Improving Path Selection • ISP-CDNs: Optimal throughput (mcf) – Joint traffic engineering and server selection – Reduced to multi-commodity flow problem • Optimization formulation – Objectives: Max total throughput – Subject to: Client demands & Link capacity constraints – Variables: Peering point selection, traffic splitting on each paths (Flow_{path, pp, client}) 11 An Example Min-cut size – improving path selection only approximates the min-cut size – increasing #peering points essentially increases min-cut size Capacity =2 Capacity =1 Capacity =2 • With PP2 and PP3, the maximum throughput of multiple paths is 4 (min-cut size 4) 12 • Increase to 4 PPs, the min-cut size now is 8 Question 1: What’s the benefit of path selection over peering point selection? 13 Quantify the Benefits under Various Scenarios • Network – Topologies: power-law, random, hierarchy, different link density, router-level ISP topo, AS-level Internet topo – Link capacity distribution: uniform, exp., pareto, higher inter-AS bandwidth • CDN peering points – Map Akamai and Limelight server IP addresses to ASes (collected from PlanetLab measurement at Nov. 2010) – Randomly pick peering points for synthetic topologies • Client demands – Session-level traces from Conviva collected between 14 Dec. 2011 and April. 2012 Multipath is better than Multiple Locations – Power law graph (500 nodes, 997 links) – Uniform link capacity distribution – 200 clients at random locations Multiple paths have little improvement over increasing peering points 15 Effect of Network Topology – Increasing peering points are better than multipath in most topologies – Except star-like topology with uniform link capacity 58 • The throughput from 1path to mcf increases by 110% - 584% • The throughput from 10 PPs to 20 PPs increases by 337% 243 146 13 263 10 290229170136 28 200 38 222 158169 3 27 267 8 90 245 131 40 64 261133 175 201 67 193 71 91 60 223165 47 109230 19 62 65 21 227 283211 121 140 161 234 48117312122 79 32 51258 25352 318 196 159 188 83 300 23 302 25953 269 231 187 35149 153 255 110 41151 74 309 268 297 303 100 242 270 308120 20 235 147 220 72 185 11 138 244 66 59 319 106 293 160 167 197 42155 128 73 111108 226 208 81 317 198 246 95 93 191 189209 80150 69114 232 49 249299 135 156 145 37 115 118 190 296 311 284 216 31 301 294 154 82 164 184 102 162248 26 304 103 116 313 105 152 87 183 86 94 291 325 171 277 29 260173 288 17 6 104 264 233 326 239 139 292236143 26514 78 305163 289 112 101 144 98 213 207 217166 180 322 275 107 181 287 323 57 212 75 321 113 30 15 276 179 295 168 2572 55125 36 278 315 281 141 202 205 274 88 282 178 148 195 182 204 206 240 22 186 316 16 279237 224 33 320 157 174 314 247 298 34 215 24 256 89 127210 12 97 280 214 76 1306 68 61 134 192 126 1307 285 25 5119 45 272 177 70 43324 85 56 286 199 63 238 241 266 176 203 262225 132 251 46 99 123228 4 250 124 54 252 219 172 254 221 96 92 77 307142 84 310 9 137194 129 50 44 18 271 39 218 273 16 Path selection not useful under Flash Crowd Thpt (Path + peering point selection) Thpt (Peering point selection) Relative scaling ratio – Conviva traces during normal and flash crowd periods – Path selection has little benefits under normal traffic – Path selection is worse than only peering point selection 1.4 1.2 1 0.8 0.6 0.4 0.2 0 flash crowd normal 5min 10min30min 1hour 2hour Path selection interval 17 More peering points always better than more paths with long-tail Distribution of Contents – Long-tail content distribution trace from Conviva – With fewer replications, the throughput benefit of multipath increases Normalized Throughput • Without replication the content delivery is closer to the singlesource traffic 8 7 6 5 4 3 2 1 0 100PP,1path 10PP,mcf 10PP,1path 0.1 1 2 10 20 Duplication Threshold (%) 18 Takeaway 1: CDNs only need to control the edge of the Internet to improve the throughput. ISP-CDNs don’t get significant benefits from controlling the network over CDNs 19 Question 2: How to compare throughput and cost between more centralized vs more dist. CDNs? 20 Throughput Comparison of CDNs – Assume a fixed aggregate peering bandwidth per CDN – A more distributed CDN achieves better throughput than more centralized one Throughput (K) 200 peering bw 2-3 150 4-5 6-10 100 >10 Distributed Centralized 50 0 0 50 100 150 200 250 300 350 400 #peering links 21 CDN Operation Cost • Management cost – At each location: electricity, cooling, equip maintenance, and human resources • Content replication cost – Storage cost to replicate popular content – Bandwidth cost to redirect traffic for rare content • Bandwidth cost – CDNs often pay ISPs for the bandwidth they use at the peering points based on mutually-agreed billing model 22 Different Cost Functions • Cost as a function of bandwidth at a location Unit price per bandwidth – Different functions: polynomial, linear, log, exp – Model how fast the unit cost drops with throughput – In practice: a linear combination of different functions 1 Polynomial Linear Log Exponential 0.8 0.6 0.4 0.2 0 20 40 60 Throughput 80 100 23 Polynomial Cost • Dist. CDN is more expensive than Centralized one Unit price per bandwidth – Limelight has larger throughput at each location and thus better scalability gains – Same observation holds across various operational cost functions and their combinations Distributed 0.5 0.45 0.4 2-3 4-5 6-10 >10 0.35 0.3 Centralized 0.25 0 20 40 60 80 100 120 140 Throughput (K) 24 Takeaway 2: More distributed CDNs achieve higher throughput than more centralized CDNs, but… … are more expensive for same throughput 25 Conclusion • A simple model to quantify CDN design choices – Increasing the number of peering points – Improving path selection – More distributed vs more centralized design • Optimizations at the edge is enough for CDNs – Multipath has little benefit over increasing # locations and choosing different peering links – There’s a tradeoff of throughput and cost among CDNs 26 Thanks! Questions? 27