Back-Office Web Traffic on The Internet E. Pujol, P. Richter, B. Chandrasekaran, G. Smaragdakis, A Feldmann, B. Maggs, and K-C. Ng Presentation by: James Newman Introduction ● ● ● Front-Office Traffic ○ End User Back-Office Traffic ○ ex. CDNs Vantage Points ○ 2 major IXPs ○ Major CDN ○ Major ISP Main Contributions ● ● ● Identify and classify types of back-office traffic ○ Front-Office vs Back-Office Identify implications on: ○ network protocol design ○ co-location strategies Confirm that ○ CDNs have a sophisticated back-office ○ Back-Office Web traffic is significant Data Sets ● Active Measurements ○ IP addresses ○ DNS reverse lookups Identifying Back-Office Traffic ● ● Classify all IPs based on involvement in Web activity Classify activities as ○ Client ■ Auctioneers ■ Crawlers ○ Server ■ Bidders ○ Both ■ Proxies Back-Office Communication Identifying Back-Office Traffic: Web Servers ● Two Caveats: ○ ○ Dynamically assigned IP addresses Complex Dual behavior Identifying Back-Office Traffic: IP Classification ● ● Most IPs are clients 11% of Web servers also act as clients Closer Look ● Auctioneers ○ ○ ● Bidders ○ ○ ● 282 IPs Many co-located with Auctioneers Crawlers ○ ○ ● > 300 IPs 2 search engines, an online social network, and a Web portal > 3,000 IPs from 120 ASes 2 ASes host 72% Content Delivery Proxies ○ > 30,000 IPs Estimating Back-Office Traffic ● ● Combine different data sets Backbone links more variable than IXPs Back-Office Traffic Characteristics ● Temporal Behavior ● Spatial Behavior Back-Office Traffic Communication Patterns ● ● L-IXP trace Auctioneers are most active ○ 232 million bid requests/hour on average CDN Perspective ● ● Front-office vs. Back-office CDN Traffic at least 25% of traffic is back-office CDN Perspective cont. ● ● Intra-CDN/Public traffic Substantial fraction of traffic travels a short distance End User Perspective ● Improving Experience leads to increases in ○ ○ ● revenues user engagement Leads to an increase of back-office traffic Implications ● Researchers ○ ○ ● Must differentiate between front and back-office traffic Back-office infrastructure could lead to new protocols ■ different paradigms: Software Defined Networking Operators ○ ○ Back-office links/traffic can influence front-office operations May have different requirements than front-office traffic ■ customized services