CPSC 641: Project Brainstorming Session Carey Williamson Department of Computer Science University of Calgary 1 PROJECT OVERVIEW A “typical” course project might involve: – design/build/obtain appropriate testbed, environment, or platform for your project – extend/customize as needed – obtain relevant data/measurements needed – design suitable experiment: clear goal, identify factors, levels, performance metrics – obtain and present (new/interesting) results 2 Some Data Sets and Traces Web server access logs (1996) Web proxy access logs (1998) MPEG video traces (20 x 40,000 frames) ISP measurements (4 traces, 1-2 minutes) FrameRelay/ATM traces (5 traces) Bellcore Ethernet LAN trace (1989) TCP/IP packet traces (LBL, 24 hours, 1.8M) See also the “Internet Traffic Archive” 3 Some Available Simulators ATM-TN simulator (ATM cell-level) Clustered Web server simulator (cws) Web proxy caching hierarchies (Muda) Distributed Web proxy simulator (dws) IP-TN simulator (U of C) IP-TNE network emulator (U of C) LBL’s ns-2 simulator (TCP packet level) 4 Some Useful Tools Synthetic Web proxy workload generation Web client traffic model (mosaic, 1995) LRD traffic analysis (R/S, V-T, AC, etc) GUI for traffic modeling/analysis (synTraff) Wavelet-based traffic model (Ram) Synthetic MPEG video trace generation Wireless Sniffer (network analyzer) 5 Issues and Ideas Improving/extending WebTraff tool Web client traffic modeling Web proxy caching hierarchies Hierarchical vs distributed caching Web response time modeling Improving network TCP flow model (dws) Wavelet-based traffic forecasting Wavelength assignment in WDM networks 6 1. ATM-TN System Overview (1998) Input Data Set ATM-T ATM-N TMF SimKit Output Data Set workstation Report Generation Scripts Report ATM MF WarpKit ESS SMTW UNIX Hardware SPARC, KSR, SGI CBR Poisson Ethernet JPEG/MPEG Web TCP/IP/AAL5 ABR Traffic Models Switch and Network Models ATM MF TMF SimKit WarpKit SMTW X WaiKit ESS UNIX Operating System Sequential: UNIX Workstations (SGI, SPARC, DEC, HP) Parallel: SGI Power Challenge, SPARC 1000 2. Clustered Web Server Model 1 2 File Server 3 Cache Manager Dispatcher (Front End) N Web Clients Server Nodes 9 Object Store Server Parameters Num server nodes Mem cache size Disk cache size Cache replacement policy for each (LRU, LFU, SIZE, DUAL) Comm. latency Cache consistency 10 Dispatch policy (DNS, RR, Redirect, Load) Request distribution policy (requests, bw, conns, affinity, ...) Server bandwidth Per-request bandwidth BW scaling model Performance Metrics Load balancing – – – – – Cache performance – document hit rate – byte hit rate requests bytes bandwidth connections clients Relative improvement versus RR, Rand, etc 11 Comm. overhead Avg response time Avg inflation factor Others... 3. Web Proxy Caching Model Web Servers Aggregate Workload Proxy server Web Clients 12 Hierarchical Proxy Caching Simulation Model Web Servers Complete Overlap Partial Overlap (50%) Proxy server Proxy server Upper Level (Parent) Proxy server Lower Level (Children) No Overlap Web Clients 13 Factors and Levels Cache size Cache Replacement Policy – Recency-based LRU – Frequency-based LFU-Aging – Size-based GD-Size Workload Characteristics – One-timers, Zipf slope, tail index, correlation, temporal locality model 14 WebTraff Conceptual View LLCD F Zipf P Correlation -1 0 +1 s r ProWGen Software Input Parameters 1 Z a 15 c L Synthetic Workload Key Workload Characteristics “One-timers” (60-70% useless!!!) Zipf-like document referencing popularity Heavy-tailed file size distribution (i.e., most files small, but most bytes are in big files) Correlations (if any) between document size and document popularity (debate!) Temporal locality (temporal correlation between recent past and near future references) [Mahanti et al. 2000] 16