Global Application Performance Testing at NYU TESTING FRAMEWORK Testing Types Service Lifecycle Testing Infrastructure 1.Component, etc. (do the software parts work alone and together as specified) 3. Performance/Lat ency (how does it work at a distance) 2. User/Functional (is it usable, does it perform the business operations correctly) 4. Load (how does it work under load) Development Pre-Rollout 5. User Experience (how does it work for individual users or groups of users, especially at different locations, different browsers, etc. Operations Testing tools, standards and techniques We wanted to know what was: Acceptable Tolerable Frustrating Common Performance Issues • Network – Network Bandwidth – Network Latency – Jitter • Shared Infrastructure – Response times are highly variable – Performance depends on the load on other applications and network traffic • Web page/Application Issues – Web and other applications not tuned for latency, lack of TCP/IP optimization – Poor or no server/client caching Latency at some NYU Sites (ms) Sydney, Australia Singapore, Singapore Shanghai, China Prague, Czech Republic London, England Florence, Italy Berlin, Germany Abu Dhabi, UAE 0.000ms 50.000ms 100.000ms 150.000ms 200.000ms 250.000ms 300.000ms 350.000ms Maximum Transfer Rates (WAN Bandwidth, Mbps) Sydney, Australia Singapore,… Shanghai, China Prague, Czech… London, England Florence, Italy Berlin, Germany Abu Dhabi, UAE 0 Mbps 10 Mbps 20 Mbps 30 Mbps 40 Mbps 50 Mbps 60 Mbps 70 Mbps 80 Mbps 90 Mbps 100 Mbps Two forms of Testing/Monitoring • Latency Simulator and Actual testing from different locations • End-User Experience performance Methodology • Service Manager defines a typical usage scenario with target response times • Understand response time targets – What is “acceptable”, “tolerable”, and “frustrating” for users – When targets are not known, assume 120% of response times from NY to be “acceptable” • Measure response times of the scenario in New York • Repeat the same test from global location and compare – Remotely login (from NY) to a test machine at global location – Engage test resources at global location when needed • Where applicable, test a combination of OS & Browsers • Analyze the report and provide recommendations • Repeat test, if necessary Testing Tools Used • RTI – RootCause Transaction Instrumentation for Internet Explorer (IE) – Provides a non-intrusive way to record transactions initiated with IE and then analyzes end-user response times to quickly identify the root cause of the slowness. • Firebug – Web development tool (plugin) for Firefox – For editing, debugging, and monitoring CSS, HTML, and JavaScript. – Also for monitoring network activity and analyzing response times. • Web Inspector – Web development tool for Safari • Wireshark – network protocol analyzer – Allows for capturing network packets and analyzing their timings • ySlow – Firefox add-on integrated with Firebug – Web page analyzer – Offers suggestions for improving page’s performance based on predefined or user defined rule sets. Long Distance Performance Simulator • Anue Systems • Profiles created for Global Sites on NYU network Application Performance at a Glance (last year) Retest after performance enhancements Demand on Data Transfer High Videoconferencing JPEG 2000 (Afgan Digital Library) Xythos (Webspace) Xythos ( Files 2.0) Kaltura Library Streaming Library Streaming Services (HIDVL) E-mail Medium Emergency Notification (MIR3) Retest after performance enhancements Alex StarRez People Admin Blackboard Echo360 JPEG 2000 (Afgan Digital Library) Studio Abroad Services (HIDVL) Faculty Digital Archive AP Workflow Hyperion Sakai / iTunes Remedy NYU eVita Low BobCat Audio Streaming Services Knowledgbase Student Health I5-FTP NYU Wiki Services NYU Blog InfoEd ProjTrak FAME BRIO NYU Lists PASS HRIS HR Reporting Albert One Card SIS NYU Home Acceptable Tolerable User Experience SpecID Card Frustrating Advance Fundraising WebApp Performance Tool • Dashboards • Reports • Queries & Analyze What is it? Hardware device (moving to software) Connected to the F5 Span Port Captures a copy of the traffic from F5 switch, according to the watch points TrueSight Device Web Server F5 Switch set up Internet / Network User What information does it capture? • Captures performance metrics of HTTP (and HTTPS) web traffic • Can capture specific information about the request • “Watch points” can be put on specific transactions or pages Host Time Network Time Start /End Time SSL Handshake Time Dashboard for one service What can we do with this information? • Analyze performance issues • Troubleshoot client and network issues • View dashlets to show real-time traffic performance • View highly customized reports • Automatically email alerts based on defined thresholds This has been a game changer for us in identifying specific issues Remediation • Optimizing webpages, applications • Tuning network to ensure we are getting full bandwidth • WAN Acceleration so we don’t have to do TCP tuning for all clients and servers • QoS to ensure appropriate bandwidth use for some applications Learning and New Practices • Learnings – As we always knew, performance testing is hard and complicated, involving all parts of IT – Most app builders assume the user and the system are on a LAN or at least on a short-distance WAN – We now understand our apps much better than we did • New Practices – We test everything this way before going live – We set watch points on our end-user app tool to watch how performance is doing – We work with cloud vendors on how they test our instances before we select, before we go llive, etc.