Wi-Fi / WLAN Performance Management and Optimization Veli-Pekka Ketonen CTO, 7signal Solutions Copyright © 2014 7signal Solutions, Inc. Topics 1. 2. 3. 4. 5. 6. 2 The Wi-Fi Performance Challenge Factors Impacting Performance The Wi-Fi Performance Cycle 10 step performance optimization flow Selected example data Summary / Questions Copyright © 2014 7signal Solutions, Inc. Wi-Fi Networks are Everywhere! But they are transitioning from “nice to have” to “must have” 3 Copyright © 2014 7signal Solutions, Inc. Wi-Fi Networks are Everywhere! But they are transitioning from “nice to have” to “must have” Challenges with Mission Critical Wi-Fi Networks: Connection issues with new devices & machines Bottlenecks from increasing data traffic Dropped or noisy voice calls Challenging physical environments Changes hourly, daily and weekly 4 Copyright © 2014 7signal Solutions, Inc. Dependable Wi-Fi is Costly and Complex $ Cost Needed to Achieve Reliability BYOD Video Apps Virtual Desktop Location Svcs Mobile Computing Guest Networks Voice over Wi-Fi Reactive focus based on complaints Complexity of Network Number of access points, clients, applications 5 Copyright © 2014 7signal Solutions, Inc. 2. Factors impacting the performance 6 Copyright © 2014 7signal Solutions, Inc. Improper Antenna Selection / Placement Antenna gain pattern Antenna gain direction Behind metal grid? Near to conductive or “dense” surface? In common ceiling mounted APs, sideways down tilted patterns is most useful 7 Max gain sideways Down tilted pattern Attenuation upwards Copyright © 2014 7signal Solutions, Inc. RF power level is not that simple RF power isn’t always what your datasheet and settings tell you Impact of: – – – – – – – – AP/device model Rate/MCS HT 20/40/80 Assumed MIMO gain Assumed diversity/STBC gain Antenna gain Channel #, regulation Passing the Type Approval – Back annotation reliability Lower output power and use antenna gain to reach further with higher rates 8 +20 dBm MIMO/TX div. gain, +3 dB +17 dBm No high MCS/rates, + 3dB +14 dBm HT40 - > HT 20, +2 dB +11 dBm Antenna gain, +3 dB +8 dBm Radio output (no antenna), HT40, highest MCS 180Mbit/s 300 Mbit/s 300 Mbit/s Copyright © 2014 7signal Solutions, Inc. WLAN Transmit Power Control (TPC) can create issues Common implementation measures neighbor APs levels and keep them below a fixed value Power levels may drift to end of the allowed range Clients commonly use +10 - +15 dBm power, running APs much lower levels causes imbalance to link budget. Both uplink and downlink coverage are needed! 9 High received neighbor AP level may drive AP Room power down Room Room Room Room Room ..and cause lack of Room coverage here Room Room Room Room Room Copyright © 2014 7signal Solutions, Inc. Channel & Utilization Issues Channel overlap APs outside channel grid HT conflicts 10 Amount of APs/SSIDs Empty AP vs.. loaded AP Copyright © 2014 7signal Solutions, Inc. Allocate channels properly Use all spectrum you have The most important way to increase capacity -- avoid interference and lower utilization! Some devices do not support all 5 GHz channels, but…try really hard to use all available channels Channel automation parameters may help to make it converge towards a better channel plan If not, use manual channel plan 11 1 1 1 6 1 1 11 1 6 1 1 6 Without a very good reason this should not ever happen Copyright © 2014 7signal Solutions, Inc. Sometimes channel automation is not working well and needs help Continuous channel switching 12 More stable operation Copyright © 2014 7signal Solutions, Inc. Too high rates cause high retries WLAN AP rate control often uses rates that are too high This causes high amount of retries, which have negative impact on performance *Lakshmanan et. al. On link rate adaptation in 802.11n WLANs 13 Optimal rate * Haratcherev et.al. : Automatic IEEE 802.11 Rate Control for Streaming Applications Copyright © 2014 7signal Solutions, Inc. What can rates and retries tell you? Typical in WLAN Retries = HIGH Data rates/MCS = HIGH Target Unstable, high jitter, packet loss, limited capacity Good coverage, reliable operation, high speed and capacity Very slow, at the coverage boundary Speed limited, working ok Retries = LOW Data rates/MCS = LOW 14 Copyright © 2014 7signal Solutions, Inc. Non Wi-Fi Interference Bluetooth Microwave 15 Video cameras Medical devices Copyright © 2014 7signal Solutions, Inc. Legacy mode drives speed down The largest impact from is 802.11b protection When an AP detects an associated 802.11b client, AP turns on protection mode (in beacons and probe responses). AP may turn this on also when it detects another AP using protection mode. When protection mode is on, all clients need to start using either RTS/CTS or CTS-to-Shelf protection to avoid collisions This introduces a significant overhead that usually limits throughputs and capacity remarkably If –b support is off, it’s useful to try to remove devices completely. Otherwise they keep probing with –b rates 16 Copyright © 2014 7signal Solutions, Inc. TCP does not like lost packets or delay TCP uses a mechanism called slow start If a packet loss occurs, TCP assumes that it is due to network congestion and takes steps to rapidly reduce the offered load to the network With slow start, TCP starts increasing rate again when consecutive acknowledgements are received properly Slow-start may perform poorly with wireless networks that are losing packets 17 Copyright © 2014 7signal Solutions, Inc. Retries at different layers using TCP User data User Application (Layer 5-7) User may lose patience in 4-10s varies Desktop virtualization (used sometime to help with layer 1-4 problems) TCP (Layer 4) WLAN (Layer 1-2) Not ACK’d within 2x RTT? -> Resend w/ SLOW START Not ACK’d? -> Resend, 7-25 times = A data packet, illustration purposes only 18 Copyright © 2014 7signal Solutions, Inc. Retries at different layers using UDP User VoIP call, etc. Application (Layer 5-7) UDP does not retransmit, permanently lost packet UDP (Layer 4) WLAN (Layer 1-2) Not ACK’d? -> Resend, 7-25 times Jitter Packet loss 19 = A data packet, illustration purposes only Copyright © 2014 7signal Solutions, Inc. Layer 2 packet fragmentation makes radio more robust #1, 1500 B #2, 1500 B If all goes well, good efficiency ACK #1, 1500 B ACK #1, Retry 1, 1500 B No ACK (lost or any error) #1, 750 B #2, 750 B ACK #3, 750 B ACK #4, 750 B ACK If error is detected, content of the whole 1500B packet is lost and needs to be retransmitted Probability of errors in smaller packet is lower and transmitting it has taken less time in the first place Fragmenting packets increases robustness , but increases overhead Aggregating (e.g. Block ACK), reduces robustness, but increases efficiency Fragmentation threshold default value usually 2346B (>1500B, no fragmenting) 20 Copyright © 2014 7signal Solutions, Inc. Higher QoS helps prioritize data Voice (VO), Video (VI), Best Effort (BE) and Background (BK) classes * Source: IEEE 802.11-08/1214-02-00aa 802.11 QoS Tutorial 21 Copyright © 2014 7signal Solutions, Inc. 3. The Wi-Fi Performance Cycle 22 Copyright © 2014 7signal Solutions, Inc. Answering the Wi-Fi Challenge Problem Wait for complaints Proactive measurements Limited view of network Check end-to-end performance Little historical data Analyze historical trends Guess at service levels Use metrics based reporting Remote issues costly to Centralize diagnosis of resolve 23 Solution problems Copyright © 2014 7signal Solutions, Inc. Bending the Cost Curve $ Cost Needed to Achieve Reliability BYOD Video Apps Virtual Desktop Location Svcs Mobile Computing Guest Networks Voice over Wi-Fi Reactive focus based on complaints Proactive focus based on continuous measurements Complexity of Network Number of access points, clients, applications 24 Copyright © 2014 7signal Solutions, Inc. Performance Management with a Systematic Approach Simulate Client Traffic (Active Tests) Sensor Mgmt Station Access Point(s) Listen to AP / Client Traffic (Passive Tests) 25 Copyright © 2014 7signal Solutions, Inc. The Eye’s Capabilities Synthetic Tests Traffic Analysis RF Analysis Spectrum Analysis Full Packet Capture 26 • End-to-end view at the application layer • Data and voice quality measurements (throughput, packet loss, latency, jitter) • Radio frame header analysis for traffic flow between clients and APs. • KPIs for each client, SSID, AP, band and antenna beam • AP settings, capabilities, signal levels, channels and noise levels • KPIs for each AP, channel and antenna beam • High resolution (280kHz) for ISM band • Interference source analysis with compass directional data on beams • Capture remotely • Easy export to Wireshark or other tool Copyright © 2014 7signal Solutions, Inc. The Wi-Fi Performance Cycle If you can’t measure it, you can’t manage it! Measure - Peter Drucker Assure Verify 27 Analyze Optimize Copyright © 2014 7signal Solutions, Inc. 4. Optimization flow, 10 step process 28 Copyright © 2014 7signal Solutions, Inc. The most important KPIs 29 Connection Success Throughput Packet Loss Latency Jitter Voice quality (MOS) Layer 2 / Layer 1 metrics(passive tests) Data rates Channels Retry rates Signal level Utilization Spectrum data Traffic volume Optimize Assess End user metrics (active tests) Copyright © 2014 7signal Solutions, Inc. Optimization flow at a glance 30 1. Preparations and baseline •Ensure that APs and antennas are positioned correctly •Collect baseline data for a few days, check WLAN SW release, upgrade 2. Channel plan •Maximize available spectrum, organize channels for max capacity potential •Use manual channel plan in dense areas 3. Minimize utilization •Minimize utilization due to unnecessary 802.11 traffic •# of SSIDs, standards, beaconing, probing, data rates, protection, etc. 4. Adjust power levels •Adjust AP power levels & TPC settings for improved SNR at both ends 5. Reduce non-WLAN interference •Remove non-WLAN interference, as much as possible •There is always interference, understand whether it has significant impact 6. Improve radio robustness •Make radio more robust towards remaining interference/noise •Increased power, dropping max MCS, fragmentation, directional antennas 7. Prioritize and balance traffic •QoS categories, AP power levels, load balancing, SSID strategy, roaming 8. LAN/WAN capabilities •Ensure sufficient LAN/WAN capacity and performance are present 9. Improve client operation •Drivers, location, models, settings 10. Physical network changes •If performance is not sufficient, consider HW changes •Directional antennas, add/move APs, replace equipment, end user devices Copyright © 2014 7signal Solutions, Inc. #1. Understand the baseline Collect and review all radio parameter settings Verify AP type, antenna performance and placement Collect baseline performance data for 3-5 days – Understand peaks and valleys in performance – Nighttime data is extremely useful - If empty network can’t provide good throughput, it won’t do that under load either! Analyze and find likely bottlenecks Draft a plan for optimization steps – Make small changes and verify each step 31 Copyright © 2014 7signal Solutions, Inc. #2. Plan the channels carefully Understand # of AP/channel in the whole area Use maximum amount of radio spectrum & channels Align all APs to a common channel grid (1, 6, 11, etc) Fix HT bonding side, HT40+ or HT40Do not overlap bonded with main channel If automation does not provide a balanced plan, assign channels manually Rotate channels evenly within floor Rotate with offset between floors Remove out of grid devices is possible 32 Copyright © 2014 7signal Solutions, Inc. #3. Minimize utilization Reduce number of SSIDs/AP to max. 3-4 – Note: Every SSID sends an own beacon, days and nights – Its common that networks run high utilization w/o clients! Remove 802.11b rates (1, 2, 5.5, 11) and their support Remove low MCS and SS multiples Increase beacon interval from 100ms to 300ms – Note: Some devices do not allow this. E.g. Vocera badges, older VoIP phones and in general older equipment Increase CCA threshold Remove printers and other devices that keep air busy 33 Copyright © 2014 7signal Solutions, Inc. #4. Adjust power levels Define a limited range for TPC algorithms instead of default Observe power level changes also from metrics. Do they correlate with settings? Assign 3-5 dB higher power range for 5 vs. 2.4 GHz Use manual power levels if TPC noes not yield good results If possible, do not exceed the power level that still supports all data rates/MCSs. Consider compensating with higher gain antennas if needed 34 Copyright © 2014 7signal Solutions, Inc. #5. Reduce non-Wi-Fi interference Interference is present, always! Understand level of impact – How are end user metrics impacted? – Correlate spectrum data with metrics Analyze spectrum, where does the noise come from? Bluetooth is the most common non-WLAN source – Keyboard, mouse, headset, handheld readers – Many other potential sources especially at 2.4 GHz band Remove sources when possible Observe impact to throughput and other end user metrics when changes are made If changes are helping, it’s visible in active data 35 Copyright © 2014 7signal Solutions, Inc. #6. Improve WLAN robustness Remove highest rates/MCS (most sensitive) Run voice SSIDs only -g/-a mode without –n Use radio packet fragmentation Enable interference resistant mode if supported 36 Copyright © 2014 7signal Solutions, Inc. #7. Prioritize and balance traffic Separate SSIDs (but keep quantity to minimum) Assign QoS classes with WMM (Wireless Multimedia Extensions) Adjust relative AP power levels to move clients Consider use of load balancing, band steering/select and admission control features Different features offered depending on vendor 37 Copyright © 2014 7signal Solutions, Inc. #8. Ensure sufficient LAN/WAN capacity Observe utilization at the switch/router interfaces Observe packet loss metrics Internet connection speed may be a bottleneck at remote sites Routing data packets always to controller may impact performance Understand what is sufficient throughput for end user and dimension connections accordingly 38 Copyright © 2014 7signal Solutions, Inc. #9. Improve client operation Review all client devices and understand where are their antennas Ensure that antennas are not hidden within metal enclosures and have space to operate properly Upgrade WLAN drivers Turn roaming aggressiveness to medium or low Adjust client power level CTS-to-Self may be more efficient than RTS/CTS 39 Copyright © 2014 7signal Solutions, Inc. #10. Physical changes to network Move APs Add APs Upgrade APs Use good quality and right type of external antennas Every network can be made perform well! 40 Copyright © 2014 7signal Solutions, Inc. 5. Examples 41 Copyright © 2014 7signal Solutions, Inc. Akron Children’s Medical Center 42 Copyright © 2014 7signal Solutions, Inc. Uplink throughput Average improved from ~11 to ~14 Mbit/s (27%) The worst APs improved from ~4 to ~13 Mbit/s. (225%) Antenna change ready 43 Channel change Power level change Codec changes Core LAN upgrade Copyright © 2014 7signal Solutions, Inc. Downlink Throughput Average improved from 13 to 17 Mbit/s (30%) The worst APs improved from 7 to 15 Mbit/s. (110%) Antenna change ready 44 Channel change Power level change Codec changes Core LAN upgrade Copyright © 2014 7signal Solutions, Inc. Packet loss From ~2.5% to ~0.5% Antenna change ready 45 Channel change Power level change Codec changes Core LAN upgrade Copyright © 2014 7signal Solutions, Inc. University, Iowa 46 Copyright © 2014 7signal Solutions, Inc. Downlink throughput (daily) Downlink throughput daily averages have improved 50% 1st 2nd 1st) Disabling power saving 2nd) Disabling b-data rates , area 1 3rd) Disabling b-data rates in other locations 4th) New channel plan areas 1 &2 47 3rd 4th 5th 6th 7th 5th) New TxPwr settings in XXX and channel plan in YYY 6th) Beacon interval change 7th( Channel re-plan area 3 2.4GHz Copyright © 2014 7signal Solutions, Inc. Downlink throughput (hour) Minimum values increase up to ~10x 1st 2nd 1st) Disabling power saving 2nd) Disabling b-data rates , area 1 3rd) Disabling b-data rates in other locations 4th) New channel plan areas 1 &2 48 3rd 4th 5th 6th 7th 5th) New TxPwr settings in XXX and channel plan in YYY 6th) Beacon interval change 7th( Channel re-plan area 3 2.4GHz Copyright © 2014 7signal Solutions, Inc. Avans University of Applied Sciences 49 Copyright © 2014 7signal Solutions, Inc. TCP downlink throughput 1 2 3 4 5 900% improvement in 1st floor 100% improvement in ground floor HT40 More channels AP power levels Beacon 300ms 50 Copyright © 2014 7signal Solutions, Inc. HTTP downlink throughput 1 2 3 4 5 90%/50% improvements 51 Copyright © 2014 7signal Solutions, Inc. Voice Quality (MOS), downlink, hourly 1 52 2 3 4 5 +0.25MOS in ground +0.25MOS in 1st floor Copyright © 2014 7signal Solutions, Inc. Network latency (RTT) 1 2 3 4 5 50% improvement in 1st floor 53 Copyright © 2014 7signal Solutions, Inc. Performance Dashboard Before Analysis and Optimization After Analysis and optimization 54 Copyright © 2014 7signal Solutions, Inc. 6. Summary 55 Copyright © 2014 7signal Solutions, Inc. Summary Wi-Fi is very sensitive to the surroundings and network parameters, even though it somehow works almost no matter where you put it Performance can often be improved significantly by adjusting the network parameters Need relevant continuous data to validate changes Need knowledge of WLAN/RF to decide the actions Optimization requires a pragmatic approach 56 Copyright © 2014 7signal Solutions, Inc. Thank You! Email: veli-pekka.ketonen@7signal.com Presentation: http://go.7signal.com/surfwlpc www.7signal.com @7signal 57 Copyright © 2014 7signal Solutions, Inc.