Self-Managed Networks: Dream or Reality? Jawad Khaki Corporate Vice President Windows Networking & Device Technologies Current Situation Management is expensive Devices only understand low-level settings Diagnostics/monitoring is primitive Need a comprehensive network solution ISP, hotspot Enterprise Home IT Budgets Pain Points Complexity due to inconsistency Heterogeneous world Different configuration models Variety of monitoring techniques Version/vendor specific repair procedures Hard to understand dependencies Networking problems are a significant cause of overall service failure (Oppenheimer, USITS’03) Network causes 15% of all problems resulting in downtime (Forrester survey of IT pros) Not humanly solvable Operator error is largest cause of service failures in some environments (Oppenheimer, USITS’03) 40% of downtime is due to human operators (Candea, ’03) In many environments, operator may not be tech savvy (e.g., home) or even immediately available (e.g., space, sensor nets). Consumer networking support calls are time consuming, e.g., power cycle router/modem = avg 53 min (MS PSS) End-to-End Approach Essential Apps/users understand behavior desired Network admins understand high-level design goals/constraints The dream is to integrate end-user knowledge and administrative goals Big Dreams Self-managing networks Self-deploying and self-cleaning Self-configuring and self-adapting Self-optimizing Self-protecting Self-monitoring Self-diagnosing Self-healing Prevention more than cure A self-* system requires knowledge of itself and its environment, it is self-aware Some Real Examples Today Policy distribution systems allow autodeployment of configuration across a network Routing protocols auto-adapt to topology changes and failures TCP auto-adapts to congestion Demos Product Engineering Challenge Design for experience End user: Focus on the task not technology Network manager: Design, deploy, operate Must get the fundamentals right Essential to think through scenarios Work flow Intelligence Environment Always keeping the customer in mind Hard issues Multiple administrative organizations Different relationships Peers Customer-provider Arbitrary Lack of trust motivates privacy constraints Unaligned goals means configuration is a challenge Possibility of catastrophic failure Defect in automation can have disastrous results “Rogue equipment can create a monster headache. It can easily waste a million dollars of resources.” -IT admin, large LA corporation Broadcast storms due to protocol or software bugs (Spurgeon, 1989) One router vendor tried offering automated config repair features, but found that customers were afraid to deploy it Possibility of exploitation by malware Tension between control and automation Flexibility of business models and preferred treatments Compliance requirements Job security for operators Natural aversion to loss of control Change to unfamiliar technology Need to find the right balance Policy to express high-level constraints Self-management within those constraints Static routes Static addresses etc Control BALANCE Dynamic routing Dynamic addresses etc Automation Summary Innovation in fundamentals just as important as new scenarios Make secure, effortless, reliable, efficient operation the forethought Let humans succeed at what they’re good at Let’s solve the hard issues Dealing with heterogeneity of device types and vendors Hard to visualize existing state and dependencies Expensive to maintain multiple configuration/monitoring systems Need for common solutions Simplicity Heterogeneity Dealing with poorly written applications “Some applications need to know what machine a person is on...we found that giving the docking stations a static IP address and the laptop a static IP address makes it easier for us.” (IT Admin, Medium Org, New York)