Managing IT Data Center Infrastructure Gaining Visibility to Reduce Risk and Cost Sean Nicholson VP Data Center Solutions Technology, Emerson Network Power Session: NOA-5784 Outline • • • • Emerson DCM/DCIM IBM Predictive Insights Where the Data Center fits - ITSM, DCIM, BMS Dynamic Monitoring, warnings of failure, reactive systems – beginnings of automation • Lab Results • Cycle of improvement of operations • Next Steps Emerson At-A-Glance 2013 US $24.4 Billion in Sales Headquarters in St. Louis, Missouri USA NYSE: EMR 2 Diversified global manufacturer and technology provider Approximately 135,000 employees worldwide • Manufacturing and/or sales presence in more than 150 countries • 235 manufacturing locations around the world • No. 120 on 2012 FORTUNE 500 list of America’s largest corporations • Founded in 1890 2 Emerson Network Power Enabling the Future of Technology 3 Data Centers are Complex Systems Monitoring and Control Utility Facility Alternative Power UPS Battery Cooling Network IT Security… H2O IT and Networks Ice Pumps Utility Chiller Precision Cooling Utility Utility, Rates, Incentives Substation Communicating Revenue Meter Cooling Tower $ Generator CHP Fuel Cell, MicroTurbine or Turbine Parallel or Transfer Eqpt Medium Voltage >600VAC Eqpt Power Central UPS Power Distribution Units Raised Floor Low Voltage 600VAC Eqpt DC Power Service Level Agreements Compute • Main Frames • Volume Servers • Blade Servers Storage • SATA Disk • Tape • Blended Network • Corporate Networks • VoIP • Integrated Blade/Switch • Network closets In-Row Power • Modular UPS • Rack Mount PDUs In Row Cooling • Rear Door Heat Exchanger • Liquid Cooling Racks • Overhead Cooling Data center infrastructure includes: • • • IT equipment and networks Power systems, cooling components, generators, the associated switching equipment, patch panels, cables The actual space associated with all these assets, including the data center floor-plan Emerson Data Center Management - Trellis™ • Data Center Monitoring • Events & Alarms • Capacity Planning • Device placement • Future planning • Asset Management • Location, assignment • Energy Management • Power consumption • Thermal management • Console Management • Systems Administration IBM Systems Middleware Solution Area - IT Service Management (ITSM) Capability Solution Area Build Application Platform • • • Manage Optimize Innovate IT Service Management Integration Stakeholder Dashboards Portfolio Offering Areas Run Digital Experience Smarter Process IT Operations Development Line of Business Operations Performance Automation Provide end-to-end insight for smarter business decisions Ensure right availability of your critical business solutions More agility, with lower cost and risk Analytics Proactive outage avoidance and faster problem resolution on-premise <----- Hybrid -----> cloud Where Data Center Management Fits Predict Challenge: Reacting to performance thresholds is not enough – to ensure your mission critical applications are always available 24X7, you must prevent outages by proactively avoiding problems before they become service impacting Watson DNA inside Uses fully automated behaviour learning algorithms to establish “normal”. Then applies real-time assessment of current conditions, to detect anomalies as they are emerging and before they become service impacting Integrated IT Management IT Service Management Data Center Management Infrastructure Portfolio Private Cloud Public Cloud Specialized Technology Centers CoLocation Primary DCs Secondary DCs Distributed Sites Dynamic Decisions about Best Cost/Service for IT Workloads – “Hybrid Cloud” Regional DCs IBM Operations Analytics – Predictive Insights Reduce operational costs by minimizing resources required to manage complex thresholds. Predict Gain the benefits of early problem detection to avoid application, middleware or infrastructure problems before they impact service. Product Value Statements Automated threshold maintenance: behavioural learning solution for quick time-to-value Understands how your IT & network infrastructure is inter-related from a holistic viewpoint Reduce event volumes utilizing real-time, streaming analytics to provide early warning alerts for abnormal issues Leverages existing performance & monitoring management solutions Consolidates and unifies performance data Works with IBM & non-IBM management solutions Early Warning: Learn relationships between metrics, and alarm without static thresholds Anomaly Event Business Impacted Bad response time Normal behaviour learned: Response Time is longer as the Response number of User Requests Time increases Good response time User Requests Time Early Warning If this healthy relationship breaks down, say due to a memory leak, an anomaly is raised immediately The problem is detected even while response time is “good” 11 DCIM Delivers Capacity Management Intelligent sensors provide insight to optimize efficiency and capacity Capacity Stack – Electrical power consumption – Airflow – Gross cooling capacity (ex CW) – Delivered Net Sensible capacity – Remaining Capacity – Predictive Diagnostics Thermal Visualization • 3D model of the data center showing the thermal conditions • Produce 3D rotational view of data center floor and inventory • Thermal visualization view of front/rear of IT racks Why aren’t operations teams proactive today? Too much data to analyze manually Existing analytic techniques, such as standard thresholds, are not up to the task They cannot detect problems while they are emerging (before business impact) Set performance threshold too high, insufficient warning before total failure. Set performance threshold too low, too much noise, everything is ignored If no there is no ‘early detection’ before the outage, operations teams can only react while outage is already in effect and already losing money... 1 4 1 5 1 6 1 7 1 8 1 1 9 2 2 0 2 1 Datacenter Maturity Model – from Next Generation Datacenter Management, 451 Research (A. Lawrence & R. Ascierto), Jan 2014 IT Analytics Makes Sense of Big ∧ Data Next Steps IBM Operations Analytics Feedback System: • Set dynamic thresholds of behavior • Determine acceptable operational limits • Establish mitigating actions Define Data Point Groupings to Measure: • For Specific Operational Functions • For Subsystem Operation • For Business Service Impact and Recovery 27 IBM Products and Emerson IBM Change Management IBM Asset Management Provision assets Generate a RFC, follow change approval process View asset maintenance history Update data in asset repository and Trellis Issue work orders IBM Business Service Management Identify potential business services impacted by MACs or data center events IBM Service Catalog IT infrastructure: Leverage firmware “hooks” in servers, storage, networking and telecom Equipment Service request kicks-off a space planning initiative Trellis Platform Dynamic infrastructure optimization Physical Infrastructure: Leverage Liebert product and monitoring technology for power, cooling, and environmental data IBM Monitoring IBM Event Management Event correlation, management, and automation IBM Analytics Intelligent data processing/analytics Determine optimal placement of workload Identify consolidation opportunities Thank You Your Feedback is Important! Access the InterConnect 2016 Conference Attendee Portal to complete your session surveys from your smartphone, laptop or conference kiosk.