Managing IT Data Center Infrastructure

advertisement
Managing IT Data Center
Infrastructure
Gaining Visibility to Reduce Risk and Cost
Sean Nicholson
VP Data Center Solutions Technology, Emerson Network Power
Session: NOA-5784
Outline
•
•
•
•
Emerson DCM/DCIM
IBM Predictive Insights
Where the Data Center fits - ITSM, DCIM, BMS
Dynamic Monitoring, warnings of failure, reactive systems – beginnings
of automation
• Lab Results
• Cycle of improvement of operations
• Next Steps
Emerson At-A-Glance 2013
US $24.4 Billion in Sales
Headquarters in
St. Louis, Missouri USA
NYSE: EMR
2
Diversified global
manufacturer
and technology provider
Approximately 135,000
employees worldwide
• Manufacturing and/or sales presence in more than 150 countries
• 235 manufacturing locations around the world
• No. 120 on 2012 FORTUNE 500 list of America’s largest corporations
• Founded in 1890
2
Emerson Network Power
Enabling the Future of Technology
3
Data Centers are Complex Systems
Monitoring and Control
Utility
Facility Alternative
Power
UPS
Battery
Cooling
Network
IT
Security…
H2O
IT and Networks
Ice
Pumps
Utility
Chiller
Precision
Cooling
Utility
Utility,
Rates, Incentives
Substation
Communicating
Revenue Meter
Cooling
Tower
$
Generator
CHP Fuel Cell, MicroTurbine
or Turbine
Parallel or
Transfer Eqpt Medium Voltage
>600VAC Eqpt
Power
Central UPS
Power Distribution Units
Raised Floor
Low Voltage
600VAC Eqpt
DC Power
Service Level Agreements
Compute
• Main Frames
• Volume Servers
• Blade Servers
Storage
• SATA Disk
• Tape
• Blended
Network
• Corporate Networks
• VoIP
• Integrated Blade/Switch
• Network closets
In-Row Power
• Modular UPS
• Rack Mount PDUs
In Row Cooling
• Rear Door Heat Exchanger
• Liquid Cooling Racks
• Overhead Cooling
Data center infrastructure includes:
•
•
•
IT equipment and networks
Power systems, cooling components, generators, the associated switching equipment, patch panels, cables
The actual space associated with all these assets, including the data center floor-plan
Emerson Data Center Management - Trellis™
• Data Center Monitoring
•
Events & Alarms
• Capacity Planning
•
Device placement
•
Future planning
• Asset Management
•
Location, assignment
• Energy Management
•
Power consumption
•
Thermal management
• Console Management
•
Systems Administration
IBM Systems Middleware
Solution Area - IT Service Management (ITSM)
Capability
Solution
Area
Build
Application
Platform
•
•
•
Manage
Optimize
Innovate
IT Service
Management
Integration
Stakeholder
Dashboards
Portfolio
Offering
Areas
Run
Digital
Experience
Smarter
Process
IT Operations
Development
Line of Business
Operations
Performance
Automation
Provide end-to-end insight
for smarter business
decisions
Ensure right availability of
your critical business
solutions
More agility, with lower
cost and risk
Analytics
Proactive outage avoidance and faster problem resolution
on-premise <----- Hybrid -----> cloud
Where Data Center Management Fits
Predict
Challenge: Reacting to performance thresholds is not enough – to
ensure your mission critical applications are always available 24X7,
you must prevent outages by proactively avoiding problems before
they become service impacting
Watson DNA inside
Uses fully automated behaviour
learning algorithms to establish
“normal”. Then applies real-time
assessment of current conditions, to
detect anomalies as they are
emerging and before they become
service impacting
Integrated IT Management
IT Service Management
Data Center Management
Infrastructure Portfolio
Private
Cloud
Public
Cloud
Specialized
Technology
Centers
CoLocation
Primary
DCs
Secondary
DCs
Distributed
Sites
Dynamic Decisions about Best Cost/Service for IT
Workloads – “Hybrid Cloud”
Regional
DCs
IBM Operations Analytics – Predictive Insights
Reduce operational costs by minimizing resources required to manage complex thresholds.
Predict
Gain
the benefits of early problem detection to avoid application, middleware or infrastructure
problems before they impact service.
Product Value Statements
 Automated threshold maintenance: behavioural
learning solution for quick time-to-value
 Understands how your IT & network infrastructure is
inter-related from a holistic viewpoint
 Reduce event volumes utilizing real-time, streaming
analytics to provide early warning alerts for abnormal
issues
 Leverages existing performance & monitoring
management solutions
 Consolidates and unifies performance data
 Works with IBM & non-IBM management solutions
Early Warning: Learn relationships between metrics, and alarm
without static thresholds
Anomaly
Event
Business
Impacted
Bad response time
Normal behaviour learned:
Response Time is longer as the
Response number of User Requests
Time
increases
Good response time
User
Requests
Time
Early Warning
If this healthy relationship
breaks down, say due to a
memory leak, an anomaly is
raised immediately
The problem is detected even
while response time is “good”
11
DCIM Delivers Capacity Management
Intelligent sensors provide insight to optimize efficiency and capacity
Capacity Stack
–
Electrical power consumption
–
Airflow
–
Gross cooling capacity (ex CW)
–
Delivered Net Sensible capacity
–
Remaining Capacity
–
Predictive Diagnostics
Thermal Visualization
•
3D model of the data center showing the thermal conditions
•
Produce 3D rotational view of data center floor and inventory
•
Thermal visualization view of front/rear of IT racks
Why aren’t operations teams proactive today?





Too much data to analyze manually
Existing analytic techniques, such as standard thresholds, are not up to the task
They cannot detect problems while they are emerging (before business impact)
Set performance threshold too high, insufficient warning before total failure.
Set performance threshold too low, too much noise, everything is ignored
If no there is no ‘early detection’ before the outage, operations teams can only react
while outage is already in effect and already losing money...
1
4
1
5
1
6
1
7
1
8
1
1
9
2
2
0
2
1
Datacenter Maturity Model
– from Next Generation Datacenter Management, 451 Research (A.
Lawrence & R. Ascierto), Jan 2014
IT Analytics Makes Sense of Big ∧ Data
Next Steps
IBM Operations
Analytics
Feedback System:
• Set dynamic thresholds of behavior
• Determine acceptable operational
limits
• Establish mitigating actions
Define Data Point Groupings to
Measure:
• For Specific Operational Functions
• For Subsystem Operation
• For Business Service Impact and
Recovery
27
IBM Products and Emerson
IBM Change
Management
IBM Asset
Management
 Provision assets
 Generate a RFC, follow change
approval process
 View asset maintenance history
 Update data in asset repository and
Trellis
 Issue work orders
IBM Business
Service
Management
 Identify potential
business services
impacted by MACs or
data center events
IBM Service
Catalog
IT infrastructure:
Leverage firmware “hooks”
in servers, storage,
networking and telecom
Equipment
 Service request kicks-off
a space planning initiative
Trellis Platform
Dynamic infrastructure
optimization
Physical Infrastructure:
Leverage Liebert product and
monitoring technology for power,
cooling, and environmental data
IBM Monitoring
IBM Event
Management
 Event correlation,
management, and automation
IBM Analytics
 Intelligent data
processing/analytics
 Determine optimal
placement of workload
 Identify consolidation
opportunities
Thank You
Your Feedback is Important!
Access the InterConnect 2016 Conference Attendee
Portal to complete your session surveys from your
smartphone,
laptop or conference kiosk.
Download