(GPFS™).

advertisement
Intelligent Storage and Data
Management with IBM General
Parallel File System (GPFS)
Maciej REMISZEWSKI
IBM Forum 2012 – Estonia
Tallinn, October 9, 2012
RedBull STRATOS
redbullstratos.com/live/
Critical IT Trends for Technical Computing Users
Explosion of
data
How to spot trends, predict
outcomes and take meaningful
actions?
Inflexible IT
infrastructures
How to manage inflexible, siloed
systems and business processes to
improve business agility?
Escalating IT
complexity
How to manage IT costs and
complexity while speeding time-tomarket for new services?
Introducing the new IBM Technical Computing Portfolio
Powerful. Comprehensive. Intuitive.
Solutions
Integrated
Solutions
Industry
Solutions
Parallel Environment
Runtime
Software
Systems
&
Storage
Platform
MPI
Intelligent
Cluster
Platform Cluster
Manager
BG/Q
iDataPlex
Intelligent Cluster
Big
Data
Platform HPC
Parallel Environment
Platform LSF
Developer
Platform Symphony
Engineering and
Scientific Libraries
PureFlex
HPC
Cloud
System x &
BladeCenter
LTO Tape 3592
Automation
P7-775
GPFS
Platform
Application
Center
SoNAS
DS5000
DS3000 DCS3700
NEW HPC Cloud Solutions
Overview
• Innovative solutions for dynamic, flexible HPC cloud
environments
What’s New
• New LSF add-on: IBM Platform Dynamic Cluster V9.1
o Workload driven dynamic node re-provisioning
o Dynamically switch nodes between physical & virtual
machines
o Automated job checkpoints and migration
o Smart, flexible policy and performance controls
• Enhanced Platform Cluster Manager – Advanced capabilities
• New complete, end to end solutions
Use Case 1:
HPC Infrastructure
Management
• Self-service cluster provisioning &
management
• Consolidate resources into a HPC
cloud
• Cluster flexing
Use Case 2:
Self-service HPC
• Self-service job submission &
management
• Dynamic provisioning
• Job migration and/or checkpointrestart
• 2D/3D remote visualization
Use Case 3:
Cloud Bursting
• ‘Burst’ internally to available
resources
• Burst externally to cloud providers
NEW Financial Risk and Crimes Solution
Overview
•High-performance, low-latency integrated risk solution
stack with Platform Symphony - Advanced Edition and
partner products including:
o BigInsights and IBM Algorithmics
o 3rd party Partner products: Murex and Calypso
What’s New
•New solution stacks to manage and process big data with
speed and scale
•Sales tools that highlight value of IBM Platform Symphony
o Financial Risk: Customer testimonial videos; inclusion in
SWG risk frameworks, and S&D blueprints
o Financial Crime: with BigInsight for credit card fraud
analytics
o TCO tool and benchmarks
Use Case 1:
Financial Risk including
Credit Value
Adjustment (CVA)
analytics
• Accelerates compute intensive
workloads up to 4X e.g. Monte
Carlo simulations, Algorithmic
Riskwatch “cube” simulations
• Integrated with IBM
Algorithmics, Murex and
Calypso
• High throughput: 17K tasks/sec
Use Case 2:
Big Data for Financial
Crimes
• Accelerates analyses of data for
fraud and irregularities
• Supports BigInsight
• Faster than Apache Hadoop
distribution
8
NEW Technical Computing for Big Data Solutions
Overview
• High-performance, low-latency “Big Data”
solution stack featuring Platform Symphony,
GPFS, DCS3700, Intelligent Cluster – proven
across many industries
•Low Latency Hadoop stack with Platform
Symphony, Advanced Edition and InfoSphere
BigInsights
What’s New
•New solution stacks to manage and process
big data with speed and scale
IBM General Parallel File System (GPFS)
IBM DCS3700 /
IBM Intelligent Cluster
9
IBM is delivering a Smarter Computing foundation for
Technical Computing
Designed for data
Tuned to the task
Increase throughput,
utilization, and lower
operating costs with
workload optimized
systems and intelligent
resource management
Smarter
Computing
Achieve faster time to insight with
scalable, low latency data access
and control for Big Data analytics
Managed with
Cloud Technologies
Optimize agility with on-demand
and workload-driven dynamic
cluster, grid, and HPC clouds
Technical Computing Software
Simplified management, optimized performance
The backbone of Technical Computing
IBM acquired Platform Computing, a leader in cluster,
grid, and HPC cloud management software
•
20 year history delivering leading management software for
technical computing and analytics distributed computing
environments
•
Enable the usage and management of 1000s of systems as 1
- powering the evolution from clusters to grids to HPC clouds
•
2000+ global customers including 23 of 30 largest enterprises
•
Market leading scheduling engine with high performance,
mission-critical reliability and extreme scalability
•
Comprehensive capability footprint from ready-to-deploy
complete cluster systems to large global grids
•
Heterogeneous systems support
•
Large ISV and global partner ecosystem
•
Global services and support coverage
De facto
Standard for
Commercial HPC
60% of top
Financial Services
Over 5 MM CPUs
under
management
x
From June 2012, IBM Platform Computing™ portfolio is ready to deploy!
12
IBM Platform Computing
can help accelerate your application results
For technical computing and analytics distributed computing environments
Optimizes
Workload
Management
•
•
•
•
Batch and highly parallelized
Policy & resource-aware scheduling
Service level agreements
Automation / workflow
Aggregates
Resource
Pools
•
•
•
•
Compute & Data intensive apps
Heterogeneous resources
Physical, virtual, cloud
Easy user access
Delivers
Shared
Services
•
•
•
•
Multiple user groups, sites
Multiple applications and workloads
Governance
Administration/ Reporting / Analytics
Transforms Static
Infrastructure to
Dynamic
•
•
•
•
Workload-driven dynamic clusters
Bursting and “in the cloud”
Enhanced self-service / on-demand
Multi-hypervisor and multi-boot
13
Clients span many industries
Platform LSF
x
“Platform Computing came to us as a true
innovation partner, not a supplier of technology,
but a partner who was able to understand our
problems and provide appropriate solutions to us,
and work with us to continuously improve
the performance of our system”
- Steve Nevey, Business Development Manager
Red Bull Technology
Watch Red Bull video
Platform Symphony
European Bank "Platform Computing
and its enterprise grid solution enable us
to share a formerly heterogeneous and
distributed hardware infrastructure across
applications regardless of their location,
operating system and application logic, …
helping us to achieve our internal efficiency
targets while at the same time improving
our performance and service quality“
-Lorenzo Cervellin, Head of Global Markets
and Treasury Infrastructure
UniCredit Global Information Services
Platform HPC
“Platform’s software was a
clear leader from the
beginning of the process”
-Chris Collins, Head of Research &
Specialist Computing
University of East Anglia
IBM also offers the most widely used, commercially available,
technical computing data management software
Cluster file system, all nodes access data.
GPFS™
Virtualized Access to Data
Virtualized, centrally deployed, managed,
backed up and grown
Seamless capacity and performance scaling
IBM General Parallel File System - Scalable, highlyavailable, high performance file system optimized for
multi-petabyte storage management
GPFS pioneered Big Data management
Extreme Scalability
Proven Reliability
Performance
File system

263 files
per file system
 Maximum file system size:
bytes
299
 Maximum file size equals file
system size
 Production 5.4 PB file system
Number of nodes
 1 to 8192
No Special Nodes
High Performance Metadata
Add/remove on the fly
Striped Data
Nodes
Equal access to data
Storage
Integrated Tiered storage
Rolling Upgrades
Administer from any node
Data replication
IBM innovation continues with GPFS Active File Management
(AFM) for global namespace
GPFS
GPFS
GPFS
GPFS
GPFS
x
GPFS
GPFS introduced
concurrent file
system access
from multiple
nodes.
Multi-cluster expanded the
global namespace by
connecting multiple sites
AFM takes global namespace
truly global by automatically
managing asynchronous
replication of data
1993
2005
2011
Storage Rich
Servers
NSD/GNRx
Legacy HPC
Storage
Native Raid
x
GPFS™
AFM
Eliminating data islands
SNC
GNRx
MapReduce
Farm
Legacy NFS
Storage
How can GPFS deliver value to your business?
Speed time-tomarket thru
faster analytics
Knowledge
Management
and efficiency
thru file
sharing
x
Reduce storage
costs thru
Life-cycle
Management
GPFS
Maintain business
continuity
thru Disaster
Recovery
Business
flexibility with
Cloud Storage
Innovate with
Map-Reduce/
Hadoop
Speed time-to-market with faster analytics
Faster time-tomarket thru
faster analytics
Knowledge
Management
and efficiency
thru file
sharing
Reduce storage
costs thru
Life-cycle
Management
GPFS
Maintain business
continuity
thru Disaster
Recovery
Business
flexibility with
Cloud Storage
Innovate with
Map-Reduce/
Hadoop
• Issue:
– We are in the era of “Smarter Analytics”
• Data explosion makes I/O a major hurdle.
• Deep analytics result in longer running workloads
• Demand for lower-latency analytics to beat the competition
x
• GPFS was designed for complex and/or large workloads
accessing lots of data:
– Real time disk scheduling and load balancing ensure all relevant
information and data can be ingested for analysis
– Built-in replication ensures that deep analytics workloads can
continue running should a hardware or low level software failure
occur.
– Distributed design means it can scale as needed
Reduce storage costs thru Life-cycle Management
Faster time-tomarket thru
faster analytics
Knowledge
Management
and efficiency
thru file
sharing
Reduce storage
costs thru
Life-cycle
Management
GPFS
Maintain business
continuity
thru Disaster
Recovery
Business
flexibility with
Cloud Storage
Innovate with
Map-Reduce/
Hadoop
• Issue:
– Increasing storage costs as dormant files sit on spinning disks
– Redundant files stored across the enterprise to ease access
– Aligning user file requirements with cost of storage
x
• GPFS has policy-driven, automated tiered storage
management for optimizing file location.
– ILM tools manage sets of files across pools of storage based
upon user requirements
– Tiering across different economic classes of storage: SSD,
spinning disk, tape – regardless of physical location.
– Interface with external storage sub-systems such as TSM and
HPSS to exploit ILM capability enterprise-wide.
Maintain business continuity thru disaster recovery
Faster time-tomarket thru
faster analytics
Knowledge
Management
and efficiency
thru file
sharing
Reduce storage
costs thru
Life-cycle
Management
GPFS
Maintain business
continuity
thru Disaster
Recovery
Business
flexibility with
Cloud Storage
Innovate with
Map-Reduce/
Hadoop
• Issue:
– Need for real-time or low latency file access
– File data contained in geographic areas susceptible to downtime
– Fragmented file based information across a wide geographic
area
x
• GPFS has inherent features that are designed to ensure high
availability of file-based data
– Remote file replication with built-in failover
– Multi-site clustering enables risk reduction of stored data via
WAN
– Space efficient point-in-time snapshot view of the file system
enabling quick recovery
Innovate with Big Data or Map-Reduce/Hadoop
Faster time-tomarket thru
faster analytics
Knowledge
Management
and efficiency
thru file
sharing
Reduce storage
costs thru
Life-cycle
Management
GPFS
Maintain business
continuity
thru Disaster
Recovery
Business
flexibility with
Cloud Storage
Innovate with
Map-Reduce/
Hadoop
• Issue:
– Unlocking value in large volumes of unstructured data
– Mission critical applications requiring enterprise-tested
reliability
– Looking for alternatives to the Hadoop File System (HDFS) for
map-reduce applications
x
• As part of a Research project, there is an active
development project called GPFS-SNC to provide a robust
alternative to HDFS
– HDFS is a centralized file system with a single point of failure,
unlike the distributed design of GPFS
– GPFS Posix compliance expands the range of application that can
access files (read, write, append) vs HDFS which cannot append
or overwrite.
– GPFS contains all of the rich ILM features for high availability and
storage management, HDFS does not.
Knowledge management and efficiency thru file sharing
Faster time-tomarket thru
faster analytics
Knowledge
Management
and efficiency
thru file
sharing
Reduce storage
costs thru
Life-cycle
Management
GPFS
Maintain business
continuity
thru Disaster
Recovery
Business
flexibility with
Cloud Storage
Innovate with
Map-Reduce/
Hadoop
• Issue:
– Geographically dispersed employees need access to same set of
file based information
– Supporting “follow-the-sun” product engineering and
development processes (CAD, CAE, etc)
– Managing and integrating the workflow of highly fragmented
and geographically dispersed
file
data
generated
by
employees
x
• GPFS global name space support and Active File
Management provide core capabilities for file sharing
– Global namespace enables a common view of files, file location
no matter where the file requestor, or file resides.
– Active File Management handles file version control to ensure
integrity.
– Parallel data access allows for large number of files and people
to collaborate without performance impact.
Intelligent Cluster
System x iDataPlex
Optimized platforms to right-size
your Technical Computing operations
IBM leadership for a new generation of Technical Computing
Technical Computing is no longer just
the domain of large problems
– Businesses of all sizes need to harness the
explosion of data for business advantage
– Workgroups and departments are increasingly
using clustering at a smaller scale to drive new
insights and better business outcomes
– Smaller groups lack the skills and resources to
deploy and manage the system effectively
x
IBM Technical Computing expertise
IBM brings experience in supercomputing to
smaller workgroup and department
clusters with IBM Intelligent Cluster™
– Reference solutions for simple deployment across
a range of applications
– Simplified end-to-end deployment and resource
management with Platform HPC software
– Factory integrated and installed by IBM
– Supported as an integrated solution
– Now even easier with IBM Platform Computing
IBM intelligence for clusters of all sizes!
IBM Intelligent Cluster™ – it’s about faster time-to-solution
Take the time and risk out Technical Computing deployment
Building Blocks: Industry-leading IBM and 3rd Party components
Cluster
Manageme
nt
IBM Intelligent Cluster
OS
x
Manageme
nt Servers
Compute
Nodes
Networkin
g
Storage
Factory-integrated, interoperability-tested
system with compute, storage, networking
and cluster management tailored to your
requirements and supported as a solution!
Design
Build
Test
Install
Support
Allows clients to focus on their business
not their IT – that is backed by IBM
IBM Intelligent Cluster simplifies large and small deployments
Large
Small
Research
LRZ SuperMUC
Europe-wide research cluster
9,587 servers, direct-water cooled
x
University of Chile
Earthquake prediction and astronomy
56 servers, air-cooled
Media
Illumination Entertainment
3D Feature-length movies
800 iDataPlex servers
Rear-Door Heat eXchanger cooled
Kantana Animation Studios
Thailand television production
36 iDataPlex servers, air-cooled
Technical Computing Storage
Complete, scaleable, dense solutions
from a single vendor
IBM System Storage® for Technical Computing
Middleware/Tools
Storage
DCS 3700
LTO Tape
GPFS
SONAS
x
DS5000
Services
Complete, scaleable, integrated solutions from single vendor
Scaling to the multi-petabyte and hundreds gigabyte/sec
Industry leading data management software and services
Big Green features lower overall costs
Worldwide support and service
IBM’s Densest Storage Solution Just Got Better…
IBM System Storage DCS3700 Performance Module
6Gb/s x4 SAS-based storage system
Expandable performance, scalability and
density starting at entry-level prices
• Powerful hardware platform
•
•
•
•
2.13GHz quad core processor
12, 24 or 48GB cache / controller
pair
x
8x base 8Gb FC ports / controller pair
Additional host port options via Host Interface Cards
• Drastically Improved Performance
• Supports up to 360 drives
• Fully supports features in recent and upcoming releases
• 10.83 feature set
•
DDP, Enhanced FlashCopy, FlashCopy Consistency Groups, Thin
Provisioning, ALUA, VAAI
Expanded Capabilities of IBM’s Densest Storage Solution…
IBM System Storage DCS3700 now with
Performance Module Option
6Gb/s x4 SAS-based storage system
Expandable performance, scalability and
density starting at entry-level prices
• New DCS3700 Performance Controller
• High density storage system designed xfor General Purpose Computing and High
Performance Technical Computing applications
• IBM’s densest disk system: 60 drives and dual controllers in 4U now scales to over 1PB
per system with 3TB drives
• New Dynamic Disk Pooling feature enables easy to configure Worry-Free storage
reducing maintenance requirements and delivering consistent performance
• New Thin Provisioning, ALUA, VAAI, Enhanced FlashCopy features deliver increased
utilization, higher efficiency, and performance
• Superior serviceability and easy installation with front load drawers
• Bullet-proof reliability and availability designed to ensure continuous
high-speed data delivery
The DCS3700 Can Scale In clusters…with IBM GPFS™
• Combining IBM’s GPFS clustered file management software and DCS3700, creates
an extremely scalable and dense file-based management system
• Using a flexible architecture, “building blocks” of DCS3700+GPFS can be organized
Configuration
Single Building
Block
Two Building
Blocks
2 GPFS x3650
Servers
3 DCS3700
4 GPFS x3650
Servers
6 DCS3700
360TB
262TB
720TB
524TB
Up to 4.8 GB/s
Up to 5.5 GB/s
Up to 9.6 GB/s
Up to 11.0 GB/s
3,600 IOP/s
6,000 IOP/s
7,200 IOP/s
12,000 IOP/s
x
Capacity:
Raw
Usable
Streaming Rate:
Write
Read
IOP Rate (4K trans.)
Write
Read
Customer Success Stories
Applying IBM technology and experience
to solve real-world issues and deliver value
National Digital Archive (NDA) – Poland
Heritage and cultural preservation for society.
The Need:
NDA needed a cost-effective IT solution that it could use to significantly increase
internal efficiencies and archiving capacity. NDA currently holds almost 15 million
photographs, 30,000 sound recordings and 2,500 films. It provides free access to
these materials.
The Solution:
The client implement a solution based on IBM Power Systems servers, IBM System
Storage devices, and IBM GPFS. To provide scalability for the ongoing work,
CompFort Meridian helped NDA implement an IBM Power 750
Express server. Using
x
this system, the client will be able to rapidly expand the digital archive without
impeding the performance of the ZoSIA service. NDA also uses IBM General
Parallel File System to conduct online storage management and integrated
information lifecycle management and to scale accessibility to the expanding
volume of archived material. Using this solution, the client can maintain the
performance of the ZoSIA service even when numerous users access the same
resource at the same time.
The Benefit
NDA saved its nearly 290,000 users an estimated $35million by enabling them to
check archives online rather than spending the time and money required to visit
NDA in person. The client also gained a high-performance, stable and secure
solution to support the ZoSIA online archive system. In addition, with this solution in
place, NDA can consolidate national and cultural remembrance - and extend a sense
of national origin and heritage into the future.
Solution components:
 IBM Power
 IBM General Parallel File
System
DEISA
Enabling 15 European supercomputing centers to collaborate
The need:
“Our work with IBM GPFS
DEISA wanted to advance European computational science through
close collaboration between Europe’s most important
supercomputing centers by supporting challenging computing tasks
and sharing data across a wide-area network.
demonstrates the viability and
usefulness of a global file system for
collaboration and data-sharing
between supercomputing centers—
even when the individual
supercomputing clusters are based
on very different technical
architectures. The flexibility of
GPFS and its ability to support all
the different DEISA supercomputers
is highly impressive.”
The solution:
x architectures to
To allow different IBM® and non-IBM supercomputer
access data from across a wide-area network, DEISA worked closely
with the IBM Deep Computing team to create a global multicluster file
system based on IBM General Parallel File System (GPFS™).
The benefit:
 Allows scientists in different countries to share supercomputing
resources and collaborate on large-scale projects
 Enables allocation of specific computing tasks to the most
suitable supercomputing resources, boosting performance.
 Provides a rapid, reliable and secure shared file system for a
variety of supercomputing architectures—both IBM and non-IBM.
—Dr. Stefan Heinzel,
Director of the Rechenzentrum Garching
at the Max Planck Society
Solution components:
 IBM Power Systems™
 IBM BlueGene®/P
 IBM PowerPC®
 Several non-IBM supercomputing
architectures including Cray XT5,
NEC SX8 and SGI-Altix
 IBM® General Parallel File System
(GPFS™)
Snecma
HPC to achieve regulatory objectives
The Need:
To reduce its impact on the environment, Snecma adopts a number of key factors
such as reducing fuel consumption and therefore greenhouse gas emissions,
reducing noise and choosing environment-friendly materials for manufacturing
and maintenance of aviation engines. The company was required to meet the
'Vision 2020' plan set by the European community. The plan defines the European
aviation industry’s objectives for 2020, with ambitious environmental objectives,
including:
- 50% reduction in perceived noise and CO2 releases per passenger-kilometer
x to the year 2000.
- 80% reduction in nitrogen oxide (NOx) emissions compared
To meet these objectives, Snecma needed heavy investment in research and
development, powered with supercomputers.
The Solution:
Snecma implemented a powerful high performance computing (HPC) environment
with optimal energy efficiency. The core architectural components were based
upon highly dense, low power server cluster packaging, low latency
interconnect, and a high performance parallel file system.
Thanks to IBM technologies, Snecma gains a powerful and reliable high performance
computing solution. The new supercomputer will be used by leading-edge researchers
to make highly complex computations in the aviation field. The simulations carried
on iDataPlex supercomputer allow Snecma to reduce fuel consumption and
therefore greenhouse gas emissions, reduce noise, while addressing data
center energy crisis.
Solution components:
 IBM System x iDataPlex
 IBM General Parallel File
SystemTM
Vestas Wind Systems
Maximize power generation and durability in its wind turbines with HPC
The Opportunity
What Makes it Smarter
This wind technology company relied on the
World Research and Forecasting modeling
system to run its turbine location algorithms,
in a process generally requiring weeks and
posing inherent data capacity limitations.
Poised to begin the development of its own
forecasts and adding actual historical data from
existing customers to the mix of factors used in
the model, Vestas needed a solution to its Big
Data challenge that would be faster, more
accurate, and better suited to the its expanding
data set.
Precise placement of a wind turbine can make a significant difference in the turbine's
performance–and its useful life. In the competitive new arena of sustainable energy, winning the
business can depend on both value demonstrated in the proposal and the speed of RFP response.
Vestas broke free of its dependency on the World Research and Forecasting model with a powerful
solution that sliced weeks from the processing time and more than doubled the capacity needed to
include all the factors it considers essential for accurately predicting turbine success. Using a
supercomputer that is one of the world's largest to-date and a modeling solution designed to
harvest insights from both structured and unstructured data, the company can factor in
temperature, barometric pressure, humidity, precipitation, wind direction and wind velocity at
the ground level up to 300 feet, along with its own recorded data from customer turbine
placements. Other sources to be considered include global deforestation metrics, satellite images,
geospatial data and data on phases of the moon and tides. The solution raises the bar for due
diligence in determining effective turbine placement.
x
Real Business Results
– Reduces from weeks to hours the response time for business user requests
– Provides the capability to analyze ALL modeling and related data to improve the accuracy of
turbine placement
– Reduces cost to customers per kilowatt hour produced and increases the precision of
customer ROI estimates
Solution Components
• IBM Technical Computing – General Parallel File System
• IBM InfoSphere® BigInsights Enterprise Edition
• IBM System x ®, iDataPlex ®
“Today, more and more sites are in complex terrain. Turbulence is a big
factor at these sites, as the components in a turbine operating in
turbulence are under more strain and consequently more likely to fail.
Avoiding these pockets of turbulence means improved cost of energy for
the customer."
- Anders Rhod Gregersen,
Senior Specialist, Plant Siting & Forecasting
…best to hear from a Client themselves
Please join me in welcoming
Ivar Koppel
deputy director of research
Download