Reap the Benefits of the Evolving HPC Cloud

advertisement
IBM Platform Computing
Reap the Benefits of the
Evolving HPC Cloud
By John Russell, Contributing Editor, Bio•IT World
Produced by Cambridge Healthtech Media Group
IBM Platform Computing
Reap the Benefits of the Evolving HPC Cloud
Harnessing the necessary high performance compute
power to drive modern biomedical research is a formidable
and familiar challenge throughout the life sciences. Modern
research-enabling technologies – Next Generation Sequencing (NGS), for example – generate huge datasets that must be
processed. Key applications such as genome assembly, genome annotation and molecular modeling can be data-intensive, compute intensive, or both. Underlying high performance
computing (HPC) infrastructures must evolve rapidly to keep
pace with innovation. And not least, cost pressures constrain
both large and small organizations alike.
In such a demanding and dynamic HPC environment, Cloud
Computing(i) technologies, whether deployed as a private
cloud or in conjunction with a public cloud, represent a powerful approach to managing technical computing resources. By
breaking down internal compute silos, by masking underlying HPC complexity to the scientist-clinician researcher user
community, and by providing transparency and control to IT
managers, cloud computing strategies and tools help organizations of all sizes effectively manage their HPC assets and
growing compute workloads that consume them.
The IBM® Platform Computing™(ii) portfolio has been driving
the evolution of distributed computing and the HPC Cloud
for over 20 years. Ground-breaking products such as IBM®
Platform™ LSF® were among the first to enable companies to
manage distributed environments from modest clusters to
massive compute farms with tens of thousands of processers
handling thousands of jobs. Most recently the introduction of
IBM® Platform™ Dynamic Cluster, together with IBM® Platform™
Cluster Manager - Advanced Edition, permits turning LSF environments into a dynamic HPC cloud.
At the heart of all shared technical computing is robust
middleware that is positioned between the collection of applications and the diverse IT resources, to handle workload
scheduling and resource orchestration. IBM Platform Computing products fulfill this critical role, providing powerful solutions
for batch-mode computing, service-oriented architectures
(SOA), and innovative MapReduce approaches now being
widely adopted in life science.
Reap the Benefits of the Evolving HPC Cloud
Sequencing Centers are Early HPC Cloud Adopters
Perhaps not surprisingly, large sequencing centers have
been early adopters of high performance cloud computing. Their HPC demands are immense. Worldwide, annual
sequencing capacity was estimated at 13 quadrillion bases
at the end of 2011(iii). It’s also worth noting a single base pair
typically represents about 100 bytes of data (raw, analyzed,
and interpreted). Here’s one example of a leading sequencing
center employing HPC cloud concepts to cope with its data
avalanche:
Wellcome Trust Sanger Institute, a pioneer in the international
consortium to sequence the human genome(iv), has relied on IBM
Platform LSF for more than a decade. In 2010, Sanger Institute generated more than 9 Petabytes of genomics data(v), and today produces roughly 120TB of raw data per week that has to be processed
for analysis. IBM Platform LSF helps the Institute run up to half a
million sequence matching jobs a day, resulting in improved data
processing capability, researchers’ efficiency and time-to-results.
Indeed, HPC cloud computing can be particularly useful in
the life sciences industry where a long legacy of departmental
independence often results in islands of costly but under-used
and under-supported HPC resources. Moreover, the intense
merger and acquisition activity of the past two decades has
exacerbated the industry’s challenge of managing disparate
systems.
Integrating these IT islands into a well-managed shared
environment not only improves efficiency and widens user access but also allows effective rightsizing – up or down – of the
company’s total HPC infrastructure. And flexibility is improved
by the ability to match individual jobs with their best computational resource fit. This is increasingly important as heterogeneous computing (GPGPU-based, FPGA-based, MIC(vi) etc.)
is increasingly used to accelerate specific life science applications.
More broadly, HPC Clouds provide a data center transformation that rationalizes use of IT resources, shortens R&D’s
time-to-results, and ultimately delivers benefits to the business
bottom line. IBM Platform Computing technologies can drive
this transformation.
2
Accelerating Batch, SOA, and MapReduce Jobs
There is not a one-size-fits-all HPC infrastructure. Company
size, HPC workload (e.g. batch or SOA), and user community
requirements all influence the nature of underlying IT architectures. IBM Platform Computing offers a comprehensive range
of systems management solutions for distributed HPC environments. Here’s a brief snapshot of the core offerings:
IBM Platform LSF. Ideal for accelerating batch-oriented computing, Platform LSF is a powerful workload management platform that
scales from small clusters to massive compute farms and grids. It
provides comprehensive, policy-driven scheduling features that enable utilization of all your compute infrastructure resources to ensure optimal application performance.
IBM® Platform™ HPC. Intended to speed and simplify cluster deployment and use, Platform HPC is a complete solution in a single
product. It includes a range of out-of the-box features designed to
help reduce the complexity of your HPC environment and improve
your time-to-solution. Platform HPC is only available bundled with
hardware.
IBM® Platform™ Symphony. When low-latency SOA or MapReduce
functionality is required, Platform Symphony is the choice. It delivers
powerful enterprise-class management for running both compute
and data intensive distributed applications on a scalable, shared
grid. It accelerates dozens of parallel applications, for faster results
and better utilization of all available resources.
All IBM Platform Computing solutions offer highly flexible
policy-based scheduling models to ensure the right job prioritization and resource allocations are executed on a continuously updated basis. Charge-back and guarantees also help ensure that groups get their share of resources to meet business
requirements, making it faster and simpler to deploy heterogeneous applications on the same shared cluster, grid or cloud.
Diverse resources are shared fluidly, bringing utilization closer
to 100 percent, which can translate to reduced time to results,
higher service levels and less labor required to manage IT –
and reduced infrastructure costs for your organization.
Turning on the HPC Cloud
The newest member of the IBM Platform Computing lineup
– IBM Platform Dynamic Cluster – is available as an add-on to
IBM Platform LSF. It turns static Platform LSF clusters into dynamic, shared cloud infrastructure. By automatically changing
the composition of clusters to meet ever-changing workload
demands, service levels are improved and organizations can
do more work with less infrastructure.
Unlike many other solutions, Platform Dynamic Cluster provides the flexibility to automatically provision mixed physical
and virtual environments and leverages existing investments
in hypervisors, management tools, and virtual machine templates to create a dynamic private HPC cloud environment.
Reap the Benefits of the Evolving HPC Cloud
With intelligent policies and features such as live job migration
and automated checkpoint-restart, Platform Dynamic Cluster
helps you increase utilization, while reducing administrator
workload.
Another key member of the IBM Platform LSF family is IBM
Platform Application Center, a powerful and comprehensive
web-based portal for HPC job submission and management.
This is a critical capability for life sciences, particularly in genomics where data analysis pipelines are often complex and
involve several HPC applications being run by the researcher
community. Job submission including customizable application
submission forms and built-in templates; job monitoring; web
services APIs; and integration with industry-leading ISVs are all
part of IBM Platform Application Center.
By reducing training and support needs, by improving
resource security, and by minimizing user errors, IBM Platform Application Center boosts HPC cloud productivity and
increases user-community satisfaction. Standardizing access
to applications also makes policy enforcement easier.
It bears repeating that cluster management – in addition to
workflow and application management – is a fundamental enabler of HPC cloud and shared computing. Two offerings, IBM
Platform Cluster Manager – Standard Edition and IBM Platform
Cluster Manager - Advanced Edition, are able to deliver rapid,
robust deployment of HPC clusters, which can then be turned
into true HPC clouds.
IBM Platform Cluster Manager – Standard Edition is well suited for
clients with a single HPC cluster and more static application requirements. Platform Cluster Manager – Standard Edition delivers the capability to quickly provision, run, manage and monitor HPC clusters
with unprecedented ease.
IBM Platform Cluster Manager – Advanced Edition automates assembly of multiple high-performance technical computing clusters
on a shared compute infrastructure for use by multiple teams. IBM
Platform Cluster Manager – Advanced Edition includes support for
multi-tenant HPC cloud and multiple workload managers.(vii) It creates an agile environment for running both technical computing and
big data analytics workloads to consolidate disparate cluster infrastructure, resulting in increased hardware utilization and the ability
to meet or exceed service level agreements while lowering costs.
IBM Platform Cluster Manager – Advanced Edition is
designed to create a dynamic and flexible multi-tenant HPC
environment where individual clusters can be deployed on demand. By consolidating silos of cluster infrastructure, multiple
groups can share a common HPC resource pool. Through rapid physical and virtual machine provisioning entire clusters can
be deployed and sized to meet the ever-changing demands
of HPC workloads. Administrative overhead is greatly reduced
through self-service and a comprehensive administrative
console.
3
Choosing the Best IBM Platform Computing Option
Despite the range of IBM Platform Computing offerings,
choosing the most appropriate option for your organization is
generally straightforward.
Platform LSF, for example, is optimum for batch workloads,
whether it’s many single threaded jobs run for a short time or
a single parallelized application run on thousands of processors for hours. Platform LSF optimizes the running of the
jobs on the HPC resources and ensures the compute facility
runs as close to 100 percent as is practical. LSF is available
in three editions – Express, Standard, and Advanced – with
best fit based upon the scale of requirements. For many life
science organizations with multiple user constituencies who
are running time-consuming, compute-intensive informatics
(e.g. QSAR analysis(viii)) vying for valuable HPC resources,
batch processing is often the best approach to maximizing IT
resource.
For those building modest size clusters and also buying the
hardware, IBM Platform HPC is an excellent choice to speed
and simplify the process. Based on Platform LSF, Platform
HPC also has a number of other capabilities built in (provisioning, cluster management, IBM Platform MPI, and reporting,
for example). It features a web-based dashboard for systems
monitoring and trouble-shooting, and simplified job submission and management.
Reap the Benefits of the Evolving HPC Cloud
In contrast to Platform LSF, Platform Symphony is designed
to handle huge volumes of extremely low latency tasks for
both compute and data intensive workloads. Historically, the
challenge was if you had a job that took a tenth of a second
to run but your scheduler pushed jobs out once every ten
seconds, then your processor was idle for 99% of the time
on these jobs. Platform Symphony can schedule jobs on the
order of milliseconds and the latency between tasks is on the
order of four milliseconds.
Platform Symphony also accommodates the life sciences’
recent embrace of MapReduce computing (proximity computing in which Big Data is divided into smaller chunks that are colocated on the node on which they are run(ix)). Data-intensive
informatics applications such as genome assembly, variant
calling and other annotation can significantly be accelerated
using MapReduce approaches. A recent performance benchmark done by STAC research showed that Platform Symphony
scheduling is 63 times faster than the open source Hadoop
implementation. The overall result can be dramatic productivity gains.
As a practical reality, most life science organizations cannot
afford (cost, space, staff) to acquire and maintain all the HPC
infrastructure required to meet every need. The IBM Platform
Computing portfolio readily supports “Cloud Bursting” –
reaching beyond your internal cloud to a public cloud as your
capacity requirements fluctuate or as specialized hardware
needs arise.
4
Conclusion
Technical computing has undergone a steady evolution
encompassing servers, clusters, complicated grids, and now
clouds. In the world of scientific computing, the Cloud is the
next evolutionary step and it promises to help companies
achieve major operational and strategic objectives.
Workload optimization across resources, sophisticated business policies, greater access to IT resources, and low latency
processing all combine to shorten the time to results – a
pressing need in life sciences. The result is faster, better science and enhanced productivity in drug R&D and healthcare
delivery. Just as important, the HPC Cloud and shared computing bring stronger control over IT infrastructure cost growth
by consolidating resources, deploying improved monitoring
and reporting, and enabling ‘bursting’ to external resources as
required. Overall, the HPC Cloud improves competitiveness
and drives business benefit.
The IBM Platform Computing portfolio of HPC cloud-enabling products provides innovative tools to reach these goals.
For more information about IBM Platform Computing cloud
solutions, visit the IBM Platform Computing website at ibm.
com/platformcomputing
Reap the Benefits of the Evolving HPC Cloud
Endnotes
i
NIST cloud computing definition, http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf
ii
IBM acquired Platform Computing in Jan 2012, http://www-03.ibm.com/press/us/
en/pressrelease/36372.wss
iii DNA Sequencing Caught in Deluge of Data”, New York Times, Nov. 30, 2011,
http://www.nytimes.com/2011/12/01/business/dna-sequencing-caught-in-delugeof-data.html?_r=1&ref=science
iv International Human Genome Sequencing Consortium, http://www.genome.
gov/11006939
v
WTSI, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3228552/
vi
Intel Many Integrated Core Architecture, http://www.intel.com/content/www/us/
en/architecture-and-technology/many-integrated-core/intel-many-integratedcore-architecture.html
vii
Current workload managers supported include Grid Engine and TIBCO GridServer (formerly DataSynapse GridServer)
viii Quantitative Structure Relationship Activity Analysis
ix
For a detailed perspective see Ronald Taylor’s, An overview of the Hadoop/
MapReduce/HBase framework and its current applications in bioinformatics.) An
overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3040523/
5
Download