Going Big (Data) with MapR Hadoop and Cisco UCS

Going Big (Data) with
MapR Hadoop and
Cisco UCS
Liaison Technologies’ big data management
cloud platform requires a high-performance,
flexible, secure, and robust infrastructure.
“We chose the Cisco Unified Computing System because of
its dense form factor, scalability, and high performance. Our
computational power is now about 80 times greater than our
legacy server infrastructure.
”
- Brad Anderson, Vice President, Big Data Informatics, Liaison Technologies
Liaison Technologies provides cloud-based solutions to help organizations integrate, manage, and secure data across the enterprise. A few years ago Liaison
noticed the increasing market trend of blending data integration and data management needs. In response, the company decided to add data management
capabilities to its strategic cloud platform to enable big data applications.
• Ensure high availability for data management
services.
Challenges
• Maintain compliance with healthcare regulations.
• P
rovide sufficient computing power to execute
complex algorithms quickly.
Case Study | Liaison Technologies
Size: 500 employees
Location: Atlanta, Georgia
© 2015 Cisco and/or its affiliates. All rights reserved.
Industry: Integration and data management
1
Big data management involves organizing, administering, and governing large
volumes of structured and unstructured data across the entire information
lifecycle. As Liaison’s data architects began to design the company’s nextgeneration platform, they realized that the existing platform lacked the
performance and flexibility to meet the needs of big data.
One of Liaison’s first applications involved clinical trials. This research work on
human populations, required for new treatments, produces a significant volume
of data including patient demographics, doctor visits, doctors’ institution
affiliations, lab results, and patient outcomes. A typical trial can comprise 5000
patients, hundreds of testing laboratories, and thousands of providers.
Because these services are vital and cannot go offline for any length of
time, high availability is paramount. In addition, offering services in the
healthcare market subjects Liaison to a number of regulations, such as the
Health Insurance Portability and Accountability Act (HIPAA) and the Health
Information Technology for Economic and Clinical Health Act (HITECH), so the
platform has to support audits and other compliance activities.
Cisco UCS platform enables
high-performance computing.
• Cisco UCS platform provides the computing
power to process complex analysis algorithms in
a dense, scalable form factor.
Solutions
• MapR Distribution offers a robust, secure,
enterprise-grade framework for cloud-based
data management services.
• MapR Distribution flexibility allows Liaison to plug
in the right software tools to deliver customized,
high-value solutions.
MapR performed better than other Hadoop distributions
Liaison conducted a thorough evaluation of the available solutions, focusing
on two specific technology needs: the software framework to store and
process big data in a distributed system, and the computing hardware to
support the software. Early in the evaluation process, Liaison’s technical team
investigated Apache Hadoop, an open source software project that enables
distributed processing of large data sets across clusters of servers.
© 2015 Cisco and/or its affiliates. All rights reserved.
2
But which distribution of Hadoop?
Liaison evaluated the top commercial distributions to determine which one
best met its requirements. This included a proof of concept (POC) to measure
the performance and reliability of Apache HBase versus MapR Distribution
including Apache Hadoop. The MapR solution excelled in both performance
and availability, with enterprise-grade features such as hardening and no single
point of failure. Given these and other advantages, the final decision to go with
MapR Distribution was easy.
For the hardware, Liaison chose the Cisco® Unified Computing System™
(Cisco UCS®) platform because of its need for high performance. Liaison’s
data management services are more computing intensive than its data
integration services. Take, for example, a clinical trial with patients named
Jon Smith, John Smith, and Jonathan Smith. For the clinical trial results to
be meaningful, researchers need to know if these entries refer to the same
person or different people.
MapR provides iron-clad security for
multi-tenant services.
Brad Anderson, Liaison’s vice president of big data informatics, explains how
Liaison addresses this issue: “Our machine-learning algorithms determine
whether similar entries represent the same person with a high degree of
confidence by accessing and comparing information from multiple sources
including diagnoses, doctor visits, hospital records, and laboratory test results.
Getting the answer from the large data set in a 5000-person clinical trial
requires a lot of computing power, but it gives us an edge over competitors
who require a number of manual steps to perform
this kind of analysis.
“We chose the Cisco Unified Computing System
because of its dense form factor, scalability, and
high performance. Our computational power
is now about 80 times greater than our legacy
server infrastructure, an improvement of nearly
two orders of magnitude.”
80X
INCREASE
Security is second to none
From the beginning, Liaison realized that a multitenant solution—several
customers sharing the same computing cluster—was the optimum
architecture, but only if Liaison could ensure the security of each customer’s
information. MapR support for multitenancy was crucial to Liaison and was
not available from the other distributions. “Our Volumes feature creates a
logical separation of data, which allows us to securely host several customers
on the same cluster,” says Anderson. “MapR gives us iron-clad security for
our multitenant cloud-based services.”
© 2015 Cisco and/or its affiliates. All rights reserved.
3
• Cisco UCS system offers 80 times the
improvement in performance over the legacy
server infrastructure.
Results
• Cisco and MapR delivers 99.999% availability.
• Compliance activities are simplified thanks to
enterprise-grade features of MapR Distribution.
• Liaison complies with EU data movement
restrictions using MapR fine-grain control of
disaster recovery operations.
Success is on the horizon
As a cloud services provider, Liaison requires extremely high availability. Any
significant amount of downtime can hamper the company’s ability to deliver
results to its customers on time, which can negatively impact revenue, customer
loyalty, and brand value. ”In traditional environments, if you’re using Hadoop for
data analysis, and it’s down a day, that’s not a disaster,” says Anderson. “For us,
our Hadoop environment is mission critical. We simply can’t have downtime.”
The flexibility of the Cisco and MapR solution meets the unique needs of
individual customers, particularly in the area of database software. Clinical
trial researchers want answers to queries such as, “Find all the doctors
whose patients had glucose levels within a certain range and had good
outcomes in the last six months.” A relational database might take hours or
days to process that query, while a graph database can return the answer in
a minute or two. No matter what the requirement, Liaison is confident about
its ability to deliver. ”The MapR and Cisco platform allows us to choose the
best tool for our customers’ use case,” says Anderson.
And because Liaison clients are subject to a host of regulations such as HIPPA
and HITECH for healthcare, and Payment Card Industry Security Standard and
Gramm-Leach-Bliley Act for financial institutions, “MapR makes it easier to
demonstrate streamlined compliance with enterprise-grade features than if we
had to build the same functionality into another distribution,” says Anderson.
Disaster recovery is another area where the Cisco and MapR platform shines.
“As a cloud provider, the ability to safeguard our customers’ data is vital to
our business,” explains Anderson. “With MapR we just take a snapshot and
mirror it to another data center. MapR gives us fine-grained control of the
movement of the backup data, which allows us to comply with regulations
that restrict the movement of certain kinds of information out of the European
Union. You don’t get that with other Hadoop distributions.”
Liaison is optimistic about its prospects in the data management market. “We
have a superior architecture, which makes it difficult for our competitors to
keep up feature- and functionwise,” says Anderson. “UCS gives us plenty of
computing power, plus the ability to tweak configurations for each customer
use case. MapR lets us customize solutions by plugging in whatever analysis
tools are needed. I can’t overstate the importance of our enterprise-grade
platform as we enter this new and challenging marketplace.”
© 2015 Cisco and/or its affiliates. All rights reserved.
4
More information
• To learn more about Cisco Unified Computing System,
visit www.cisco.com/go/ucs.
• To learn more about MapR Distribution including Apache Hadoop,
visit www.mapr.com.
• •
Products and services
• Cisco UCS
• MapR Distribution including Apache Hadoop Enterprise Database Edition
Americas Headquarters
Cisco Systems, Inc.
San Jose, CA
Asia Pacific Headquarters
Cisco Systems (USA) Pte. Ltd.
Singapore
Europe Headquarters
Cisco Systems International BV Amsterdam,
The Netherlands
Cisco has more than 200 offices worldwide. Addresses, phone numbers, and fax numbers are listed on the Cisco Website at www.cisco.com/go/offices.
Cisco and the Cisco logo are trademarks or registered trademarks of Cisco and/or its affiliates in the U.S. and other countries. To view a list of Cisco trademarks, go to this URL:
www.cisco.com/go/trademarks. Third-party trademarks mentioned are the property of their respective owners. The use of the word partner does not imply a partnership relationship
between Cisco and any other company. (1110R)
© 2015 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.