University College London Success Story

SUCCESS STORY
ACCELERATE:
ACADEMIC RESEARCH
University College London Transforms
Research Collaboration and Data
Preservation with Scalable Cloud Object
Storage Appliance from DDN
CHALLENGES
• Improve the sharing & access of
project-based research
• Transition the IT responsibilities
from research teams back to IT
• Address the rapid growth of
data, in both volume and velocity
• Comply with funding agencies’
data preservation regulations
• Source an end-to-end solution
with flexible, highly scalable
architecture
SOLUTION
•
•
•
From 21 RFP respondents, UCL
selected a DDN solution that
includes GRIDScaler®, WOS® &
iRODS
This end-to-end File Storage
& Object Storage solution
enabled IT to consolidate &
take-back responsibility for
infrastructure
GRIDScaler File Storage
provided the high performance
and low latency required for
their HPC applications
RESULTS
• WOS Object Storage provided
virtually unlimited scalability
compared to NAS & SAN, incl.
out-of-the-box global and local
collaboration.
• Increased performance AND
cost savings, while simplifying
UCL’s environment with a
collaborative platform to
accelerate research workflows
and time-to-discovery
DDN.COM
| 1.800.837.2298
UNIVERSITY COLLEGE LONDON (UCL), RANKED CONSISTENTLY AS ONE OF THE TOP FIVE UNIVERSITIES
in the world, is London’s leading multidisciplinary university with more than 10,000 staff, over 26,000
students as well as more than 100 departments, institutes and research centers. With 25 Nobel Prize
winners and three Fields medalists among UCL’s alumni and staff, the university has attained a worldclass reputation for the quality of its teaching and research across the academic spectrum.
As London’s premier research institution, UCL has 5,000 researchers committed to applying their collective
strengths, insights and creativity to overcome problems of global significance. The university’s innovative,
cross-disciplinary research agenda is designed to deliver immediate, medium and long-term benefits
to humanity. UCL Grand Challenges, which encompass Global Health, Sustainable Cities, Intercultural
Interaction and Human Wellbeing, are a central feature of the university’s research strategy.
According to Dr. J. Max Wilkinson, Head of Research Data Services for the UCL Information Services
Division, sharing and preserving project-based research results is essential to the scientific method. “I
was brought in to provide researchers with a safe and resilient solution for storing, sharing, reusing and
preserving project-based data,” he explains. “Our goal is to remove the burden of managing project data
from individual researchers while making it more available over longer periods of time.”
THE CHALLENGE
The opportunity to improve the sharing and access of project-based research presented several unique
technical and cultural challenges. On the technical side, the team had to accommodate a variety of different
types of data, growing in volume and velocity. In some cases, a small amount of data is so valuable to a
research team that six discrete copies were retained on separate USB drives or removable hard drives kept
in different locations. In other instances, UCL researchers produce copious amounts of very well-defined
data that pass between compute algorithms under which research sits.
In addition to solving technical problems, the research data services team was faced with the
opportunity to support researchers in a new ‘data-intensive’ world by making it safe and easy to
follow best practices in data management and use best-of-class storage solutions. “We discovered
the valuable data underpinning most research projects were stuck on a hard drives or disk, never
to be seen again,” adds Wilkinson. “If we could provide a framework over which people could share
and preserve data confidently, we could minimize this behavior and improve research by making the
scholarly record more complete.”
To accomplish this, UCL needed to provide an enterprise-class foundation for data manipulation that
met the needs of its diverse user community. While some researchers thought 100GB was a large
amount of data, others clamored for more than 100TB to support a particular project. There was
also an expectation that up to 3,000 individuals from UCL’s total base of 5,000 active researchers and
collaborators would require services within the next 18-to-24 months.
BENEFITS
• Increases performance AND
cost savings, while simplifying
their environment
“We had a simple services proposition that would eliminate the need for research teams to manage
racks of servers and data storage devices,” says Wilkinson. “Of course, this meant we’d need a highly
scalable storage infrastructure that could grow to 100PB without creating a large storage footprint or
excessive administrative overhead.”
• Deployed a collaborative
platform to accelerate research
workflows and time-todiscovery
Additionally, they had to address long-term data retention needs that extended well beyond the realm of
research projects. UCL, along with many other UK research intensive institutes, is faced with increasingly
stringent requirements for the management of project data outputs by UK Research Councils and other
funding bodies in the United Kingdom. As grant funding in the UK supports best practice, it was critical
to have a proven data management plan that documented how UCL would preserve data for sometimes
decades while ensuring maximum appropriate access and reuse by third parties.
• Highest storage density enables
future scalability within their
expensive, limited downtown
London floorspace
THE SOLUTION
• Extends data durability and
protection of research to
comply with regulatory
requirements
• Seamlessly expands to handle
the demands of Big Data from
500TB to 100PB
In seeking a scalable, resilient storage foundation, UCL issued an RFP to solicit insight on different
approaches for consolidating the university’s research data storage infrastructure. Each of the 21
RFP respondents was asked to provide examples of large-scale deployments, which produced farranging answers, including how providers addressed sheer data volume, reduced increasingly complex
environments or delivered overarching data management frameworks.
UCL’s RFP covered a diverse set of requirements to determine each potential solution provider’s
respective strengths and limitations. “We asked for more than we thought possible from a single
vendor—from a synchronous file sharing to a high performance parallel file system, highly scalable,
resilient storage that would be simple to manage,” notes Daniel Hanlon, Storage Architect for Research
Data Services at University College London. “We wanted to cover our bases while determining what was
practical and doable for researchers.”
Recommendations encompassed a broad storage spectrum, including NAS, SAN, HSM, object storage,
asset management solutions and small amounts of spinning disks with lots of back-end tape. “Because we
had such broad requirements, we omitted any vendor that was bound to a particular hardware platform,”
explains Wilkinson. “It was important to be both data and storage agnostic so we would have the flexibility
to support all data and media types without being locked into any particular hardware platform.”
UCL SOLUTION ARCHITECTURE
LEGEND
CLIENT ACCESS METHODS
LEGION
CS
TERAGRID
CLIENTS
GRIDScaler®DATA MOVEMENT
WOS® DATA MOVEMENT
iRODS™ DATA MOVEMENT
MULTI-PROTOCOL ACCESS
ACCESS TO
CLOUD STORAGE
iRODs, iPUT
iDROP, web
iCAT SERVERS
OBJECT STORE
VIA NFS OR CIFS
S3/WEBDAV/
SYNC CLIENTS
iCAT DATABASE
HIGH PERFORMANCE
NFS OR CIFS
DIRECT WOSLib
OR REST
GRIDScaler POLICYDRIVEN FILESETS
iPUT INTO iRODS
NFS/CIFS
ARCHIVAL PUBLISHED
DATA PLACED INTO WOS
iRODS SERVER
NFS/CIFS
WOS CLOUD
REMOTE SITE
iRODS SERVER
DM API
GRIDScaler
®
DM-API DRIVEN
SIMPLE POLICY
DRIVEN ARCHIVAL
INTO OBJECT STORE
iRODS POLICY
DRIVEN TAPE
ARCHIVAL
WOS ACCESS
PUT/GET TO WOS
iRODS SERVER
WOS POLICIES FOR
REPLICATION, REMOTE
REPLICA RETRIEVAL
TAPE
With its ability to support virtually unlimited scalability, object storage appealed to UCL, especially
since it also would be much easier to manage than alternatives. Still, object storage was seen as a
relatively new technology and UCL lacked hands-on experience with large-scale deployments within
the university’s ecosystem. In addition to evaluating the different technologies, UCL also assessed each
provider’s understanding of their environment, as it was critically important to accommodate UCL’s
researcher requirements in order to drive acceptance. “Some of the RFP respondents didn’t understand
DDN.COM
| 1.800.837.2298
We had a simple services
proposition that would
eliminate the need for
research teams to manage
racks of servers and data
storage devices. Of course,
this meant we’d need a
highly scalable storage
infrastructure that could
grow to 100PBs without
creating a large storage
footprint or excessive
administrative overhead.
Dr. J. Max Wilkinson
Head of Research Data Services
for University College London’s
Information Services Division
the difference between the corporate and academic worlds, and the fact that universities by nature
generally have to avoid being tied into particular closed technologies,” adds Hanlon. “Many of the RFP
respondents were eliminated, not because of their technical response, but because they didn’t really get
what we were trying to do.”
As a result, the universe of prospective solutions was reduced to a half-dozen recommendations. As the
team took a closer look at the finalists, they considered each vendor’s academic track record, ability to scale
without overburdening administrators and experience with open-source technology. “We wanted to work
with a storage solutions provider that took advantage of open-source solutions,” Hanlon notes. “This would
enable us to partner with them and also with other academic institutions trying to do similar things.”
In the final analysis, UCL wanted a partner with equal enthusiasm for freeing researchers from the burden
of data storage so they could maximize the impact of their projects. “We were very interested in building
a relationship with a strong storage partner to fill our technology gap,” says Wilkinson. “After a thorough
assessment, DataDirect™ Networks (DDN) met our technical requirements and shared our data storage
vision. In evaluating DDN, we agreed that their solution had a simple proposition, high performance and low
administration overhead.”
The proposed solution, which included the GRIDScaler massively scalable parallel file system and Web
Object Scaler (WOS), also provided the desired scalability and management simplicity. Another plus for WOS
storage was its tight integration with the Integrated Rule-Oriented Data Management Solution (iRODS). This
open-source solution is ideally suited for research collaboration by making it easier to organize, share and
find collections of data stored in local and remote repositories.
“It was important that DDN’s solution gave us multiple ways to access the same storage, so we could be
compatible with existing application codes,” says Hanlon. “The tendency with other solutions was to give us
bits of technology that had been developed in different spaces and that didn’t really fit our problem.”
For 2013, IDC anticipates
object storage growing faster
than any other segment in
the file-and-object storage
market. A driver in the
growth of private cloud
adoption is the control over
data security and resiliency
users get compared to public
clouds. As more companies
look at object storage for
collaborative file sharing,
archive and backup, IDC
is seeing an acceleration
in cloud adoption. UCL is
a great example of this
mainstream private cloud
adoption of object storage
for collaborative file sharing.
Ashish Nadkarni
Research Director, IDC
THE BENEFITS
During a successful pilot implementation involving a half-petabyte of storage, UCL gained first-hand insight
into the advantage of DDN’s turnkey distributed storage and collaboration solution. “The main attraction of
DDN WOS is the combination of an efficient object store with edge appliances to ease integration with other
storage infrastructure,” says Hanlon. Another big plus for UCL is DDN’s high-density storage capacity, which
will enable fitting a lot more disks into existing storage racks, which is crucial to growing while maintaining a
small footprint in UCL’s highly-congested, expensive downtown London location.
As researchers are often reluctant to give up control of their data storage solutions, the team also has been
pleased to discover early adopters who see the value of using the new service to protect and preserve
current data assets. In fact, the new research data service already is getting high marks for performance
reliability, data durability, data backup and disaster recovery capabilities.
UCL predicts that as traction for the new service increases, there will be greater interest in leveraging it to
further extend how current research is reused and exploited to drive more impactful outcomes. By taking
this innovative approach, the UCL Research Data Services team is embracing the open data movement
while enlisting leading-edge technologies to deliver reliable, flexible data access that maximizes appropriate
sharing and re-use of research data.
Additionally, UCL is taking the researcher worry of meeting increasingly strong expectations from funding
organizations out of the storage equation with its plans to add a scalable archive to its dynamic storage
service offering. “We’ll be able to tell researchers that if they use our services, they’ll be compliant with UCL,
UK Research Council and other UK and international funding bodies’ policies and requirements,” Wilkinson
says. “They won’t have to worry about it because we will.”
By providing a framework over which UCL researchers can store and share data confidently, UCL expects to
achieve significant bottom-line cost savings. Early projections around the initial phase of the infrastructure
build out are upwards of hundreds of thousands of UK pounds, simply by eliminating the need for
thousands of researchers to attain and maintain their own storage hardware. “DDN is empowering us to
deliver performance and cost savings through a dramatically simplified approach; in doing so we support
UCL researchers, their collaborators and partners to maintain first class research at London’s global
university,” concludes Wilkinson. “Add in the fact that DDN’s resilient, extensible storage solution provided
evidence of seamless expansion from a half-petabyte to 100PBs, and we found exactly the foundation we
were looking for.”
DDN.COM
| 1.800.837.2298
ABOUT DDN®
DataDirect Networks (DDN) is the world’s leading big data storage supplier to data-intensive, global
organizations. For more than 15 years, DDN has designed, developed, deployed and optimized
systems, software and solutions that enable enterprises, service providers, universities and
government agencies to generate more value and to accelerate time to insight from their data and
information, on premise and in the cloud. Organizations leverage the power of DDN technology
and the deep technical expertise of its team to capture, store, process, analyze, collaborate and
distribute data, information and content at largest scale in the most efficient, reliable and cost
effective manner. DDN customers include many of the world’s leading financial services firms and
banks, healthcare and life science organizations, manufacturing and energy companies, government
and research facilities, and web and cloud service providers. For more information, visit our website
www.ddn.com or call 1-800-837-2298.
SALES@DDN.COM +1.800.837.2298
©2015 DataDirect Networks. All Rights Reserved. All Rights Reserved. DataDirect Networks, the DataDirect
Networks logo, DDN, GRIDScaler, Web Object Scaler, & WOS are trademarks of DataDirect Networks. Other
Names and Brands May Be Claimed as the Property of Others.
v3 (5/15)