Technology and Collaboration: Researching the development and CERN particle accelerator

advertisement
Technology and Collaboration:
Researching the development and
use of Grid infrastructure for the
CERN particle accelerator
laboratory.
Dr Will Venters & Dr Yingqin Zheng
www.pegasusresearch.org.uk
The Pegasus Team

Three year project funded by the EPSRC programme: “Usability challenges
from e-science” (EP/D049954/1)

A research in progress!

Members:






Dr Will Venters (Lecturer & PI – LSE)
Dr Tony Cornford (Senior Lecturer – LSE)
Dr Mark Lancaster (Senior Lecturer in PP – UCL)
Dr Yingqin Zheng (Research Officer -LSE)
Avgousta Kyriakidou (PhD student -LSE)
Advisory Group:

Prof. Tony Doyle, Prof. Steve Lloyd,
Dr Elaine Ferneley, Prof. Wanda Orlikowski,
Dr Susan Scott,
Will
Yingqin
Tony
Mark
Avgousta
Overview

Introduction of the context of the study
 Grids
 Experimental
particle physics
 Computing in experimental particle physics

Our interests
 Methodology
 Theoretical
 Findings
point of departure
Grids: Hype or the next big thing?

“Overturn strategic and operating
assumptions, alter industrial economics,
upset markets (…) pose daunting
challenges for every user and vendor”
(Carr, 2005)

“Provide the electronic foundation for a
global society in business, government,
research, science and entertainment”
(Berman, 2003)


“Potentially the same social impact as
railroads” (Smarr 2004)
“Nothing New” and “plenty of confusion”
(Gentzsch, 2002)
Grids: Technology

Emerging platform for coordinated
resource sharing and problem solving
on a global scale for data-intensive
and compute-intensive applications
(Foster, 2001)


As Internet protocols enable the
sharing and integration of information
on the Web, so Grid protocols aim to
allow the integration of … sensors,
applications, data-storage, computer
processors and most other IT
resources (Wladawsky-Berger, 2004)
Centred around standard protocols
and middleware.
Experiment layer
Application Middleware
Grid Middleware


1: No central control.
2: Standard open protocols.
 3: Non-trivial level of service.
Facilities and Fabrics
Grids: and Collaboration

“Coordinated resource sharing and problem solving in
dynamic, multi-institutional virtual organizations" (Foster,
2001)



… which “enable disparate groups of organisations
and/or individuals to share resources in a controlled
fashion, so that members may collaborate to achieve a
shared goal” (Foster, 2001)
“E-science is about global collaboration in key areas…
will change the dynamic of the way science is
undertaken” (John Taylor)
It is politics rather than technology which will inhibit grids
(Orzech 2003)
Advanced Users: Particle Physicists





Currently constructing the
worlds most powerful particle
accelerator… the Large Hadron
Collider (LHC)
~100,000,000 electronic
channels
800,000,000 proton-proton
interactions per second.
Searching for Higgs Boson – “1
person in 1000 worlds, or 1
needle in 20 million haystacks”
Unprecedented amount of data
from the LHC (12-14 million
gigabytes) (1% of all info!)
CD stack with
1 year LHC data
(~ 20 km)
(Ex-)Concorde
(15 km)
We are here
(1 km)
Who are they?





Particle Physics sees itself as an elite.
“Particle physics is the unbelievable in pursuit
of the unimaginable.” (Guardian)
“All science is either physics or stamp
collecting” (Rutherford 1962)
“Promethean heroes of the search for the
truth… They bring news of another world.. the
extraordinary scale and costliness of much
physics research if anything reinforces its
cultural value.” (Traweek 1988)
“The culture is built on beliefs in individual
genius and outstanding performance that are
not (and, in the physicists’ view should not
be) in reach of every physicist” (Traweek 1988)
PP and Computing





Envisage requiring a Grid of 100,000
machines (processors) by 2008.
Historically successful at pragmatic use of
new technology (Web, Cray, Open-source,
farms).
“Particle physics has always pushed the
bounds of computing. I mean I’m the guy
who sort of pushed the first networks which
was really; the first use of the Internet.”
“Particle physics has never failed because
of computing”
Highly collaborative working practices
(Knorr-Cetina 1999) with few formal lines of
authority.
Critical Views of Grids?

So I think if Ian hadn’t created the concept of the grid it would
have been invented here anyway. We may not have tried to
match it to a paradigm and called it the same, but it would have
had to have been invented because we have to use all these
machines.

“it’s nothing special…it’s just an intelligent batch system… it’s just
that you’ve distributed the resources in a bigger way”

Grid is used “as a pseudonym for cluster computing”.

“… what we’re essentially building is a data management
system…Nobody’s ever built a Grid in the original sense”

“Security is a joke, and the whole Grid concept is predicated on a
strong security model”.
The LHC grid and GridPP
19 UK institutes.
 £33m (2001-7)
 GridPP runs
around 10,000
nodes.
 3000 ‘users’
 Tier Architecture

The Pegasus Project
We study a particular grid (GridPP) as a means to “do”
science.
We aim to study Grid infrastructure development,
deployment and use, as an interaction of technology,
practices, knowledge, people, cultures, institutions,
and politics… within a specific context – experimental
particle physics as it prepares for new experiments.
To extract experience and lessons for other e-science
projects, as well as other efforts on large information
infrastructure.
Research focus
Explore “actions to do science” alongside “actions of
doing science”

How the specific needs of the LHC become translated into GridPP,
both in the technical and organizational sense;

How are working practices of particle physics inscribed into the
technology, and dictate how the grid is developed.

How is the Grid used by particle physicists to do their scientific work
at the LHC…

How does the Grid (actual and potential) come to influence the work
of particle physicists for the LHC…
Methodological Approach




Qualitative longitudinal
research through studies of the
work practices of particle
physicists preparing for the
LHC, and of those involved in
the design and implementation
of GridPP and its associated
middleware.
30+ interviews, transcribed.
We are just beginning our
analysis of this first round.
Data analysis using Atlas.ti
(potential) Grid
Users
(UCL etc.)
Grid Developers
(middleware)
Grid
Deployment
(GridPP)
Research Findings

GridPP as embedded in PP practices
 Common
goal
 Long term vision
“It will work because
 Collaboration
it’s got to work”
 Pragmatism, bricolage
 Loose Management and Plenty of Freedom
 Trust
 Competence
Common Goal & Long Term Vision

“I said I was proud of being a particle physicist, this is ‘cause particle
physicists always get the job done; by and large because they are
driven by one fundamental thing. They want their experiment to work
when the beam gets into the accelerator, okay? And that transcends
everything else they do.”

“…but we are one community, we have one goal, which is to deliver
the CMS experiment and win a Nobel Prize, that’s the goal. And we
are all working towards that.”

“…there’s this Grid paradigm, this vision of this, and this way of
working and what’s happened is everybody’s had to try and run and
catch up with that and make things work so it meets that vision, …
rather than the other way round. … sometimes… the vision doesn’t
come until later you know.”

“… (the industry), their horizon for getting some return is incredibly
short compares with anything we are interested in.”
The other side of the coin…

Competing visions between a PP grid and a generic grid

“…actually we are completely tied into this European Union
project structure. So the amount of detailed planning that is now
done, where everyone participating in the detailed planning
knows that these plans are more or less not what we will do in
the end, is extremely atypical for this environment.”

“…the focus has to change cause in the past it’s been
developing the middleware, building up resources; but now the
whole point of investing all this money is to do science. And if
you are going to do science, then the whole thing’s a failure from
our perspective anyway.”
Collaboration in HEP

Epistemic culture of Particle Physics: “post-traditional
communitarian structures” (Knorr-Cetina 1999).

“Distributed collaboration” (Merz 2006) and Distributed
Cognition (Hutchins 1995), “In which distribution has not
only a physical dimension…but also a social dimension
(distribution of cognitive processes…)” (Merz 2006).

“There’s no strict line management on top of it; it’s a
collaborative project.”

Physics has a globalised working practices, mediated by
a travelling culture (Merz 2006)
Collaboration in Grid development
The “achievement is managing the sites globally and working together”

“Grid computing is actually linking computing resources that are
actually staying under local control and being in the administrative
domain of different independent entities and then building something
that makes all this look, and behave, from the users perspective as
one thing.”

“The development effort is very, very distributed, even inside a
single component… any change you do here is reflected here, and
the teams are in other places.”

“So it is not so much a software development, the story we have to
tell, it is building this community around the grid computing, and also
that, for the first time, we closely interact with other disciplines.”
The other side of the coin…
Competition between experiments

“So basically we’ve got roughly equal performance detectors. We’ve got
roughly equal size collaborations. We all know what we’re doing more or
less. So the person who’s going to get the computing analysis right is
going to win.”

“ATLAS is far far bigger than CMS in the UK. The reason for making [a
decision between experiments] is based upon a combination of who you’ve
worked with before, who you liked, what your prejudices are, were you on the
same… experiments as these other people and so on and so forth… Once
that decision is made, then there is an irrevocable split for the next 20
years... We don’t talk to each other collectively in any real substantial
way.

And then of course there’s also the relationship with GridPP which where we
do end up talking to ATLAS people and LHCb people and people do take on
roles independent of the experiments.. And you know, we don’t sit there with
experiment hats on all the time. .. So in other words, it’s bloody
complicated.”
Pragmatism, bricolage

“The approach inside experiments has always been
extremely pragmatic. So we were aware of a kind of high
level concept and a vision of what it should be looking
like, but they worked always bottom up, so they always
started with very primitive prototypes, leaving things out
that are not necessary for achieving something, and tried
to get the users involved as quickly as possible.”

“…having said that we are tied in this European project
with a very formal structure, a lot of the work is still done,
actually the successful work is done mostly in informal
ways. So through the experiments you have links to sites
and to individuals here, and a lot of the ultimate decision
making is done by communicating with these people that
you know from former experience.”
The other side of the coin…
Tension between Computer Scientists and PP

“Computer scientists will put together the most elegant things in the
universe but it will never work…physicists will come up with the
most hacked solution in the world…but it will work.”

Of software engineers… “want to do things very formally. They want
to design things, they want the project very well defined, but (…) by
definition physicists normally don’t know what they want. There’s
lots of prototyping and there’s a slight difference in attitude” “there
has certainly been some friction along those lines”

There is a belief that it is possible to “get a bright graduate student
to write something that will work for me in three weeks”.
Loose Management & Freedom

“The group leader doesn’t get to say what to do”,
“Socialist”, ”federation”, ”club”, “meritocracy”.

“This environment is based on, if you want, charismatic
leadership and people doing things relatively independent
but also having the freedom to do them, and not having to
report every two minutes on what they are doing.”

“Why was the web invented here? Because Tim had the
freedom from this hierarchy, to spend a bit of time
investigating something which was of interest to him and
nobody else here [thought]– oh it’s a waste of time, never
mind. He was working on remote procedure calls. And
out of it popped the Web... One guy, sitting in his office,
who had a dream.”
The Other Side of the Coin:

Difficulties in distributed management
 “We
replaced this conflict management system to just
a bunch of configuration scripts, based on Bash,
which every sys admin knows and feels comfortable
with. And also making sure they don’t feel that the
software controls them, but they control the software.
That was very important for us.”
 “herding cats”
 “I think that they’ve (GridPP) a little bit lost their way in
terms of the organisation of the Tier 2s. And part of
this is based on the idea that we can’t tell them what
to do. So they use different management software.”
Trust

“everyone trusts each other to be doing the best they can.. That
fundamental trust drives our particle physics group”

“you have to trust that people will step up… and do the dirty work as
well as doing the glamorous work”

“actually the trust between the different high energy physics
computing centres is much larger than what, in most of our member
countries, are the legal constraints.”

“I cannot imagine that a huge car maker would like to crash test,
literally, their upcoming models, on another car maker’s machine.
No matter what security you put in they may not feel they should do
that, for maybe good reasons. Not because the system is inherently
not safe, but the trust you need to do this is just not there. But for
the scientific community, where this is not such a big issue …”
The Other Side of the Coin:
 Tension

with a common Grid
“Would you entrust, I mean the Tier 1 centre is critical for UK
physics analysis from LHC, right, which is what we’ve had, 200
and something million pounds to do… Would you trust that to
somebody else in a different country who didn’t have your
interest at heart? No, of course, you wouldn’t.”
Next stage of research
Users
 Impact of GridPP on their working
practices

Download