Technology and Collaboration: Researching the development and use of Grid infrastructure for the CERN particle accelerator laboratory. Dr Will Venters & Dr Yingqin Zheng www.pegasusresearch.org.uk The Pegasus Team Three year project funded by the EPSRC programme: “Usability challenges from e-science” (EP/D049954/1) A research in progress! Members: Dr Will Venters (Lecturer & PI – LSE) Dr Tony Cornford (Senior Lecturer – LSE) Dr Mark Lancaster (Senior Lecturer in PP – UCL) Dr Yingqin Zheng (Research Officer -LSE) Avgousta Kyriakidou (PhD student -LSE) Advisory Group: Prof. Tony Doyle, Prof. Steve Lloyd, Dr Elaine Ferneley, Prof. Wanda Orlikowski, Dr Susan Scott, Will Yingqin Tony Mark Avgousta Overview Introduction of the context of the study Grids Experimental particle physics Computing in experimental particle physics Our interests Methodology Theoretical Findings point of departure Grids: Hype or the next big thing? “Overturn strategic and operating assumptions, alter industrial economics, upset markets (…) pose daunting challenges for every user and vendor” (Carr, 2005) “Provide the electronic foundation for a global society in business, government, research, science and entertainment” (Berman, 2003) “Potentially the same social impact as railroads” (Smarr 2004) “Nothing New” and “plenty of confusion” (Gentzsch, 2002) Grids: Technology Emerging platform for coordinated resource sharing and problem solving on a global scale for data-intensive and compute-intensive applications (Foster, 2001) As Internet protocols enable the sharing and integration of information on the Web, so Grid protocols aim to allow the integration of … sensors, applications, data-storage, computer processors and most other IT resources (Wladawsky-Berger, 2004) Centred around standard protocols and middleware. Experiment layer Application Middleware Grid Middleware 1: No central control. 2: Standard open protocols. 3: Non-trivial level of service. Facilities and Fabrics Grids: and Collaboration “Coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations" (Foster, 2001) … which “enable disparate groups of organisations and/or individuals to share resources in a controlled fashion, so that members may collaborate to achieve a shared goal” (Foster, 2001) “E-science is about global collaboration in key areas… will change the dynamic of the way science is undertaken” (John Taylor) It is politics rather than technology which will inhibit grids (Orzech 2003) Advanced Users: Particle Physicists Currently constructing the worlds most powerful particle accelerator… the Large Hadron Collider (LHC) ~100,000,000 electronic channels 800,000,000 proton-proton interactions per second. Searching for Higgs Boson – “1 person in 1000 worlds, or 1 needle in 20 million haystacks” Unprecedented amount of data from the LHC (12-14 million gigabytes) (1% of all info!) CD stack with 1 year LHC data (~ 20 km) (Ex-)Concorde (15 km) We are here (1 km) Who are they? Particle Physics sees itself as an elite. “Particle physics is the unbelievable in pursuit of the unimaginable.” (Guardian) “All science is either physics or stamp collecting” (Rutherford 1962) “Promethean heroes of the search for the truth… They bring news of another world.. the extraordinary scale and costliness of much physics research if anything reinforces its cultural value.” (Traweek 1988) “The culture is built on beliefs in individual genius and outstanding performance that are not (and, in the physicists’ view should not be) in reach of every physicist” (Traweek 1988) PP and Computing Envisage requiring a Grid of 100,000 machines (processors) by 2008. Historically successful at pragmatic use of new technology (Web, Cray, Open-source, farms). “Particle physics has always pushed the bounds of computing. I mean I’m the guy who sort of pushed the first networks which was really; the first use of the Internet.” “Particle physics has never failed because of computing” Highly collaborative working practices (Knorr-Cetina 1999) with few formal lines of authority. Critical Views of Grids? So I think if Ian hadn’t created the concept of the grid it would have been invented here anyway. We may not have tried to match it to a paradigm and called it the same, but it would have had to have been invented because we have to use all these machines. “it’s nothing special…it’s just an intelligent batch system… it’s just that you’ve distributed the resources in a bigger way” Grid is used “as a pseudonym for cluster computing”. “… what we’re essentially building is a data management system…Nobody’s ever built a Grid in the original sense” “Security is a joke, and the whole Grid concept is predicated on a strong security model”. The LHC grid and GridPP 19 UK institutes. £33m (2001-7) GridPP runs around 10,000 nodes. 3000 ‘users’ Tier Architecture The Pegasus Project We study a particular grid (GridPP) as a means to “do” science. We aim to study Grid infrastructure development, deployment and use, as an interaction of technology, practices, knowledge, people, cultures, institutions, and politics… within a specific context – experimental particle physics as it prepares for new experiments. To extract experience and lessons for other e-science projects, as well as other efforts on large information infrastructure. Research focus Explore “actions to do science” alongside “actions of doing science” How the specific needs of the LHC become translated into GridPP, both in the technical and organizational sense; How are working practices of particle physics inscribed into the technology, and dictate how the grid is developed. How is the Grid used by particle physicists to do their scientific work at the LHC… How does the Grid (actual and potential) come to influence the work of particle physicists for the LHC… Methodological Approach Qualitative longitudinal research through studies of the work practices of particle physicists preparing for the LHC, and of those involved in the design and implementation of GridPP and its associated middleware. 30+ interviews, transcribed. We are just beginning our analysis of this first round. Data analysis using Atlas.ti (potential) Grid Users (UCL etc.) Grid Developers (middleware) Grid Deployment (GridPP) Research Findings GridPP as embedded in PP practices Common goal Long term vision “It will work because Collaboration it’s got to work” Pragmatism, bricolage Loose Management and Plenty of Freedom Trust Competence Common Goal & Long Term Vision “I said I was proud of being a particle physicist, this is ‘cause particle physicists always get the job done; by and large because they are driven by one fundamental thing. They want their experiment to work when the beam gets into the accelerator, okay? And that transcends everything else they do.” “…but we are one community, we have one goal, which is to deliver the CMS experiment and win a Nobel Prize, that’s the goal. And we are all working towards that.” “…there’s this Grid paradigm, this vision of this, and this way of working and what’s happened is everybody’s had to try and run and catch up with that and make things work so it meets that vision, … rather than the other way round. … sometimes… the vision doesn’t come until later you know.” “… (the industry), their horizon for getting some return is incredibly short compares with anything we are interested in.” The other side of the coin… Competing visions between a PP grid and a generic grid “…actually we are completely tied into this European Union project structure. So the amount of detailed planning that is now done, where everyone participating in the detailed planning knows that these plans are more or less not what we will do in the end, is extremely atypical for this environment.” “…the focus has to change cause in the past it’s been developing the middleware, building up resources; but now the whole point of investing all this money is to do science. And if you are going to do science, then the whole thing’s a failure from our perspective anyway.” Collaboration in HEP Epistemic culture of Particle Physics: “post-traditional communitarian structures” (Knorr-Cetina 1999). “Distributed collaboration” (Merz 2006) and Distributed Cognition (Hutchins 1995), “In which distribution has not only a physical dimension…but also a social dimension (distribution of cognitive processes…)” (Merz 2006). “There’s no strict line management on top of it; it’s a collaborative project.” Physics has a globalised working practices, mediated by a travelling culture (Merz 2006) Collaboration in Grid development The “achievement is managing the sites globally and working together” “Grid computing is actually linking computing resources that are actually staying under local control and being in the administrative domain of different independent entities and then building something that makes all this look, and behave, from the users perspective as one thing.” “The development effort is very, very distributed, even inside a single component… any change you do here is reflected here, and the teams are in other places.” “So it is not so much a software development, the story we have to tell, it is building this community around the grid computing, and also that, for the first time, we closely interact with other disciplines.” The other side of the coin… Competition between experiments “So basically we’ve got roughly equal performance detectors. We’ve got roughly equal size collaborations. We all know what we’re doing more or less. So the person who’s going to get the computing analysis right is going to win.” “ATLAS is far far bigger than CMS in the UK. The reason for making [a decision between experiments] is based upon a combination of who you’ve worked with before, who you liked, what your prejudices are, were you on the same… experiments as these other people and so on and so forth… Once that decision is made, then there is an irrevocable split for the next 20 years... We don’t talk to each other collectively in any real substantial way. And then of course there’s also the relationship with GridPP which where we do end up talking to ATLAS people and LHCb people and people do take on roles independent of the experiments.. And you know, we don’t sit there with experiment hats on all the time. .. So in other words, it’s bloody complicated.” Pragmatism, bricolage “The approach inside experiments has always been extremely pragmatic. So we were aware of a kind of high level concept and a vision of what it should be looking like, but they worked always bottom up, so they always started with very primitive prototypes, leaving things out that are not necessary for achieving something, and tried to get the users involved as quickly as possible.” “…having said that we are tied in this European project with a very formal structure, a lot of the work is still done, actually the successful work is done mostly in informal ways. So through the experiments you have links to sites and to individuals here, and a lot of the ultimate decision making is done by communicating with these people that you know from former experience.” The other side of the coin… Tension between Computer Scientists and PP “Computer scientists will put together the most elegant things in the universe but it will never work…physicists will come up with the most hacked solution in the world…but it will work.” Of software engineers… “want to do things very formally. They want to design things, they want the project very well defined, but (…) by definition physicists normally don’t know what they want. There’s lots of prototyping and there’s a slight difference in attitude” “there has certainly been some friction along those lines” There is a belief that it is possible to “get a bright graduate student to write something that will work for me in three weeks”. Loose Management & Freedom “The group leader doesn’t get to say what to do”, “Socialist”, ”federation”, ”club”, “meritocracy”. “This environment is based on, if you want, charismatic leadership and people doing things relatively independent but also having the freedom to do them, and not having to report every two minutes on what they are doing.” “Why was the web invented here? Because Tim had the freedom from this hierarchy, to spend a bit of time investigating something which was of interest to him and nobody else here [thought]– oh it’s a waste of time, never mind. He was working on remote procedure calls. And out of it popped the Web... One guy, sitting in his office, who had a dream.” The Other Side of the Coin: Difficulties in distributed management “We replaced this conflict management system to just a bunch of configuration scripts, based on Bash, which every sys admin knows and feels comfortable with. And also making sure they don’t feel that the software controls them, but they control the software. That was very important for us.” “herding cats” “I think that they’ve (GridPP) a little bit lost their way in terms of the organisation of the Tier 2s. And part of this is based on the idea that we can’t tell them what to do. So they use different management software.” Trust “everyone trusts each other to be doing the best they can.. That fundamental trust drives our particle physics group” “you have to trust that people will step up… and do the dirty work as well as doing the glamorous work” “actually the trust between the different high energy physics computing centres is much larger than what, in most of our member countries, are the legal constraints.” “I cannot imagine that a huge car maker would like to crash test, literally, their upcoming models, on another car maker’s machine. No matter what security you put in they may not feel they should do that, for maybe good reasons. Not because the system is inherently not safe, but the trust you need to do this is just not there. But for the scientific community, where this is not such a big issue …” The Other Side of the Coin: Tension with a common Grid “Would you entrust, I mean the Tier 1 centre is critical for UK physics analysis from LHC, right, which is what we’ve had, 200 and something million pounds to do… Would you trust that to somebody else in a different country who didn’t have your interest at heart? No, of course, you wouldn’t.” Next stage of research Users Impact of GridPP on their working practices