The mechanics of EPCC How EPCC learnt to do technology transfer and software engineering the hard way Dr Mark Parsons Commercial Director, EPCC m.parsons@epcc.ed.ac.uk +44 131 650 5022 Structure of talk • What is EPCC today • Learning to deliver on time • The structure of a commercial project • Software development and OGSA-DAI • Project management • Questions and discussion The mechanics of EPCC 2 EPCC Activities • Europe’s largest, most successful supercomputing centre – 15 years old • Vital statistics: – 65 staff – £3.2M turnover (almost) all from external sources Facilities • HPC Research – with a large spectrum of activities – and a critical mass of expertise • Technology Transfer Training European Visitor Coordination Programme Multidisciplinary and multi-funded Strong engagement with industry – from local SMEs to large multinationals – project based consultancy services • Supports research at University of Edinburgh via –access to facilities – training and support – TRACS visitor programme • NeSC – founding partner of National e-Science Centre • Wide variety of leading-edge systems – 1,600 processor HPCx system – 2,000 processor IBM Bluegene/L – 12,000 processor QCDOC • • New investment in Advanced Computing Facility EPCC has a unique breadth of expertise in high performance computing The mechanics of EPCC 3 Commercial activities today • Bespoke software development and software project management for business – network, cluster and high-performance computing – novel application areas – from mushrooms to internet packets • Start-to-finish projects – full software development lifecycle, 3 - 12+ months – most commercial projects are < 6 months • Operate like a business – Commercial Group brings in business, Software Development Group delivers – charge at commercial rates ($1,000 per day) – very delivery focused – all commercial contracts are fixed cost – funded by cash contracts, public funds and European Commission, EU – many of the smaller projects are supported by SE The mechanics of EPCC 4 Clients USA: o Cisco Systems Inc o Sun Microsystems Inc o IBM Corporation o Oracle Corporation o Hewlett Packard o Microsoft o Xilinx Corporation UK: AlmondEngineering Engineering Ltd o Almond Ltd AltamiraLtd Ltd o Altamira Arran Aromatics AromaticsLtd Ltd o Arran CallandersSawmills SawmillsLtd Ltd o Callanders Calman Ltd Ltd o Calman CB Technology TechnologyLtd Ltd o CB Centre for forCustomer CustomerAwareness Awareness o Centre LtdLtd o CERN o Cheltenham & Gloucester plc o DTI DigitalBridges BridgesLtd Ltd o Digital ElektrobitLtd Ltd o Elektrobit o First Group plc Europe: GoldenCrumb CrumbLtd Ltd o Golden o European Commission High Speed SpeedProductions Productions o High LtdLtd IntegritiSolutions Solutions o Integriti LtdLtd + many EU project partners TechnologyLtd Ltd o IP Technology IronsideFarrar FarrarLtd Ltd o Ironside JardineTechnology Technology Ltd o Jardine Ltd o Pepper’s Ghost Productions Ltd Radar World WorldLtd Ltd o Radar Red Lemon LemonLtd Ltd o Red Rosti(Scotland) (Scotland)Ltd Ltd o Rosti QuadstoneLtd Ltd o Quadstone Ltd o SCI Ltd o Scottish Enterprise o The Crown Office o TSB Bank Scotland Ltd o UK Meteorological Office UpstreamSystems SystemsLtd Ltd o Upstream AlphaData DataParallel ParallelSystems Systems Ltd o Alpha Ltd NallatechLtd Ltd o Nallatech The mechanics of EPCC 2000 - 2004 Japan: o Hitachi o NEC Europe o Fujitsu Labs Europe 5 Business Strategy • to solve business problems NOT sell technology • … individual solutions for clients Technology Push Down Academic Research Project size £ X,000,000 £ X00,000 £ X,000 OGSA-DAI SunDCG PGPGrid First Group CCA IPO Autoscreen Microsoft C&G The mechanics of EPCC 6 How do we work? No. The mechanics of EPCC 7 How do we work? • Take pride in a professional approach – Work in small project teams – Project leader, 1-6 developers, technical reviewers • Use documented engineering & management processes – Project management based on PRINCE2 – Engineering using agile methods • Built from experience and industry best practice – – – – Iterative/staged development techniques Requirements triage Test-driven development Tuned to the leading edge of innovative software development The mechanics of EPCC 8 Who does the work? • Currently around 4 business development staff – 2 focus on business development – 2 focus on marketing and publicity • Currently around 20 engineering staff – – – – Three full-time project managers, two software architects c. 15 consultants and principal consultants Staff backgrounds – maths , physics, computer & life sciences Over 100 staff-years of experience, over 1/3 from industry • Typical skills – – – – Java, C/C++, Visual Basic/C#, Perl, Fortran Distributed computing, webservices, XML, J2EE, MPI, OpenMP Databases, SQL, JDBC, XML-DB Software engineering, OO design, UML The mechanics of EPCC 9 EPCC’s early history • Established in 1990 – focus for interest in parallel computing within Physics and CS • Early years largely supported by UK Government “Parallel Applications Programme” – made lots of money working with large UK corporations to optimise/parallelise their codes • How did our funding model come about? – from a belief in the self-funding of University research – we’ve shown it can be done but it’s very difficult – it did mean we had to work with industry from the beginning The mechanics of EPCC 10 EPCC history (continued) • 1990-1994 – funded by UK Government Parallel Applications Programme – grew to 65 staff – many parallelisation projects with UK industry – aerospace, nuclear, oil & gas etc etc – span out company – Quadstone • 1995-1996 – as Gov money dried up so did projects – had to move from long term projects (18 months) to much shorter projects (3-6 months) – major problem – project / cost overruns – nearly had to make many staff redundant The mechanics of EPCC 11 EPCC history (continued) • 1997-2000 – – – – successfully moved markets from large-scale industry to SMEs opportunities focussed around successful EU TTN project projects 3-6 months in duration embarked on having a repeatable process • 2000-now – over the past few years moved into Grid computing – continued to work with industry – wide variety of projects: – OGSA-DAI – data access & integration for the Grid – Intersim – packet level modelling of differentiated services – Golden Crumb – automatic mushroom selection in factory – Cheltenham & Gloucester – data mining for mortgage industry The mechanics of EPCC 12 How does EPCC work today? • We have well developed project processes • Two linked processes – software development process – project management process • Will illustrate software development process using OGSADAI as example • Recently moved to PRINCE2 project management methodology The mechanics of EPCC 13 The project lifecyle • Commercial Group identifies clients and initiates discussions • Following initial discussions CD and technical staff visit company to discuss requirements • High level design written – timings / costs agreed – may involve free code survey at this point • Contract negotiated – fixed price – includes detailed workplan based on design • Project handed to IS – staff scheduled according to skills • All projects have – Project Leader, Applications Consultant, Technical Reviewer – Regular meetings between IS and CG – CG act as account manager to company / funder The mechanics of EPCC 14 OGSA-DAI • Data Access and Integration for databases resources on the Grid • Aim to deliver application mechanisms that: – Meet the data requirements of Grid applications – Functionally, performance and reliability – Reduce development cost of data centric Grid applications – Provide consistent interfaces to data resources – Acceptable and supportable by database providers – Trustable, imposed demand is acceptable, etc. – Provide a standard framework that satisfies standard requirements • A base for developing higher-level services – Data federation / Distributed query processing – Data mining – Data visualisation The mechanics of EPCC 15 OGSA-DAI team EPCC Team, Edinburgh NeSC, Edinburgh NEReSC, Newcastle ESNW, Manchester IBM Development Team, Hursley IBM Dissemination Team The mechanics of EPCC 16 REVIEW Software Process and Teams Programme Board Technical Review Board Peer Review and Inspection Technical Reviewer Users’ Group Design Implement QA Ingest Release Dissem. Support Training Requests Contribs Continual process → DEVELOPERS Reqs. Deep track features Prototype System tests based on reqs USERS Test Cases Use Cases Nightly unit + system tests Testing Additional test cases Fix Bugs Prioritisation The mechanics of EPCC 17 Working together • No more heroes any more – the lone researcher can get into trouble – so don’t do it! – use teams even for small projects – a task leader to keep the bigger picture in mind – a “reviewer” as a technical foil for the developer – distributed extreme programming doesn’t work – be sensible! • Code needs owners – and joint ownership doesn’t work – Java packages and CVS module provide useful boundaries – “buddy” system worked well for a team of 10-12, not as well for 5 – we now have 80,000 lines of Java code + 30,000 lines of documentation The mechanics of EPCC 18 An agile approach to development • Agility is all – – – – Grid/HPC environments and problems = complex systems complex systems = big, complex projects big, complex projects = high risk of failure adopting incremental approaches to requirements, design, and implementation helps minimise risk – delivering small increments regularly is good – good for quality, for visibility, for morale • Keep your eyes on the road – keep an active eye on project risks – think about what happens if this goes wrong – just thinking about it reduces the likelihood it’ll happen! The mechanics of EPCC 19 Releasing software • No release schedule = no releases – don’t timebox research, but do timebox development – HPC is fun and exciting - beware feature creep! – “how’s the project?” – “oh, we’re 95% there” (and always will be…) – frequent release milestones focus developers – but don’t overspecify what will be released • OGSA-DAI had the opposite problem – three months too short – six months about right – major/minor/patch/”special brew” – set your testing timetable in stone The mechanics of EPCC 20 Know your requirements • Requirements, requirements, requirements – write ‘em down! Give ‘em numbers! – remember, requirements aren’t just functional! – whatever they are, they are always testable – tests on HPC systems may be tricky, but that makes it fun! – MoSCoW notation is good – Must, Should, Could, Won't – “how important are Priority 3 requirements again..?” • OGSA-DAI had lots of requirements – but make sure you can understand their worth – real users are often better than good ideas – a user group helps to focus development as software matures The mechanics of EPCC 21 Return, recycle, reuse • Throwaway prototypes never are – “once I’ve proved this, I’ll junk the code” – no, you won’t (or your grad student won’t) – apply some basic process even to trivial codes – even reuse of “good” code is sometimes wrong • OGSA-DAI started with high ideals – beware the big ball of mud – patterns in architecture – “Shantytown” – enables quick exploration of feature territory – must be built on a strong central foundation – must include council legislation aka testing The mechanics of EPCC 22 OGSA-DAI Dashboard The mechanics of EPCC 23 Can I see your documents please? • Document! Document! Document! – Imagine trying to program without a language reference – structure and stability is good – Get people who like writing documents to do them – but get everyone to doc their code – a single editor can provide guidance – Good code documentation can be used by the tooling – Good human documentation will win your users support • Make sure you don’t underestimate the cost – code maintenance and documentation takes longer than code development – make it part of the process The mechanics of EPCC 24 People power • Social engineering is the key – Push decisions down to the developers – “Too many chiefs” – make sure you know what are the key battles to win – Have a process for change – or one person will become very unpopular – developers and managers both think they know what’s best – Understand your teams – different people like working in different ways – no one style for management in OGSA-DAI – Competition is good – go one better The mechanics of EPCC 25 The big picture • Balance the hype – software engineering is about vision vs effort vs requests – expectation management is important – researchers, developers, users and funders are all different – and all want different things – the larger the project, the harder it falls • Listen to your users – – – – – – useability is good it has to install easily don’t change your interface client tooling helps support helpdesk is better user groups are interesting The mechanics of EPCC 26 Software development summary • Agile methods are very sympathetic – the Agile founders disliked Rigid Inflexible Processes too! • Adopt a simple process and toolset – – – – – – • even lightweight process really pays off scoping, requirements and risk analysis up front incremental approach to design, develop, test learn some basic tools (they’re even free!) distributed teams are hard to manage strictly distributed management is even harder Listen to your customers – they always know best The mechanics of EPCC 27 Project Management • All technical staff have a line manager and at least one project leader • Procedures are well documented and have grown up over time • Recently we have moved to PRINCE2 project management methodology for commercial projects – seems to work well but is a bit of a culture shock • • • • We employ staff specifically for project management All staff time is logged – planned and actual A working day has two blocks of 3 hours Staff can bid for time to do research / proposal writing The mechanics of EPCC 28 What is PRINCE2? • PRojects IN Controlled Environments version 2 • A project management standard produced by UK’s Office of Government Commerce (part of DTI) – “PRINCE2 is a process-based approach for project management providing an easily tailored, and scalable method for the management of all types of projects” • PRINCE2 is a de facto UK PM standard – becoming mandatory in the public sector (Gov, NHS, Police) – becoming PM method of choice in business – Unilever, GlaxoWellcome, Tesco, BT, Sun, TSB, NatWest, Norwich Union, Centrica, Cable & Wireless… – becoming widespread in Europe too • PRINCE2 is internationally recognised and respected The mechanics of EPCC 29 What is PRINCE2 not? • PRINCE2 is not a software engineering method – but it grew out of an IT environment – and it fits well with traditional or agile development methods alike • PRINCE2 will not help you code better – but it will help you deliver better quality products, on time – and will stop you falling out with your boss/staff • PRINCE2 will not tell you how to write software – but it will leave you alone to write software your way • PRINCE2 is not a silver bullet – but it’s general, flexible and tailorable and – most importantly – it’s based on common sense The mechanics of EPCC 30 PRINCE2 in a nutshell • Projects have a clear Business Case or they don’t happen – “remind me again why we’re doing this project?” • Projects have a beginning, a middle and an end – clearly defined – they start and they stop – they don’t weeble on forever • Projects run in stages with clearly defined boundaries – get a clear picture of how we’re all doing • Product-based planning focuses on deliverables not tasks – think “what do we have to make?” • Layered management: corporate, board, project, team – each level has a clearly defined interface with the others • Management by exception – if there are no problems, just carry on – management don’t meddle • Change is fundamental: change management is intrinsic – assume things will change and plan accordingly The mechanics of EPCC 31 The PRINCE2 process diagram Corporate or Programme Management Directing a Project Project Mandate Starting up a Project Initiating a Project Controlling a Stage Managing Stage Boundaries Closing a Project Managing Product Delivery Planning The mechanics of EPCC 32 PRINCE2 Components • As well as the processes there are several complementary components… • The Business Case – a key driver – the Why? for the project – either a genuine (commercial) business case or at least a set of compelling reasons – owned by the Executive – monitored throughout the project – if the BC goes away, the project should be stopped • The Project Organisation – describes the four management layers – corporate, board, project, team – everyone should have a job description – make roles and responsibilities clear The mechanics of EPCC 33 PRINCE2 Components (2) • Plans – product-based, as discussed above – write product descriptions for key products • Controls – divide the project into Management Stages – a Stage is “as far ahead as you can plan in reasonable detail” – typically a few months – define reports, meetings etc. • Tolerances – allowed variations in time, budget, scope before escalation triggered – “you have six months, +/- 1 month” – “you must satisfy these requirements; those are optional this stage” The mechanics of EPCC 34 PRINCE2 Components (3) • Quality – the project must define methods for QC and test – quality checks should be built in to the MP process • Risk – think about it, monitor it – one of the best management tools is to ask “what might go wrong?” – and create plans to handle it if it does • Configuration Management – keep track of product versions and histories – software version control tools are a good way of implementing this The mechanics of EPCC 35 PRINCE2 Summary • PRINCE2 is a powerful, flexible, scalable PM approach • It’s based on industry best practice – rooted in software development projects • Provides good, intelligent layers of management control • Formalises, in a positive way, customer relations • Can fit easily with agile software development • It’s the only PM approach with internationally recognised qualifications The mechanics of EPCC 36 Final comments on working with industry • Wear a tie! • Remember that the person you’re meeting is just as nervous of meeting a mad academic as you are of meet a rapacious capitalist • The managing director of the company may be drunk • Always apply Denis Healey’s law of holes: “When in one stop digging” • If you’re going to deliver late – tell the customer straightaway • Listen listen listen!!! It’s the only way to get business The mechanics of EPCC 37 Questions / discussion ? The mechanics of EPCC 38