The myGrid Developer Perspective The myGrid Case Study Design • Approach to design – Emergent over several years of interaction between CS and bioinformaticians – Previous projects– TAMBIS, GIMS, MicroVBase, et al; – Questionnaire in TAMBIS = suggestion of workflow; – Understanding how bioinformaticians work—models a product of myGrid and after – User days – Email lists – M.Sc. Students and other student projects Who is myGrid for? – Users, developers, maintainers. – Biologists. – Bioinformaticians, resource providers. – Tool builders, system administrators. myGrid users biologists infrequent IS specialists tool builders problem specific bioinformaticians bioinformatics tool builders systems administrators service provider Taverna - Architecture Taverna Workbench Semantic Discovery .. . Service Panel WF Diagram Enactor UI Model Explorer Freefluo Enactor Enactor Web Service Interface WSDL Event Observer GUI plug-in interface Event observer interface .. . Event Generation Logic Processor Event Observer Enactor Core Soaplab Processor REST (HTTP Processor GET PUT) Processor Local Java Object or API consumer Processor .. . Points of extension Processor User Interaction interface .. . Functional Requirements • Functional:Joining tools and data resources in data flows; support for the scientific process; data gathering; provenance; • Non-functional: High-level; flexible; re-useable; replicable; auditable; open • Functional intimately drives non-functional: Tried writing workflow in WSFL, but too low-level -> SCUFL, a high-level data flow language • No usability without utility • Starting at such a low base Usability – Effectiveness: Getting the job done at all – Efficiency: Getting the job done in minutes rather than days – Satisfaction: Getting the job done Technical Strategy • Prototypes– more than one thrown away • Bioinformatician applications builders • No standards chasing– e.g., workflow languages and security • Standard where possible • Few biological standards • Openness has implications Development Issues • Users will not speak to you until you’ve got something • We didn’t own any users • The e-Science programme? Publish or develop? • Computer Science or Application domain • Project dynamics: We knew what we wanted to do…. Evaluation • We got biologists to use it: We did some biology with myGrid • Embedded Ph.D. students • Openness and response • Early users and developers lists • User days for tutorials and clandestine evaluation • No formal evaluation Lessons Learnt • It works and it is used • We didn’t do enough about results visualisation • Bioinformatics is the biggest barrier • Biologists don’t care about infra-structure until it doesn’t work; they want to do biology • Tension between CS and domain needs • Every RC sponsored eScience project must be given Ph.D studentships • These must be given to their users to use their tools • These students can reject the tools • The project must explain why! Future Plans OMII Thinner client; higher data volumes Workbench for super-users Need portal for general users Moving to cheminformatics; biological simulation; adjacent domains