The myGrid Developer Perspective The myGrid Case Study

advertisement
The myGrid Developer
Perspective
The myGrid Case Study
Design
•
Approach to design
–
Emergent over several years of interaction between CS
and bioinformaticians
–
Previous projects– TAMBIS, GIMS, MicroVBase, et al;
–
Questionnaire in TAMBIS = suggestion of workflow;
–
Understanding how bioinformaticians work—models a
product of myGrid and after
–
User days
–
Email lists
–
M.Sc. Students and other student projects
Who is myGrid for?
– Users, developers, maintainers.
– Biologists.
– Bioinformaticians, resource providers.
– Tool builders, system administrators.
myGrid users
biologists
infrequent
IS specialists
tool
builders
problem
specific bioinformaticians
bioinformatics
tool builders
systems
administrators
service
provider
Taverna - Architecture
Taverna Workbench
Semantic
Discovery
..
.
Service
Panel
WF Diagram
Enactor UI
Model Explorer
Freefluo Enactor
Enactor Web Service Interface
WSDL
Event
Observer
GUI plug-in
interface
Event observer
interface
..
.
Event Generation Logic
Processor
Event
Observer
Enactor
Core
Soaplab
Processor
REST (HTTP
Processor
GET PUT)
Processor
Local Java
Object or
API
consumer
Processor
..
.
Points of
extension
Processor
User
Interaction
interface
..
.
Functional Requirements
• Functional:Joining tools and data resources in data
flows; support for the scientific process; data
gathering; provenance;
• Non-functional: High-level; flexible; re-useable;
replicable; auditable; open
• Functional intimately drives non-functional: Tried
writing workflow in WSFL, but too low-level ->
SCUFL, a high-level data flow language
• No usability without utility
• Starting at such a low base
Usability
– Effectiveness: Getting the job done at all
– Efficiency: Getting the job done in minutes rather
than days
– Satisfaction: Getting the job done
Technical Strategy
• Prototypes– more than one thrown away
• Bioinformatician applications builders
• No standards chasing– e.g., workflow
languages and security
• Standard where possible
• Few biological standards
• Openness has implications
Development Issues
• Users will not speak to you until you’ve got
something
• We didn’t own any users
• The e-Science programme? Publish or
develop?
• Computer Science or Application domain
• Project dynamics: We knew what we wanted
to do….
Evaluation
• We got biologists to use it: We did some biology with
myGrid
• Embedded Ph.D. students
• Openness and response
• Early users and developers lists
• User days for tutorials and clandestine evaluation
• No formal evaluation
Lessons Learnt
• It works and it is used
• We didn’t do enough about results visualisation
• Bioinformatics is the biggest barrier
• Biologists don’t care about infra-structure until it doesn’t
work; they want to do biology
• Tension between CS and domain needs
• Every RC sponsored eScience project must be given
Ph.D studentships
• These must be given to their users to use their tools
• These students can reject the tools
• The project must explain why!
Future Plans
 OMII
 Thinner client; higher data volumes
 Workbench for super-users
 Need portal for general users
 Moving to cheminformatics; biological
simulation; adjacent domains
Download