Requirements & Challenges: Status and Open Questions Hilmar Lapp National Evolutionary Synthesis Center DRIADE Workshop, Dec 5, 2006 The Four Virtues Virtue Use case examples Data Sharing Reproducing, supplementing experiments Special databases Data Discovery Locating data for re-use, synthesis Competitive situation Data Preservation Insuring accessibility against business and career changes Protecting data from loss of integrity Synthesis Facilitating meta-analysis Machine-readable data semantics Challenges vs Virtues Challenge Federated or grid or central storage Interoperability, data services Data & metadata lifecycle management Structured or unstructured or raw data Repository trust-level, data integrity Long-term funding & operation Incentive for user Data provenance, IP, open access Metadata capturing and standard Sharing Disco- Preser Synvery vation thesis Requirements “[…] a singular documented need of what a particular product or service should be or do.” quoted from Wikipedia Requirements gathering is only one step e.g., preceded by conceptual analysis e.g., followed by requirements analysis Requirements are divided into types Functional - things a system must do Non-functional - properties a system must have Constraints - implementation limits Good requirements are … Necessary – must be included or an important feature or property will be missing Unambiguous – only one interpretation. Concise – brief and easy to read, yet conveys the essence of what is required. Consistent – does not contradict other requirements and uses language consistent with other requirements. Complete – stated entirely in one place. Reachable – implementation is feasible. Verifiable – must be able to determine that the requirement has been met. based on Wikipedia Requirements are central Ensure an engineered system meets the expectations of users and stakeholders. Determine the features a system will have. Determine which challenges must be overcome, and to which extent. Can be prioritized to determine the order in which features are developed in which challenges are addressed Gathering Requirements Costs a significant amount of time for both the engineer and user or stakeholder. This time can be wasted if the requirements are later ignored, or change before the implementation completes. Adaptive gathering depth Gather details on recognizably invariant (‘hard’) requirements Use agile approach (get the bottom line, then iterate) for those anticipated to change Expressing Requirements Stating ‘good’ requirements is an engineering art, not a common trait ‘Use cases’ and ‘user stories’ provide the bridge Expressed in common language Need not be concise, unambiguous, complete etc. Reflect expectations of the user or stakeholder, or a hypothetical scenario from a user perspective Engineers will transform a set of use cases into requirements (which may require iteration) Use case example Use case: Submitting a paper with suppl. material. “Author will be taken to the repository to upload her suppl. data in status ‘pending publication’, one file at a time. After uploading, the repository issues a receipt token confirming the deposition. The paper submission site will accept the receipt token for the user to complete the submission.” User story: “I go to the repository to upload the data for a paper I’m submitting, consisting of a file of sequences, and a series of morphological images for N taxa. The results are summarized in a phylogenetic tree with bootstrap values and a table. The repository lets me upload every file, asks pertinent question about what the data is in the file, who created it, and how it was obtained. It then goes ahead and deposits the sequences in Genbank, the images in MorphBank, and the tree in TreeBase. It returns to me the accession numbers for each of those submissions, and the accession number for the data table.” Role for Journals and Soc. Scope of repository is published data, hence Journals are the major stakeholder Authors will be the major users Need to gather requirements of journals for Submission process Data preservation Data access Need help in gathering requirements for creating an incentive for authors to comply Often opportunities can be better gleaned from behavior than elicited through interviewing