A Workflow Manager for NMR : objectives, design, and implementation ● Introduction ● CCPN and WeNMR ● NMR data and software ● WMS Workflow Management System ● Goals ● Implementation ● Status CCPN ● Collaborative Computing Project for NMR (Nuclear Magnetic Resonance) ● Funded by BBSRC since 1999 ● WeNMR project partner ● Goals: ● Unifying platform for NMR software ● Community-based, open-source software development ● Outreach: meetings and courses CCPN Credits ■ University of Cambridge ● ● ● ● ● ● John Ionides (WMS lead) Rasmus Fogh (WMS) Wayne Boucher Tim Stevens Magnus Helgstrand Prof. Ernest Laue ■ University of Leicester ● Prof. Geerten Vuister ■ EBI (PDBe), Hinxton ● Pieter Hendrickx ● Aleksandras Gutmanas ● Gerard Kleywegt Funding: ● EU FP7 (WeNMR partner) ● BBSRC (CCPN grant) ● AstraZeneca, Genentech, Janssen Pharmaceutical, Lead Pharma, Medivir, Novartis, NovoNordisk, Syngenta, Vernalis WeNMR NMR Calculation web portals ● Introduction ● CCPN and WeNMR ● NMR data and software ● WMS Workflow Management System ● Goals ● Implementation ● Status Macromolecular NMR pipeline NMR processing Analysis Assignment Structure generation, Dynamics, Interactions Validation ● NMR experiments give atomic-level, qualitatively diverse information ● No direct mathematical relationship between final model and original measurements ● Peak-atom mapping (‘assignment’) is ‘puzzle solving’ ● Redone for each sample group, semi-ambiguous ● Data heterogenous and extremely complex ● Workflow often branched or recursive NMR software ● Many programs - limited resources ● Programs often done by single person, ● who has since left or become professor ● and is thus unavailable for maintenance ● Each program allows different subset of possible data ● Different data formats, with different dialects ● Different atom naming systems, with local variants Task1 Convert Task2 Convert Convert Task1 Native Disorganisation Task3 CCPN Data Standard Data Standard Task2 Task3 Task1 Task1 CCPN data standard ● Program-independent data storage and exchange ● Final and intermediate data for all relevant programs ● Keyword-value for program-specific parameters ● ● ● ● Precise, detailed, comprehensive, coherent Object-oriented, normalized, for changing data UML model as source for automatic code generation Data access libraries for Python, Java, C; XML and SQL storage. 800 000 lines/implementation ● BIG! ● 531 classes, 135 data type definitions, 2044 attributes, 1790 roles Molecule and NMR data (20% of classes) ccp.molecule.ChemComp.NonStdChemComp ccp.molecule.Molecule.MolResidue +serial: Int +seqCode: Int +seqInsertCode: Line = +molType: MolType +ccpCode: LongWord +linking: ChemCompLinking +descriptor: Line +code1Letter: Word +stdChemCompCode: LongWord +getCode1Letter() ccp.molecule.ChemComp.StdChemComp +code1Letter: Word +stdChemCompCode: LongWord ccp.molecule.ChemComp.ChemComp +molType: MolType +ccpCode: LongWord +name: Text +code3Letter: ThreeLetterCode +code1Letter: Word +baseGlycoCtCode: String +commonNames: Text +stdChemCompCode: LongWord +keywords: Line +merckCode: Word +sigmaAldrichCode: Word +beilsteinCode: Word +casRegCode: Word +details: Text +hasStdChirality: Boolean +isLinearPolymer: Boolean +getMolType() +getCcpCode() +getChemCompVar() +setChemCompVar() +getCode1Letter() 1 +possibility 1 1 * 1 11 * * ccp.molecule.Molecule.MolSeqFragment +organismName: Line +organismSciName: Line ccp.molecule.ChemComp.ChemBond +bondType: BondType = single +stereochem: BondStereochemistry * +getOrganismName() +getOrganismSciName() 1 +getLinkEnds() +getFormula() +getMolecularMass() +getChemBonds() +getChemAngles() +getChemTorsions() +getChemAtomSets() +getChemCompSysNames() +getName() +setName() 1 1 * 1 2 +linking: ChemCompLinking +descriptor: Line +isDefaultVar: Boolean = False +name: Text +varName: Text +glycoCtCode: String +formula: Line +formalCharge: Int +molecularMass: Float +nonStereoSmiles: String +stereoSmiles: String +isParamagnetic: Boolean +isAromatic: Boolean 1 +getCode1Letter() +getStdChemComp() +getIsLinearPolymer() 1 1 +limitResidues ccp.molecule.ChemComp.ChemCompVar * +serial: Int +homologyRatio: FloatRatio +dbRefAlignBegin: Int +dbRefAlignEnd: Int +nIdentical: Int +nPositive: Int +alignLength: Int +alignmentScore: Float +alignmentProgram: Text +details: String * * * * * ccp.molecule.MolSystem.MolSystemSysName ccp.molecule.Molecule.MolResLinkEnd 1 +namingSystem: Word +name: Line * ccp.molecule.MolSystem.Chain +code: Line +pdbOneLetterCode: Line = +role: Line +magnEquivalenceCode: Line +physicalState: Line +conformationalIsomer: Line +chemExchangeState: Line +empiricalFormula: Line +molecularMass: Float +formalCharge: Int +details: Text * 2 ccp.molecule.Molecule.MolResLink +dihedralAngle: LinkDihedralAngle +isStdLinear: Boolean ccp.molecule.MolSystem.Atom *2 +getIsStdLinear() +name: Word 1..* * 1 +boundChemAtom +remoteChemAtom * +getChemElement() ccp.molecule.ChemComp.ChemAtomSet +name: Word +subType: Int = 1 +isEquivalent: Boolean +isProchiral: Boolean +distCorr: Float = 0.0 +elementSymbol: Word * +atomName: Word +atomSubType: Int = 1 +sysName: Line +altSysNames: Line 1 * ccp.molecule.ChemComp.NamingSystem +name: Word * +getElementSymbol() * +serial: Int +name: Line * +structureAnalyses +getResidue() 1..* +dataDerivations 1 +parentList *1 +orderedAtoms {ordered} * * * ccp.molecule.MolStructure.Model +serial: Int +name: Line +index: Int +details: Text +coordinates: Float +bFactors: Float +occupancies: Float * 1 * 1 ccp.nmr.Nmr.ExpDimRefMapping +mappedFromSign: IntSign +getAltLocationCode() +setAltLocationCode() +getX() +setX() +getY() +setY() +getZ() +setZ() +getBFactor() +setBFactor() +getOccupancy() +setOccupancy() +derivedTo 1 ccp.nmr.Nmr.HExchProtectionList +parentList 1 +measurements * ccp.nmr.Nmr.AbstractDataDerivation +serial: Int +details: Text 1 +dim: PositiveInt +pointNumbers: Int +predT2: PositiveFloat +predJ: Float +minPointsPerVector: PositiveInt * * +parentList ccp.nmr.Nmr.SparseSampling +serial: Int +name: Line +unit: Word +details: String +isSimulated: Boolean = False * * ccp.nmr.Nmr.DataDerivation * ccp.nmr.Nmr.DerivedDataList * * +parentList ccp.nmr.Nmr.HExchRateList +measurements 1 * +dataDim +scalingFactors: Float 1 * ccp.nmr.Nmr.Datum +serial: Int +value: Float +error: Float +figOfMerit: FloatRatio = 1.0 +referenceRateList ccp.nmr.Nmr.HExchRate ccp.nmr.Nmr.SpectralDensityList ccp.nmr.Nmr.DimensionScaling +derivedData * ccp.nmr.Nmr.DerivedData * +serial: Int +details: String 1 +derivedData +measurements +parentList +derivedData * * +parentList * ccp.nmr.Nmr.DipolarRelaxList +unit: Word = Hz +sf: Float 1 +derivations +derivation * +tauEValue: Float +tauEError: Float +rexValue: Float +rexError: Float +sumSquaredErrors: Float +modelFit: Text +tauSValue: Float +tauSError: Float * * +parentList ccp.nmr.Nmr.DataSourceImage ccp.nmr.Nmr.IsotropicS2List 1 +measurements +parentList ccp.nmr.Nmr.IsotropicS2 ccp.nmr.Nmr.Noe 1 * +serial: Int +sf: Float +tauEUnit: Word = s +tauSUnit: Word = s 1 1 * * * +derivation ccp.nmr.Nmr.IsotropicS2Derivation +derivations +derivedData 1 * +measurements ccp.nmr.Nmr.T2 ccp.nmr.Nmr.SpectralDensityDerivation * +dataDims 1 * ccp.nmr.Nmr.Rdc ccp.nmr.Nmr.T1 1 ccp.nmr.Nmr.Pka +derivations +parentList +measurements 1 +parentList ccp.nmr.Nmr.T1List +unit: Word = s +sf: Float +coherenceType: T1CoherenceType = z ccp.nmr.Nmr.PKaDerivation 1 +derivedData * * +derivation ccp.nmr.Nmr.ShiftDifferenceList +unit: Word = ppm +differenceType: ShiftDifferenceType 1 +parentList +measurements ccp.nmr.Nmr.ShiftDifference ccp.nmr.Nmr.AbstractPeakDimContrib +serial: Int +dim: Int ccp.nmr.Nmr.FreqDataDim +numPointsOrig: Int +pointOffset: Int = 0 +phase0: Float +phase1: Float +valuePerPoint: Float +spectralWidth: Float +spectralWidthOrig: Float 1 +getDim() +peakDimContribs +mainPeakDimContribs +peakDimContribs * * ccp.nmr.Nmr.PeakDimComponent +peakDimContribs +serial: Int +scalingFactor: Float = 1.0 +annotation: Line * * +getSpectralWidth() +getSpectralWidthOrig() * +hillCoeff: Float +hillCoeffError: Float +highPHParam: Float +highPHParamError: Float +lowPHParam: Float +lowPHParamError: Float +parameterType: Line +parameterUnit: Line * * +measurements * 1 1 * ccp.nmr.Nmr.ExpDim +dim: PositiveInt +isAcquisition: Boolean = False ccp.nmr.Nmr.PeakDim 1 +peakDims +peakDims +getMaxValue() +getSpectralWidth() ** * +processedFrom +serial: Int +clusterType: PeakClusterType +annotation: Line +peaks * +dim: PositiveInt +annotation: Line +numAliasing: Int = 0 +position: Float +positionError: Float +boxWidth: Float +phase: Float +phaseError: Float +decayRate: Float +decayRateError: Float +value: Float +valueError: Float +realValue: Float +realValueImpl: Float +lineWidth: Float 1 +getDataDim() +getValue() +setValue() +getValueError() +setValueError() +getRealValue() +setRealValue() +getMainPeakDimContribs() * +processedTo ccp.nmr.Nmr.PeakCluster * +firstValue: Float = 0.0 +valuePerPoint: Float +maxValue: Float +numPointsValid: NonNegativeInt +pointOffset: Int = 0 +phase0: Float +phase1: Float +spectralWidth: Float +oversamplingInfo: Text +alternateSign: Boolean = False +negateImaginary: Boolean = False +rawData ccp.nmr.Nmr.DataSource +serial: Int +name: Line +numDim: Int +scale: Float = 1.0 +noiseLevel: NonNegativeFloat +signalLevel: NonNegativeFloat +snRatio: NonNegativeFloat +details: String +isNormalStorage: Boolean = True +storageDetails: String +dataType: DataSourceDataType +isSimulated: Boolean = False +recordNumber: Int = 0 +numShapes: NonNegativeInt = 0 +numSparsePoints: NonNegativeInt = 0 * ccp.nmr.Nmr.PkaList * +getSpectralWidth() +getSpectralWidthOrig() +getValuePerPoint() +pointToValue(point) +valueToPoint(value) +getSnRatio() * * ccp.nmr.Nmr.PeakDimContribN +mixingTime: Float +transferType: ExpTransferType +transferSubType: ExpTransferSubType +isDirect: Boolean = True ccp.nmr.Nmr.FidDataDim +spectralDensities +measurements +dataDims 1 ccp.nmr.Nmr.DipolarRelaxation ccp.nmr.Nmr.NoeList ccp.nmr.Nmr.AbstractDataDim +dim: PositiveInt +numPoints: NonNegativeInt +fileDim: NonNegativeInt +isComplex: Boolean +unit: Word +shapeSerial: PositiveInt * +unit: Word = s-1 * +unit: Word = arbitrary +sf: Float +noeValueType: NoeValueType +refValue: Float +refDescription: Text ccp.nmr.Nmr.ExpTransfer * +dataDim ccp.nmr.Nmr.SpectralDensity +frequency: Float * +refPoint: Float = 0.0 +refValue: Float = 0.0 +valuePerPoint: Float +localValuePerPoint: Float +spectralWidth: Float +spectralWidthOrig: Float * +unit: Word = s-1 1 * * +parentList * 1 * * ccp.nmr.Nmr.ResonanceSet +serial: Int 1 +name: Line +unit: Word +definition: String +sf: Float 1 * ccp.nmr.Nmr.JCoupling 1 ccp.nmr.Nmr.DataList +parentList +parentList * * 1+derivedTo * * * * +derivations 1 +derivation ccp.nmr.Nmr.T2List +unit: Word = s +sf: Float +coherenceType: T2CoherenceType = SQ +tempCalibMethod: TempCalibMethod +tempControlMethod: TempControlMethod {ordered} ccp.nmr.Nmr.DataDimRef +getIsotopes() +getHasAliasedFreq() * +dataDerivations +derivations * * 2 * +dataDerivations +derivation +nmrExpSeries * ccp.nmr.Nmr.PeakDimContrib 1 * +serial: Int +name: Line +numDim: Int +details: String +numScans: Int +date: DateTime +sampleState: ExpSampleState +sampleVolume: Float +volumeUnit: Word +nmrTubeType: Line ** +spinningAngle: Float * +spinningRate: Float ** +userExpCode: Line ** ** * * * +derivedFrom 1 1 ccp.nmr.Nmr.ExpDimRef * * +measurements +derivedFrom +serial: Int +sf: Float +isotopeCodes: Word +measurementType: ExpMeasurementType = Shift +isFolded: Boolean = False +name: Word +unit: Word +isAxisReversed: Boolean = True +maxAliasedFreq: Float +minAliasedFreq: Float +hasAliasedFreq: Boolean +variableIncrFraction: FloatRatio +constantTimePeriod: Float +nominalRefValue: Float +baseFrequency: Float +displayName: Word +groupingNumber: PositiveInt = 1 «DataType» ccp.molecule.MolStructure.EnsembleDataNames ccp.nmr.Nmr.AbstractMeasurementList * * +serial: Int +name: Line +conditionNames: SampleConditionType +details: Text * +nmrExperiments ccp.molecule.MolStructure.Coord +altLocationCode: Line = +x: Float = 0.0 +y: Float = 0.0 +z: Float = 0.0 +bFactor: Float = 0.0 +occupancy: Float = 1.0 * +getEnsembleValidations() +purge() +getNAtoms() +measurements +unit: Word = ratio +protectionType: HExchProtectionType * ccp.nmr.Nmr.NmrExpSeries * 1 ccp.nmr.Nmr.Experiment +getCoords() +getCoordinates() +getBFactors() +getOccupancies() +getSubmatrixData(name) +setSubmatrixData(name, values) +value: Float +error: Float = 0.0 +figOfMerit: FloatRatio = 1.0 +details: Text * * +nmrExpSeries * * * +ensembleId: Int +atomNamingSystem: Line +resNamingSystem: Line +nAtoms: Int = 0 +softwareName: Word +details: Text ccp.nmr.Nmr.AbstractMeasurement ccp.nmr.Nmr.HExchProtection ccp.nmr.Nmr.ExpChainState +weight: Float = 1.0 +structureAnalyses ccp.nmr.Nmr.StructureAnalysis +serial: Int +name: Line +details: Text +measurementLists +measurements +name: Line +unit: Word +definition: String +sf: Float ccp.nmr.Nmr.Shift * * 1 ccp.nmr.Nmr.DataList * ccp.nmr.Nmr.AtomSet 1 ccp.molecule.MolStructure.StructureEnsemble +inputMeasurements ccp.nmr.Nmr.JCouplingList ccp.nmr.Nmr.ChainState +serial: Int +name: Line +details: Text * 1 * +unit: Word = Hz +sf: Float 1 * 1 * +coordChains +atomSetReference +atomVariantSystems +atomSetVariantSystems +atomReference +possibility 1 +possibility +getMolResidue() +getMolType() +getCcpCode() +getChemCompVar() +seqId: Int +seqCode: Int +seqInsertCode: Line = * +getMainChemCompSysName() 1 * +serial: Int +stateSetType: ChainStateSetType +details: Text * {ordered} 1 +seqId: Int +seqCode: Int +seqInsertCode: Line = +molType: MolType +ccpCode: LongWord +linking: ChemCompLinking +descriptor: Line +details: Text * 1 * +getAtom() +getElementSymbol() +getChemAtom() +getCoords() +getCoordinates() +getBFactors() +getOccupancies() +getSubmatrixData(name) +setSubmatrixData(name, values) +setCoordinate(index, values) +setBFactor(index, value) +setOccupancy(index, vaue) +setSubmatrixValues(name, index, values) +newCoord() * * * ccp.molecule.ChemComp.AtomSysName * +serial: Int +isotopeCode: Word +molName: ShiftReferenceMolecule +atomGroup: Line +unit: Word = ppm +value: Float +referenceType: ShiftRefType +indirectShiftRatio: Double +getIsotope() +getChemComp() +getChemCompVar() +interactionType: Line * ccp.nmr.Nmr.ShiftReference * ccp.molecule.MolSystem.Residue +getChain() ccp.molecule.MolStructure.Atom 11 +sysName: Word * +serial: Int +name: Line +molType: MolType +ccpCode: LongWord +linking: ChemCompLinking +descriptor: Line +secStrucCode: SecStrucCode +clusterCode: Line +isActive: Boolean = True +details: Page ccp.nmr.Nmr.ChainStateSet * +code: Line * ccp.molecule.ChemComp.ChemTorsionSysName * 1 +getElementSymbol() ccp.molecule.ChemComp.ChemAtom 1 * ccp.molecule.MolSystem.ChainInteraction 1 ccp.molecule.MolStructure.Residue +chirality: AtomChirality +nuclGroupType: Word +elementSymbol: Word +shortVegaType: Line +waterExchangeable: Boolean = False +sysNames 1 * 1 * * * 1 ccp.molecule.MolSystem.NonCovalentBond 1 ccp.nmr.Nmr.ResonanceGroup * * +name: Word +altLocationCode: Line = +index: Int +elementSymbol: Word +coordinates: Float +bFactors: Float +occupancies: Float +sampleGeometry: Line +location: Line +axis: Line 1 * ccp.molecule.MolStructure.Chain ccp.molecule.ChemComp.ChemTorsion ccp.nmr.Nmr.ExternalShiftReference * * 1 2 +getChemAtom() ccp.molecule.ChemComp.LinkEnd * +remoteLinkEnd +boundLinkEnd 1 +linkCode: Word +remoteLinkEnds +boundLinkEnds +boundLinkAtom +getChemCompVars() +remoteLinkAtom 1 1 1 * 4 * +getNmrCalcStores() +fromResonanceGroup +possibility ccp.nmr.Nmr.ResidueProb * 1 +getEmpiricalFormula() +getMolecularMass() +getFormalCharge() +createChainFragments() ccp.molecule.MolSystem.MolSystemLinkEnd +linkCode: Word +chemAtoms {ordered} ccp.molecule.ChemComp.LinkAtom ccp.nmr.Nmr.InternalShiftReference * +fromResonanceGroups 1 1 1 1 1 11 ccp.molecule.MolSystem.ChainFragment +serial: Int +molType: Word +isLinearPolymer: Boolean +getIsStdLinear() 1 1 * +dihedralAngle: LinkDihedralAngle +isStdLinear: Boolean +elementSymbol: Word * ccp.molecule.MolSystem.MolSystemLink * +chemAtoms 3 * * +chemAtoms {ordered} 2 +chemAtoms +chemAtoms ccp.molecule.ChemComp.AbstractChemAtom +coreStereochemistries +stereochemistries +name: Word * +subType: Int = 1 * 1 +linkType: LinkType +isSelected: Boolean +sequenceOffset: Int = 0 ccp.nmr.Nmr.NmrProject +name: Line * ccp.molecule.Molecule.MolResLink +name: Word 1 * * * ccp.nmr.Nmr.ResonanceGroupProb * 2 {ordered} * +chemAtoms +coreAtoms * * * 1 1 +getRuns() +serial: Int +name: Word +details: String 1 +dihedralAngle: LinkDihedralAngle 1 1 +stereochemistries +getRefStereochemistry() +setRefStereochemistry() * +condition: SampleConditionType +unit: Line +value: Float +error: Float +serial: Int +name: Line +detail: Text ccp.nmr.Nmr.ResidueTypeProb +getLinkEnd() * ccp.molecule.MolSystem.StructureGroup * ccp.nmr.Nmr.SampleConditionSet +getLinkEnd() * * ccp.molecule.ChemComp.ChemAngle ccp.molecule.ChemComp.Stereochemistry 1 +getNumChains() +getIsParamagnetic() +getMolecularMass() 1 ccp.nmr.Nmr.SampleCondition * +linkCode: Word +serial: Int +stereoClass: Word +value: Word ccp.nmr.Nmr.StructureGeneration +serial: Int +name: Line +generationType: StructureGenerationType = denovo +details: Text 1 +getMolType() +getSeqLength() +getEmpiricalFormula() +getMolecularMass() +getFormalCharge() +getIsAromatic() +getIsParamagnetic() +getIsStdCyclic() +getIsStdLinear() +getHasNonStdChemComp() +getHasNonStdChirality() +getSeqString() +getStdSeqString() * ccp.molecule.Molecule.Alignment 1 ccp.molecule.MolSystem.MolSystem +code: Word +name: Text +numChains: Int +hasChemExchange: Boolean +commonNames: Text +keywords: Line +functions: Line +isParamagnetic: Boolean +molecularMass: Float +details: Text ccp.molecule.Molecule.Molecule +name: Line +longName: Text +isFinalised: Boolean = False +molType: MolType +commonNames: Line +keywords: Line +functions: Line +seqLength: Int +calcIsoelectricPoint: Float +empiricalFormula: Line +molecularMass: Float +formalCharge: Int +isAromatic: Boolean +isParamagnetic: Boolean +smiles: String 1 +smilesType: SmilesType +details: String +seqDetails: Text +fragmentDetails: Text +mutationDetails: Text +seqString: String 1 +stdSeqString: String 1 +hasNonStdChemComp: Boolean +hasNonStdChirality: Boolean 1 +isStdLinear: Boolean +isStdCyclic: Boolean ccp.nmr.Nmr.Peak +serial: Int +details: Text +height: Float +volume: Float +figOfMerit: FloatRatio = 1.0 +constraintWeight: NonNegativeFloat = 1.0 +annotation: Line +componentNumbers: NonNegativeInt +parentList 1 ccp.nmr.Nmr.RdcList +unit: Word = Hz +sf: Float * 1 1 * 1..* ccp.nmr.Nmr.PeakContrib +serial: Int +weight: Float = 0.0 1 * ** 1 * +peakIntensities ccp.nmr.Nmr.PeakIntensity * ccp.nmr.Nmr.ShiftAnisotropyList +unit: Word = ppm +sf: Float 1 * * +measurements ccp.nmr.Nmr.ShiftAnisotropy +parentList * +shiftAnisotropies +intensityType: IntensityType +value: Float +error: Float 1* 2 1 2 2 2 11 1 1..* 1 * 2..* * ccp.nmr.Nmr.Resonance +serial: Int +name: Line +assignNames: Word +isotopeCode: Word +details: Text 1 11 +getIsotope() * +covalentlyBound ccp.nmr.Nmr.T1RhoList ccp.nmr.Nmr.Shift +measurements * * * ccp.nmr.Nmr.T1Rho ccp.nmr.Nmr.ResonanceProb *+covalentlyBound * * +parentList +measurements +unit: Word = s +sf: Float +coherenceType: T2CoherenceType = SQ +tempCalibMethod: TempCalibMethod 1 +tempControlMethod: TempControlMethod +referenceShiftList +parentList 1 ccp.nmr.Nmr.ShiftList +unit: Word = ppm Data Interoperability ● CCPN has standard representation (our first task!) ● Writing is straightforward ● Reading requires disambiguation ● User interaction ● Heuristics ● Known starting point (WMS) ■ Not all programs can deal with all problems ■ Programs are generally large(ish) loosely coupled batch jobs ● Introduction ● CCPN and WeNMR ● NMR data and software ● WMS Workflow Management System ● Goals ● Implementation ● Status Workflow Management Goals ● Standardized interface to WeNMR portals ● ● ● ● ● Application-independent data selection Standardized task submission and result gathering Submit to multiple programs from single task High-quality modern interface Easy to modify and customize calculation protocols ● Seamless, invisible data flow ● Automatic conversion to/from program-specific formats ● Start and end on precisely defined CCPN data ● Combine tasks into workflows Data Management Goals ● Central data store, with access control ● Track jobs and data flow ● NMR analysis is rarely linear ● Alternative jobs from single starting point ● Run – modify – re-run ● Identified as desirable also for management of non-Grid data ● Group/department data store WMS – End User Local Platform ● WMS is a web-based end user platform for accessing web-based services and executing workflows ● Development of the Extend-NMR project ● Accesses services through adaptor modules ● Also direct access from desktop CcpNmr Analysis ● Introduction ● CCPN and WeNMR ● NMR data and software ● WMS Workflow Management System ● Goals ● Implementation ● Status WMS – Architecture Web Client Standardised interface (Java / GWT) Server (Java /Hibernate) Database (Postgres) Bioinformatics web services Taverna Remote Execution Server WeNMR web portals Local installation Web Service Wrapper (Java / CGI) WSDL Program files Data Adapter (Python) CCPN data CcpNmr Analysis Desktop (Python ) Data handling ● Data sets are large (XML 100’s of Mb) ● Thin summary sent to client for selection ● Data kept on server ● Standard CCPN data storage ● Task data in separate package (NmrCalc) ● Transferred as tarred, zipped CCPN data sets ● Hidden ports. ● Transfer one zipped data bundle, one JSON parameter file Data Flow Generation Protocol specification UI generator Data Flow Client UI Data selection Input Converter NmrCalc GRID / Launch LOCAL calculation Output Converter Parameters TASK Data Generated from selection User Interface Specification Parameter definitions NmrCalc : Program map Protocol and interface specification ● Multiple interfaces, different programs and protocols ● Group and user specific ● User and program interfaces abstracted out ● Driven by specification file ● Custom-made widget for each data type ● structures, peak lists, shift lists … ● Layout specification as part of protocol specification ● Layout editor planned ● Scaled back due to developer illness WMS Protocol Edit View / edit protocol specification WMS Constraints select WMS Peaklist Select Interface Implementation ● Requirements analysis emphasized modern toolset, drag-and-drop, … ● Hence choice of GWT ● Fast, customizable interface generation proved more important ● Browser interface is a single, complex HTML page with varying content. ● GWT less useful in practice ● Simpler system might have been preferable Taverna ● Intention was full integration with Taverna ● Generate workflow specifications with SCUFL2 API ● But SCUFL2 API not ready and available in practice ● Handbuilt workflow manager user for normal tasks ● Taverna used for workflows ● Workflows written as t2flow files using Taverna Workbench ● Problems passing complex objects and large binary blocks in Taverna protocol ● Keep data bundles on server and pass only handle ● Introduction ● CCPN and WeNMR ● NMR data and software ● WMS Workflow Management System ● Goals ● Implementation ● Status Status ● Prototype for alpha testing ● End October project deliverable ● Submit four different structure generation tasks to WeNMR portals from a single input selection ● Validation task ● Separate facility for Taverna Workflows ● Lost most of a man-year (out of three) due to lead developer illness. WMS Home page WMS Submit page Submit task or workflow for project snapshot END ccp.general.Template.MultiTypeValue RunIo NmrCalc +textValue: String +intValue: Int +floatValue: Float +booleanValue: Boolean +name: Line +code: Word +ioRole: IoRole = input RunParameter ParameterGroup 1 * +data * +serial: Int NmrCalcStore 1 * Data * +serial: Int +details: Text * Run +inputs * +outputs * ccp.nmr.Nmr.AbstractMeasurementList +serial: Int +name: Line +unit: Word +details: String +isSimulated: Boolean = False +parentList * 1 +measurements ccp.nmr.Nmr.AbstractMeasurement +value: Float +error: Float = 0.0 +figOfMerit: FloatRatio = 1.0 +details: Text ccp.molecule.MolStructure.StructureEnsemble +measurementList 1 * ccp.molecule.MolStructure.Model * MeasurementListData +measurementListSerial: Int +getMeasurementList() +setMeasurementList() {ordered} StructureEnsembleData +molSystemCode: Word +ensembleId: Int +modelSerials: Int +getStructureEnsemble() +setStructureEnsemble() +getModels() +setModels()