2012SepEgiFogh

advertisement
A Workflow Manager for NMR :
objectives, design, and implementation
● Introduction
● CCPN and WeNMR
● NMR data and software
● WMS Workflow Management System
● Goals
● Implementation
● Status
CCPN
● Collaborative Computing Project for NMR
(Nuclear Magnetic Resonance)
● Funded by BBSRC since 1999
● WeNMR project partner
● Goals:
● Unifying platform for NMR software
● Community-based, open-source
software development
● Outreach: meetings and courses
CCPN Credits
■ University of Cambridge
●
●
●
●
●
●
John Ionides (WMS lead)
Rasmus Fogh (WMS)
Wayne Boucher
Tim Stevens
Magnus Helgstrand
Prof. Ernest Laue
■ University of Leicester
● Prof. Geerten Vuister
■ EBI (PDBe), Hinxton
● Pieter Hendrickx
● Aleksandras Gutmanas
● Gerard Kleywegt
Funding:
● EU FP7 (WeNMR partner)
● BBSRC (CCPN grant)
● AstraZeneca, Genentech, Janssen
Pharmaceutical, Lead Pharma, Medivir,
Novartis, NovoNordisk, Syngenta, Vernalis
WeNMR
NMR Calculation
web portals
● Introduction
● CCPN and WeNMR
● NMR data and software
● WMS Workflow Management System
● Goals
● Implementation
● Status
Macromolecular NMR pipeline
NMR processing
Analysis
Assignment
Structure
generation,
Dynamics,
Interactions
Validation
● NMR experiments give atomic-level, qualitatively diverse
information
● No direct mathematical relationship between final model
and original measurements
● Peak-atom mapping (‘assignment’) is ‘puzzle solving’
● Redone for each sample group, semi-ambiguous
● Data heterogenous and extremely complex
● Workflow often branched or recursive
NMR software
● Many programs - limited resources
● Programs often done by single person,
● who has since left or become professor
● and is thus unavailable for maintenance
● Each program allows different subset of possible data
● Different data formats, with different dialects
● Different atom naming systems, with local variants
Task1
Convert
Task2
Convert
Convert
Task1
Native Disorganisation
Task3
CCPN Data Standard
Data
Standard
Task2
Task3
Task1
Task1
CCPN data standard
● Program-independent data storage and exchange
● Final and intermediate data for all relevant programs
● Keyword-value for program-specific parameters
●
●
●
●
Precise, detailed, comprehensive, coherent
Object-oriented, normalized, for changing data
UML model as source for automatic code generation
Data access libraries for Python, Java, C;
XML and SQL storage. 800 000 lines/implementation
● BIG!
● 531 classes, 135 data type definitions,
2044 attributes, 1790 roles
Molecule and NMR data (20% of classes)
ccp.molecule.ChemComp.NonStdChemComp
ccp.molecule.Molecule.MolResidue
+serial: Int
+seqCode: Int
+seqInsertCode: Line =
+molType: MolType
+ccpCode: LongWord
+linking: ChemCompLinking
+descriptor: Line
+code1Letter: Word
+stdChemCompCode: LongWord
+getCode1Letter()
ccp.molecule.ChemComp.StdChemComp
+code1Letter: Word
+stdChemCompCode: LongWord
ccp.molecule.ChemComp.ChemComp
+molType: MolType
+ccpCode: LongWord
+name: Text
+code3Letter: ThreeLetterCode
+code1Letter: Word
+baseGlycoCtCode: String
+commonNames: Text
+stdChemCompCode: LongWord
+keywords: Line
+merckCode: Word
+sigmaAldrichCode: Word
+beilsteinCode: Word
+casRegCode: Word
+details: Text
+hasStdChirality: Boolean
+isLinearPolymer: Boolean
+getMolType()
+getCcpCode()
+getChemCompVar()
+setChemCompVar()
+getCode1Letter()
1
+possibility
1
1
*
1
11
*
*
ccp.molecule.Molecule.MolSeqFragment
+organismName: Line
+organismSciName: Line
ccp.molecule.ChemComp.ChemBond
+bondType: BondType = single
+stereochem: BondStereochemistry
*
+getOrganismName()
+getOrganismSciName()
1
+getLinkEnds()
+getFormula()
+getMolecularMass()
+getChemBonds()
+getChemAngles()
+getChemTorsions()
+getChemAtomSets()
+getChemCompSysNames()
+getName()
+setName()
1
1
*
1
2
+linking: ChemCompLinking
+descriptor: Line
+isDefaultVar: Boolean = False
+name: Text
+varName: Text
+glycoCtCode: String
+formula: Line
+formalCharge: Int
+molecularMass: Float
+nonStereoSmiles: String
+stereoSmiles: String
+isParamagnetic: Boolean
+isAromatic: Boolean
1
+getCode1Letter()
+getStdChemComp()
+getIsLinearPolymer()
1 1
+limitResidues
ccp.molecule.ChemComp.ChemCompVar
*
+serial: Int
+homologyRatio: FloatRatio
+dbRefAlignBegin: Int
+dbRefAlignEnd: Int
+nIdentical: Int
+nPositive: Int
+alignLength: Int
+alignmentScore: Float
+alignmentProgram: Text
+details: String
*
*
*
*
*
ccp.molecule.MolSystem.MolSystemSysName
ccp.molecule.Molecule.MolResLinkEnd
1
+namingSystem: Word
+name: Line
*
ccp.molecule.MolSystem.Chain
+code: Line
+pdbOneLetterCode: Line =
+role: Line
+magnEquivalenceCode: Line
+physicalState: Line
+conformationalIsomer: Line
+chemExchangeState: Line
+empiricalFormula: Line
+molecularMass: Float
+formalCharge: Int
+details: Text
*
2
ccp.molecule.Molecule.MolResLink
+dihedralAngle: LinkDihedralAngle
+isStdLinear: Boolean
ccp.molecule.MolSystem.Atom
*2
+getIsStdLinear()
+name: Word
1..*
*
1
+boundChemAtom
+remoteChemAtom
*
+getChemElement()
ccp.molecule.ChemComp.ChemAtomSet
+name: Word
+subType: Int = 1
+isEquivalent: Boolean
+isProchiral: Boolean
+distCorr: Float = 0.0
+elementSymbol: Word
*
+atomName: Word
+atomSubType: Int = 1
+sysName: Line
+altSysNames: Line
1
*
ccp.molecule.ChemComp.NamingSystem
+name: Word
*
+getElementSymbol()
*
+serial: Int
+name: Line
*
+structureAnalyses
+getResidue()
1..*
+dataDerivations
1
+parentList
*1
+orderedAtoms
{ordered}
*
*
*
ccp.molecule.MolStructure.Model
+serial: Int
+name: Line
+index: Int
+details: Text
+coordinates: Float
+bFactors: Float
+occupancies: Float
*
1
*
1
ccp.nmr.Nmr.ExpDimRefMapping
+mappedFromSign: IntSign
+getAltLocationCode()
+setAltLocationCode()
+getX()
+setX()
+getY()
+setY()
+getZ()
+setZ()
+getBFactor()
+setBFactor()
+getOccupancy()
+setOccupancy()
+derivedTo
1
ccp.nmr.Nmr.HExchProtectionList
+parentList
1
+measurements
*
ccp.nmr.Nmr.AbstractDataDerivation
+serial: Int
+details: Text
1
+dim: PositiveInt
+pointNumbers: Int
+predT2: PositiveFloat
+predJ: Float
+minPointsPerVector: PositiveInt
*
*
+parentList
ccp.nmr.Nmr.SparseSampling
+serial: Int
+name: Line
+unit: Word
+details: String
+isSimulated: Boolean = False
*
*
ccp.nmr.Nmr.DataDerivation
*
ccp.nmr.Nmr.DerivedDataList
*
*
+parentList
ccp.nmr.Nmr.HExchRateList
+measurements
1
*
+dataDim
+scalingFactors: Float
1
*
ccp.nmr.Nmr.Datum
+serial: Int
+value: Float
+error: Float
+figOfMerit: FloatRatio = 1.0
+referenceRateList
ccp.nmr.Nmr.HExchRate
ccp.nmr.Nmr.SpectralDensityList
ccp.nmr.Nmr.DimensionScaling
+derivedData
*
ccp.nmr.Nmr.DerivedData
*
+serial: Int
+details: String
1
+derivedData
+measurements
+parentList
+derivedData
*
*
+parentList
*
ccp.nmr.Nmr.DipolarRelaxList
+unit: Word = Hz
+sf: Float
1
+derivations
+derivation
*
+tauEValue: Float
+tauEError: Float
+rexValue: Float
+rexError: Float
+sumSquaredErrors: Float
+modelFit: Text
+tauSValue: Float
+tauSError: Float
*
*
+parentList
ccp.nmr.Nmr.DataSourceImage
ccp.nmr.Nmr.IsotropicS2List
1
+measurements
+parentList
ccp.nmr.Nmr.IsotropicS2
ccp.nmr.Nmr.Noe
1
*
+serial: Int
+sf: Float
+tauEUnit: Word = s
+tauSUnit: Word = s
1
1
*
*
*
+derivation
ccp.nmr.Nmr.IsotropicS2Derivation
+derivations
+derivedData 1
*
+measurements
ccp.nmr.Nmr.T2
ccp.nmr.Nmr.SpectralDensityDerivation
*
+dataDims
1
*
ccp.nmr.Nmr.Rdc
ccp.nmr.Nmr.T1
1
ccp.nmr.Nmr.Pka
+derivations
+parentList
+measurements
1
+parentList
ccp.nmr.Nmr.T1List
+unit: Word = s
+sf: Float
+coherenceType: T1CoherenceType = z
ccp.nmr.Nmr.PKaDerivation
1
+derivedData
*
*
+derivation
ccp.nmr.Nmr.ShiftDifferenceList
+unit: Word = ppm
+differenceType: ShiftDifferenceType
1
+parentList
+measurements
ccp.nmr.Nmr.ShiftDifference
ccp.nmr.Nmr.AbstractPeakDimContrib
+serial: Int
+dim: Int
ccp.nmr.Nmr.FreqDataDim
+numPointsOrig: Int
+pointOffset: Int = 0
+phase0: Float
+phase1: Float
+valuePerPoint: Float
+spectralWidth: Float
+spectralWidthOrig: Float
1
+getDim()
+peakDimContribs
+mainPeakDimContribs
+peakDimContribs *
*
ccp.nmr.Nmr.PeakDimComponent
+peakDimContribs
+serial: Int
+scalingFactor: Float = 1.0
+annotation: Line
*
*
+getSpectralWidth()
+getSpectralWidthOrig()
*
+hillCoeff: Float
+hillCoeffError: Float
+highPHParam: Float
+highPHParamError: Float
+lowPHParam: Float
+lowPHParamError: Float
+parameterType: Line
+parameterUnit: Line
*
*
+measurements
*
1
1
*
ccp.nmr.Nmr.ExpDim
+dim: PositiveInt
+isAcquisition: Boolean = False
ccp.nmr.Nmr.PeakDim
1
+peakDims
+peakDims
+getMaxValue()
+getSpectralWidth()
**
*
+processedFrom
+serial: Int
+clusterType: PeakClusterType
+annotation: Line
+peaks
*
+dim: PositiveInt
+annotation: Line
+numAliasing: Int = 0
+position: Float
+positionError: Float
+boxWidth: Float
+phase: Float
+phaseError: Float
+decayRate: Float
+decayRateError: Float
+value: Float
+valueError: Float
+realValue: Float
+realValueImpl: Float
+lineWidth: Float
1
+getDataDim()
+getValue()
+setValue()
+getValueError()
+setValueError()
+getRealValue()
+setRealValue()
+getMainPeakDimContribs()
*
+processedTo
ccp.nmr.Nmr.PeakCluster
*
+firstValue: Float = 0.0
+valuePerPoint: Float
+maxValue: Float
+numPointsValid: NonNegativeInt
+pointOffset: Int = 0
+phase0: Float
+phase1: Float
+spectralWidth: Float
+oversamplingInfo: Text
+alternateSign: Boolean = False
+negateImaginary: Boolean = False
+rawData
ccp.nmr.Nmr.DataSource
+serial: Int
+name: Line
+numDim: Int
+scale: Float = 1.0
+noiseLevel: NonNegativeFloat
+signalLevel: NonNegativeFloat
+snRatio: NonNegativeFloat
+details: String
+isNormalStorage: Boolean = True
+storageDetails: String
+dataType: DataSourceDataType
+isSimulated: Boolean = False
+recordNumber: Int = 0
+numShapes: NonNegativeInt = 0
+numSparsePoints: NonNegativeInt = 0
*
ccp.nmr.Nmr.PkaList
*
+getSpectralWidth()
+getSpectralWidthOrig()
+getValuePerPoint()
+pointToValue(point)
+valueToPoint(value)
+getSnRatio()
*
*
ccp.nmr.Nmr.PeakDimContribN
+mixingTime: Float
+transferType: ExpTransferType
+transferSubType: ExpTransferSubType
+isDirect: Boolean = True
ccp.nmr.Nmr.FidDataDim
+spectralDensities
+measurements
+dataDims
1
ccp.nmr.Nmr.DipolarRelaxation
ccp.nmr.Nmr.NoeList
ccp.nmr.Nmr.AbstractDataDim
+dim: PositiveInt
+numPoints: NonNegativeInt
+fileDim: NonNegativeInt
+isComplex: Boolean
+unit: Word
+shapeSerial: PositiveInt
*
+unit: Word = s-1
*
+unit: Word = arbitrary
+sf: Float
+noeValueType: NoeValueType
+refValue: Float
+refDescription: Text
ccp.nmr.Nmr.ExpTransfer
*
+dataDim
ccp.nmr.Nmr.SpectralDensity
+frequency: Float
*
+refPoint: Float = 0.0
+refValue: Float = 0.0
+valuePerPoint: Float
+localValuePerPoint: Float
+spectralWidth: Float
+spectralWidthOrig: Float
*
+unit: Word = s-1
1
*
*
+parentList
*
1
*
*
ccp.nmr.Nmr.ResonanceSet
+serial: Int
1
+name: Line
+unit: Word
+definition: String
+sf: Float
1
*
ccp.nmr.Nmr.JCoupling
1
ccp.nmr.Nmr.DataList
+parentList
+parentList
*
* 1+derivedTo
*
*
*
*
+derivations
1
+derivation
ccp.nmr.Nmr.T2List
+unit: Word = s
+sf: Float
+coherenceType: T2CoherenceType = SQ
+tempCalibMethod: TempCalibMethod
+tempControlMethod: TempControlMethod
{ordered}
ccp.nmr.Nmr.DataDimRef
+getIsotopes()
+getHasAliasedFreq()
*
+dataDerivations
+derivations
*
*
2
*
+dataDerivations
+derivation
+nmrExpSeries
*
ccp.nmr.Nmr.PeakDimContrib
1
*
+serial: Int
+name: Line
+numDim: Int
+details: String
+numScans: Int
+date: DateTime
+sampleState: ExpSampleState
+sampleVolume: Float
+volumeUnit: Word
+nmrTubeType: Line
** +spinningAngle: Float
* +spinningRate: Float
** +userExpCode: Line
**
**
*
*
*
+derivedFrom
1
1
ccp.nmr.Nmr.ExpDimRef
*
*
+measurements
+derivedFrom
+serial: Int
+sf: Float
+isotopeCodes: Word
+measurementType: ExpMeasurementType = Shift
+isFolded: Boolean = False
+name: Word
+unit: Word
+isAxisReversed: Boolean = True
+maxAliasedFreq: Float
+minAliasedFreq: Float
+hasAliasedFreq: Boolean
+variableIncrFraction: FloatRatio
+constantTimePeriod: Float
+nominalRefValue: Float
+baseFrequency: Float
+displayName: Word
+groupingNumber: PositiveInt = 1
«DataType»
ccp.molecule.MolStructure.EnsembleDataNames
ccp.nmr.Nmr.AbstractMeasurementList
*
*
+serial: Int
+name: Line
+conditionNames: SampleConditionType
+details: Text
*
+nmrExperiments
ccp.molecule.MolStructure.Coord
+altLocationCode: Line =
+x: Float = 0.0
+y: Float = 0.0
+z: Float = 0.0
+bFactor: Float = 0.0
+occupancy: Float = 1.0
*
+getEnsembleValidations()
+purge()
+getNAtoms()
+measurements
+unit: Word = ratio
+protectionType: HExchProtectionType
*
ccp.nmr.Nmr.NmrExpSeries
* 1
ccp.nmr.Nmr.Experiment
+getCoords()
+getCoordinates()
+getBFactors()
+getOccupancies()
+getSubmatrixData(name)
+setSubmatrixData(name, values)
+value: Float
+error: Float = 0.0
+figOfMerit: FloatRatio = 1.0
+details: Text
*
*
+nmrExpSeries
*
*
*
+ensembleId: Int
+atomNamingSystem: Line
+resNamingSystem: Line
+nAtoms: Int = 0
+softwareName: Word
+details: Text
ccp.nmr.Nmr.AbstractMeasurement
ccp.nmr.Nmr.HExchProtection
ccp.nmr.Nmr.ExpChainState
+weight: Float = 1.0
+structureAnalyses
ccp.nmr.Nmr.StructureAnalysis
+serial: Int
+name: Line
+details: Text
+measurementLists
+measurements
+name: Line
+unit: Word
+definition: String
+sf: Float
ccp.nmr.Nmr.Shift
*
*
1
ccp.nmr.Nmr.DataList
*
ccp.nmr.Nmr.AtomSet
1
ccp.molecule.MolStructure.StructureEnsemble
+inputMeasurements
ccp.nmr.Nmr.JCouplingList
ccp.nmr.Nmr.ChainState
+serial: Int
+name: Line
+details: Text
*
1
*
+unit: Word = Hz
+sf: Float
1
* 1
*
+coordChains
+atomSetReference
+atomVariantSystems
+atomSetVariantSystems
+atomReference
+possibility
1
+possibility
+getMolResidue()
+getMolType()
+getCcpCode()
+getChemCompVar()
+seqId: Int
+seqCode: Int
+seqInsertCode: Line =
*
+getMainChemCompSysName()
1
*
+serial: Int
+stateSetType: ChainStateSetType
+details: Text
*
{ordered}
1
+seqId: Int
+seqCode: Int
+seqInsertCode: Line =
+molType: MolType
+ccpCode: LongWord
+linking: ChemCompLinking
+descriptor: Line
+details: Text
*
1
*
+getAtom()
+getElementSymbol()
+getChemAtom()
+getCoords()
+getCoordinates()
+getBFactors()
+getOccupancies()
+getSubmatrixData(name)
+setSubmatrixData(name, values)
+setCoordinate(index, values)
+setBFactor(index, value)
+setOccupancy(index, vaue)
+setSubmatrixValues(name, index, values)
+newCoord()
*
*
*
ccp.molecule.ChemComp.AtomSysName
*
+serial: Int
+isotopeCode: Word
+molName: ShiftReferenceMolecule
+atomGroup: Line
+unit: Word = ppm
+value: Float
+referenceType: ShiftRefType
+indirectShiftRatio: Double
+getIsotope()
+getChemComp()
+getChemCompVar()
+interactionType: Line
*
ccp.nmr.Nmr.ShiftReference
*
ccp.molecule.MolSystem.Residue
+getChain()
ccp.molecule.MolStructure.Atom
11
+sysName: Word
*
+serial: Int
+name: Line
+molType: MolType
+ccpCode: LongWord
+linking: ChemCompLinking
+descriptor: Line
+secStrucCode: SecStrucCode
+clusterCode: Line
+isActive: Boolean = True
+details: Page
ccp.nmr.Nmr.ChainStateSet
*
+code: Line
*
ccp.molecule.ChemComp.ChemTorsionSysName
*
1
+getElementSymbol()
ccp.molecule.ChemComp.ChemAtom
1
*
ccp.molecule.MolSystem.ChainInteraction
1
ccp.molecule.MolStructure.Residue
+chirality: AtomChirality
+nuclGroupType: Word
+elementSymbol: Word
+shortVegaType: Line
+waterExchangeable: Boolean = False
+sysNames
1
*
1
*
*
*
1
ccp.molecule.MolSystem.NonCovalentBond
1
ccp.nmr.Nmr.ResonanceGroup
*
*
+name: Word
+altLocationCode: Line =
+index: Int
+elementSymbol: Word
+coordinates: Float
+bFactors: Float
+occupancies: Float
+sampleGeometry: Line
+location: Line
+axis: Line
1
*
ccp.molecule.MolStructure.Chain
ccp.molecule.ChemComp.ChemTorsion
ccp.nmr.Nmr.ExternalShiftReference
*
*
1
2
+getChemAtom()
ccp.molecule.ChemComp.LinkEnd
*
+remoteLinkEnd
+boundLinkEnd
1
+linkCode: Word
+remoteLinkEnds
+boundLinkEnds
+boundLinkAtom +getChemCompVars()
+remoteLinkAtom
1
1 1
*
4
*
+getNmrCalcStores()
+fromResonanceGroup
+possibility
ccp.nmr.Nmr.ResidueProb
*
1
+getEmpiricalFormula()
+getMolecularMass()
+getFormalCharge()
+createChainFragments()
ccp.molecule.MolSystem.MolSystemLinkEnd
+linkCode: Word
+chemAtoms
{ordered}
ccp.molecule.ChemComp.LinkAtom
ccp.nmr.Nmr.InternalShiftReference
*
+fromResonanceGroups
1
1
1 1 1 11
ccp.molecule.MolSystem.ChainFragment
+serial: Int
+molType: Word
+isLinearPolymer: Boolean
+getIsStdLinear()
1
1
*
+dihedralAngle: LinkDihedralAngle
+isStdLinear: Boolean
+elementSymbol: Word
*
ccp.molecule.MolSystem.MolSystemLink
*
+chemAtoms
3
*
*
+chemAtoms
{ordered}
2
+chemAtoms
+chemAtoms
ccp.molecule.ChemComp.AbstractChemAtom
+coreStereochemistries
+stereochemistries
+name: Word
*
+subType: Int = 1
*
1
+linkType: LinkType
+isSelected: Boolean
+sequenceOffset: Int = 0
ccp.nmr.Nmr.NmrProject
+name: Line
*
ccp.molecule.Molecule.MolResLink
+name: Word
1
*
*
*
ccp.nmr.Nmr.ResonanceGroupProb
*
2
{ordered}
*
+chemAtoms
+coreAtoms
*
*
*
1 1
+getRuns()
+serial: Int
+name: Word
+details: String
1
+dihedralAngle: LinkDihedralAngle
1
1
+stereochemistries
+getRefStereochemistry()
+setRefStereochemistry()
*
+condition: SampleConditionType
+unit: Line
+value: Float
+error: Float
+serial: Int
+name: Line
+detail: Text
ccp.nmr.Nmr.ResidueTypeProb
+getLinkEnd()
*
ccp.molecule.MolSystem.StructureGroup
*
ccp.nmr.Nmr.SampleConditionSet
+getLinkEnd()
*
*
ccp.molecule.ChemComp.ChemAngle
ccp.molecule.ChemComp.Stereochemistry
1
+getNumChains()
+getIsParamagnetic()
+getMolecularMass()
1
ccp.nmr.Nmr.SampleCondition
*
+linkCode: Word
+serial: Int
+stereoClass: Word
+value: Word
ccp.nmr.Nmr.StructureGeneration
+serial: Int
+name: Line
+generationType: StructureGenerationType = denovo
+details: Text
1
+getMolType()
+getSeqLength()
+getEmpiricalFormula()
+getMolecularMass()
+getFormalCharge()
+getIsAromatic()
+getIsParamagnetic()
+getIsStdCyclic()
+getIsStdLinear()
+getHasNonStdChemComp()
+getHasNonStdChirality()
+getSeqString()
+getStdSeqString()
*
ccp.molecule.Molecule.Alignment
1
ccp.molecule.MolSystem.MolSystem
+code: Word
+name: Text
+numChains: Int
+hasChemExchange: Boolean
+commonNames: Text
+keywords: Line
+functions: Line
+isParamagnetic: Boolean
+molecularMass: Float
+details: Text
ccp.molecule.Molecule.Molecule
+name: Line
+longName: Text
+isFinalised: Boolean = False
+molType: MolType
+commonNames: Line
+keywords: Line
+functions: Line
+seqLength: Int
+calcIsoelectricPoint: Float
+empiricalFormula: Line
+molecularMass: Float
+formalCharge: Int
+isAromatic: Boolean
+isParamagnetic: Boolean
+smiles: String
1
+smilesType: SmilesType
+details: String
+seqDetails: Text
+fragmentDetails: Text
+mutationDetails: Text
+seqString: String
1
+stdSeqString:
String
1
+hasNonStdChemComp: Boolean
+hasNonStdChirality: Boolean
1 +isStdLinear: Boolean
+isStdCyclic: Boolean
ccp.nmr.Nmr.Peak
+serial: Int
+details: Text
+height: Float
+volume: Float
+figOfMerit: FloatRatio = 1.0
+constraintWeight: NonNegativeFloat = 1.0
+annotation: Line
+componentNumbers: NonNegativeInt
+parentList
1
ccp.nmr.Nmr.RdcList
+unit: Word = Hz
+sf: Float
*
1
1
*
1..*
ccp.nmr.Nmr.PeakContrib
+serial: Int
+weight: Float = 0.0
1
* **
1
*
+peakIntensities
ccp.nmr.Nmr.PeakIntensity
*
ccp.nmr.Nmr.ShiftAnisotropyList
+unit: Word = ppm
+sf: Float
1
*
*
+measurements
ccp.nmr.Nmr.ShiftAnisotropy
+parentList
*
+shiftAnisotropies
+intensityType: IntensityType
+value: Float
+error: Float
1*
2
1
2
2
2
11
1
1..* 1
*
2..*
*
ccp.nmr.Nmr.Resonance
+serial: Int
+name: Line
+assignNames: Word
+isotopeCode: Word
+details: Text
1
11
+getIsotope()
*
+covalentlyBound
ccp.nmr.Nmr.T1RhoList
ccp.nmr.Nmr.Shift
+measurements
*
*
*
ccp.nmr.Nmr.T1Rho
ccp.nmr.Nmr.ResonanceProb
*+covalentlyBound
*
*
+parentList
+measurements
+unit: Word = s
+sf: Float
+coherenceType: T2CoherenceType = SQ
+tempCalibMethod: TempCalibMethod
1 +tempControlMethod: TempControlMethod
+referenceShiftList
+parentList
1
ccp.nmr.Nmr.ShiftList
+unit: Word = ppm
Data Interoperability
● CCPN has standard representation (our first task!)
● Writing is straightforward
● Reading requires disambiguation
● User interaction
● Heuristics
● Known starting point (WMS)
■ Not all programs can deal with all problems
■ Programs are generally large(ish) loosely coupled batch jobs
● Introduction
● CCPN and WeNMR
● NMR data and software
● WMS Workflow Management System
● Goals
● Implementation
● Status
Workflow Management Goals
● Standardized interface to WeNMR portals
●
●
●
●
●
Application-independent data selection
Standardized task submission and result gathering
Submit to multiple programs from single task
High-quality modern interface
Easy to modify and customize calculation protocols
● Seamless, invisible data flow
● Automatic conversion to/from program-specific formats
● Start and end on precisely defined CCPN data
● Combine tasks into workflows
Data Management Goals
● Central data store, with access control
● Track jobs and data flow
● NMR analysis is rarely linear
● Alternative jobs from single starting point
● Run – modify – re-run
● Identified as desirable also for management of
non-Grid data
● Group/department data store
WMS – End User Local Platform
● WMS is a web-based end
user platform for accessing
web-based services and
executing workflows
● Development of the
Extend-NMR project
● Accesses services through
adaptor modules
● Also direct access from desktop
CcpNmr Analysis
● Introduction
● CCPN and WeNMR
● NMR data and software
● WMS Workflow Management System
● Goals
● Implementation
● Status
WMS – Architecture
Web Client
Standardised
interface
(Java / GWT)
Server
(Java /Hibernate)
Database
(Postgres)
Bioinformatics
web services
Taverna
Remote Execution
Server
WeNMR web
portals
Local
installation
Web
Service
Wrapper
(Java /
CGI)
WSDL
Program files
Data Adapter
(Python)
CCPN data
CcpNmr Analysis
Desktop (Python )
Data handling
● Data sets are large (XML 100’s of Mb)
● Thin summary sent to client for selection
● Data kept on server
● Standard CCPN data storage
● Task data in separate package (NmrCalc)
● Transferred as tarred, zipped CCPN data sets
● Hidden ports.
● Transfer one zipped data bundle, one JSON parameter file
Data Flow
Generation
Protocol specification
UI generator
Data Flow
Client
UI
Data selection
Input
Converter
NmrCalc
GRID /
Launch
LOCAL
calculation
Output
Converter
Parameters
TASK
Data
Generated
from selection
User Interface
Specification
Parameter definitions
NmrCalc : Program
map
Protocol and interface specification
● Multiple interfaces, different programs and protocols
● Group and user specific
● User and program interfaces abstracted out
● Driven by specification file
● Custom-made widget for each data type
● structures, peak lists, shift lists …
● Layout specification as part of protocol specification
● Layout editor planned
● Scaled back due to developer illness
WMS Protocol Edit
View / edit protocol specification
WMS Constraints select
WMS Peaklist Select
Interface Implementation
● Requirements analysis emphasized modern toolset,
drag-and-drop, …
● Hence choice of GWT
● Fast, customizable interface generation
proved more important
● Browser interface is a single, complex HTML page
with varying content.
● GWT less useful in practice
● Simpler system might have been preferable
Taverna
● Intention was full integration with Taverna
● Generate workflow specifications with SCUFL2 API
● But SCUFL2 API not ready and available in practice
● Handbuilt workflow manager user for normal tasks
● Taverna used for workflows
● Workflows written as t2flow files using Taverna Workbench
● Problems passing complex objects and large binary
blocks in Taverna protocol
● Keep data bundles on server and pass only handle
● Introduction
● CCPN and WeNMR
● NMR data and software
● WMS Workflow Management System
● Goals
● Implementation
● Status
Status
● Prototype for alpha testing
● End October project deliverable
● Submit four different structure generation tasks to
WeNMR portals from a single input selection
● Validation task
● Separate facility for Taverna Workflows
● Lost most of a man-year (out of three)
due to lead developer illness.
WMS Home page
WMS Submit page
Submit task or workflow for project snapshot
END
ccp.general.Template.MultiTypeValue
RunIo
NmrCalc
+textValue: String
+intValue: Int
+floatValue: Float
+booleanValue: Boolean
+name: Line
+code: Word
+ioRole: IoRole = input
RunParameter
ParameterGroup
1
* +data
*
+serial: Int
NmrCalcStore
1
*
Data
*
+serial: Int
+details: Text
*
Run
+inputs
*
+outputs
*
ccp.nmr.Nmr.AbstractMeasurementList
+serial: Int
+name: Line
+unit: Word
+details: String
+isSimulated: Boolean = False
+parentList
*
1
+measurements
ccp.nmr.Nmr.AbstractMeasurement
+value: Float
+error: Float = 0.0
+figOfMerit: FloatRatio = 1.0
+details: Text
ccp.molecule.MolStructure.StructureEnsemble
+measurementList
1
*
ccp.molecule.MolStructure.Model
*
MeasurementListData
+measurementListSerial: Int
+getMeasurementList()
+setMeasurementList()
{ordered}
StructureEnsembleData
+molSystemCode: Word
+ensembleId: Int
+modelSerials: Int
+getStructureEnsemble()
+setStructureEnsemble()
+getModels()
+setModels()
Download