HAD Spreadsheet Data Format: Column A: Primary GI. The primary

HAD Spreadsheet Data Format: ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Column A: Primary GI. The primary GI found in the SFLD for a given unique sequence. Column B: All GIs. These are all the GIs with sequences identical to Primary GI. Column C: EFD ID. Internal SFLD identifier. Column D: Seq Length. The length of the sequence Column E: URL to EFD ID. A clickable link that references the SFLD page for the appropriate enzyme. Column F: hasStructure. This is a binary list stating whether there are PDB structures associated with a node (useful for coloring, etc in cytoscape). Column G: PDBs. This is a list of all PDBs associated with a given sequence. Column H: isCharacterized. This is a binary list (Yes/No) stating whether functional characterization has occurred with this sequence. Sequences are identified as characterized from a) SFLD evidence codes (CFM, IES), b) has an entry in swiss-prot Column I: FamilyAssignment. This identifies what function (SFLD family) this sequence has been assigned to. This column will not have the same number of entries as column G (isCharacterized) as this column also includes sequences that have been assigned by human annotators to functional families in the SFLD and does not include swiss-prot entries not assigned to SFLD families. Column J: Swiss-Prot ID. The Swiss-Prot ID for the sequence if it exists. Column K: isTarget. Binary list (Yes/No) indicating whether the sequence is/was a target of Lily, P01 or EFI (LabDB). Column L: targetStatus. The current status of the Lily/P01/EFI target. Column M: LillyTarget ID. Indicates the identifiers for the target(s) from Lilly. Column N: P01Target ID. Indicates the identifiers for the P01 targets Column O: EFITarget ID. Indicates the identifiers for target(s) from EFI. Note: Some targets didn't have exact matches in the SFLD HAD sequence set. In these cases, the percent ID for the match of the target sequence to the SFLD sequence is indicated in parentheses. Column P: LabDB Url. If the sequence is in LabDB, the link to that entry. Column Q: Type of life. As defined by the division record of the organisms from which the enzymes are derived Column R: Species. A list of all species that correspond to the node. Column S: Taxon ID. If it exists, the taxonomy ID for the species listed in column R. Column T: Genome GI. The GI for the genomic DNA for this sequence. Column U: DNAAvailable. This is a binary list (Yes/No) that indicates whether any of the species in R are in either MTDF or ATCC. Column V: inMTDF. This is a binary list that indicates whether any of the species in R are in the MTDF list. Column W: inATCC. This is a binary list that indicates whether any of the species in R are in the ATCC list. Column X: Subgroup. This is the SFLD subgroup that this sequence is assigned to. Column Y: Cytoscape ID. The cytoscape identifier this sequence is associated with in the representative network. Column Z: Potential Targets. Yes indicates that all of the following are true: No structures for this sequence, Sequences is not experimentally characterized (as specified in column H), sequence is not a target (as specified in column K), and the species is in either MTDF and/or ATCC. Column AA: spAnnot. Annotation from swiss-prot. ● ● ● Column AB: nonHADPfamDom. Indicates the non-HAD-clan pfam domain with the most significant match (within the gathering cutoff specified by Pfam) to the sequence. (Potentially useful for identifying multidomain proteins.) Column AC: goldStd. Indicates the gold standard protein represented by the sequence (each gold standard protein was mapped to the single closest match in the HAD SFLD superfamily set) and the percent ID for the match. Column AC: enzyme ID. Internal SFLD identifier.

HAD Spreadsheet Data Format: Column A: Primary GI. The primary

Related documents

Products

Support

HAD Spreadsheet Data Format: Column A: Primary GI. The primary

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib