Supplementary Material MAGIC-SPP: a database-driven DNA sequence processing package with

advertisement
Supplementary Material
MAGIC-SPP: a database-driven DNA sequence processing package with
associated management tools
Chun Liang1, Feng Sun1, Haiming Wang1, Junfeng Qu2, Robert M. Freeman, Jr.3,
Lee H. Pratt1 and Marie-Michèle Cordonnier-Pratt1
1
Laboratory for Genomics and Bioinformatics
Department of Plant Biology
University of Georgia
Athens, GA 30602, USA
2
Department of Computer Science
University of Georgia
Athens, GA 30602, USA
3
Department of Systems Biology
Harvard Medical School
Boston, MA 02115
14 February 2006
Trace
Files
MAGIC-SPP
MAGIC
POLYMORPHISM
MAGIC-CLUSTER
MAGIC
ANNOTATION
MAGIC
MICROARRAY
Figure 1: Relationship between MAGIC-SPP and other components of the
complete MAGIC system. Information about sequence reads obtained from
trace files processed through MAGIC-SPP is stored in MAGIC DB. While
MAGIC-SPP can be used as a stand-alone package, it also provides input for
four other major subsystems: annotation, EST clustering, detection of
sequence polymorphisms, and MIAME-compliant microarrays. These
downstream subsystems have been designed to benefit from features unique
to MAGIC-SPP, for example the inclusion of genotype information in
sequence names.
V2_CONTACT
CONTAC T_ID_N UM
LAST_NAME
FIRST_N AME
MIDDLE_NAME
INITIALS
TITLE
FAX
TEL
EMAIL
LAB_ID (FK)
INSTITUTION
ADDRESS
DB_ACCESS
V2_LAB_PI
LAB_ID
PI_CONTACT_ID_NUM (FK)
Project
Author
V2_LAB
: MAGIC_SPP (Figure 041216)
LAB_ID
: Lee Pratt (print@50% )
Com pany : Univers ity of Georgia
Vers ion
: 1.0
V2_USER
Modified: 2004/12/17
V2_LIB_SEQ_NAME_FORMAT_PART
LAB_NAME
USERID
Copyright (c) 2004 Univers ity of Georgia
V2_FIELD_VALUE_DEFINITION
V2_TABLE_LIST
V2_ORACLE_SEQ_LIST
TABLE_ID
TABLE_ID (FK)
ORA_SEQ_NAME
TABLE_NAME
DEFINITION
COMMENTS
STATUS
COLUMN _N AME
TABLE_NAME
FIELD_N AME
TABLE_NAME (FK)
VALUE_GRIC
ASSO_GRIC
ASSO_TABLE_NAME
ASSO_FIELD_NAME
COMMENTS
POSSIBLE_VALUE
DEFIN ITION
NAME4SELECT
SEQ_NAME_FORMAT_ID (FK)
PART_INDEX
PART_NAME
PART_FIX_VALUE
PART_LENGTH
TAG_IS_NUM
V2_TEMPLATE_PLATE
CONTAC T_ID_N UM (FK)
PASSWORD
ALLOW_AD MIN
CATEGORY_ID (FK)
V2_FIELD_VALUE_ASSOCIATION
TEMPLATE_PLATE_ID
V2_ALIAS_ENTRY_NAME
TEMPLATE_PLATE_FORMAT
TEMPLATE_PLATE_NAME
PLATE_N UM
PCR_TAG
LIB_NAME
GT_C OMBINE_CODE
USER_PLATE_NUM
USER_QUADRANT
QU ADRANT_CONVENTION
TABLE_NAME (FK)
FIELD_NAME (FK)
VALU E_GRIC (FK)
ALIAS_ENTRY_ID
ALIAS_PLATE_ID (FK)
ORIGINAL_NAME
ORIGINAL_CLONE
ORIGINAL_WELL
ORIGINAL_PRIMER
ORIGINAL_INJECTION
ORIGINAL_CAPILLAR Y
V2_FIELD_DEFINITION
FIELD_NAME
TABLE_NAME (FK)
V2_LIBRARY_USER_PERMISSION
LIB_CODE (FK)
USERID (FK)
ENTRY_ALLOWED
VIEW_ALLOWED
V2_ALIAS_PLATE
V2_TEMPLATE_W ELL
V2_CONFIG_PLATE
COMMENTS
POSSIBLE_VALUE
DEFINITION
TEMPLATE_WELL_ID
CONFIG_PLATE_ID
V2_BACTERIA_PLATE_CONFIG
LAB_ID (FK)
USER_CONFIG_NAME
USERID (FK)
CONFIG_DATE
PLATE_FORMAT
TAG_CONFIG_COMPLETE
V2_BACTERIA_PLATE384
CONFIG_PLATE_ID
GRIC
BACTERIA_PLATE96_ID
DNA_PLATE96_ID
V2_BUFFER_COMPOSITION
DNA_PLATE96_BARCODE
BACTERIA_PLATE96_ID (FK)
SUPT_USERID (FK)
SUPT_START_DATE
SUPT_START_TIME
SUPT_DR OP_DATE
SUPT_STATUS
SUPT_COMMEN TS_ID
SUPT_COMMEN TS
DNA_USERID (FK)
DNA_START_DATE
DNA_START_TIME
DNA_DR OP_DATE
DNA_STATUS
DNA_COMMEN TS_ID
DNA_COMMEN TS
STUDEN T_ID (FK)
BUFFER _ID
BUFFER _GRIC
BUFFER _NAME
BUFFER _BARCOD E
CHEMICAL_ID
CHEMICAL_GRIC
CHEMICAL_NAME
VENDOR
CATALOG_NUM
LOT_NU M
STOCK_ID
STOCK_NAME
STOCK_GRIC
USER ID (FK)
MAKE_D ATE
COMMENTS
UNIT_NUM_SERIES_ID
IN CREMENTAL_LEVEL
LAB_ID (FK)
PROJEC T_ID (FK)
PI_ID
LAST_VALUE
DATE_FOR _LAST_VALUE
UNIT_NAME
COMMENTS
UNIT_NUM_ID
V2_LIMS_TO_UNIT_NUM
UNIT_NUM_ID (FK)
BACTERIA_PLATE96_ID (FK)
BACTERIA_PLATE384_ID
BACTERIA_PLATE384_QUADRAN T
BARC ODE_BACTERIA_PLATE384
BARC ODE_BACTERIA_PLATE96
UNIT_FORMAT
CREATION _DATE
UNIT_NUM_SERIES_ID (FK)
UNIT_NUM
UNIT_FORMAT
UNIT_TYPE
TAG_PLATE_COMPLETED
CONFIG_COMPLETE
V2_TH_INSTRUMENT
IN STRUMENT_ID
IN STRUMENT_NAME
MAX_SLOT_N UM
COMMENTS
IN STRUMENT_LOCATION
TH_PLATE_BARCODE
TH_PLATE_FORMAT
IN STRUMENT_ID (FK)
IN STRUMENT_BLOCK
ENTR Y_DATE
DROP_DATE
STATUS
COMMENT_ID
COMMENTS
STUD ENT_ID
USERID
TH_PLATE_NAME
EN TRY_TYPE
ALIAS_LIB_ID (FK)
ALIAS_LIB_N AME
ORIGINAL_PLATE_NAME
ORIGINAL_PLATE_FORMAT
ORIGINAL_PR OCESS_TYPE
V2_CONFIG_WELL
V2_LIBRARY_CATEGORY_PERMISSION
LIB_CODE (FK)
C ATEGOR Y_ID (FK)
ENTRY_ALLOWED
VIEW_ALLOWED
V2_SEQ_PROJECT_LAB
SEQ_PROJECT_ID (FK)
LAB_ID (FK)
V2_SEQ_PROJECT
SEQ_PR OJECT_ID
SEQ_PROJEC T_ID (FK)
LIB_CODE (FK)
PROJEC T_NAME
PROJEC T_DESC
COMMENTS
LIB_NAME
GT_COMBINE_CODE
LIB_ADD_DATE
V2_VECTOR
VECTOR_ID
VECTOR_NAME
VECTOR_SEQU ENCE1
VECTOR_SEQU ENCE2
VECTOR_TYPE
VECTOR_COMMENT
V2_PRIMER_VECTOR
PRIMER_ID (FK)
VECTOR_ID (FK)
VECTOR_ORIENTATION
OLD_EXP_VECTOR _LEN
EXP_VECT_LEN_SEGMENT_A
EXP_VECTOR_LEN
LIB_CODE (FK)
VECTOR _ID (FK)
HOST
VECT_ADAPT_COMB_SEQ1
VECT_ADAPT_COMB_SEQ2
FILE_NAME
FILE_LOCATION
DESCRIPTION
COMMENTS
RESTRICTION_SITE1
RESTRICTION_SITE2
CLONE_ENZ1_VECTOR_ORI
CLONE_ENZ2_VECTOR_ORI
CLONE_ENZ1_EST_D IR
CLONE_ENZ2_EST_D IR
FOR_DIR_ADAPTOR_SEQ_START
FOR_DIR_ADAPTOR_SEQ_END
REV_DIR_ADAPTOR_SEQ_START
CLONE_ENZ1_NAME
CLONE_ENZ2_NAME
CLONE_ENZ1_SEQ
CLONE_ENZ2_SEQ
FOR_DIR_ADAPTOR_SEQ_START_FILE
FOR_DIR_ADAPTOR_SEQ_END_FILE
REV_DIR_ADAPTOR_SEQ_START_FILE
REV_DIR_ADAPTOR_SEQ_END_FILE
FOR_DIR_ADAPTOR_SEQ_START_PAR
FOR_DIR_ADAPTOR_SEQ_END_PAR
REV_DIR_ADAPTOR_SEQ_START_PAR
REV_DIR_ADAPTOR_SEQ_END_PAR
REV_DIR_ADAPTOR_SEQ_END
FOR_EXP_VECT_LEN_SEG_B_C
REV_EXP_VECT_LEN_SEG_B_C
PR IMER_ID
V2_MASTER_MIX_PREP
PR IMER_ID (FK)
PR IMER_NAME
DDH2O_U L
DMSO_UL
BU FFER5X_UL
PR IMER_VOLUME_UL
BIGDYE_UL
V2_PRIMER_INSERT
PRIMER_ID (FK)
PRIMER_NAME
DIRECTION
V2_SEQUENCING_PLATE
SEQUENCING_PLATE_ID
V2_SEQUENCING_UNIT_PLATE96
INSTRUMENT_NAME
INSTRUMENT_LOCATION
COMMENTS
TAG_ACTIVE
V2_SEQUENCER_RUN_PROFILE_ASSO
SEQU ENCER_RUN _ID
PLATE_LOC ATION
PLATE_NAME
PLATE_QU ADRANT_NUM
PR IMER_NAME
PR IMER_SEQU ENCE
NUM_OF_MERS
TM
LOOP
SELF_DIMER
TEMPLATE_NAME
DESIGN_PROGRAM
RECORD_USER ID (FK)
CREATE_DATE
PR IMER_COMMENT
PR IMER_TYPE
TEMPLATE_FILE_NAME
TEMPLATE_FILE_LOCATIO
STR AND
PR IMER_ORIENTATION
TH_CLEAN_UNIT_PLATE_ID (FK)
BACTER IA_PLATE96_ID (FK)
SEQU ENCING_PLATE_ID (FK)
SEQU ENCING_PLATE_QUADRANT
USER ID (FK)
ENTRY_DATE
DROP_DATE
STATUS
COMMENT_ID
COMMENTS
STUDENT_ID (FK)
RUN _MODULE
PRIMER_ID (FK)
ALIAS_GRIC
PRIMER_NAME
ALIAS_PRIMER_ID
ALIAS_PRIMER_N AME
ALIAS_LIB_ID (FK)
COMMENTS
TAG_MULTI_ATGC
TAG_MULTI_A
TAG_MULTI_T
TAG_MULTI_G
TAG_MULTI_C
RATIO_MULTI_A
RATIO_MULTI_T
RATIO_MULTI_G
RATIO_MULTI_C
ALIAS_NAME
V2_SEQ_RRNA_SSAHA
SSAHA_ID
SEQ_NAME (FK)
QUAL_PROC ESS_ID (FK)
F_OR_R
Q1
Q2
SUBJECT
S1
S2
Q2_Q1_DIFF
S2_S1_DIFF
Q16V_LENGTH
SLENGTH
MATCH_LEN
PERCENT_ID
M_QUE_PER CENT
M_SUB_PERCENT
PASS_CRITERIA
U PLOAD_DATE
U PLOAD_TIME
ALIAS_NAME
VECTOR_LEN
VECTOR_RATIO
VECTOR1_START
VECTOR1_STOP
VECTOR1_LEN
VECTOR2_START
VECTOR2_STOP
VECTOR2_LEN
ALIAS_NAME
V1STOP_MINUS_Q16_START_PLUS_1
Q16_STOP_MINUS_V2START_PLUS_1
VECTOR1_IN _Q16
VECTOR2_IN _Q16
VECTOR_Q16_RATIO
CHROMAT_FILE_NAME
BLOCK_ID_NU M (FK)
UPLOAD_TIME
UPLOAD_DATE
LIB_NAME (FK)
GT_COMBINE_CODE (FK)
CHROMAT_STATUS
FTP_TIME
LOCATION_DIR
FINAL_DIR
FILESIZE
ORIGINAL_NAME
RUN_FOLDER
RUN_FOLDER_ID (FK)
CAPILLAR Y
ALIAS_NAME
EXTERNAL_C HROMAT
V2_PUBLICATION
GT_GEN_C ODE
GT_GEN_N AME
SP_CODE (FK)
GT_GEN_C OMMENTS
GENOTYPE_GENERAL_TYPE
INBREEDING_STATUS
GT_CODE (FK)
HETERO_VAL_LEVEL
RATIO_OF_GEN OTYPES_USED
V2_GENOTYPE
V2_LIBRARY_GENOTYPE_RATIOS
GT_COD E
LIB_NAME (FK)
GT_COMBINE_CODE (FK)
GT_CODE (FK)
GT_NAME
SOURCE_CONTACT_ID (FK)
GT_COMMENTS
GT_GEN_CODE (FK)
INDIVID UAL_ID
INDIVID UAL_NAME
POPULATION_MIX_VALUE
GT_RATIO
V2_SAME_GT_CLONE
SAME_GT_CLON E_ID
GT_CODE (FK)
INDIVIDU AL_C LONE_D ESCRIPTION
INDIVIDU AL_C LONE_N AME
EXT_DB_ID
EXT_CLON E_ID
CLONE_SOUR CE_LAB_ID
CLONE_NAME
CLONE_ALIAS_NAME
CLONE_AKA_NAME
CLONE_SOUR CE
CLONE_ORIGIN _LOCATION
CLONE_TYPE
TAG_IN SERT_INVERTED
TAG_COMPLETE_SEQUENCE
TAG_FULL_LEN _CODING
TAG_SEQ_B_GOOD
TAG_SEQ_G_GOOD
GT_COMBIN E_COD E
CLONE_STATUS
TAG_ARRAY
CLONE_MAGIC_SEQ_ID
EXTERNAL_CLONE_ID
EXTERNAL_CLONE_DB_ID
LIB_NAME
SEQ_NAME (FK)
QU AL_PROCESS_ID (FK)
TITLE
JOURNAL_NAME
START_PAGE
STOP_PAGE
AUTHORS
YEAR
STATUS
VOLUME_NUM
AUTHOR_COMBIN E_ID
BLOCK_ID _N UM (FK)
LIB_NAME (FK)
GT_C OMBINE_CODE (FK)
CLONE_NAME
SEQUENC ED_PLATE96_NAME
BLOCK_NUM
WELL_ID
VECTOR_ORIENTATION
REPLICATE_NUM
TAG_C OUNT
TAG_SEQ_ACTIVITY
TAG_PROCESS_ACTIVITY
TAG_POLYT
TAG_VECTOR_F1
TAG_VECTOR_F2
TAG_TOTAL_VECTOR
TAG_ECOLI
TAG_R RNA
TAG_C HLORO
TAG_MITO
TAG_FAIL_IN_80
TAG_POT_DYEBLOB
TAG_JOIN_GAP
TAG_PUBLIC_VIEW
TAG_Q16VS_100
TAG_GB_DEP_AUTH ORIZED
TAG_GB_DEP_STATUS
TAG_OVERLAP
RAW_SEQ_LEN GTH
NU M_GAP_5_9_Q16VS
NU M_GAP_10P_Q16VS
NU M_GAP_5_9_RAW
NU M_GAP_10P_RAW
NU M_OF_N
NU M_OF_Q16_Q16VS
NU M_OF_Q20_Q16VS
NU M_OF_Q16_RAW
NU M_OF_Q20_RAW
Q16VS_STAR T
Q16VS_LENGTH
Q16VS_STOP
Q16V_START
Q16V_LENGTH
Q16V_STOP
Q16_START
Q16_LENGTH
Q16_STOP
UPLOAD_DATE
UPLOAD_TIME
COMMEN T_ID (FK)
COMMEN TS
TAG_PAIRED_GOOD_ENDS
TAG_METHOD_TEST
TAG_MULTI_ATGC
ALIAS_NAME
USER_AKA_NAME
MAGIC_SEQ_ID (FK)
V2_PLATE96_SEQUENCING_NAME
SEQ_NAME (FK)
QUAL_PROCESS_ID (FK)
GB_DEPOSITION_DATE
GB_ACCESS_NUM
GB_GI_NUM
DB_EST_ID
GB_DEPOSITION_EVENT_STATUS
GB_SEQU ENCE
MAGIC_SEQ_ID
BLOCK_ID_NUM
V2_UPLOAD
RUN_FOLDER_ID
RUN_FOLDER_NAME
BLOCK_ID_NU M (FK)
FOLDER_FORMAT
SEQUENCING_PLATE_FOR MAT
NUM_WELL_USED
LIB_TYPE
STATUS
ENTRY_DATE
ENTRY_TIME
PROCESS_DATE
PROCESS_TIME
TAG_RENAME_FORMAT
COMMENTS
COMMENT_ID (FK)
RUN_FOLDER_NAME_MULTIPLE
RUN_FOLDER_TYPE
V2_W ELLPLATE96_SEQUENCING_NAME
SEQ_ID
BLOCK_ID_NUM
BLOCK_TYPE
BLOCK_NUM (FK)
ALIAS_EN TRY_ID
TEMPLATE_WELL_ID
REPLICATE_NUM (FK)
PRIMER_ID
PLATE96_WELL_N AME
PLATE384_WELL_N AME
SEQ_NAME
ALIAS_NAME
VECTOR _ORIENTATION (FK)
LIB_NAME (FK)
LIB_TYPE
GT_COMBIN E_CODE (FK)
TEMPLATE_C LONE
PRIMER_NAME
PCR_TAG
UNIT_NUM_ID
BARCODE_384
BARCODE_QU AD RANT_384
BACTERIA_PLATE96_ID
LIMS_LIB_NAME
MAGIC_PROC_LIB_NAME
MAGIC_PROC_GT_COMBIN E_COD E
MAGIC_TEMPLATE_C LONE_N AME
MAGIC_PROC_LIB_TYPE
V2_SEQ_QVS_ADD_INFO
SEQ_NAME (FK)
QUAL_PROCESS_ID (FK)
SEGMENT_ORDER
QU AL_PROC ESS_ID
SEQ_NAME
ALIAS_NAME
CLONE_NAME
SEQ_TYPE
CLONE_ID (FK)
AKA_NAME
TAG_PHD
SEQ_ORIGIN
CONTIG_ID
CLU STER_CATEGORY
CLU STER_RUN_ID
ALIAS_CONTIG_ID
COMMEN T_ID
EXT_SEQ_DB_ID
PROCESS_TYPE
V2_CLONE_SEQ_ASSOCIATION
C LONE_ID (FK)
MAGIC _SEQ_ID (FK)
TAG_Q16VS_100
TAG_DIR_C ONTROL_SELECTED
SEQ_4_CLONE_STATUS
EST_DIRECTION
V2_GENUS_SCREEN_DEF
GENUS_CODE (FK)
GENUS_NAME
MITO_TYPE
C HLORO_TYPE
R RNA_TYPE
ECOLI_TYPE
V2_ECOLI_SCREEN_PROCESS
EC OLI_PROCESS_ID
EC OLI_FILE_NAME
SSAHA_PAR_FILE_NAME
SSAHA_PAR_CONTENT
DATE_START
COMMENTS
PR OGRAM_ID (FK)
PR OGRAM_GRIC (FK)
V2_MITO_SCREEN_PROCESS
MITO_PROCESS_ID
TABLE_NAME
U SERID (FK)
ENTRY_DATE
C OMMENT_CONTENT
MITO_FILE_NAME
SSAHA_PAR_FILE_NAME
SSAHA_PAR_C ONTEN T
DATE_START
COMMENTS
PROGR AM_ID (FK)
PROGR AM_GRIC (FK)
V2_QUAL_PROCESS_DEF
QUAL_PROCESS_ID
QA_SC RIPT_VERSION
CREATE_DATE
COMMENTS
ECOLI_PROCESS_TYPE_1_ID (FK)
ECOLI_PROCESS_TYPE_2_ID (FK)
ECOLI_PROCESS_TYPE_3_ID
ECOLI_PROCESS_TYPE_4_ID
RRNA_PROCESS_TYPE_1_ID (FK)
RRNA_PROCESS_TYPE_2_ID (FK)
RRNA_PROCESS_TYPE_3_ID (FK)
RRNA_PROCESS_TYPE_4_ID (FK)
MITO_PROCESS_TYPE_1_ID (FK)
MITO_PROCESS_TYPE_2_ID (FK)
MITO_PROCESS_TYPE_3_ID (FK)
MITO_PROCESS_TYPE_4_ID (FK)
CHLORO_PROCESS_TYPE_1_ID (FK)
CHLORO_PROCESS_TYPE_2_ID (FK)
CHLORO_PROCESS_TYPE_3_ID (FK)
CHLORO_PROCESS_TYPE_4_ID (FK)
BASEC ALL_PROGRAM_ID (FK)
BASEC ALL_GRIC (FK)
DESCRIPTION
VECTOR_FILENAME_EXT_TYPE
V2_QUAL_PROCESS_UPDATE
QUAL_PROCESS_ID (FK)
COMMENT_ID (FK)
BLOCK_ID_NUM (FK)
UPLOAD _DATE (FK)
UPLOAD _TIME (FK)
PHD_TIME_STAMP
SEQUENCE
QUALITY_VALUES1
QUALITY_VALUES2
ALIAS_NAME
V2_SEQ_POLYT_SEGMENT
MAGIC_SEQ_ID
C OMMENT_ID
SEQ_NAME (FK)
QUAL_PR OCESS_ID (FK)
SEQ_N AME (FK)
QUAL_PROCESS_ID (FK)
V2_MAGIC_SEQUENCE
V2_COMMENTS
V2_SEQ_BASE_QUALITY
A_NUM
T_NUM
G_NU M
C_NU M
TOT_BASE_NUM_QVS
GC_R ATIO
N_NU M
ALIAS_N AME
V2_GENOTYPE_GEN
GT_COMBIN E_COD E (FK)
GRIC
V2_SEQ_QUAL_STATS
PUBLICATION_ID_NUM
GB_DEPOSITION_ID_NUM
CHROMAT_FILE_ID
SEQ_NAME (FK)
QUAL_PROCESS_ID (FK)
SEQ_NAME (FK)
QUAL_PROCESS_ID (FK)
V2_GENOTYPE_ASSOCIATION
CLONE_ID
V2_SEQUENCE_GENBANK
V2_CHROMAT
V2_SEQ_VECTOR_SCREEN
V2_SEQ_QVS_MULTI_ATGC
SP_NAME
SP_COMMON_NAME
GEN US_CODE (FK)
SP_COMMENTS
TAXON_ID
LIB_NAME (FK)
GT_COMBINE_CODE (FK)
BLOCK_NUM
ID_CH ECK_STATUS
D IRECTION_G_GOOD
D IRECTION_G_BLOCK_ID_N UM (FK)
D IRECTION_B_GOOD
D IRECTION_B_BLOCK_ID_NUM (FK)
C OMMENT_ID
LIB_NAME (FK)
GT_COMBINE_CODE (FK)
BLOCK_NUM
VEC TOR_ORIENTATION
REPLICATE_N UM
QUAL_PROCESS_ID
ID_CHECK_PASSED
PERCENT_POLYT_GOOD_3P_SEQ
AVG_POLYT_LEN_GOOD_3P_SEQ
TAG_STUDENT_D EPOSITION
STU DENT_PUBL_ID_N UM (FK)
OVERLAP_GOOD_SEQ_NUM
NUM_PAIRED_GOOD_EN DS
PERCENT_OVERLAP_GOOD_EN DS
NUM_GOOD_SEQS
PERCENT_SU CCESS
COMMENTS
V2_ALIAS_PRIMER_MAPPING
LIB_TYPE
LIB_NAME (FK)
GT_COMBINE_C ODE (FK)
BLOCK_NU M
VEC TOR_ORIEN TATION
REPLICATE_NUM
UPLOAD_DATE
UPLOAD_TIME
BLOCK_TH_STATUS
BLOCK_UPLD_STATUS
FOLDER_MAPPING
SEQUENCING_UNIT_PLATE_ID (FK)
PR IMER_ID (FK)
STU DENT_ID (FK)
PR OCESSING_TYPE
BLOCK_FOR MAT
SEQU ENCING_U NIT_PLATE_ID
RU N_DATE
PLATE_384_NU M
USERID (FK)
PLATE_A_NAME
PLATE_B_NAME
PLATE_C _NAME
PLATE_D _NAME
PLATE_A_QU ADRANT_NUM
PLATE_B_QUADRAN T_NUM
PLATE_C _QUADRANT_NUM
PLATE_D _QUADRANT_NUM
ADAPTOR_SEG_START
ADAPTOR_SEG_STOP
ADAPTOR_SEG_LEN
COMPLEMENT
ADAPTOR_TOT_NUM
ADAPTOR_SEG_TYPE
PROCESS_DATE
PROCESS_TIME
ALIAS_NAME
SP_CODE
SOURCE_CONTACT_ID (FK)
CREATE_D ATE
RECORD_USERID (FK)
MULTIPLE_INDIVIDUALS_SAME_GT
COMMENTS
V2_CLONE
SEQUENC ER_RUN _ID
SEQ_NAME (FK)
QUAL_PROCESS_ID (FK)
ADAPTOR_ENZYME
ADAPTOR_SEG_ORD ER
GT_COMBINE_CODE
PROCESS_BLOCK_ID
V2_SEQUENCER_RUN_PROFILE
V2_SEQ_ADAPTOR_SEGMENT
V2_SPECIES
V2_PROCESS_PLATE96_DIR_CONTROL
BLOCK_ID_NUM (FK)
INSTRUMENT_ID
GENUS_NAME
GENUS_COMMON
GENUS_COMMENTS
DIR ECTORY_N AME
V2_GT_COMBINE_CODE
LIB_PRIMER _ID
LIB_CODE (FK)
PRIMER_ID (FK)
EXP_VECTOR_LEN
VECTOR_ORIENTATION
EST_DIRECTION
TAG_POLY_T_SC REEN
TAG_ACTIVE
COMMENT_ID
COMMENTS
OLD_EXP_VECTOR_LEN
V2_PRIMER
BACTERIA_PLATE96_ID (FK)
TH _UNIT_PLATE_ID (FK)
TH _CLEAN_PLATE_ID (FK)
TH _CLEAN_PLATE_QUADRANT
U SERID (FK)
ENTRY_DATE
D ROP_D ATE
STATUS
C OMMENT_ID
C OMMENTS
STUDENT_ID (FK)
SEQUENCING_PLATE_BARCODE
PLATE_FORMAT
INSTR UMEN T_ID (FK)
ENTRY_DATE
DROP_DATE
STATUS
COMMENT_ID
COMMENTS
STUDEN T_ID
SEQUENCING_PLATE_NAME
TAG_PLATE_RECORD_FILE
SEQUENCER_RUN_ID (FK)
SEQUENCER_PLATE_LOCATION
TAG_PLATE_RECORD_FINISHED
USERID (FK)
PLATE_LOCATION (FK)
LIB_NAME
LIB_TYPE
GT_COMBIN E_COD E (FK)
GB_DEPOSIT_AUTHOR IZATION
PUBLIC_VIEW_AUTH ORIZATION
ENTRY_ALLOWED
DESCRIPTIVE_LIB_NAME
CONTACT_ID_NUM_SOURCE (FK)
CONTACT_ID_NUM_SUBMITTER (FK)
PUBLICATION_ID _NUM (FK)
GB_FIELD_NAME_FOR_GEN OTYPE
SEX
ORGAN
TISSU E
CELL_TYPE
CELL_LINE
STAGE
EXPECTED_3PRIME_DIR
EXPECTED_5PRIME_DIR
EST_DIR_FOR_B
EST_DIR_FOR_G
NUM_BLOCK_R EQUEST_FOR_B
NUM_BLOCK_R EQUEST_FOR_G
NUM_SEQ_REQUEST_FOR_B
NUM_SEQ_REQUEST_FOR_G
DESCRIPTION
COMMENT_FOR_PROCESS
LIB_RECORD_USER ID (FK)
LIB_CREATION_DATE
PICK_384_REQU EST
SEQ_PERMIT_EXCEPTION
RNA_ID
GB_DEPOSIT_STATUS
SEQ_NAME_FORMAT_ID (FK)
CHROMAT_NAME_FORMAT_ID
BLOCK_SERIES_ID
ALIAS_LIB_ID
LIMS_LIBRAR Y_ONLY
UNIT_NUM_SERIES_ID
V2_LIBRARY_PRIMER
V2_PROCESS_PLATE96_PROFILE_GB
V2_SEQUENCER_INSTRUMENT
LIB_CODE
ALIAS_LIB_NAME
ALIAS_LIB_TYPE
LAB_ID
LAB_PI_USERID
ENTRY_DATE
DESCRIPTION
SAME_GT_CLONE_ID (FK)
GT_RATIO
TH _CLEAN_U NIT_PLATE_ID
TH _CLEAN_PLATE_BARCOD E
PLATE_FORMAT
ENTRY_DATE
D ROP_D ATE
STATUS
C OMMENT_ID
C OMMENTS
STUDENT_ID
U SERID
TH _CLEAN_PLATE_N AME
GENUS_CODE
V2_LIBRARY
ALIAS_LIB_ID (FK)
LIB_NAME
GT_COMBINE_CODE
GRIC
V2_TH_CLEAN_UNIT_PLATE96
TH _CLEAN_PLATE_ID
V2_GENUS
V2_ALIAS_LIBRARY
LIB_VEC TOR_ID
BACTERIA_PLATE96_ID (FK)
DN A_PLATE96_ID (FK)
TH_PLATE_ID (FK)
TH_PLATE_QUADRANT
LIB_NAME (FK)
GT_C OMBINE_CODE (FK)
BLOCK_NUM
REPLICATE_NUM
VECTOR_ORIENTATION
PHASE_NUM
USERID (FK)
PRIMER _ID (FK)
ENTRY_DATE
DR OP_DATE
STATUS
CH EMISTRY
CH EMISTRY_DILU TION
COMMEN T_ID
COMMEN TS
STUDENT_ID (FK)
V2_TH_CLEAN_PLATE
SEQ_N AME_FOR MAT_NAME
TOT_PART_NUMBER
LIB_N AME_INDEX
BLOCK_NUM_INDEX
WELL_ROW_IND EX
WELL_COL_INDEX
VECTOR_ORIENTATION_IN DEX
GT_COMBINE_CODE_IN DEX
REPLICATE_NUM_INDEX
TAG_UPSTREAM
TAG_GT_C OMBINE_CODE
TAG_REPLICATE_NUM
PARSE_DIRECTION
VECTOR_ORI_FORWARD_NAME
VECTOR_ORI_REVER SE_NAME
SEQ_N AME_REGULAR_PATTER N
PRIMER_ID_INDEX
TEMPLATE_CLONE_INDEX
CAPILLARY_IN DEX
FORMAT_TYPE
INJECTION_INDEX
FILE_EXTENSION_INDEX
V2_LIBRARY_VECTOR
TH_U NIT_PLATE_ID
TH_PLATE_ID
CATEGORY_NAME
SEQ_N AME_FOR MAT_ID
V2_SAME_GT_CLONE_IN_LIB
V2_TH_UNIT_PLATE96
V2_TH_PLATE
CATEGORY_ID
V2_SEQ_PROJECT_LIB
V2_UNIT_NUM_SERIES
V2_UNIT_NUM_LOG
BACTERIA_PLATE96_BAR CODE
BACTERIA_PLATE384_QUADRAN T
BACTERIA_PLATE384_ID (FK)
LIB_NAME (FK)
GT_COMBINE_CODE (FK)
BLOC K_NUM
BACTERIA_INOCULATION_DATE
BACTERIA_INOCULATION_USERID (FK)
BACTERIA_GROWTH_START_TIME
BACTERIA_GROWTH_START_DATE
BACTERIA_GROWTH_START_USERID (FK)
BACTERIA_GROWTH_STOP_TIME
BACTERIA_GROWTH_STOP_DATE
BACTERIA_GROWTH_STOP_USERID (FK)
TE_RN ASE_ID
SDS_NAOH_ID
KOAC _3M_ID
LYSIS_START_TIME
LYSIS_START_DATE
LYSIS_END_TIME
LYSIS_USERID (FK)
STATUS
DROP_DATE
PHASE_NUM
BG_START_COMMENT_ID
BG_START_COMMENTS
BG_STOP_COMMENT_ID
BG_STOP_COMMENTS
LYSIS_COMMENT_ID
LYSIS_COMMENTS
LYSIS_END_D ATE
PROTOCOL_ID
STUD ENT_ID (FK)
RED O_NUM
V2_CATEGORY
CONFIG_PLATE_ID (FK)
ALIAS_ENTRY_ID
TEMPLATE_WELL_ID (FK)
PCR_TAG
WELL_NAME
ALIAS_TEMPLATE_CLONE_NAME
MAGIC_TEMPLATE_CLONE_NAME
MAGIC_PROC_LIB_TYPE
MAGIC_PROC_LIB_NAME (FK)
MAGIC_PROC_GT_COMBINE_CODE (FK)
LIMS_LIB_NAME
LIMS_GT_COMBINE_CODE
BACTERIA_PLATE384_BARCODE
BACTERIA_PLATE384_LETTER_CODE
BACTERIA_PLATE384_STATUS
LIB_NAME (FK)
GT_COMBINE_CODE (FK)
PICKING_D ATE
U SERID (FK)
N UMBER_OF_COPY
Q1
Q2
Q3
Q4
AVAILABLE_COUNT
C OMMENT_ID
C OMMENTS
PICK_METHOD
STUDENT_ASSIGN (FK)
BACTERIA_PLATE384_ALIAS_NAME
V2_BACTERIA_PLATE96
V2_DNA_PREP_PLATE96
TEMPLATE_CLONE_NAME
TEMPLATE_PLATE_ID (FK)
WELL_NAME
PCR_TAG
CONFIG_WELL_ID
BACTERIA_PLATE384_ID
BACTERIA_PLATE_ID (FK)
BACTERIA_PLATE_FOR MAT
BACTERIA_PLATE_AVAILABILITY
ALIAS_PLATE_ID
V2_LIB_SEQ_NAME_FORMAT
V2_RRNA_SCREEN_PROCESS
RRNA_PROCESS_ID
RRNA_FILE_NAME
SSAHA_PAR_FILE_NAME
SSAHA_PAR_CONTENT
DATE_START
COMMENTS
PROGRAM_ID (FK)
PROGRAM_GR IC (FK)
V2_CHLORO_SCREEN_PROCESS
CHLORO_PROC ESS_ID
CHLORO_FILE_NAME
SSAHA_PAR_FILE_NAME
SSAHA_PAR_CONTENT
DATE_START
COMMENTS
PR OGRAM_ID (FK)
PR OGRAM_GRIC (FK)
V2_SEQ_GAP_10PLUS_SCREEN
SEQ_GAP_10PLUS_ID
SEQ_NAME (FK)
QUAL_PROCESS_ID (FK)
GAP_10_POSITION
GAP_10_SIZE
WIN_SIZE
THRESH OLD
TAG_GAPJOIN
UPLOAD_DATE
UPLOAD_TIME
ALIAS_NAME
SEGMENT_TOT_NUM
SEGMENT_START
SEGMENT_LENGTH
V2_SEQ_GAP_5_9_SCREEN
SEQ_GAP_5_9_ID
V2_SEQ_MITO_SSAHA
SSAHA_ID
V2_SEQ_POLYT_GAP
SEQ_NAME (FK)
QUAL_PR OCESS_ID (FK)
F_OR_R
Q1
Q2
SUBJECT
S1
S2
Q2_Q1_DIFF
S2_S1_DIFF
Q16V_LENGTH
SLENGTH
MATCH_LEN
PERCENT_ID
M_QUE_PERCEN T
M_SUB_PERCENT
PASS_CR ITERIA
UPLOAD _DATE
UPLOAD _TIME
ALIAS_NAME
SEQ_NAME (FK)
QU AL_PROCESS_ID (FK)
GAP_ORDER
V2_SEQ_ADAPTOR_SCREEN
SEQ_N AME (FK)
QUAL_PROCESS_ID (FK)
V2_SEQ_ECOLI_SSAHA
SSAHA_ID
SEQ_NAME (FK)
QUAL_PROC ESS_ID (FK)
F_OR_R
Q1
Q2
SUBJECT
S1
S2
Q2_Q1_DIFF
S2_S1_DIFF
Q16V_LENGTH
SLENGTH
MATCH_LEN
PERCENT_ID
M_QUE_PER CENT
M_SUB_PERCENT
PASS_CRITERIA
U PLOAD_DATE
U PLOAD_TIME
ALIAS_NAME
V2_SEQ_CHLORO_SSAHA
SSAHA_ID
SEQ_NAME (FK)
QUAL_PR OCESS_ID (FK)
F_OR_R
Q1
Q2
SUBJECT
S1
S2
Q2_Q1_DIFF
S2_S1_DIFF
Q16V_LENGTH
SLENGTH
MATCH_LEN
PERCENT_ID
M_QUE_PERCEN T
M_SUB_PERCENT
PASS_CR ITERIA
UPLOAD _DATE
UPLOAD _TIME
ALIAS_NAME
ADAPTOR1_START
ADAPTOR1_STOP
ADAPTOR1_LEN
ADAPTOR2_START
ADAPTOR2_STOP
ADAPTOR2_LEN
ADAPTOR1_TAG
ADAPTOR2_TAG
PROCESS_DATE
PROCESS_TIME
SEQU ENCE
Q16VS_STAR T
Q16VS_LEN GTH
Q16VS_STOP
Q16V_START
Q16V_LENGTH
Q16V_STOP
TAG_Q16VS_100
VECTOR1_START
VECTOR1_STOP
VECTOR1_LENGTH
VECTOR2_START
VECTOR2_STOP
VECTOR2_LENGTH
TAG_VECTOR_F1
TAG_VECTOR_F2
TAG_TOTAL_VECTOR
VECTOR_LEN
VECTOR_RATIO
PREV_SEQUEN CE
PREV_Q16V_START
PREV_Q16V_LENGTH
PREV_Q16V_STOP
PREV_TAG_VECTOR _F1
PREV_TAG_VECTOR _F2
ALIAS_N AME
V2_SEQ_RRNA_SCREEN
V2_SEQ_CHLORO_SCREEN
V2_SEQ_MITO_SCREEN
SEQ_N AME (FK)
QUAL_PROCESS_ID (FK)
MATCH _SEG_NUM
F_OR_R
Q1
Q2
S1
S2
SUBJEC T
L_Q16V100
L_Q2
M_QU E_PERCENT
MATCH _LEN
PERC ENT_ID
MERGE_NUM
TAG_COUNT
TAG_SCREENED
UPLOAD_DATE
UPLOAD_TIME
ALIAS_N AME
V2_SEQ_ECOLI_SCREEN
SEQ_NAME (FK)
QUAL_PROCESS_ID (FK)
MATCH_SEG_NUM
F_OR_R
Q1
Q2
S1
S2
SU BJECT
L_Q16V100
L_Q2
M_QUE_PERCENT
MATCH_LEN
PERCENT_ID
MERGE_N UM
TAG_COUN T
TAG_SCREENED
UPLOAD_DATE
UPLOAD_TIME
ALIAS_NAME
SEQ_NAME (FK)
QUAL_PR OCESS_ID (FK)
MATCH_SEG_NUM
F_OR_R
Q1
Q2
S1
S2
SUBJECT
L_Q16V100
L_Q2
M_QUE_PERCEN T
MATCH_LEN
PERCENT_ID
MERGE_NUM
TAG_COUNT
TAG_SCR EENED
UPLOAD _DATE
UPLOAD _TIME
ALIAS_NAME
SEQ_NAME (FK)
QUAL_PROCESS_ID (FK)
MATCH_SEG_N UM
F_OR_R
Q1
Q2
S1
S2
SUBJECT
L_Q16V100
L_Q2
M_QUE_PERCENT
MATCH_LEN
PERCENT_ID
MER GE_NU M
TAG_COUNT
TAG_SC REENED
UPLOAD_DATE
UPLOAD_TIME
ALIAS_NAME
GAP_TOT_NUM
GAP_START
GAP_LENGTH
SEQ_NAME (FK)
QUAL_PROCESS_ID (FK)
GAP_5_9_POSITION
GAP_5_9_SIZE
WIN_SIZE
THRESHOLD
UPLOAD_DATE
UPLOAD_TIME
ALIAS_NAME
V2_PROGRAM
V2_SEQ_POLYT_SCREEN
SEQ_NAME (FK)
QUAL_PROCESS_ID (FK)
POLYT_START
POLYT_STOP
POLYT_LEN
ALIAS_NAME
Figure 2: Entity-Relationship diagram of the MAGIC-SPP database. To see
detail, use the ‘Zoom To…’ function in the Adobe Acrobat View menu.
PROGR AM_ID
PROGR AM_GRIC
PROGR AM_NAME
PROGR AM_VERSION
PROGR AM_FILENAME
PROGR AM_DESCRIPTION
D ATE_START
C OMMENTS
R ELEASE_DATE
PROGR AM_PARAMETERS
Figure 3: A screen shot of MAGIC Admin. In addition to one of the library definition screens
illustrated here, note the many pages available not only for entry of contact and publication
information, but also primer, vector and genotype definitions. In addition to what is shown, the
sequencing primers and vector used with this library, together with all required information as
described in the text, are entered using pages selected from the options along the left-hand border.
These pages also permit input of target information for each sequencing project and specify user
permissions with respect to both entry and viewing of data. The Contact Information page facilitates
among other functions creation of user accounts.
Figure 4: A screen shot of MAGIC SEQ-LIMS. This interface tracks all steps in DNA sequencing
from picking of bacterial colonies to upload of trace files into the analysis server.
Figure 5: A screen shot of the MAGIC-SPP Sequence Statistics interface. This interface provides
sequencing performance statistics for one or more plates in a selected library. In this view, all plates
were selected for view both collectively (above red line) and individually (below red line).
Figure 6: A screen shot of the MAGIC-SPP Status Report interface. Target information, including number
of colonies to be picked and number of sequences to be obtained, is entered through MAGIC Admin
(columns 1, 2, 5 and 6 from the left). Other columns report progress. The query page for this interface also
permits retrieving and viewing only data entered between any two dates.
Figure 7: A screen shot of MAGIC-SPP Plate Viewer. This
interface assists with troubleshooting by showing sequences from
both vector directions for the same plate. For example, in this
illustration wells A1, A10, A12, C11, and E10 have all failed in both
forward (top) and reverse (bottom) sequencing directions indicating
that these failures are most likely due to missing or poor quality
template DNA. Numbers within wells represent Q16VS read lengths.
Download