A Field Guide part 2 February 14, 2006 UT-Health Science Center NCBI FieldGuide National Center for Biotechnology Information The Flatfile Format Header Feature Table Sequence NCBI FieldGuide GenBank Records A Typical GenBank Record NM_019570 4279 bp mRNA linear INV 28-OCT-2004 Mus musculus REV1-like(S. cerevisiae)(Rev1l),mRNA NM_019570 NM_019570.3 GI:50811869 = Title . NCBI FieldGuide LOCUS DEFINITION ACCESSION VERSION KEYWORDS NCBI FieldGuide GenBank Record: Feature Table GenPept identifier NCBI FieldGuide GenBank Record: Feature Table, con’t. skip NCBI FieldGuide GenBank Record: sequence Field [primary accession] [title] [organism] [sequence length] [modification date] [properties] Indexed Terms NM_001012399 [accn] Bos taurus hemochromatosis (hfe), mRNA. Bos taurus [orgn] 1168 2005/02/19 [mdat] biomol mrna [prop] gbdiv mam srcdb refseq NCBI FieldGuide Indexing for Nucleotide UID 59958365 HFE NCBI FieldGuide Global Entrez Search: HFE 137 records [Title] Not HFE NCBI FieldGuide Entrez Nucleotide: HFE hfe[title] AND human[orgn] 42 records NCBI FieldGuide Smarter Query Curated HFE splice variants (11 total) (con’t) Primary data NCBI FieldGuide hfe[title] AND human[orgn] NCBI FieldGuide Preview/Index Gateway to Advanced Searches NCBI FieldGuide Preview/Index Properties srcdb NCBI FieldGuide Preview/Index: Properties, srcdb …AND srcdb refseq[Properties] NCBI FieldGuide Preview/Index: Properties, srcdb …AND srcdb ddbj/embl/genbank[Properties] NCBI FieldGuide Preview/Index: Properties, srcdb #1 hfe #2 hfe[title] AND human[orgn] 137 42 #3 #2 AND srcdb refseq[prop] #4 #2 AND srcdb ddbj/embl/genbank[prop] #5 #4 AND gbdiv pri[prop] #4 #4 AND gbdiv est[prop] Primate division EST division 11 31 29 2 gbdiv pri[prop] gbdiv est[prop] NCBI FieldGuide Database Queries #1 hfe #2 hfe[title] AND human[orgn] #3 #2 AND biomol mrna[prop] #4 #2 AND biomol genomic[prop] Genomic DNA cDNA 116 42 29 13 biomol genomic[prop] biomol mrna[prop] NCBI FieldGuide Molecule Queries More Queries… Entrez Nucleotide Reviewed RefSeqs with transcript variants: srcdb refseq reviewed[prop] AND transcript[title] AND variant[title] NCBI FieldGuide Fields are database-specific More Queries… Entrez Nucleotide Reviewed RefSeqs with transcript variants: srcdb refseq reviewed[prop] AND transcript[title] AND variant[title] Entrez Gene Topoisomerase genes from Archaea: topoisomerase[gene name] AND archaea[organism] Genes on human chromosome 2 with OMIM links 2[chromosome] AND human[organism] AND “gene omim”[filter] Membrane proteins linked to cancer: “integral to plasma membrane”[gene ontology] AND cancer[dis] NCBI FieldGuide Fields are database-specific Genomic Biology Genomic Biology UniGene E-PCR Map Viewer Trace Archive NCBI FieldGuide Genome Resources NCBI FieldGuide Genomic Biology Gen Biol: Gen Resources NCBI FieldGuide NCBI FieldGuide Map Viewer – Genome Annotation Updates Gen Biol: Gen Resources NCBI FieldGuide NCBI FieldGuide Genome Projects: microb Genome Projects: microb NCBI FieldGuide 13 Eukaryotic Genome Sequencing Projects Selected: Complete – 0, Assembly – 2, In Progress - 11 Gen Biol: Gen Resources NCBI FieldGuide Gen Biol: Gen Resources NCBI FieldGuide Gen Biol: Gen Resources NCBI FieldGuide Gen Biol: Gen Resources NCBI FieldGuide NCBI FieldGuide Gen Biol: Gen Resources Genomic Biology UniGene E-PCR Map Viewer Trace Archive NCBI FieldGuide Genome Resources Gene-oriented clusters of expressed sequences • Automatic clustering using MegaBlast • Each cluster represents a unique gene • Informed by genome hits • Information on tissue types and map locations • Useful for gene discovery and selection of mapping reagents NCBI FieldGuide UniGene NCBI FieldGuide A Cluster of ESTs query 5’ EST hits 3’ EST hits NCBI FieldGuide UniGene Collections Species UniGene NCBI FieldGuide UniGene Collections NCBI FieldGuide UniGene Hs build 188 NCBI FieldGuide UniGene Cluster Hs.95351 Lipase, hormone-sensitive (LIPE) UniGene Cluster Hs.95351 NCBI FieldGuide NCBI FieldGuide UniGene Cluster Hs.95351: expression NCBI FieldGuide UniGene Cluster Hs.95351: seqs Get Sequences ftp://ftp.ncbi.nih.gov/repository/UniGene/Homo_sapiens/ NCBI FieldGuide web page Genomic Biology UniGene E-PCR Map Viewer Trace Archive NCBI FieldGuide Genome Resources Genomic sequence here NCBI FieldGuide E-PCR NCBI FieldGuide Options NCBI FieldGuide Results NCBI FieldGuide reverse e-pcr NCBI FieldGuide reverse e-pcr NCBI FieldGuide reverse e-pcr Gene STS LY6G6D: lymphocyte antigen 6 complex, locus G6D NCBI FieldGuide reverse e-pcr Genomic Biology UniGene E-PCR Map Viewer Trace Archive NCBI FieldGuide Genome Resources NCBI FieldGuide List View Human MapViewer NCBI FieldGuide adar NCBI FieldGuide MapViewer: Human ADAR 5’ UTR MV Hs ADAR NCBI FieldGuide 3’ UTR --Sequence maps-Ab initio Assembly Repeats BES_Clone Clone NCI_Clone Contig Component CpG island dbSNP haplotype Fosmid GenBank_DNA Gene Phenotype SAGE_Tag STS TCAG_RNA Transcript (RNA) Hs_UniGene Hs_EST Mm_UniGene Mm_EST Rn_UniGene Rn_EST Ssc_UniGene Ssc_EST Bt_UniGene Bt_EST Gga_UniGene Gga_EST Variation --Cytogenetic maps-Ideogram FISH Clone Gene_Cytogenetic Mitelman Breakpoint Morbid/Disease --Genetic Maps-deCODE Genethon Marshfield --RH maps-= SNP GeneMap99-G3 GeneMap99-GB4 NCBI RH Standford-G3 TNG Whitehead-RH Whitehead-YAC NCBI FieldGuide Maps& & Options Maps Options MapViewer UniGene Repeats Gene NCBI FieldGuide Component Phenotype NCBI FieldGuide Gene Variation NCBI FieldGuide Maps& & Options Options Maps Mouse ADAR Human ADAR NCBI FieldGuide Chimp ADAR Genomic Biology UniGene E-PCR Map Viewer Trace Archive NCBI FieldGuide Genome Resources NCBI FieldGuide Trace Archive Page NCBI FieldGuide Ciona savignyi Traces NCBI FieldGuide Potential access to sequences NOT yet in GenBank NCBI FieldGuide Trace Archive BLAST Page NCBI FieldGuide Basic Local Alignment Search Tool BLAST Web Searches, 2005 NCBI FieldGuide 200,000 Nucleotide or protein: Related Sequences BLAST link: BLink Transcript clusters: UniGene Protein homologs: HomoloGene NCBI FieldGuide Precomputed BLAST Services NCBI FieldGuide Link to Related Sequences Most similar Least similar NCBI FieldGuide Related Sequences NCBI FieldGuide BLink (BLAST Link) Best hits 3D structures CDD-Search NCBI FieldGuide BLink Output Fast - heuristic approach based on Smith Waterman Local alignments Statistical significance - Expect value Versatile - blastn, blastp, blastx, tblastn, tblastx, rps-blast, psi-blast - www, standalone, and network clients NCBI FieldGuide Why Is BLAST So Popular? Seq 1 Seq 2 Global alignment Seq 1 Seq 2 Local alignment NCBI FieldGuide Global vs Local Alignment Seq1: Seq2: WHEREISWALTERNOW (16aa) HEWASHEREBUTNOWISHERE (21aa) Global Seq1: 1 Seq2: 1 W--HEREISWALTERNOW 16 W HERE HEWASHEREBUTNOWISHERE 21 Local Seq1: 1 Seq2: 3 W--HERE 5 W HERE WASHERE 9 Seq1: 1 W--HERE 5 W HERE Seq2: 15 WISHERE 21 NCBI FieldGuide Global vs Local Alignment 1. Make lookup table of “words” for query 2. Scan database for hits 3. Extend alignment both directions – Ungapped extensions of hits (initial HSPs) – Gapped extensions (no traceback) – Gapped extensions (traceback - alignment details) NCBI FieldGuide How BLAST Works Query: GTQITVEDLFYNIATRRKALKN GTQ TQI Word size can only be 2 or 3 Make a lookup QIT Neighborhood Words table of words ITV VTV, LTV, VSV, etc. TVE VTV 12 Neighborhood LTV 11 score threshold VED VSV 8 EDL DLF ... Word size = 3 (default) NCBI FieldGuide Protein Words example query words Query: IETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESILEV… HFL 18 YLS 15 HFV 15 YLT 12 YVS 12 Neighborhood HFS 14 HWL 13 YIT 10 words NFL 13 Neighborhood etc … DFL 12 score threshold HWV 10 T (-f) =11 etc … Query 1 IETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESI 47 +E YA YL K F+YLSL +SP+ +DVNVHP+K VHFL+++ I Sbjct 287 LEETYAKYLHKGASYFVYLSLNMSPEQLDVNVHPSKRIVHFLYDQEI 333 Drop-off score = Highest score – current score -X X dropoff value for gapped alignment (in bits) blastn 30, megablast 20, tblastx 0, all others 15 NCBI FieldGuide BLASTP Summary example query words Query: IETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESILEV… HFL 18 YLS 15 HFV 15 YLT 12 YVS 12 Neighborhood HFS 14 HWL 13 YIT 10 words NFL 13 Neighborhood etc … DFL 12 score threshold HWV 10 T (-f) =11 etc … Query 1 IETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESI 47 +E YA YL K F+YLSL +SP+ +DVNVHP+K VHFL+++ I Sbjct 287 LEETYAKYLHKGASYFVYLSLNMSPEQLDVNVHPSKRIVHFLYDQEI 333 High-scoring pair (HSP) Gapped extension with trace back Query 1 IETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESI-LEV… 50 +E YA YL K F+YLSL +SP+ +DVNVHP+K VHFL+++ I + + Sbjct 287 LEETYAKYLHKGASYFVYLSLNMSPEQLDVNVHPSKRIVHFLYDQEIATSI… 337 Final HSP NCBI FieldGuide BLASTP Summary Identity matrix A G C T A +1 –3 –3 –3 G –3 +1 –3 –3 CAGGTAGCAAGCTTGCATGTCA || |||||||||||| ||||| CACGTAGCAAGCTTG-GTGTCA C –3 –3 +1 –3 T -3 -3 -3 +1 [ -r 1 -q -3 ] raw score = 19-9 = 10 NCBI FieldGuide Scoring Systems - Nucleotides Position Independent Matrices PAM Matrices (Percent Accepted Mutation) • Derived from observation; small dataset of alignments • Implicit model of evolution • All calculated from PAM1 • PAM250 widely used BLOSUM Matrices (BLOck SUbstitution Matrices) • Derived from observation; large dataset of highly conserved blocks • Each matrix derived separately from blocks with a defined percent identity cutoff • BLOSUM62 - default matrix for BLAST Position Specific Score Matrices (PSSMs) PSI- and RPS-BLAST NCBI FieldGuide Scoring Systems - Proteins A 4 R -1 5 N -2 0 6 D -2 -2 1 6 C 0 -3 -3 -3 9 Q -1 1 0 0 -3 5 E -1 0 0 2 -4 2 5 G 0 -2 0 -1 -3 -2 -2 6 H -2 0 1 -1 -3 0 0 -2 8 I -1 -3 -3 -3 -1 -3 -3 -4 -3 4 L -1 -2 -3 -4 -1 -2 -3 -4 -3 2 4 K -1 2 0 -1 -3 1 1 -2 -1 -3 -2 5 M -1 -1 -2 -3 -1 0 -2 -3 -2 1 2 -1 5 F -2 -3 -3 -3 -2 -3 -3 -3 -1 0 0 -3 0 6 P -1 -2 -2 -1 -3 -1 -1 -2 -2 -3 -3 -1 -2 -4 S 1 -1 1 0 -1 0 0 0 -1 -2 -2 0 -1 -2 T 0 -1 0 -1 -1 -1 -1 -2 -2 -1 -1 -1 -1 -2 Negative for -4 less-2likely substitutions W -3 -3 -4 -2 -3 -2 -2 -3 -2 -3 -1 1 Y -2 -2 -2 -3 -2 -1 -2 -3 2 -1 -1 -2 -1 3 V 0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 Positive substitutions X 0 -1 -1 -1 -2 for -1 more -1 -1likely -1 -1 -1 -1 -1 -1 A R N DD C Q E G H I L K M F NCBI FieldGuide BLOSUM62 7 -1 4 -1 1 5 -4 -3 -2 11 -3 -2 -2 2 7 -2 -2 0 -3 -1 4 -2 0 0 -2 -1 -1 -1 P S T W Y V X Serine/Threonine protein kinases catalytic loop PSSM scores DAF-1 1 5 7 4 4 NCBI FieldGuide Position-Specific Score Matrix catalytic loop 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 K E S N K P A M A H R D I K S K N I M V K N D L A -1 0 0 -1 -2 -2 3 -3 4 -4 -4 -4 -4 0 0 0 -4 -3 -4 -3 -2 1 -3 -3 R 0 1 0 0 1 -2 -2 -4 -4 -2 8 -4 -5 0 -3 3 -3 -5 -4 -3 1 1 -2 -1 N 0 0 -1 -1 1 -2 1 -4 -4 -1 -3 -1 -6 1 -2 0 8 -5 -6 -5 1 3 5 0 D -1 2 0 -1 -1 -2 -2 -4 -4 -3 -4 8 -6 -3 -3 1 -1 -6 -6 -6 4 0 5 -3 C -2 -1 1 1 -2 -3 0 -3 0 -5 0 -6 -3 -5 0 -5 -5 0 -3 -3 -5 -4 -1 0 Q 3 0 1 0 0 -2 -1 -4 -4 -2 -1 -2 -4 -1 -2 0 -2 -5 -4 -4 0 -1 -1 -3 E 0 2 0 -1 -1 -2 0 -4 -4 -2 -2 0 -5 -1 -2 0 -2 -5 -5 -5 -1 1 1 -2 G 3 -1 1 3 -2 -2 1 -5 -3 -4 -3 -3 -6 -3 -3 -4 -3 -6 -6 -6 -2 0 -1 3 H 0 0 1 3 -2 -2 -2 -4 -4 10 -2 -3 -5 -3 -3 -1 -1 -5 -5 -5 1 -3 0 -4 I -2 -1 0 -1 -1 -1 -2 7 4 -6 -5 -5 3 -5 -4 -4 -6 6 0 3 -4 -4 -5 -2 L -2 -1 -1 -1 -2 -2 -2 0 -1 -5 -4 -6 5 -5 -4 -3 -6 2 6 3 -2 -4 -4 3 K 1 0 0 1 5 -1 0 -4 -4 -3 0 -3 -5 7 -2 4 -2 -5 -5 -4 4 3 0 0 M -1 0 0 -1 1 0 -1 1 -2 -4 -3 -5 1 -4 -4 -3 -4 2 1 2 -3 -2 -2 1 F -1 0 0 0 -2 -3 -2 0 -3 -3 -2 -6 1 -5 -5 -2 -5 -2 0 -2 -2 -5 -5 1 P -1 -1 2 0 -2 7 3 -4 -4 -2 -4 -4 -5 -3 2 2 -4 -5 -5 -5 -3 -2 -1 -2 S -1 0 0 -1 -1 -1 1 -4 -1 -3 -3 -2 -5 -1 6 1 -1 -4 -4 -4 0 2 0 -2 T -1 0 -1 -1 -1 -2 0 -2 -2 -4 -3 -3 -3 -2 2 -1 -2 -3 -3 -3 -1 -2 -2 -3 W -1 -1 -1 1 -2 -3 -3 -4 -4 -5 0 -7 -4 -5 -5 -5 -6 -5 -4 -5 -5 -5 -6 5 Y -1 -1 0 1 -2 -1 -3 -1 -3 0 -4 -5 -3 -4 -4 -4 -4 -3 -3 -3 -2 -4 -4 -1 V -2 -1 -1 -1 -1 -1 0 2 4 -5 -5 -5 1 -4 -4 -4 -5 3 0 5 -3 -4 -5 -3 NCBI FieldGuide Position-Specific Score Matrix Expect Value E = number of database hits you expect to find by chance, ≥ S E = Kmne-S or E = mn2-S’ K = scale for search space = scale for scoring system S’ = bitscore = (S - lnK)/ln2 (applies to ungapped alignments) More info: The Statistics of Sequence Similarity Scores NCBI FieldGuide Local Alignment Statistics 1 GAATATATGAAGACCAAGATTGCAGTCCTGCTGGCCTGAACCACGCTATTCTTGCTGTTG || | || || || | || || || || | ||| |||||| | | || | ||| | 1 GAGTGTACGATGAGCCCGAGTGTAGCAGTGAAGATCTGGACCACGGTGTACTCGTTGTCG 61 GTTACGGAACCGAGAATGGTAAAGACTACTGGATCATTAAGAACTCCTGGGGAGCCAGTT | || || || ||| || | |||||| || | |||||| ||||| | | 61 GCTATGGTGTTAAGGGTGGGAAGAAGTACTGGCTCGTCAAGAACAGCTGGGCTGAATCCT 121 GGGGTGAACAAGGTTATTTCAGGCTTGCTCGTGGTAAAAAC |||| || ||||| || || | | |||| || ||| 121 GGGGAGACCAAGGCTACATCCTTATGTCCCGTGACAACAAC Reason: no contiguous exact match of 7 bp. NCBI FieldGuide An Alignment BLAST Cannot Make NCBI FieldGuide An Alignment BLAST Can Make Score = 290 bits (741), Expectsequences; = 7e-77 Solution: compare protein BLASTX Identities = 147/331 (44%), Positives = 206/331 (61%), Gaps = 8/331 (2%) Frame = +3 BLAST 2 Sequences (blastx) output: • Megablast • Discontiguous Megablast • PSI-BLAST • PHI-BLAST NCBI FieldGuide Other BLAST Algorithms • Long alignments of similar DNA sequences • Greedy algorithm • Concatenation of query sequences • Faster than blastn; less sensitive NCBI FieldGuide Megablast: NCBI’s Genome Annotator Trade-off: sensitivity vs speed WORD SIZE default minimum blastn 11 7 megablast 28 8 blastp 3 2 NCBI FieldGuide MegaBLAST & Word Size • Uses discontiguous word matches • Better for cross-species comparisons NCBI FieldGuide Discontiguous Megablast W W W W W W W W W W W W = = = = = = = = = = = = 11, 11, 12, 12, 11, 11, 12, 12, 11, 11, 12, 12, t t t t t t t t t t t t = = = = = = = = = = = = 16, 16, 16, 16, 18, 18, 18, 18, 21, 21, 21, 21, coding: non-coding: coding: non-coding: coding: non-coding: coding: non-coding: coding: non-coding: coding: non-coding: 1101101101101101 1110010110110111 1111101101101101 1110110110110111 101101100101101101 111010010110010111 101101101101101101 111010110010110111 100101100101100101101 111010010100010010111 100101101101100101101 111010010110010010111 W = word size; # matches in template t = template length Reference: Ma, B, Tromp, J, Li, M. PatternHunter: faster and more sensitive homology search. Bioinformatics March, 2002; 18(3):440-5 NCBI FieldGuide Templates for Discontiguous Words NCBI FieldGuide NCBI FieldGuide Discontiguous (Cross-species) MegaBLAST NCBI FieldGuide Discontiguous Word Options Query: NM_078651 Drosophila melanogaster CG18582-PA (mbt) mRNA, (3244 bp) /note= mushroom bodies tiny; synonyms: Pak2, STE20, dPAK2 Database: nr (nt), Mammalia[orgn] MegaBLAST = “No significant similarity found.” Discontiguous megaBLAST = numerous hits . . . NCBI FieldGuide Disco. Megablast Example . . . NCBI FieldGuide Ex: Discontiguous MegaBLAST NCBI FieldGuide Ex: BLASTN Position-specific Iterated BLAST Example: Confirming relationships of purine nucleotide metabolism proteins NCBI FieldGuide PSI-BLAST >gi|113340|sp|P03958|ADA_MOUSE ADENOSINE DEAMINASE (ADENOSINE MAQTPAFNKPKVELHVHLDGAIKPETILYFGKKRGIALPADTVEELRNIIGMDKPLSLPGF VIAGCREAIKRIAYEFVEMKAKEGVVYVEVRYSPHLLANSKVDPMPWNQTEGDVTPDDVVD EQAFGIKVRSILCCMRHQPSWSLEVLELCKKYNQKTVVAMDLAGDETIEGSSLFPGHVEAY RTVHAGEVGSPEVVREAVDILKTERVGHGYHTIEDEALYNRLLKENMHFEVCPWSSYLTGA VRFKNDKANYSLNTDDPLIFKSTLDTDYQMTKKDMGFTEEEFKRLNINAAKSSFLPEEEKK 0.005 E value cutoff for PSSM NCBI FieldGuide PSI-BLAST Same results as protein-protein BLAST; different format NCBI FieldGuide RESULTS: Initial BLASTP Other purine nucleotide metabolizing enzymes not found by ordinary BLAST NCBI FieldGuide Results of First PSSM Search Just below threshold, another nucleotide metabolism enzyme Check to add to PSSM NCBI FieldGuide Tenth PSSM Search: Convergence >gi|231729|sp|P30429|CED4_CAEEL CELL DEATH PROTEIN 4 MLCEIECRALSTAHTRLIHDFEPRDALTYLEGKNIFTEDHSELISKMSTRLERIANFLRIYRRQASE LIDFFNYNNQSHLADFLEDYIDFAINEPDLLRPVVIAPQFSRQMLDRKLLLGNVPKQMTCYIREYHV IKKLDEMCDLDSFFLFLHGRAGSGKSVIASQALSKSDQLIGINYDSIVWLKDSGTAPKSTFDLFTDI LKSEDDLLNFPSVEHVTSVVLKRMICNALIDRPNTLFVFDDVVQEETIRWAQELRLRCLVTTRDVEI ASQTCEFIEVTSLEIDECYDFLEAYGMPMPVGEKEEDVLNKTIELSSGNPATLMMFFKSCEPKTFEK [GA]xxxxGK[ST] NCBI FieldGuide PHI-BLAST NCBI FieldGuide What’s New? Nucleotide • refseq_rna = NM_*, XM_* • refseq_genomic = NC_*, NG_* • env_nt – environmental sample[filter], e.g., 16S rRNA Protein • refseq = NP_*, XP_* • env_nr nr = nr NCBI FieldGuide BLAST Databases Select lower case Select red NCBI FieldGuide New Formatter low complexity sequence filtered NCBI FieldGuide BLAST Output: Alignments & Filter NCBI FieldGuide BLAST Output: CDS Feature Limit to Organism all[filter] NOT ma Example Entrez Queries all[Filter] NOT mammalia[Organism] ray finned fishes[Organism] srcdb refseq[Properties] Nucleotide only: biomol mrna[Properties] biomol genomic[Properties] OtherAdvanced –e 10000 -v 2000 -b 2000 -e 10000 -v 2000 expect value descriptions alignments NCBI FieldGuide Advanced Options NCBI FieldGuide Genome BLAST NCBI FieldGuide Genome BLAST via Map Viewer Human EST TGCCTCCTTTGGTGAAGGTGACACATCATGTGACCTCTTCAGTGAC CACTCTACGGTGTCGGGCCTTGAACTACTACCCCCAGAAC ATCACCATGAAGTGGCTGAAGGATAAGCAGCCAATGGATGCCAAG GAGTTCGAACCTAAAGACGTATTGCCCAATGGGGATGGGAC CTACCAGGGCTGGATAACCTTGGCTGTACCCCCTGGGGAAGAGC NCBI FieldGuide Example: Human Genome BLAST NCBI FieldGuide Human Genome BLAST: Results Entrez Gene NCBI FieldGuide Human Genome BLAST: MapViewer >forward CCATGGCGACCCTGGAAAAGC ? ? >reverse CAGCAGCGGCTGTGCCTGCGG ? NCBI FieldGuide Example: Mapping Oligos Onto a Genome >CCATGGCGACCCTGGAAAAGCNNNNNNNNNNCAGCAGCGGCTGTGCCTGCGG forward primer -W 7 –e 1000 reverse primer NCBI FieldGuide Map Oligos Onto Genome NCBI FieldGuide Genome BLAST Results NCBI FieldGuide Primer Alignments reverse primer forward primer NCBI FieldGuide MapViewer NCBI FieldGuide MapViewer forward reverse NCBI FieldGuide Sequence View (sv) •BLAST •General Help blast-help@ncbi.nlm.nih.gov info@ncbi.nlm.nih.gov •Wayne matten@ncbi.nlm.nih.gov Matten NCBI FieldGuide Service Addresses