BN 231 BLAST Exercise Exercise 1: Protein-Protein BLAST Step 1: copy the sequence below: MSKRKAPQETLNGGITDMLTELANFEKNVSQAIHKYNAYRKAASVIAKYPHKIKSG AEAKKLPGVGTKIAEKIDEFLATG KLRKLEKIRQDDTSSSINFLTRVSGIGPSAARKFVDEGIKTLEDLRKNEDKLNHHQRIG LKYFGDFEKRIPREEMLQMQD IVLNEVKKVDSEYIATVCGSFRRGAESSGDMDVLLTHPSFTSESTKQPKLLHQVVEQ LQKVHFITDTLSKGETKFMGVCQ LPSKNDEKEYPHRRIDIRLIPKDQYYCGVLYFTGSDIFNKNMRAHALEKGFTINEYTIR PLGVTGVAGEPLPVDSEKDIF DYIQWKYREPKDRSE Step 2: Open BLAST: https://blast.ncbi.nlm.nih.gov/Blast.cgi Step 3: Select Protein BLAST Step 4: Paste Sequence on query box, add job name and choose BLAST P, Click BLAST Step 5: Interpreting results Explore the results page a. Top section with input sequence details b. Explore these tabs: Description, Graphic summary, Alignments, Taxonomy c. Interpreting results Scientific name: Max Score (Maximum score): the highest alignment score calculated from the sum of the rewards for matched nucleotides and penalities for mismatches and gaps Total Score: the sum of alignment scores of all segments from the same subject sequence Query Cover (Query coverage): the percent of the query length that is included in the aligned segments E-value (Expect Value): the number of alignments expected by chance with the calculated score or better. The expect value is the default sorting metric; for significant alignments the E value should be very close to zero. Percentage identity: the highest percent identity for a set of aligned segments to the same subject sequence. Acc. Len. (Accession Length): the number of nucleotides or amino acids in the result sequence identified by the accession number Accession: a unique identifier assigned to records in the NCBI databases About score matrices BLAST uses score matrices to compare the sequences by alignment them and assigning a value to each alignment. There are a several kinds of scoring matrices for different comparisons such as BLOSUM and PAM matrices. BLOSUM-62 matrix is among the best for detecting most weak protein similarities, BLOSUM-45 for long and weak alignments, BLOSUM80 for closely related sequences, BLOSUM45 for distant related sequences, PAM 1, PAM250, etc The higher the score value, the better the alignment/comparison. Examples: Image rights: https://www.ncbi.nlm.nih.gov/Class/MLACourse/Modules/BLAST/scoring_nucleotides.html Image rights: https://studylib.net/doc/18217730/score--bit-score--p-value--e Learn more on BLAST scoring matrices: 1. Use of scoring matrices: https://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/650/Use _scoring_matrices.html 2. BLAST substitution matrices: https://www.ncbi.nlm.nih.gov/blast/html/sub_matrix.html Exercise 2: DNA BLAST 1. Copy the sequence TCGAAATAACGCGTGTTCTCAACGCGGTCGCGCAGATGCCTTTGCTCATCAGATGCGACCGCAAC CACGTCCGC CGCCTTGTTCGCCGTCCCCGTGCCTCAACCACCACCACGGTGTCGTCTTCCCCGAA CGCGTCCCGGTCAGCCAG CCTCCACGCGCCGCGCGCGCGGAGTGCCCATTCGGGCCGCAGCTGCG ACGGTGCCGCTCAGATTCTGTGTGGCA GGCGCGTGTTGGAGTCTAAA Open BLAST https://blast.ncbi.nlm.nih.gov/Blast.cgi Choose nucleotide BLAST and paste the sequence Click BLAST Explore and interpret results as above Other exercises: 1. Interactive BLAST tutorial: https://digitalworldbiology.com/blast 2. https://digitalworldbiology.com/tutorial/blast-for-beginners 3. Detailed tutorial: https://www.ncbi.nlm.nih.gov/books/NBK1734/