Assiut University Faculty of Computers and Information Course Code:BNF424 Course Name: Biological Databases 4th Level Questions Bank Choose the correct answer: 1) All the following are proteinsequence databaseexcept? ______ a) PIR b) Swiss prot c) EMBL 2) __________is a database that provides all the information about the human inheritance. a) PDB b) STAG c) OMIM 3) GenBank, the nucleic acid sequence database is maintained by ________ a) DDBJ b) EMBL c) NCBI 4) FlyBae is a ______ a) Biodiversity database b) Model organism database c) Literature database d) Biomolecular database 5) To parse a Swiss-Prot file, you write in the code >>> from Bio import SwissProt >>> handle = open("myfile.dat") >>> records = SwissProt.read (handle) a) True b) False 6) CDS field in the GenBank files contains _____: a) Information about the boundaries of the sequence that can be translated into amino acids. b) Accession number for the sequence 7) The global sequence alignment for the two sequences V=TGGTG , W=ATCGT .When the scoring matrix is 1 for Match and -1 for Mismatch and gap penalty=-2 is – T G G T G , A T C G T –. a) True b) False 8) The score of the pervious alignment is equal to ____ a) -2 b) -3 c) -1 9) Progenitor sequences represented by the ______ branches of the tree are derived by alignment of the _______ sequences. a) outer, outermost b) inner, outermost c) inner, innermost d) outer, innermost 10) To parse a Prosite file, you write in the code >>> from Bio import SwissProt >>> handle = open("myprositefile.dat") >>> records = Prosite.parse(handle) a) True b) False 11) In Smith–Waterman algorithm, in initialization Step, the _________ row and ________ column are subject to gap penalty. a) first, first b) first, second c) second, First d) first, last 12) In SW algorithm, to align two sequences of lengths of m and n _________ time is required. a) O(mn) b) O(m2n) c) O(m2n3) d) O(mn2) 13) TrEMBL consists of entries in the same format as SWISS-PROT. a) True b) False 14) Many studies have demonstrated that the distribution of similarity scores assumes a peculiar shape that resembles a highly skewed normal distribution with a long tail on one side. The distribution matches the _______ a) Gumble elective value distribution b) Gumble extreme void distribution c) Gumble end value distribution d) Gumble extreme value distribution 15) ExPASY is a comprehensive proteomics web server with a suite of programs for searching peptide information from the SWISS-PROT and TrEMBL databases. a) True b) False 16) The rigorous dynamic programming method is normally not used for database searching because it is slow and computationally expensive. a) True b) False 17) Which of the following does not describe local alignment algorithm? a) Score can be negative b) Negative score is set to 0 c) First row and first column are set to 0 in initialization step d) In traceback step, beginning is with the highest score, it ends when 0 is encountered 18) a Prosite file can contain more than one Prosite records a) True b) False 19) Local alignments are more used when _____________ a) There are totally similar and equal length sequences b) Dissimilar sequences are suspected to contain regions of similarity c) Similar sequence motif with larger sequence context d) Partially similar, different length and conserved region containing sequences 20) Which of the following does not describe BLOSUM matrices? a) It stands for BLOcks SUbstitution Matrix b) It was developed by Henikoff and Henikoff c) The year it was developed was 1992 d) These matrices are logarithmic identity values 21) In a dot matrix, two sequences to be compared are written in the _____________ of the matrix. a) horizontal and vertical axes b) 2 parallel horizontal axes c) 2 parallel vertical axes d) horizontal axis (one preceding another) 22) Among the following which one is not the approach to the local alignment? a) Smith-Waterman algorithm b) K-tuple method c) Words method d) Needleman-Wunsch algorithm 23) Which of the following does not describe BLAST? a) It stands for Basic Local Alignment Search Tool b) It uses word matching like FASTA c) It is one of the tools of the NCBI d) Even if no words are similar, there is an alignment to be considered 24) Which of the following is incorrect regarding pair wise sequence alignment? a) The most fundamental process in this type of comparison is sequence alignment b) It is an important first step toward structural and functional analysis of newly determined sequences c) This is the process by which sequences are compared by searching for common character patterns and establishing residue-residue correspondence among related sequences d) It is the process of aligning multiple sequences 25) Which of the following is incorrect about ENTREZ? a) It is a resource prepared only by the staff of the National Center for Biotechnology Information b) It provides a series of forms that can be filled out to retrieve a Medline reference related to the molecular biology sequence databases c) One straightforward way to access the sequence databases is through ENTREZ d) It provides a series of forms that can be filled out to retrieve a DNA or protein sequence 26) SWISS-PROT is a repository for ____________ a) Nucleotide sequences b) Protein sequences c) Vectors d) Genome arrays 27) Prosite is a database containing protein domains, protein families, functional sites, as well as the patterns and profiles to recognize them a) True b) False 28) Which of the following is incorrect about evolution? a) The macromolecules can be considered molecular fossils that encode the history of millions of years of evolution b) The building blocks of these biological macromolecules, nucleotide bases, and amino acids form linear sequences that determine the primary structure of the molecules c) DNA and proteins are products of evolution d) The molecular sequences barely undergo changes 29) If the two sequences share significant similarity, it is extremely ______ that the extensive similarity between the two sequences has been acquired randomly, meaning that the two sequences must have derived from a common evolutionary origin. a) unlikely b) possible c) likely d) relevant 30) Which of the following is incorrect regarding sequence homology? a) Two sequences can homologous relationship even if have do not have common origin b) It is an important concept in sequence analysis c) When two sequences are descended from a common evolutionary origin, they are said to have a homologous relationship d) When two sequences are descended from a common evolutionary origin, they are said to share homology 31) When the two sequences have substantial regions of similarity, many dots line up to form contiguous _______ lines. a) crossings on b) horizontal c) diagonal d) vertical 32) A problem exists when comparing _____ sequences using the dot matrix method, namely, the _______ a) small, amplification b) large, amplification c) small, high noise level d) large, high noise level 33) In local alignment, the two sequences to be aligned cannot be of unequal lengths. a) True b) False 34) Alignment algorithms, both global and local, are fundamentally similar and only differ in the optimization strategy used in aligning similar residues. a) True b) False 35) Self complementarity of DNA sequences cannot be identified using a dot plot. a) True b) False 36) A sequence can be aligned with itself to identify internal repeat elements. a) True b) False 37) The truly statistically significant sequence alignment will be able to provide evidence of homology between the sequences involved. a) True b) False 38) Score can be negative in Smith–Waterman algorithm. a) True b) False 39) The function of the scoring matrix is to conduct one-to-one comparisons between all components in two sequences and record the optimal alignment results. a) True b) False 40) KEGG is a database resource for understanding high-level functions and utilities of the biological system, such as the cell, the organism and the ecosystem a) True b) False