917917 1. (a). use blastn. ACCESSION NM_143133 (b). SOURCE Drosophila melanogaster (fruit fly) ORGANISM Drosophila melanogaster Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; Ephydroidea; Drosophilidae; Drosophila. (c). Celera Genomics, 45 West Gude Drive, Rockville, MD 20850, USA. The fly Drosophila melanogaster is one of the most intensively studied organisms in biology and serves as a model system for the investigation of many developmental and cellular processes common to higher eukaryotes, including humans. We have determined the nucleotide sequence of nearly all of the approximately 120-megabase euchromatic portion of the Drosophila genome using a whole-genome shotgun sequencing strategy supported by extensive clone-based sequence and a high-quality bacterial artificial chromosome physical map. Efforts are under way to close the remaining gaps; however, the sequence is of sufficient accuracy and contiguity to be declared substantially complete and to support an initial analysis of genome structure and preliminary gene annotation and interpretation. The genome encodes approximately 13,600 genes, somewhat fewer than the smaller Caenorhabditis elegans genome, but with comparable functional diversity. MeSH Terms: Animal Biological Transport/genetics Chromatin/genetics Cloning, Molecular Computational Biology Contig Mapping Cytochrome P-450 Enzyme System/genetics DNA Repair/genetics DNA Replication/genetics Drosophila melanogaster/metabolism Drosophila melanogaster/genetics* Euchromatin Gene Library Genes, Insect Genome* Heterochromatin/genetics Insect Proteins/physiology Insect Proteins/genetics Insect Proteins/chemistry Nuclear Proteins/genetics Sequence Analysis, DNA* Support, Non-U.S. Gov't Support, U.S. Gov't, P.H.S. Transcription, Genetic Translation, Genetic Substances: Cytochrome P-450 Enzyme System Nuclear Proteins Insect Proteins Heterochromatin Euchromatin Chromatin Grant support: P50-HG00750/HG/NHGRI PMID: 10731132 [PubMed - indexed for MEDLINE] 2. (a). LocusID: 6035, Cytogenetic: 14q11.1 (b). 14201796- 14211796 RefSeq 3. (a). >lcl|Sequence 1 ORF:243..713 Frame –3, Length=471 (b). MALEKSLVRLLLLVLILLVLGWVQPSLGKESRAKKFQRQHMDSDSSPSSSSTY CNQMMRRRNMTQGRCKP VNTFVHEPLVDVQNVCFQEKVTCKNGQGNCYKSNSSMHITDCRLTNGSRYPN CAYRTSPKERHIIVACEG SPYVPVHFDASVEDST (c). LOCUS RNASE1 156 aa linear PRI 07-APR-2003 DEFINITION ribonuclease, RNase A family, 1 (pancreatic) [Homo sapiens]. ACCESSION NP_002924 4. (a). LOCUS Rnase1 149 aa linear ROD 07-APR-2003 DEFINITION ribonuclease, RNase A family, 1 (pancreatic); ribonuclease 1, pancreatic [Mus musculus]. ACCESSION NP_035401 (b). Protein in mouse MGLEKSLILFPLFFLLLGWVQPSLGRESAAQKFQRQHMDPDGSSINSPTYCNQ MMKRRDMTNGSCKPVNT FVHEPLADVQAVCSQENVTCKNRKSNCYKSSSALHITDCHLKGNSKYPNCDY KTTQYQKHIIVACEGNPY VPVHFDATV Use BLAST2 Score = 226 bits (576), Expect = 7e-59Identities = 105/152 (69%), Positives = 123/152 (80%), Gaps = 3/152 (1%) (c). BLAST 2 SEQUENCES RESULTS VERSION BLASTP 2.2.5 [Nov-16-2002] Matrix BLOSUM62PAM30PAM70PAM250BLOSUM90BLOSUM50 gap open: gap extension: x_dropoff: expect: wordsize: Filter -------------------------------------------------------------------------------Sequence 1 lcl|seq_1 Length 149 (1 .. 149) Sequence 2 lcl|seq_2 Length 156 (1 .. 156) 2 1 NOTE:The statistics (bitscore and expect value) is calculated based on the size of nr database Score = 226 bits (576), Expect = 7e-59Identities = 105/152 (69%), Positives = 123/152 (80%), Gaps = 3/152 (1%) Query: 1 MGLEKSLI---LFPLFFLLLGWVQPSLGRESAAQKFQRQHMDPDGSSINSPTYC NQMMKR 57 M LEKSL+ L L L+LGWVQPSLG+ES A+KFQRQHMD D S +S TYCNQMM+R Sbjct: 1 MALEKSLVRLLLLVLILLVLGWVQPSLGKESRAKKFQRQHMDSDSSPSSSSTY CNQMMRR 60 Query: 58 RDMTNGSCKPVNTFVHEPLADVQAVCSQENVTCKNRKSNCYKSSSALHITDC HLKGNSKY 117 R+MT G CKPVNTFVHEPL DVQ VC QE VTCKN + NCYKS+S++HITDC L S+Y Sbjct: 61 RNMTQGRCKPVNTFVHEPLVDVQNVCFQEKVTCKNGQGNCYKSNSSMHITD CRLTNGSRY 120 Query: 118 PNCDYKTTQYQKHIIVACEGNPYVPVHFDATV 149 PNC Y+T+ ++HIIVACEG+PYVPVHFDA+V Sbjct: 121 PNCAYRTSPKERHIIVACEGSPYVPVHFDASV 152 CPU time: Lambda 0.04 user secs. K H 0.03 sys. secs 0.07 total secs. 0.320 0.133 0.419 Gapped Lambda 0.267 K 0.0410 H 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Hits to DB: 275 Number of Sequences: 0 Number of extensions: 11 Number of successful extensions: 1 Number of sequences better than 10.0: 1 Number of HSP's better than 10.0 without gapping: 1 Number of HSP's successfully gapped in prelim test: 0 Number of HSP's that attempted gapping in prelim test: 0 Number of HSP's gapped (non-prelim): 1 length of query: 149 length of database: 456,953,620 effective HSP length: 125 effective length of query: 24 effective length of database: 456,953,495 effective search space: 10966883880 effective search space used: 10966883880 T: 9 A: 40 X1: 16 ( 7.4 bits) X2: 129 (49.7 bits) X3: 129 (49.7 bits) S1: 41 (21.8 bits) S2: 66 (30.0 bits) (d). RPS-BLAST 2.2.6 [Apr-09-2003] Query= local sequence: (149 letters) Database: cdd.v1.62 11,088 PSSMs; 2,717,223 total columns Domain Relatives .. This CD alignment includes 3D structure. To display structure, download Cn3D! (bits) E value PSSMs producing significant alignments: Score gnl|CDD|14893 smart00092, RNAse_Pc, Pancreatic ribonuclease ; 189 9e-50 gnl|CDD|5327 cd00163, RNAse_Pc, Pancreatic ribonucleases (RNAse) are pyrimi... 179 1e-46 gnl|CDD|7438 pfam00074, rnaseA, Pancreatic ribonuclease. Ribonucleases. Mem... 173 8e-45 -------------------------------------------------------------------------------gnl|CDD|14893, smart00092, RNAse_Pc, Pancreatic ribonuclease ; CD-Length = 123 residues, 100.0% aligned Score = 189 bits (482), Expect = 9e-50 Query: 26 RESAAQKFQRQHMDPDGSSINSPTYCNQMMKRRDMTNGSCKPVNTFVHEPL ADVQAVCSQ 85 Sbjct: 1 QETRAQKFLRQHIDSTPSS-ASSNYCNQMMKRRNMTQGRCKPVNTFIHESLA NVKAVCSN 59 Query: 86 ENVTCKNRKSNCYKSSSALHITDCHLKGNSKYPNCDYKTTQYQKHIIVACEG NPYVPVHF 145 Sbjct: 60 KNVTCKNGRTNCHQSNSRFQLTDCRLTGGSKYPNCRYKTTQANKFIIVACEGN PYVPVHF 119 Query: 146 DATV 149 Sbjct: 120 DGSV 123 -------------------------------------------------------------------------------gnl|CDD|5327, cd00163, RNAse_Pc, Pancreatic ribonucleases (RNAse) are pyrimidine-specific endonucleases found in high quantity in the pancreas of certain mammals and of some reptiles. Involved in endonucleolytic cleavage of 3'-phosphomononucleotides and 3'-phosphooligonucleotides ending in C-P or U-P with 2',3'-cyclic phosphate intermediates. Catalytic mechanism is a transphosphorylation of P-O 5' bonds on the 3' side of pyrimidines and subsequent hydrolysis to generate 3' phosphate groups. Other family members include: bovine seminal vesicle and brain ribonucleases; kidney non-secretory ribonucleases; liver-type ribonucleases; angiogenin, which induces vascularization of normal and malignant tissues; eosinophil cationic protein A cytotoxin and helminthotoxin with ribonuclease activity; and frog liver ribonuclease and frog sialic acid-binding lectin CD-Length = 119 residues, 100.0% aligned Score = 179 bits (456), Expect = 1e-46 Query: 28 SAAQKFQRQHMDPDGSSINSPTYCNQMMKRRDMTNGSCKPVNTFVHEPLAD VQAVCSQEN 87 Sbjct: 1 TRAQKFLRQHIDSTPSG-SSSNYCNQMMKRRNMTQGRCKPVNTFVHESLADV KAVCSQKN 59 Query: 88 VTCKNRKSNCYKSSSALHITDCHLKGNSKYPNCDYKTTQYQKHIIVACEGNP YVPVHFDA 147 Sbjct: 60 VTCKNGRNNCHQSNSSFQITDCRLTGGSKYPNCRYRTTQSNKHIIVACEGNPG VPVHFDG 119 -------------------------------------------------------------------------------gnl|CDD|7438, pfam00074, rnaseA, Pancreatic ribonuclease. Ribonucleases. Members include pancreatic RNAase A and angiogenins. Structure is an alpha+beta fold -- long curved beta sheet and three helices. CD-Length = 121 residues, 99.2% aligned Score = 173 bits (440), Expect = 8e-45 Query: 27 ESAAQKFQRQHMDPDGSSINSPTYCNQMMKRRDMTNGSCKPVNTFVHEPLA DVQAVCSQE 86 Sbjct: 2 ETRAQKFQRQHIDP-NTSSSSPNYCNQMMKRRNMTQGRCKPVNTFVHESLAD VKAVCSQK 60 Query: 87 NVTCKNRKSNCYKSSSALHITDCHLKGNSKYPNCDYKTTQYQKHIIVACEGN PYVPVHFD 146 Sbjct: 61 NVTCKNGQTNCYLSTSSFQLTDCRLTGGSKYPNCRYRTTPSTKRIIVACEGN--V PVHFD 118 Query: 147 ATV 149 Sbjct: 119 GSV 121 pfam00074 5. (a). Primary accession number P07998 (b). DE Ribonuclease pancreatic precursor (EC 3.1.27.5) (RNase 1) (RNase A) DE (RNase UpI-1) (RIB-1). OS Homo sapiens (Human). The computation has been carried out on the complete sequence. -------------------------------------------------------------------------------Molecular weight: 17644.24 Theoretical pI: 9.10 (c). Weights for window positions 1,..,7, using linear weight variation model: 1 2 3 4 5 6 7 0.80 0.87 0.93 1.00 0.93 0.87 0.80 edge center edge MIN: 0.192 MAX: 0.947 (d). FindPept tool The entered sequence is: MALEKSLVRL LLLVLILLVL GWVQPSLGKE SRAKKFQRQH MDSDSSPSSS STYCNQMMRR RNMTQGRCKP VNTFVHEPLV DVQNVCFQEK VTCKNGQGNC YKSNSSMHIT DCRLTNGSRY PNCAYRTSPK ERHIIVACEG SPYVPVHFDA SVEDST156 Amino Acids. Theoretical pI/Mw: 9.10 / 17644.24 Entered peptide masses: 1000.000 Tolerance: ±0.5 daltons Using monoisotopic masses of the occurring amino acid residues and interpreting your peptide masses as [M+H]+. Enzyme: Chymotrypsin (C-term to F/Y/W/M/L, not before P) (P00766). Cysteine in reduced form. -------------------------------------------------------------------------------- FindPept documentation Mass values and considered PTMs -------------------------------------------------------------------------------Matching peptides for unspecific cleavage: User mass DB mass mass (daltons) peptide position modifications missed cleavages 1000.000 1000.415 0.415 (G)QGNCYKSNS(S) 97-105 0 1000.000 1000.473 0.473 (V)PVHFDASVE(D) 145-153 0 (e). (i). ScanProsite Search a sequence against PROSITE Sequence: MALEKSLVRL LLLVLILLVL GWVQPSLGKE SRAKKFQRQH MDSDSSPSSS STYCNQMMRR RNMTQGRCKP VNTFVHEPLV DVQNVCFQEK VTCKNGQGNC YKSNSSMHIT DCRLTNGSRY PNCAYRTSPK ERHIIVACEG SPYVPVHFDA SVEDST PROSITE Release 17.44, of 26-Apr-2003 >PDOC00001 PS00001 ASN_GLYCOSYLATION N-glycosylation site [pattern] [Warning: pattern with a high probability of occurrence]. 62 - 65 NMTQ 104 - 107 NSSM 116 - 119 NGSR >PDOC00005 PS00005 PKC_PHOSPHO_SITE Protein kinase C phosphorylation site [pattern] [Warning: pattern with a high probability of occurrence]. 92 - 94 TcK 128 - 130 SpK >PDOC00006 PS00006 CK2_PHOSPHO_SITE Casein kinase II phosphorylation site [pattern] [Warning: pattern with a high probability of occurrence]. 128 - 131 SpkE 151 - 154 SveD >PDOC00008 PS00008 MYRISTYL N-myristoylation site [pattern] [Warning: pattern with a high probability of occurrence]. 96 - 101 GQgnCY >PDOC00118 PS00127 RNASE_PANCREATIC Pancreatic ribonuclease family signature [pattern]. 68 - 74 CKpvNTF Graphical summary of hits (java applet) 9 hits with 5 PROSITE entries (ii). 4 (iii) ScanProsite Search a sequence against PROSITE Sequence: MALEKSLVRL LLLVLILLVL GWVQPSLGKE SRAKKFQRQH MDSDSSPSSS STYCNQMMRR RNMTQGRCKP VNTFVHEPLV DVQNVCFQEK VTCKNGQGNC YKSNSSMHIT DCRLTNGSRY PNCAYRTSPK ERHIIVACEG SPYVPVHFDA SVEDST PROSITE Release 17.44, of 26-Apr-2003 >PDOC00001 PS00001 ASN_GLYCOSYLATION N-glycosylation site [pattern] [Warning: pattern with a high probability of occurrence]. 62 - 65 NMTQ 104 - 107 NSSM 116 - 119 NGSR >PDOC00005 PS00005 PKC_PHOSPHO_SITE Protein kinase C phosphorylation site [pattern] [Warning: pattern with a high probability of occurrence]. 92 - 94 TcK 128 - 130 SpK >PDOC00006 PS00006 CK2_PHOSPHO_SITE Casein kinase II phosphorylation site [pattern] [Warning: pattern with a high probability of occurrence]. 128 - 131 SpkE 151 - 154 SveD >PDOC00008 PS00008 MYRISTYL N-myristoylation site [pattern] [Warning: pattern with a high probability of occurrence]. 96 - 101 GQgnCY >PDOC00118 PS00127 RNASE_PANCREATIC Pancreatic ribonuclease family signature [pattern]. 68 - 74 CKpvNTF Graphical summary of hits (java applet) 9 hits with 5 PROSITE entries