LOCUS DEFINITION ACCESSION VERSION KEYWORDS SOURCE ORGANISM REFERENCE AUTHORS JOURNAL REFERENCE AUTHORS TITLE JOURNAL REFERENCE AUTHORS TITLE JOURNAL MEDLINE REFERENCE AUTHORS TITLE JOURNAL MEDLINE REFERENCE AUTHORS JOURNAL REFERENCE AUTHORS JOURNAL REFERENCE AUTHORS TITLE JOURNAL MEDLINE COMMENT SIVMM251 10277 bp ss-RNA VRL 29-NOV-2000 Simian (macaque) immunodeficiency virus, complete genome; isolate Mm251, proviral DNA. M19499 M15897 M16125 M24614 Y00294 M19499.1 GI:334657 env protein; envelope-associated protein; gag polyprotein; nef protein; pol polyprotein; rev protein; tat protein; vif protein; vpr protein; vpx protein. Simian immunodeficiency virus. Simian immunodeficiency virus Viruses; Retroid viruses; Retroviridae; Lentivirus; Primate lentivirus group. 1 (bases 5731 to 10277) Hirsch,V.M. Unpublished 2 (bases 809 to 1677; 5731 to 10249) Hahn,B.H., Kumar,P., Taylor,M.E., Arya,S.K. and Shaw,G.M. Human retrovirses, cancer and AIDS: approach to prevention and therapy Unpublished 3 (bases 6078 to 10249) Hirsch,V., Riedel,N. and Mullins,J.I. The genome organization of STLV-3 is similar to that of the AIDS virus except for a truncated transmembrane protein Cell 49 (3), 307-319 (1987) 87187627 4 (bases 506 to 5975; 6319 to 10125) Franchini,G., Gurgo,C., Guo,H.G., Gallo,R.C., Collalti,E., Fargnoli,K.A., Hall,L.F., Wong-Staal,F. and Reitz,M.S. Jr. Sequence of simian immunodeficiency virus and its relationship to the human immunodeficiency viruses Nature 328 (6130), 539-543 (1987) 87287229 5 (bases 7757 to 8090; 8108 to 8225; 8237 to 8625; 8706 to 8898) Kestler,H.W. Unpublished 6 (bases 1 to 10249) Donahue,P.R., Kornfeld,H., Gallo,M.V. and Mullins,J.I. Unpublished 7 (sites) Kestler,H.W. III., Li,Y., Naidu,Y.M., Butler,C.V., Ochs,M.F., Jaenel,G., King,N.W., Daniel,M.D. and Desrosiers,R.C. Comparison of simian immunodeficiency virus isolates Nature 331 (6157), 619-622 (1988) 88122665 [7] sites. Kestler et al. [7], (Nature 331, 619-622 (1988)) present strong evidence that the isolates previously referred to as STLV-III AGM ([4], [3], [1]) and HTLV-IV ([2]) are not authentic, but were derived from cell cultures infected with SIV MAC-251. The reference sequence for this entry is from unpublished (1988), with exception of bases 10250-10277 which are from [1]. the vpr coding region has been annotated to end near where the gene ends in the SIVMM142 sequence. A base change from 'a' at position 6443 in SIVMM142 to 't' at corresponding position 6418 in the SIVMM251 sequence results in the coding sequence ending 4 amino acids earlier than it does in SIVMM142. Also, the variation marked at position 6377 ('aa' in K6W, 'aga' in K78) implies different amino acid sequences for those isolates from that point onward in vpr. The tat start is annotated in agreement with the site in [2] (also agrees with HIV2ROD) sequence rather than at the site marked in [3] for the K78 sequence. The start codon for rev is marked at position 6504 in accordance with reference [3] for the K78 sequence. This position also agrees with the rev start in the SIVMM142 sequence. However, [2] annotates the rev start at position 6354. Both of these start sites are in the same frame. The sequence annotation shows two alternate 3' splice junction sites at 8743 and 8785. See reference [3]. Clean copy of sequence unpublished (1988) kindly provided by H.W. Kestler to the EMBL data library (25-FEB-1988). Computer-readable copy of sequence unpublished (1988) kindly provided by J.Mullins (12-SEP-1988). An internal stop codon is present in the env cds at positions 8782-8784 in unpublished (1988) and 8785-8787 in [4],[3],[1], [2],unpublished (1988). unpublished (1988) has shown proviral clone BK28 to be biologically active, causing persistent lymphadenopathy and death in one out of four rhesus monkeys 17 months after infection. BK28 was originally called HTLV-4 (Nature 326, 610-613 (1987)), but it has since been found to correspond to SIVMM251. The K6W sequence [4], which was derived from isolate Mm251, contains about 25 frame shifts in gag-pol relative to this sequence. A single base insertion at position 9678 results in a frame-shift in the nef coding region. The nef coding region in [4] would produce a protein about 36 residues longer than that produced by the other isolates. Author [7] address Kestler H.W., Harvard Medical School, New England Regional Primate Research Center, Department of Microbiology, One Pine Hill Drive, Southborough, Mass. 01772, USA. [7] submitted (25-FEB-1988) to the EMBL data library by. Complete source information: Simian (macaque) immunodeficiency virus proviral DNA, SIV mac isolate K6W, clones lambda-16, 27, 35, and 11 [4]; and SIV mac isolate K78, clones K2-10 and K2-5 [3], [1]; HTLV4/PK82 and HTLV4/PK190 proviral DNA clones [2]; SIV mac integrated proviral DNA isolate 251, clone lambda 251 unpublished (1988); proviral DNA, isolate Mm251, clone BK28 unpublished (1988). Location/Qualifiers 1..10277 /organism="Simian immunodeficiency virus" /proviral /isolate="Mm251" /db_xref="taxon:11723" LTR 1..806 /note="5' LTR" misc_binding 428..437 /note="putative" /bound_moiety="Sp1 III" misc_binding 439..448 /note="putative" /bound_moiety="Sp1 II" misc_binding 450..459 /note="putative" /bound_moiety="Sp1" repeat_region 506..682 /note="R repeat 5' copy" prim_transcript 507..10125 /note="genomic mRNA" prim_transcript 507..10125 /note="tat, rev, nef subgenomic mRNA" variation 517..519 /note="cgg in unpublished (1988); gc in [4]" /replace="gc" variation 617..618 /note="tg in unpublished (1988); t in [4]" /replace="t" variation 668 /note="g in unpublished (1988); gc in [4]" /replace="gc" variation 694 /note="g in unpublished (1988); c in [4]" /replace="c" variation 779 /note="a in unpublished (1988); g in [4]" /replace="g" variation 806 /note="a in unpublished (1988); agt in [4]" /replace="agt" misc_binding 810..827 /bound_moiety="Lys-tRNA primer" variation 825..826 /note="ac in [4],unpublished (1988); gac in [2]" /replace="gac" variation 843..844 /note="ga in [2],unpublished (1988); agaa in [4]" /replace="agaa" variation 858..859 /note="ct in [2],unpublished (1988); cct in [4]" /replace="cct" variation 870 /note="a in [4],unpublished (1988); g in [Unpublished (1987) UCLA Symposia on Molecular and Cellular Biology]" /replace="g" FEATURES source variation variation variation variation variation CDS 895..897 /note="gac in [4],unpublished (1988); /replace="gc" 919 /note="g in [2],unpublished (1988); a /replace="a" 966..968 /note="gtt in [4],unpublished (1988); /replace="gt" 1009..1011 /note="atc in [2],unpublished (1988); /replace="ac" 1014 /note="g in [4],unpublished (1988); a /replace="a" 1041..2561 /note="gag polyprotein" /codon_start=1 /protein_id="AAB59905.1" /db_xref="GI:334660" gc in [2]" in [4]" gt in [2]" ac in [4]" in [2]" /translation="MGARNSVLSGKKADELEKIRLRPGGKKKYMLKHVVWAANELDRF GLAESLLENKEGCQKILSVLAPLVPTGSENLKSLYNTVCVIWCIHAEEKVKHTEEAKQ IVQRHLVVETGTAETMPKTSRPTAPSSGRGGNYPVQQIGGNYVHLPLSPRTLNAWVKL IEEKKFGAEVVPGFQALSEGCTPYDINQMLNCVGDHQAAMQIIRDIINEEAADWDLQH PQPAPQQGQLREPSGSDIAGTTSSVDEQIQWMYRQQNPIPVGNIYRRWIQLGLQKCVR MYNPTNILDVKQGPKEPFQSYVDRFYKSLRAEQTDAAVKNWMTQTLLIQNANPDCKLV LKGLGVNPTLEEMLTACQGVGGPGQKARLMAEALKEALAPVPIPFAAAQKRGPRKPIK CWNCGKEGHSARQCRAPRRQGCWKCGKMDHVMAKCPDRQAGFLGLGPWGKKPRNFPMA QVHQGLTPTAPPEDPAVDLLKNYMQLGKQQRESREKPYKEVTEDLLHLNSLFGGDQ" variation 1056 /note="t in [2],unpublished (1988); g /replace="g" variation 1083 /note="g in [4],unpublished (1988); a /replace="a" variation 1190 /note="g in [2],unpublished (1988); c /replace="c" variation 1365 /note="g in [4],unpublished (1988); a /replace="a" variation 1396 /note="c in [4],unpublished (1988); t /replace="t" variation 1435 /note="g in [4],unpublished (1988); a /replace="a" variation 1619 in [4]" in [2]" in [4]" in [2]" in [2]" in [2]" variation variation variation variation variation variation CDS /note="g in [2],unpublished (1988); a in [4]" /replace="a" 1848 /note="g in unpublished (1988); a in [4]" /replace="a" 1882 /note="c in unpublished (1988); t in [4]" /replace="t" 1905 /note="g in unpublished (1988); a in [4]" /replace="a" 2156..2158 /note="cgc in unpublished (1988); gcg in [4]" /replace="gcg" 2170 /note="t in unpublished (1988); c in [4]" /replace="c" 2189..2190 /note="ga in unpublished (1988); ac in [4]" /replace="ac" <2216..5386 /note="NH2-terminus uncertain" /codon_start=1 /product="pol polyprotein" /protein_id="AAB59906.1" /db_xref="GI:334661" /translation="VLELWEGGTLCKAMQSPKKTGMLEMWKNGPCYGQMPRQTGGFFR PWSMGKEAPQFPHGSSASGADANCSPRGPSCGSAKELHAVGQAAERKQREALQGGDRG FAAPQFSLWRRPVVTAHIEGQPVEVLLDTGADDSIVTGIELGPHYTPKIVGGIGGFIN TKEYKNVKIEVLGKRIKGTIMTGDTPINIFGRNLLTALGMSLNLPIAKVEPVKVTLKP GKVGPKLKQWPLSKEKIVALREICEKMEKDGQLEEAPPTNPYNTPTFAIKKKDKNKWR MLIDFRELNRVTQDFTEVQLGIPHPAGLAKRKRITVLDIGDAYFSIPLDEEFRQYTAF TLPSVNNAEPGKRYIYKVLPQGWKGSPAIFQYTMRHVLEPFRKANPDVTLVQYMDDIL IASDRTDLEHDRVVLQLKELLNSIGFSTPEEKFQKDPPFQWMGYELWPTKWKLQKIEL PQRETWTVNDIQKLVGVLNWAAQIYPGIKTKHLCRLIRGKMTLTEEVQWTEMAEAEYE ENKIILSQEQEGCYYQEGKPLEATVIKSQDNQWSYKIHQEDKILKVGKFAKIKNTHTN GVRLLAHVIQKIGKEAIVIWGQVPKFHLPVERDVWEQWWTDYWQVTWIPEWDFISTPP LVRLVFNLVKDPIEGEETYYTDGSCNKQSKEGKAGYITDRGKDKVKVLEQTTNQQAEL EAFLMALTDSGPKTNIIVDSQYVMGIITGCPTESESRLVNQIIEEMIKKSEIYVAWVP AHKGIGGNQEIDHLVSQGIRQVLFLEKIEPAQEEHDKYHSNVKELVFKFGLPRIVARQ IVDTCDKCHQKGEAIHGQVNSDLGTWQMDCTHLEGKIVIVAVHVASGFIEAEVIPQET GRQTALFLLKLAGRWPITHLHTDNGANFASQEVKMVAWWAGIEHTFGVPYNPQSQGVV EAMNHHLKNQIDRIREQANSVETIVLMAVHCMNFKRRGGIGDMTPAERLINMITTEQE IQFQQSKNSKFKNFRVYYREGRDQLWKGPGELLWKGEGAVILKVGTDIKVVPRRKAKI IKDYGGGKEVDSSSHMEDTGEAREVA" variation 2277 /note="g in unpublished (1988); a in [4]" /replace="a" variation 2336 /note="g in unpublished (1988); a in [4]" /replace="a" variation 2717 /note="a in unpublished (1988); g in [4]" /replace="g" variation 2747 /note="g in unpublished (1988); a in [4]" /replace="a" variation 2858 /note="a in unpublished (1988); g in [4]" /replace="g" variation 3053 /note="g in unpublished (1988); c in [4]" /replace="c" variation 3083 /note="g in unpublished (1988); ga in [4]" /replace="ga" variation 3103..3104 /note="ag in unpublished (1988); a in [4]" /replace="a" variation 3648 /note="g in unpublished (1988); t in [4]" /replace="t" variation 3702 /note="a in unpublished (1988); c in [4]" /replace="c" variation 3839 /note="t in unpublished (1988); a in [4]" /replace="a" variation 3878 /note="g in unpublished (1988); a in [4]" /replace="a" variation 3971..3973 /note="tgg in unpublished (1988); gta in [4]" /replace="gta" variation 4131 /note="g in unpublished (1988); t in [4]" /replace="t" variation 4218 /note="a in unpublished (1988); t in [4]" /replace="t" variation 4231..4237 /note="agaacag in unpublished (1988); gaca in [4]" /replace="gaca" variation 4261 /note="a in unpublished (1988); g in [4]" variation variation variation variation variation variation variation variation variation variation variation variation variation variation variation variation variation variation variation /replace="g" 4268 /note="t in unpublished (1988); a in [4]" /replace="a" 4271 /note="c in unpublished (1988); ta in [4]" /replace="ta" 4283..4284 /note="ac in unpublished (1988); a in [4]" /replace="a" 4302 /note="c in unpublished (1988); g in [4]" /replace="g" 4318..4320 /note="ttc in unpublished (1988); att in [4]" /replace="att" 4324..4325 /note="at in unpublished (1988); agt in [4]" /replace="agt" 4348..4350 /note="atg in unpublished (1988); tt in [4]" /replace="tt" 4405..4407 /note="aaa in unpublished (1988); a in [4]" /replace="a" 4413..4415 /note="aaa in unpublished (1988); gg in [4]" /replace="gg" 4439..4442 /note="caca in unpublished (1988); ttag in [4]" /replace="ttag" 4470..4474 /note="accac in unpublished (1988); gtcca in [4]" /replace="gtcca" 4490 /note="a in unpublished (1988); t in [4]" /replace="t" 4786..4790 /note="aagac in unpublished (1988); a in [4]" /replace="a" 4804 /note="t in unpublished (1988); tc in [4]" /replace="tc" 4832 /note="a in unpublished (1988); tat in [4]" /replace="tat" 4850..4851 /note="ga in unpublished (1988); cag in [4]" /replace="cag" 4870 /note="c in unpublished (1988); a in [4]" /replace="a" 4892 /note="g in unpublished (1988); a in [4]" /replace="a" 4911 /note="a in unpublished (1988); ag in [4]" variation CDS /replace="ag" 4921..4922 /note="tg in unpublished (1988); t in [4]" /replace="t" 5316..5960 /note="vif protein" /codon_start=1 /protein_id="AAB59907.1" /db_xref="GI:334662" /translation="MEEEKRWIAVPTWRIPERLERWHSLIKYLKYKTKDLQKVCYVPH FKVGWAWWTCSRVIFPLQEGSHLEVQGYWHLTPERGWLSTYAVRITWYSRNFWTDVTP DYADILLHSTYFPCFTAGEVRRAIRGEQLLSCCKFPRAHRYQVPSLQYLALKVVSDVR SQGENPTWKQWRRDNRRGLRMAKQNSRGDKQRGSKPPTKGADFPGLAKVLGILA" variation 5560 /note="t in unpublished (1988); c in [4]" /replace="c" variation 5598..5599 /note="aa in unpublished (1988); gac in [4]" /replace="gac" variation 5632 /note="a in unpublished (1988); g in [4]" /replace="g" variation 5639 /note="a in unpublished (1988); tc in [4]" /replace="tc" variation 5660..5662 /note="ttg in unpublished (1988); t in [4]" /replace="t" variation 5737 /note="a in [4],[2],unpublished (1988); g in [1]" /replace="g" CDS 5788..6126 /note="vpx protein" /codon_start=1 /protein_id="AAB59908.1" /db_xref="GI:334663" /translation="MSDPRERIPPGNSGEETIGEAFEWLNRTVEEINREAVNHLPREL IFQVWQRSWEYWHDEQGMSQSYVKYRYLCLMQKALFMHCKKGCRCLGEGHGAGGWRPG PPPPPPPGLA" variation 5817..5818 /note="tg in [4],[2],unpublished (1988); aa in [1]" /replace="aa" variation 5877 /note="g in [4],[2],unpublished (1988); a in [1]" /replace="a" variation 5895 /note="a in [2],unpublished (1988); g in [4],[1]" /replace="g" variation 5968 /note="g in [4],[2],unpublished (1988); a in [1]" /replace="a" variation 6070 variation CDS /note="g in [2],unpublished (1988); a in [1]" /replace="a" 6078 /note="a in [3],[1],unpublished (1988); g in [2]" /replace="g" 6127..6420 /note="vpr protein" /codon_start=1 /protein_id="AAB59909.1" /db_xref="GI:334664" /translation="MEERPPENEGPQREPWDEWVVEVLEELKEEALKHFDPRLLTALG NHIYNRHGDTLEGAGELIRILQRALFMHFRGGCNHSRIGQPGGGNPLSTIPPS" variation 6143 /note="c in [2],unpublished (1988); t in [3],[1]" /replace="t" variation 6250 /note="g in [3],[1],unpublished (1988); a in [2]" /replace="a" CDS join(6278..6573,8785..8884) /note="tat protein" /codon_start=1 /protein_id="AAB59910.1" /db_xref="GI:334658" /translation="METPLREQENSLESSNERSSCILEADATTPESANLGEEILSQLY RPLEACYNTCYCKKCCYHCQFCFLKKGLGICYEQSRKRRRTPKKAKANTSSASNNRLI PNRTRHCQPEKAKKETVEKAVATAPGLGR" variation 6377..6378 /note="aa in [4],unpublished (1988); aga in [3],[1], [2]" /replace="aga" variation 6493 /note="g in [4],[2],unpublished (1988); a in [3],[1]" /replace="a" variation 6499 /note="a in [3],[1],[2],unpublished (1988); t in [4]" /replace="t" CDS join(6504..6573,8785..9041) /note="rev protein" /codon_start=1 /protein_id="AAB59911.1" /db_xref="GI:334659" /translation="MSSHEREEELRKRLRLIHLLHQTIDSYPTGPGTANQRRQRRRRW RRRWQQLLALADRIYSFPDPPTDTPLDLAIQQLQNLAIESIPDPPTNTPEALCDPTKG SRSPQD" intron 6574..8784 /note="rev and tat,[3]" /number=2 intron 6574..8742 /note="tat, rev, nef subgenomic mRNA intron 2 (alternate), [3]" intron 6574..8742 intron CDS variation variation variation variation variation variation variation variation variation variation variation variation variation /note="rev and tat (alternate),[3]" /number=2 6574..8784 /note="tat, rev, nef subgenomic mRNA intron 2,[3]" 6580..9225 /note="envelope polyprotein (in-frame stop at 8785, [4], [3],[1],[2],[" /codon_start=1 /pseudo 6592..6593 /note="gg in [4],[2],unpublished (1988); aa in [3],[1]" /replace="aa" 7009 /note="a in [4],[2],unpublished (1988); t in [3],[1]" /replace="t" 7097 /note="c in [4],[2],unpublished (1988); a in [3],[1]" /replace="a" 7174 /note="g in [4],[2],unpublished (1988); a in [3],[1]" /replace="a" 7209 /note="g in [4],[3],[1],unpublished (1988); a in [2]" /replace="a" 7239 /note="a in unpublished (1988); g in [4],[3],[1],[2]" /replace="g" 7565 /note="g in [4],[2],unpublished (1988); a in [3],[1]" /replace="a" 7569 /note="g in [4],[2],unpublished (1988); a in [3],[1]" /replace="a" 7588 /note="a in [3],[1],[2],unpublished (1988); c in [4]" /replace="c" 7592 /note="a in [3],[1],[2],unpublished (1988); c in [4]" /replace="c" 7638 /note="g in [4],[2],unpublished (1988); a in [3],[1]" /replace="a" 7743 /note="g in [4],[3],[1],unpublished (1988); a in [2]" /replace="a" 7826 /note="g in [4],[2],unpublished (1988),unpublished (1988); variation a in [3],[Unpublished (1987) Cancer Biol Dept, Harvard School of Pub." /replace="a" 7865 /note="g in [4],[2],unpublished (1988),unpublished (1988); a in [3],[Unpublished (1987) Cancer Biol Dept, Harvard School of Pub." /replace="a" variation variation variation variation variation 7894 /note="a in [4],[3],[1],unpublished (1988),unpublished (1988); g in [Unpublished (1987) UCLA Symposia on Molecular and Cellular" /replace="g" 8021 /note="g in [4],[3],[1],unpublished (1988); a in [2], [Unpublished" /replace="a" 8061 /note="g in [3],[1],[2],unpublished (1988),unpublished (1988); a in [Nature 328, 539-543" /replace="a" 8100 /note="t in [4],[3],[1],unpublished (1988); c in [2]" /replace="c" 8184 /note="g in [4],[2],unpublished (1988),unpublished (1988); variation a in [3],[Unpublished (1987) Cancer Biol Dept, Harvard School of Pub." /replace="a" 8214 /note="g in [4],[2],unpublished (1988),unpublished (1988); variation variation a in [3],[Unpublished (1987) Cancer Biol Dept, Harvard School of Pub." /replace="a" 8223 /note="g in [3],[1],[2],unpublished (1988),unpublished (1988); c in [Nature 328, 539-543" /replace="c" 8229..8231 /note="gac in [3],[1],[2],unpublished (1988); cag in [4]" variation variation variation /replace="cag" 8294 /note="a in [3],[1],[2],unpublished (1988),unpublished (1988); g in [Nature 328, 539-543" /replace="g" 8326 /note="c in [3],[1],[2],unpublished (1988),unpublished (1988); t in [Nature 328, 539-543" /replace="t" 8360 /note="g in [4],[2],unpublished (1988),unpublished (1988); variation variation a in [3],[Unpublished (1987) Cancer Biol Dept, Harvard School of Pub." /replace="a" 8386 /note="a in [3],[1],[2],unpublished (1988),unpublished (1988); g in [Nature 328, 539-543" /replace="g" 8403 /note="a in unpublished (1988); g in [4],[3],[1],[2], [Unpublished" variation variation /replace="g" 8407 /note="g in [4],[3],[1],unpublished (1988),unpublished (1988); a in [Unpublished (1987) UCLA Symposia on Molecular and Cellular" /replace="a" 8587 /note="g in [4],[2],unpublished (1988),unpublished (1988); variation variation misc_feature variation misc_feature variation CDS a in [3],[Unpublished (1987) Cancer Biol Dept, Harvard School of Pub." /replace="a" 8601 /note="g in [4],[3],[1],unpublished (1988),unpublished (1988); a in [Unpublished (1987) UCLA Symposia on Molecular and Cellular" /replace="a" 8782 /note="c in [4],[3],[1],[2],unpublished (1988); t in [Unpublished" /replace="t" 8782..8784 /note="in-frame stop in env cds unpublished (1988)" 8785 /note="t in [4],[3],[1],[2],unpublished (1988); c in [Unpublished" /replace="c" 8785..8787 /note="in-frame stop in env cds [4],[3],[1],[2], unpublished (1988)" 9019 /note="g in unpublished (1988); a in [4],[3],[1],[2]" /replace="a" 9059..9802 /note="nef protein" /codon_start=1 /protein_id="AAB59912.1" /db_xref="GI:334665" /translation="MGGAISMRRSKPAGDLRQKLLRARGETYGRLLGEVEDGSSQSLG GLGKGLSSRSCEGQKYNQGQYMNTPWRNPAEEKEKLAYRKQNMDDIDEEDDDLVGVSV RPKVPLRAMTYKLAIDMSHFIKEKGGLEGIYYSARRHRILDMYLEKEEGIIPDWQDYT SGPGIRYPKTFGWLWKLVPVNVSDEAQEDERHYLMQPAQTSKWDDPWGEVLAWKFDPT LAYTYEAYARYPEELEASQACQRKRLEEG" variation 9133 /note="a in unpublished (1988); g in [4] [3],[1],[2]" /replace="g" variation 9204 /note="g in [3],[1],[2],unpublished (1988); a in [4]" /replace="a" variation 9224 /note="g in [3],[1],[2],unpublished (1988); a in [4]" /replace="a" variation 9289 variation LTR variation variation /note="a in [3],[1],unpublished (1988); g in [4],[2]" /replace="g" 9352 /note="a in [4],[2],unpublished (1988); c in [3],[1]" /replace="c" 9444..10249 /note="3' LTR" 9474 /note="a in [4],[3],[1],unpublished (1988); g in [2]" /replace="g" 9498..9508 /note="aaaaggaagaa in [3],[1],[2],unpublished (1988); a in variation variation variation variation variation variation variation variation [4]" /replace="a" 9677..9678 /note="tg in [3],[1],[2],unpublished (1988); t in [4]" /replace="t" 9691..9692 /note="ag in unpublished (1988),[3],[1],[2]; ga in [4]" /replace="ga" 9702 /note="t in [4],[3],[1],unpublished (1988); c in [2]" /replace="c" 9714 /note="t in [4],[2],unpublished (1988); c in [3],[1]" /replace="c" 9738 /note="c in unpublished (1988); t in [4],[3],[1],[2]" /replace="t" 9754 /note="g in unpublished (1988); gt in [4],[3],[1],[2]" /replace="gt" 9836 /note="g in [4],[2],unpublished (1988); a in [3],[1]" /replace="a" 9852..9853 /note="ta in unpublished (1988); ca in [4],[3],[1]; cg in repeat_region variation [2]" /replace="cg" 9949..10125 /note="R repeat 3' copy" 9959..9962 /note="gcgg in [3],[1],[2],unpublished (1988); ggc in [4]" variation variation variation polyA_signal /replace="ggc" 9972 /note="c in [4],[3],[1],unpublished (1988); t in [2]" /replace="t" 10019 /note="g in [4],[2],unpublished (1988); a in [3],[1]" /replace="a" 10061..10063 /note="ggg in [3],[1],[2],unpublished (1988); gg in [4]" /replace="gg" 10102..10107 variation variation variation variation variation variation variation /note="mRNA polyadenylation signal" 10111 /note="g in [3],[1],[2],unpublished (1988); gc in [4]" /replace="gc" 10188..10189 /note="aa in [3],[1],unpublished (1988); gg in [2]" /replace="gg" 10192..10193 /note="cc in [3],[1],unpublished (1988); ccc in [2]" /replace="ccc" 10222 /note="a in [3],[1],unpublished (1988); g in [2]" /replace="g" 10227 /note="c in [2],unpublished (1988); t in [3],[1]" /replace="t" 10231 /note="a in [3],[1],unpublished (1988); g in [2]" /replace="g" 10239 /note="a in [3],[1],unpublished (1988); g in [2]" /replace="g" 3451 a 1941 c 2581 g 2304 t BASE COUNT ORIGIN 1 tggaagggat 61 aagaaggcat 121 agacatttgg 181 atgagaggca 241 aggttctagc 301 acccagaaga 361 cgcaagaggc 421 tccacaaggg 481 atatcactgc 541 tgggaggttc 601 agcacttggc 661 taaagctgcc 721 gtcaactcgg 781 aaaccgaagc 841 gagactcctg 901 gtgctcctat 961 ctccggttgc 1021 ataataagat 1081 atgaattaga 1141 tagtatgggc 1201 aagaaggatg 1261 atttaaaaag 1321 tgaaacacac 1381 cagcagaaac 1441 attacccagt 1501 taaatgcctg 1561 ttcaggcact 1621 gagaccatca 1681 gggacttgca 1741 cagatattgc 1801 agaaccccat 1861 gtgtcagaat 1921 ttcagagcta ttattacagt cataccagat ctggctatgg ttatttaatg gtggaagttt gttggaagca cttcttaaca gatgttatgg atttcgctct tctccagcac cagtgctggg attttagaag tactcggtaa aggaaaatcc agtacggctg aaaggcgcgg aggtaagtgc agagtgggag aaaaattagg agcaaatgaa tcaaaaaata cctttataat tgaggaagca tatgccaaaa acaacaaata ggtaaaattg gtcagaaggc agcggctatg gcacccacaa aggaacaact accagtaggc gtataaccca tgtagacagg gcaagaagac tggcaggatt aaattagtcc cagccagctc gatccaactc agtcaggcct tggctgacaa ggaggagccg gtattcagtc tagcaggtag cagagtggct taagccagtg taagaagacc ctagcagatt agtgaaggca gtcggtacca aacacaaaaa atgggcgcga ctacgacccg ttagatagat ctttcggtct actgtctgcg aaacagatag acaagtagac ggtggtaact atagaggaaa tgcaccccct cagattatca ccagctccac agttcagtag aacatttaca acaaacattc ttctacaaaa atagaatctt acacctcagg ctgtaaatgt aaacttccaa tagcctacac gtcagaggaa gagggaaact gtcgggaaca gctctgcgga agcctgggtg ccacgcttgc tgtgttccca ctggtctgtt ggcgcccgaa gtaagggcgg gacggcgtga agaaatagct gaaactccgt gcggaaagaa ttggattagc tagctccatt tcatctggtg tgcagagaca caacagcacc atgtccacct agaaatttgg atgacattaa gagatattat aacaaggaca atgaacaaat ggagatggat tagatgtaaa gcttaagagc agacatgtac accaggaatt atcagatgag gtgggatgac ttatgaggca gaggttagaa cgctgagata cccactttct gaggctggca ttccctgcta ttgcttaaag tctctcctag aggacccttt caggacttga caggaaccaa ggagcgggag gtcttgttat cttgtcaggg aaagtacatg agaaagcctg agtgccaaca cattcacgca cctagtggtg atctagcggc gccattaagc agcagaagta tcagatgtta aaatgaggag gcttagggag ccagtggatg ccaactgggg acaagggcca agaacaaaca ttagaaaagg agatacccaa gcacaggagg ccttggggag tatgctagat gaaggctaac gcagggactt tgatgtataa gattgagccc gactctcacc acctcttcaa tcgccgcctg ctgctttgag aggagagtga ccacgacgga aggaggaggc ccaggaaggg aagaaagcag ttgaagcatg ttggagaaca ggctcagaaa gaagagaaag gaaacaggaa agaggaggaa ccgagaacat gtgccaggat aattgtgtgg gctgcagatt ccgtcaggat tacagacaac ttgcaaaaat aaagagccat gatgcagcag 1981 2041 2101 2161 2221 2281 2341 2401 2461 2521 2581 2641 2701 2761 2821 2881 2941 3001 3061 3121 3181 3241 3301 3361 3421 3481 3541 3601 3661 3721 3781 3841 3901 3961 4021 4081 4141 4201 4261 4321 4381 4441 4501 4561 4621 4681 4741 4801 4861 4921 4981 5041 5101 5161 5221 5281 5341 taaagaattg tgctgaaggg tagggggacc cagtgccaat ggaattgtgg gctggaaatg ttttaggcct aggggctgac tgcagttggg atttgctgca acagcctgta gttaggtcca agaatacaaa aggggacact aaatcttccc accaaaattg tgaaaagatg ccccacattt ggaactaaat aggactagca acctctagat agagccagga catcttccaa gaccttagtc tgacagggta agagaaattc atggaagttg gaagttagta tctctgtagg ggcagaagca ttaccaagaa ttataaaatt tacacatacc aatagtgatc acagtggtgg gccaccacta ctattataca agataggggc agaagcattt acaatatgtt ccaaataata caaaggtata tctcttcttg aaaagaattg ctgtgataaa gacttggcaa agctagtgga atttctgtta taactttgcc tggggtacca aaatcaaata agttcattgc attaattaac atttaaaaat tgagctattg agtacccaga cagttcccac gatgactcaa gctgggtgtg aggacagaag cccttttgca gaaggaggga tggaaaaatg tggtccatgg gccaactgct caagcagcag cctcaattct gaagtattat cattataccc aatgtaaaaa ccgattaaca atagctaagg aagcagtggc gaaaaggatg gccataaaga agggtcactc aaaaggaaaa gaagaattta aaacgataca tacactatga cagtatatgg gttttacagc caaaaagatc caaaagatag ggagtattaa ttaattagag gaatatgagg ggcaagccat caccaagaag aatggagtta tggggacagg acagactatt gtaagattag gatggatcat aaagacaaag ctcatggcat atgggaataa gaagaaatga ggaggaaacc gaaaagatag gtattcaaat tgtcatcaga atggactgta ttcatagaag aaattggcag tcgcaagaag tacaatccac gatagaatca atgaatttta atgatcacta tttcgggtct tggaaagggg agaaaggcta atggaggata acactgctga aatcccaccc gctagattaa gcagcccaga cactctgcaa gaccatgtta ggaaagaagc cccccagagg agagaaagca ctctttggag tggatacagg caaaaatagt tagaagtttt tttttggtag tagagcctgt cattatcaaa gtcagttgga aaaaagataa aggactttac ggattacagt ggcagtacac tttataaggt gacatgtgct atgacatctt taaaggaact ccccatttca agttgccaca attgggcagc gaaaaatgac aaaataagat tagaagccac acaaaatact gactattagc tcccaaaatt ggcaggtaac tcttcaatct gtaataaaca taaaagtgtt tgacagactc taacaggatg ttaaaaagtc aagaaataga agccagcaca ttggattacc aaggagaagc cccatctaga cagaagtaat gcagatggcc taaagatggt agagtcaggg gggaacaagc aaagaagggg cagaacaaga attacagaga aaggagcagt aaattatcaa ccggagaggc ttcaaaatgc tagaagaaat tggcagaagc agaggggacc ggcaatgcag tggccaaatg cccgcaattt acccagctgt gagagaagcc gagaccagta ggctgatgat aggaggaata aggcaaaagg gaatttgcta aaaagtcacc agaaaagata ggaagctccc gaacaaatgg agaagtccaa actggatata tgcctttact tctgcctcag agaacccttc aatagctagt cttaaatagc atggatgggg aagagagacc tcaaatttat tctaacagag aattctcagt ggtaataaag gaaagtagga acatgtaata ccacttacca ctggataccg agtgaaggac gtcaaaagaa agaacagact agggccaaag ccctacagaa agaaatttat ccacctagtt agaagaacat cagaatagtg tatacatggg aggaaaaata tccacaagag tattacacat tgcatggtgg agtagtggaa aaattcagta aggaataggg aatacaattt aggcagagat catcttaaag agattatgga tagagaggtg taacccagat gctgacggct cctgaaagag aagaaagcca agccccaaga cccagacaga ccccatggct ggatctgcta ttacaaggag gtcactgctc tctattgtaa ggaggtttta attaaaggga acagctctgg ttaaagccag gttgcattaa ccgaccaatc agaatgctga ttaggaatac ggtgatgcat ttaccatcag ggatggaagg aggaaggcaa gacaggacag atagggttct tacgaattgt tggacagtga ccaggtataa gaagttcagt caggaacaag agtcaggaca aaatttgcaa cagaaaatag gttgagaggg gagtgggatt cctatagagg gggaaagcag actaatcaac acaaatatta tcagagagca gtagcatggg agtcagggga gataaatacc gccagacaga caggtaaatt gtcatagttg acaggaagac ctacacacag gcagggatag gcaatgaatc gaaaccatag gatatgactc caacaatcaa caactgtgga gtagggacag ggaggaaaag gcatagcctc tgcaagctag tgtcaaggag gccctcgcac attaagtgtt agacagggat caggcgggtt caagtgcatc aagaactaca gtgacagagg atattgaagg caggaataga ttaatactaa caatcatgac ggatgtctct gaaaggttgg gagaaatctg catacaacac tagattttag cacaccctgc atttctccat taaataatgc ggtcaccagc atccagatgt acctggaaca ctaccccaga ggccgacaaa atgatataca aaaccaaaca ggactgagat aaggatgtta atcagtggtc agataaagaa gaaaggaagc atgtatggga ttatctcaac gagaagaaac gatatatcac aagcagaatt tagtagattc ggctagttaa taccagcaca ttagacaagt atagtaatgt tagtagacac cagatctagg cagtacatgt agacagcact ataatggtgc agcacacctt accacctgaa tattaatggc cagcagaaag aaaactcaaa agggacccgg acattaaggt aggtggatag ataaaatatc 5401 5461 5521 5581 5641 5701 5761 5821 5881 5941 6001 6061 6121 6181 6241 6301 6361 6421 6481 6541 6601 6661 6721 6781 6841 6901 6961 7021 7081 7141 7201 7261 7321 7381 7441 7501 7561 7621 7681 7741 7801 7861 7921 7981 8041 8101 8161 8221 8281 8341 8401 8461 8521 8581 8641 8701 8761 tgaaatataa gggcatggtg tacaagggta taacctggta tgcatagcac aacaactgct agtacttagc aacagtggag ataaacagag tcttgggaat ttgtgtttaa gaaggacacg gcataaatgg tgggtagtgg ttgctaactg ggagaactca cactccagaa ggcgtgctat taaaaaggga ggctaaggct ctgcttatcg gtcttttatg aatagggata gcccttaatg gaggacgtat tgcattacta acaacaataa aatgagacta ataagctgta acttggtact tgctacatga gatactatta acaaattatt atgatggaga acttatattt aatctaacaa tctggattgg tttggaggaa aggtatactg ccggaagtta tggtttctaa agaaggaatt aaaaatgttt ctcatagcaa gtggcagaac ggcttggccc ggggtctttg gcgtcgttga caacagctgt acaaagaacc ctaaatgctt gcaagtctaa ttcttggagg atgtatgaat tcttggataa atagtgatct tcttccccac aactaaagat gacctgcagc ttggcatttg ctcaaggaac ttatttccct gtcttgctgc actaaaagta aagagacaat aggcagtaaa actggcatga tgcaaaaggc gggcaggagg aagaaagacc aggttctgga cacttggtaa ttagaatcct tcggccaacc aacacatgct ttggggatat aatacatctt ccatcttgct gtgtaccagc cttggggaac ttacagaaag ggcaactctt tgagatgcaa caacagcagc gttcttgtat aattcaccat ctacagattt atcactgtaa gatttaggta caggctttat cacagacttc actggcatgg tgaaatgtag ttttccactc aatggaagga gaactaacaa ccttcatgtg attgggtaga acgtgccgtg atttgcctcc acatagattg tgtatcgatt ccacagatgt tgctagggtt cgctgaccgc tggacgtggt tccagactag ggggatgtgc caccagactg aaaatataac tacaaaagtt agtatataca atatagtaca cctcttattt ctacaaaagg agagtaatct acaccagaaa ttttggacag tgctttacag aagttcccga gtaagcgatg aggagaggcc ccacctacca tgaacaaggg tttatttatg atggagacca tccagaaaat agaattgaaa tcatatctat ccaacgagcg tgggggagga attgtaaaaa gttatgagca ctgcatcaaa tttaagtgtc ttggaggaat aactcagtgc ctttgatgct tgagacctca taaaagtgag accaacatca agctcagaat gacagggtta ggtttgtgaa cacttctgtt ttgtgcacct gcctaaatgt tacttggttt tagggataat aagaccagga acaaccaatc tgcaataaaa tactgataaa gacaaattgc ggatagggat tcatattaga aagagaggga gactgatgga ggagttggga gaagaggtac cttgggtttt tcagtcccgg caagagacaa ggtcactgcc gtttagacaa gaacaatgat agccctccta gaatagctgg atatggaatt aatgctagct ccagtagact tttgctatgt tccccctaca gagggtggct atgtaacacc cgggagaagt gagctcatag tcagatccca ttcgaatggc agggagctga atgtcacaaa cattgcaaga ggacctcctc gaaggcccac gaagaagctt aatagacatg ctcttcatgc aatcctctct gtgttgctac gtcacgaaag caagtaagta tatgggatct gcgacaattc ctaccagata tgggagaata ataaagcctt acagatagat gcaccagtat aattgcacag aaaagagaca caagggaata atccaagaat ccaggttatg tctaaggtgg ggctttaatg aggactataa aataagacag aatgataggc gaggtgaaac atcaatttaa agaggagagt gtaactaccc caaataatca gacctcacgt aaccaaacta gattataaat actactggtg ctcgcaacgg actttattgg caagaattgt atcgagaagt gtctgccaca acttggcaag gaagaggcac gatgtgtttg tatgtagttg aagttaaggc catacccaac gccccatttt ggaaggaagc cagtacttat agactatgca gagaagggcc gtaccaggta gggagagaat taaacagaac ttttccaggt gctatgtaaa aaggctgtag ctcctccccc aaagggaacc taaaacattt gagacaccct attttagagg caactatacc cattgccagt agaagaagaa tgggatgtct attgtactca ccctcttctg atggtgatta cagtcacaga gtgtaaaatt ggggattgac cagaaaaaat gcttggaaca agacaaagga gcactgataa cttgtgacaa ctttgcttag tggtctcttc gaactagagc ttagtttaaa ttttaccagt caaagcaggc agaccattgt cggctcctgg tcctctactg agaggccaaa acacttggca gtaactccac gtatcaccat tagtagagat gcacctcaag caggttctgc ctgggatagt tgcgactgac acttaaagga ctactgtacc agtgggagcg aaattcaaca gcaattggtt taggagtaat aggggtatag aggacccggc aaggtcggat catttagaag gcagtgagga gacattttac atcaggggag ccaagcctac cccacctgga agtagaggag ttggcaaagg atacagatac atgtctaggg tccaggacta atgggatgaa tgatcctcgc tgagggagca cggatgcaac gccctcttga tttgttttct ctccgaaaaa tgggaatcag atatgtcaca tgcaaccaag ttcagaattg acaggcaata atccccatta aaaatcatca agacatggtc agagcaaatg gtacaatgaa tgaaagcaga acattattgg atgtaatgac atgcacaagg agaaaataga taagtattat caccattatg atggtgttgg caaacatccc aggaggagat taaaatgaat ggaacggcat taaagtaggc agtgaccagt gagtgcagag cactccgatt aaataaaaga aatgggcgcg gcagcaacag cgtctgggga ccaggcgcag atggccaaat aaaggttgac agagaagaac tgaccttgct actgttaaga gccagtgttc actgccaacc 8821 8881 8941 9001 9061 9121 9181 9241 9301 9361 9421 9481 9541 9601 9661 9721 9781 9841 9901 9961 10021 10081 10141 10201 10261 agagaaggca atagaatata aactgcagaa tctgcgaccc gggtggagct gcgggcgcgt atccctagga tcaggggcag atacagaaaa agtgaggcca ttttataaaa cttagacatg aggaccagga tgtatcagat caagtgggat cacttatgag gaagaggtta actcgctgag acacccactt ggagaggctg gtgttccctg tgcttgctta ccatctctcc gttaggaccc taatatagga aagaaggaga ttcatttcct ccttgctatc tacgaagggt atttccatga ggagagactt ggattaggca tatatgaata caaaatatgg aaagttcccc gaaaaggggg tacttagaaa attagatacc gaggcacagg gacccttggg gcatatgcta gaagaaggct atagcaggga tcttgatgta gcagattgag ctagactctc aagacctctt tagtcgccgc tttctgcttt gagacct cggtggagaa gatccgccaa gagagcatac tcgagaagtc ggcggtccaa atgggagact agggcttgag ctccatggag atgatataga taagagcaat gactggaagg aggaagaagg caaagacatt aggatgagag gagaggttct gatacccaga aaccgcaaga ctttccacaa taaatatcac ccctgggagg accagcactt caataaagct ctggtcaact gagaaaccga ggcggtggca ctgatacgcc cagatcctcc ctcaggactg gccggctgga cttaggagag ctcacgctct aaacccagct tgaggaagat gacttacaaa gatttattac catcatacca tggctggcta gcattattta agcgtggaag agagttggaa ggccttctta ggggatgtta tgcatttcgc ttctctccag ggccagtgct gccattttag cggtactcgg agcaggaaaa acagctcctg tcttgacttg aaccaatact aactgaccta gatctgcgac gtggaagatg tgtgagggac gaagaaaaag gatgacttgg ttggcaatag agtgcaagaa gattggcagg tggaaattag atgcagccag tttgatccaa gcaagtcagg acatggctga tggggaggag tctgtattca cactagcagg gggcagagtg aagtaagcca taataagaag tccctagcat gccttggcag gctattcagc ccagaggctc cctacaatat agaaactctt gatcctcgca agaaatacaa aaaaattagc taggggtatc atatgtctca gacatagaat attacacctc tccctgtaaa ctcaaacttc ctctagccta cctgtcagag caagagggaa ccggtcggga gtcgctctgc tagagcctgg gctccacgct gtgtgtgttc accctggtct gaagatggac // 6 n above report in format g 1 334657 Save the