LOCUS HPV39 7833 bp ds-DNA VRL 06-MAR-1991 DEFINITION Human papillomavirus type 39 (HPV-39), complete genome. ACCESSION M62849 M38185 SOURCE Human papillomavirus type 39 DNA isolated from a penile Bowenoid papule biopsy. ORGANISM Human papillomavirus type 39 Viridae; ds-DNA nonenveloped viruses; Papovaviridae; Papillomavirus. REFERENCE 1 (bases 1 to 7833) AUTHORS Volpers,C. and Streeck,R.E. TITLE Genome organization and nucleotide sequence of human papillomavirus type 39 JOURNAL Virology 181, 419-423 (1991) STANDARD full staff_review COMMENT In developing countries, cancer of the cervix is responsible for 24% of all cancers in women. In these areas, it is the most frequent female malignancy. In developed countries, it ranks behind cancers of the breast, lung, uterus, and ovaries and accounts for 7% of all female cancers. HPV-39 is most often found in lesions of the genital mucosa which may have a risk for malignant progression. Estimates indicate that HPV-39 and other less studied HPV types (-31, -33, -35, -45, -51, -52, and -56) have been recovered from about 15% of all invasive cervical cancers. In a recent study of 365 female HPV positive patients, HPV-39 DNA was detected in 3.9% of the tissue samples. The 7833 bp genome of HPV-39 was first recovered and cloned from biopsy samples of penile Bowenoid papules, which contained the viral DNA in episomal form. Its genome contains an E7 ORF which is located immediately upstream of E1, which is common among all genital papillomaviruses. Also seen in this genome is a absence of an initiation codon for E4, characteristic of types 16, 31, and 33. Unusually, a large ORF of 1.3 kb has been found on the complementary strand of DNA. This ORF contains an initiation codon, a potential splice acceptor site close to the 5' end, and a polyadenylation signal at the 3' end. Further upstream of this large ORF is a smaller ORF preceded by a TATA box and an NF-1 binding site. The noncoding region of HPV-39 contains several features common among other papillomavirus types. It contains three complete and two degenerate versions of a PV-specific palindrome, which is speculated to be an E2 activator and/or repressor binding site. Possible promoter elements which have been identified include two TATA boxes, a conserved AAAGGGAGTA promoter element which is upstream of a 12 bp palindrome tandem repeat, and an enhancer core sequence. Various transcription factor binding sites are also present. These include four possible sites for nuclear factor 1 (NF-1), two possible sites for activator protein 1 (AP-1), and a motif for a recently postulated papillomavirus enhancer associated factor (PVF). A glucocorticoid response element (GRE) is found resembling those found in other types. In addition, a GRE is found in the L1 ORF with no equivalent in other types. The E6 and E7 ORFs of HPV-39 contain four copies and one copy respectively of the well-conserved cysteine doublet (Cys-X-X-Cys) motif. These motifs may be involved in the formation of zinc finger like structures. The author points out that mutational analysis of the HPV-16 ORF has shown that one copy of this motif is sufficient for transformation. In addition, the E7 ORF of HPV-39 contains a putative cell division motif found in genital HPVs associated with malignancy, SV40 large T antigen, adenovirus E1A, and the myc protein. FEATURES Location/Qualifiers CDS <3393..3677 /note="E4 ORF from bp 3393 to 3677" /gene="E4" /note="putative" /codon_start=1 /translation="FIVLTLCAVPVTDRYPLLNLLPNYQTPPRPIPPQQPHAPKKQSR RRLESDLDSVQSQSPLSPTECPWTILTTHSTVTVQATTQDGTSVVVTLRL" 5'UTR join(7161..7833,1..106) /standard_name="LCR" promoter 33..42 /note="putative" protein_bind 43..54 /function="gene transcription" /bound_moiety="E2" /note="putative" protein_bind 59..70 /function="gene transcription" /bound_moiety="E2" /note="putative" TATA_signal 74..80 /note="putative" CDS 107..583 /note="E6 ORF from bp 44 to 583" /product="transforming protein" /gene="E6" /note="putative" /codon_start=1 /translation="MARFHNPAERPYKLPDLCTTLDTTLQDITIACVYCRRPLQQTEV YEFAFSDLYVVYRDGEPLAACQSCIKFYAKIRELRYYSDSVYATTLENITNTKLYNLL IRCMCCLKPLCPAEKLRHLNSKRRFHKIAGSYTGQCRRCWTTKREDRRLTRRETQV" CDS 592..921 /note="E7 ORF from bp 493 to 921" /product="transforming protein" /gene="E7" /note="putative" /codon_start=1 /translation="MRGPKPTLQEIVLDLCPYNEIQPVDLVCHEQLGESEDEIDEPDH AVNHQHQLLARRDEPQRHTIQCSCCKCNNTLQLVVEASRDTLRQLQQLFMDSLGFVCP WCATANQ" CDS 928..2871 /note="E1 ORF from bp 922 to 2871" /product="replication protein" /gene="E1" /note="putative" /codon_start=1 /translation="MANREGTDGDGSGCNGWFLVQAIVDKQTGDTVSEDEDENATDTG SDLADFIDDSTDICVQAERETAQVLLHMQEAQRDAQAVRALKRKYTDSSGDTRPYGKK VGRNTRGTLQEISLNVSSTQATQTVYSVPDSGYGNMEVETAEVEEVTVATNTNGDAEG EHGGSVREECSSVDSAIDSENQDPKSPTAQIKLLLQSNNKKAAMLTQFKETYGLSFTD LVRTFKSDKTTCTDWVAAIFGVHPTIAEGFKTLINKYALYTHIQSLDTKQGVLILMLI RYTCGKNRVTVGKGLSTLLHVPESCMLLEPPKLRSPVAALYWYRTGISNISVVTGDTP EWIQRLTVIQHGIDDSVFDLSDMVQWAFDNEYTDESDIAFNYAMLADCNSNAAAFLKS NCQAKYVKDCATMCKHYKRAQKRQMSMSQWIKFRCSKCDEGGDWRPIVQFLRYQGIEF ISFLCALKEFLKGTPKKNCIVIYGPANTGKSHFCMSLMHFLQGTVISYVNSTSHFWLE PLADAKLAMLDDATGTCWSYFDNYMRNALDGYAISLDRKYKSLLQMKCPPLLITSNTN PVEDDRWPYLRSRLTVFKFPNAFPFDQNRNPVYTINDKNWKCFFEKTWCRLDLQQDED EGDNDENTFTTFKCVTGQNTRIL" CDS 2798..3910 /note="E2 ORF from bp 2780 to 3910" /product="regulatory protein" /gene="E2" /note="putative" /codon_start=1 /translation="MKETMMKTLSQRLNVLQDKILEYYEQDSKSIYDQINYWKCVRME NAIFYAARERGMHTIDHQVVPTINISKCKAYQAIELQMALESVAQTEYNTEEWTLKDT SNELWHTQPKQCFKKQGTTVEVWYDGDKCNAMNYVLWGAIYYKNNIDIWCKTEGCVDY WGIYYMNEHLKVYYEVFIQDAERYGTSGKWEVHYNGNIIHCPDSMCSTSDGSVPTTEL TTELSNTTATHSTATTPCTQKTIPPPSRKRPRQCAVTEPTEPDGVSLDHLNNPLHSNS TGHNTRRYLSCGNTTPIIHLKGDKNGLKCLRYRLQKYDTLFENISCTWHWIRGKGTKN AGILTVTYATESQRQKFLDTVKIPSSVHVSLGYMTL" CDS 3958..4176 /note="E5 ORF from bp 3958 to 4176" /gene="E5" /note="putative" /codon_start=1 /translation="MILLVFLVWFGVCIYICCNVPLLPSVHVCAYVWIIVFVFILIRT TPLEVFFVYLLFFVLPMWLLHRLAMDMI" polyA_signal 4243..4248 CDS 4250..5662 /note="L2 ORF from bp 4172 to 5662" /product="minor capsid protein" /gene="L2" /note="putative" /codon_start=1 /translation="MVSHRAARRKRASATDLYRTCKQSGTCPPDVVDKVEGTTLADKI LQWTSLGIFLGGLGIGTGTGTGGRTGYIPLGGRPNTVVDVSPARPPVVIEPVGPSEPS IVQLVEDSSVITSGTPVPTFTGTSGFEITSSSTTTPAVLDITPSSGSVQITSTSYTNP AFTDPSLIEVPQTGETSGNIFVSTPTSGTHGYEEIPMEVFATHGTGTEPISSTPTPGI SRVAGPRLYSRAHQQVRVSNFDFVTHPSSFVTFDNPAFEPVDTTLTYEAADIAPDPDF LDIVRLHRPALTSRKGTVRFSRLGKKATMVTRRGTQIGAQVHYYHDISSIAPAESIEL QPLVHAEPSDASDALFDIYADVDNNTYLDTAFNNTRDSGTTYNTGSLPSVASSASTKY ANTTIPFSTSWNMPVNTGPDIALPSTTPQLPLVPSGPIDTTYAITIQGSNYYLLPLLY FFLKKRKRIPYFFSDGYVAV" CDS 5643..7160 /note="L1 ORF from bp 5610 to 7160" /product="major capsid protein" /gene="L1" /note="putative" /codon_start=1 /translation="MAMWRSSDSMVYLPPPSVAKVVNTDDYVTRTGIYYYAGSSRLLT VGHPYFKVGMNGGRKQDIPKVSAYQYRVFRVTLPDPNKFSIPDASLYNPETQRLVWAC VGVEVGRGQPLGVGISGHPLYNRQDDTENSPFSSTTNKDSRDNVSVDYKQTQLCIIGC VPAIGEHWGKGKACKPNNVSTGDCPPLELVNTPIEDGDMIDTGYGAMDFGALQETKSE VPLDICQSICKYPDYLQMSADVYGDSMFFCLRREQLFARHFWNRGGMVGDAIPAQLYI KGTDIRANPGSSVYCPSPSGSMVTSDSQLFNKPYWLHKAQGHNNGICWHNQLFLTVVD TTRSTNFTLSTSIESSIPSTYDPSKFKEYTRHVEEYDLQFIFQLCTVTLTTDVMSYIH TMNSSILDNWNFAVAPPPSASLVDTYRYLQSAAITCQKDAPAPEKKDPYDGLKFWNVD LREKFSLELDQFPLGRKFLLQARVRRRPTIGPRKRPAASTSSSSATKHKRKRVSK" protein_bind 6367..6381 /bound_moiety="hormone receptor" /standard_name="glucocorticoid responsive element" /note="putative" polyA_signal 7261..7266 protein_bind 7425..7439 /bound_moiety="hormone receptor" /standard_name="glucocorticoid responsive element" /note="putative" protein_bind 7456..7467 /function="gene transcription" /bound_moiety="E2" /note="putative" protein_bind 7798..7809 /function="gene transcription" /bound_moiety="E2" /note="putative" source 1..7833 /organism="Human papillomavirus type 39" /sequenced_mol="DNA" BASE COUNT 2426 a 1485 c 1660 g 2262 t ORIGIN 1 cttataacat tttataagta tcttgtttaa aaaaagggag taaccgaaaa cggtcaggac 61 cgaaatcggt ggatataaaa cgcagtcaca gtttctgtcc ataccgatgg cgcgatttca 121 caatcctgca gaacggccat acaaattgcc agacctgtgc acaacgctgg acaccacctt 181 gcaggacatt acaatagcct gtgtctattg cagacgacca ctacagcaaa ccgaggtata 241 tgaatttgca tttagtgatt tatatgtagt atatagggac ggggaaccac tagctgcatg 301 ccaatcatgt ataaaatttt atgctaaaat acgggagcta cgatattact cggactcggt 361 gtatgcaact acattagaaa atataactaa tacaaagtta tataatttat taataaggtg 421 catgtgttgt ctgaaaccgc tgtgtccagc agaaaaatta agacacctaa atagcaaacg 481 aagatttcat aaaatagcag gaagctatac aggacagtgt cgacggtgct ggaccacaaa 541 acgggaggac cgcagactaa cacgaagaga aacccaagta taacatcaga tatgcgtgga 601 ccaaagccca ccttgcagga aattgtatta gatttatgtc cttacaatga aatacagccg 661 gttgaccttg tatgtcacga gcaattagga gagtcagagg atgaaataga tgaacccgac 721 catgcagtta atcaccaaca tcaactacta gccagacggg atgaaccaca gcgtcacaca 781 atacagtgtt cgtgttgtaa gtgtaacaac acactgcagc tggtagtaga agcctcacgg 841 gatactctgc gacaactaca gcagctgttt atggactcac taggatttgt gtgtccgtgg 901 tgtgcaactg caaaccagta acctgctatg gccaatcgtg aaggtacaga cggggatggg 961 tcgggatgta acggatggtt tctagtacag gcaatagtag ataaacaaac aggcgacaca 1021 gtgtcggagg atgaggatga aaatgcaaca gatacaggtt cagacctggc agactttatt 1081 gatgattcca cagatatttg tgtacaggca gagcgtgaga cagcacaggt acttttacat 1141 atgcaagagg cccaaaggga tgcacaagca gtgcgtgcct taaaacgaaa gtatacagac 1201 agcagtggcg acactagacc gtatggaaaa aaagtaggca ggaataccag gggaacacta 1261 caggaaattt cattaaatgt aagcagtacg caggcaacac aaacggtgta ttccgtgcca 1321 gacagcggat atggcaatat ggaagtggaa acagctgaag tggaggaggt aactgtagca 1381 actaatacaa atggggatgc tgaaggggaa catggcggca gtgtacggga ggagtgcagt 1441 agtgtggata gtgctataga tagtgaaaac caggatccca aatctccaac tgcacaaatt 1501 aaattattgt tacaatccaa taacaaaaag gctgcaatgc taacacaatt taaagaaaca 1561 tatggactat cctttactga cctggtacgt acgtttaaaa gtgataaaac aacatgtaca 1621 gactgggtgg cagccatatt tggagtacat ccaactattg cagaaggatt taaaacatta 1681 atcaacaaat atgccttata tacacatata caaagcttag acacaaaaca aggagtacta 1741 attttaatgc taataagata tacatgtgga aaaaataggg ttactgtagg aaagggatta 1801 agtacattgt tacatgttcc agaaagttgt atgcttctgg agcctcctaa actgcgcagc 1861 cctgtagcag cactatattg gtatcgcaca ggtatatcca atattagtgt ggtaacaggg 1921 gatacgccag aatggataca acgattaact gttatacaac atggaataga tgatagtgta 1981 tttgacctat cggacatggt acaatgggca tttgacaatg aatatactga tgaaagtgac 2041 atagcattta attatgcaat gttagcagat tgtaacagta atgctgcagc ctttttaaaa 2101 agtaactgcc aggcaaaata tgtaaaagat tgtgcaacaa tgtgtaaaca ttacaagcga 2161 gcacaaaaaa ggcaaatgtc catgtctcaa tggataaaat ttaggtgtag taaatgtgat 2221 gaaggcgggg actggagacc catagtacaa ttcttaagat atcaaggaat agaatttata 2281 tcctttttat gtgcattaaa ggaattttta aagggtactc ccaaaaaaaa ctgtatagtt 2341 atatatggac ctgcgaatac aggaaagtca catttttgta tgagccttat gcatttttta 2401 cagggcacag ttatttcata tgtaaactcc accagccact tttggctaga accacttgca 2461 gatgcaaaac tagcaatgtt agatgatgca accggtacct gctggtcata tttcgataat 2521 tatatgagaa atgcattaga tgggtatgca ataagtttag ataggaaata taaaagttta 2581 ctacaaatga aatgtccacc attattaata acctccaata ccaatcctgt ggaagacgat 2641 aggtggccat atttacgtag taggctaaca gtgtttaaat ttcctaatgc atttccattt 2701 gaccaaaaca ggaatccagt gtacacaatc aatgataaaa actggaaatg tttttttgaa 2761 aagacttggt gcagattaga cttgcagcag gacgaggatg aaggagacaa tgatgaaaac 2821 actttcacaa cgtttaaatg tgttacagga caaaatacta gaatactatg aacaagacag 2881 taaatcaata tatgatcaaa ttaattattg gaaatgtgtg cgaatggaaa atgcaatatt 2941 ttatgcagca cgagaacgtg gcatgcatac tattgaccac caggtggtgc caaccataaa 3001 catttcaaaa tgtaaagcat atcaagctat tgaactgcag atggcactag aaagtgttgc 3061 acaaactgaa tacaatacag aggagtggac attaaaagac actagtaatg aactgtggca 3121 tacacagcca aaacaatgtt ttaaaaaaca aggaactaca gtggaggtgt ggtatgatgg 3181 ggacaaatgt aatgctatga actatgtatt atggggtgct atatattata aaaataatat 3241 agacatatgg tgtaaaacag aagggtgtgt ggactattgg ggtatatatt atatgaacga 3301 gcacctaaaa gtatactatg aagtgtttat tcaagatgcg gaaaggtatg ggactagtgg 3361 caaatgggaa gtgcattata atggcaacat aattcattgt cctgactcta tgtgcagtac 3421 cagtgacgga tcggtaccca ctactgaact tactaccgaa ttatcaaaca ccaccgcgac 3481 ccattccacc gcaacaaccc catgcaccca aaaaacaatc ccgccgccgt ctcgaaagcg 3541 acctcgacag tgtgcagtca cagagcccac tgagcccgac ggagtgtccc tggaccatct 3601 taacaaccca ctccacagta acagtacagg ccacaacaca agacggtacc tcagttgtgg 3661 taacactacg cctataatac atttaaaagg tgacaaaaat ggtttaaaat gtttaagata 3721 tagactacaa aaatatgaca cattgtttga aaatatttca tgtacctggc attggatacg 3781 gggtaaggga accaaaaacg ctggcatatt aactgttaca tatgccacag agtcacaacg 3841 ccaaaaattt ttggacactg ttaaaatacc ttctagtgta catgtttcat tgggttacat 3901 gacattgtaa agtatactat ggatattgtg tatgtatatt gtatacatac tacatagatg 3961 atattattgg tatttttggt gtggtttggt gtgtgtatat atatatgttg caatgtcccg 4021 cttttgccgt ctgtgcatgt gtgtgcgtat gtgtggataa ttgtgtttgt gtttattctt 4081 atacgtacca caccattgga ggtgtttttt gtatatttac tattttttgt attgcccatg 4141 tggttgttgc atagactggc aatggatatg atatagtact gtatatgtat gtgcattgtg 4201 cataactact gtacatagct ttttatattt ttttttgtta ctaataaaca tggtttccca 4261 ccgtgctgcc aggcgtaagc gtgcatctgc aactgaccta tatagaacct gtaaacaatc 4321 gggtacctgt ccaccagacg ttgttgataa agttgagggt actacacttg ctgacaaaat 4381 tttacagtgg actagtttag gtatattttt gggtgggtta ggcataggca caggtactgg 4441 tactggggga cgcacaggat atatacccct ggggggtagg cctaatactg ttgtagatgt 4501 gtctcctgca cgtccacctg tagttattga acctgttggt ccttctgagc catctattgt 4561 gcaattggtg gaggactcaa gtgttataac ctctggaaca ccagtaccaa catttacagg 4621 cacctctgga tttgaaatta cttcttcttc tactactacg cctgcggtat tggatattac 4681 accctcctct gggtctgtac aaataacctc tactagttat actaaccctg cctttacgga 4741 tccttcctta attgaggttc cccaaacagg tgaaacctcg ggtaatatat ttgtcagtac 4801 ccctacatca ggtacacatg gctatgagga aatacctatg gaagtgtttg ccacacatgg 4861 cacaggtacc gaacctatta gcagcacacc tacacctgga atcagtcgtg tggcaggacc 4921 acgtttatat agtagagcac atcagcaggt tcgtgttagt aattttgatt ttgtaactca 4981 cccttcatca tttgtaacat ttgataatcc tgcttttgag cctgttgata ctacattaac 5041 atatgaagct gctgacatag ctccagatcc ggattttctg gacattgttc gtttacatag 5101 gcctgcctta acctcgcgta aaggaacagt aaggtttagt aggcttggca aaaaggctac 5161 catggttacc cggcgtggca cacaaattgg agcgcaagta cattattacc atgacattag 5221 tagtattgct cctgctgaaa gcattgaatt acagccccta gttcacgctg agccctctga 5281 tgcttcagat gcattatttg atatatatgc tgatgtggac aataacacat atttagatac 5341 tgcatttaat aatacaaggg attcgggcac tacatataac acaggctcac taccttctgt 5401 ggcttcttca gcatctacta aatatgccaa tacaactatt ccttttagta cctcatggaa 5461 tatgcctgta aatactggtc ctgatattgc tttaccaagt actactccac agttgccatt 5521 ggtgccttct ggaccaatag acacaacata tgcaataacc attcagggtt ccaattatta 5581 tttgttgcca ttattgtatt ttttcctaaa aaaacgtaaa cgtattccct attttttttc 5641 agatggctat gtggcggtct agtgacagca tggtgtattt gcctccacct tctgtggcga 5701 aggttgtcaa tactgatgat tatgttacac gcacaggcat atattattat gctggcagct 5761 ctagattatt aacagtagga catccatatt ttaaagtggg tatgaatggt ggtcgcaagc 5821 aggacattcc aaaggtgtct gcatatcaat atagggtatt tcgcgtgaca ttgcccgatc 5881 ctaataaatt cagtattcca gatgcatcct tatataatcc agaaacacaa cgtttagtat 5941 gggcttgtgt aggggtggag gtgggcaggg gccagccatt gggtgttggt attagtggac 6001 acccattata taatagacag gatgatactg aaaactcacc attttcatca accaccaata 6061 aggacagtag ggataatgtg tctgtggatt ataaacagac acagttgtgc attataggct 6121 gtgttcccgc cattggggag cactggggta agggaaaggc atgcaagccc aataatgtat 6181 ctacggggga ctgtcctcct ttggaactag taaacacccc tattgaggat ggtgatatga 6241 ttgatactgg ctatggagct atggactttg gtgcattgca ggaaaccaaa agtgaggtgc 6301 ctttagatat ttgtcaatcc atttgtaaat atcctgatta tttgcaaatg tctgcagatg 6361 tgtatgggga cagtatgttc ttctgtttac gtagggaaca actgtttgca agacattttt 6421 ggaatcgtgg tggtatggtg ggtgacgcca ttcctgccca attgtatatt aagggcacag 6481 atatacgtgc aaaccccggt agttctgtat actgcccctc tcccagcggt tccatggtaa 6541 cctctgattc ccagttattt aataagcctt attggctaca taaggcccag ggccacaaca 6601 atggtatatg ttggcataat caattatttc ttactgttgt ggacactacc cgtagtacca 6661 actttacatt atctacctct atagagtctt ccataccttc tacatatgat ccttctaagt 6721 ttaaggaata taccaggcac gtggaggagt atgatttaca atttatattt caactgtgta 6781 ctgtcacatt aacaactgat gttatgtctt atattcacac tatgaattcc tctatattgg 6841 acaattggaa ttttgctgta gctcctccac catctgccag tttggtagac acttacagat 6901 acctacagtc tgcagccatt acatgtcaaa aggatgctcc agcacctgaa aagaaagatc 6961 catatgacgg tctaaagttt tggaatgttg acttaaggga aaagtttagt ttggaacttg 7021 atcaattccc tttgggacgt aaatttttgt tgcaggccag ggtccgcagg cgccctacta 7081 taggtccccg aaagcggcct gctgcatcca cttcctcgtc ctcagctact aaacacaaac 7141 gtaaacgtgt gtctaaataa tgcatgtgta tgccttgtta tgtgtgtgta tgttgtttgt 7201 ttccttatgt gttgagtgta tatgtgtatg tttgtaggta tgtgtgtata tgtttttgtt 7261 aataaagtat gtatgacagt ttcatgtgtg attgcacacc ctgtgactaa cagtgtattt 7321 gttttacata taataggtct gcaacatttc atacataatc tatatgccct accctaaggt 7381 gtgtttacta cctaatatgt aatttttaca ttgttgtatg cgtttctaca ttttatactt 7441 cgccattttg tggcgaccga agtcggtcgt gggttgagca ttttttttaa actagtggaa 7501 accacctttc tcagcaaaaa catgtcttta ccttaggttc accctgcata gttggcactg 7561 gtaacagttt tactggcgcg ccttattact catcatcctg tccaggtgca ctgcaacaat 7621 actttggcaa catccatatc tccaccctat gtaataaaac tgcttttagg catatatttt 7681 agctgttttt acttgcttaa ttaaatagtt ggcctgtata actacttttt gattcaggaa 7741 tgtgtcttac agtataagtt atacaagtga ctaatgtagc acacaatagt ttatgcaacc 7801 gaaataggtt gggcatacat acctatactt tta