LOCUS HPV72 7988 bp DNA VRL 14-AUG-1996 DEFINITION Human papillomavirus type 72 E6, E7, E1A, E1B, E2, E4, L2, and L1 genes. ACCESSION X94164 NID g1491683 KEYWORDS E1A gene; E1B gene; E2 gene; E4 gene; E6 gene; E7 gene; early gene; L1 gene; L2 gene; late gene. SOURCE Human papillomavirus type 72. ORGANISM Human papillomavirus type 72 Viridae; ds-DNA nonenveloped viruses; Papovaviridae; Papillomavirus. REFERENCE 1 (bases 1 to 7988) AUTHORS Volter,C., He,Y., Delius,H., Roy-Burman,A., Greenspan,J.S., Greenspan,D. and de Villiers,E.M. TITLE Novel HPV types present in oral papillomatous lesions from patients with HIV infection JOURNAL Int. J. Cancer 66 (4), 453-456 (1996) MEDLINE 96213783 REFERENCE 2 (bases 1 to 7988) AUTHORS Delius,H. TITLE Direct Submission JOURNAL Submitted (08-DEC-1995) H. Delius, Deutsches Kerbsforschungzentrum, Abt. ATV - 0686, Im Neuenheimer Feld 506, 69120 Heidelberg, FRG COMMENT The complete genome of HPV72 was isolated from a wart with atypia taken from an HIV positive patient in a study of oral lesions [1]. The sequence of the E1 region below appears to contain a frame-shift mutation, leading to the identification of two ORFS, designated E1a and E1b by the authors [1]. The isolates CP4173 (Peyton and Wheeler, J Inf Dis 170(5) 1089-92) and LVX100 (Ong, Bernard and Villa, J Inf Dis 170(5):1086-8), obtained from cervical samples, are variants of HPV72, as indicated by their MY09-MY11 sequences. FEATURES Location/Qualifiers source 1..7988 /organism="Human papillomavirus type 72" CDS 102..548 /gene="E6" /note="early gene, putative" /codon_start=1 /db_xref="PID:e224127" /db_xref="PID:g1491684" /translation="MPMGLHNPTNIWLLCKEIEVDLEDLRITCIFCKNELTTEELLAI AIKELQIVWRDNWPFGVCAPCLARATKVRELRYWTYSGYGPTVEQETGKSLAELYIRC HACCKPLSCQEKEYQVQTGIHFHKISGLWTGRCCQCRGACTARWQP" CDS 524..826 /gene="E7" /note="putative" /codon_start=1 /db_xref="PID:e224128" /db_xref="PID:g1491685" /translation="MHGQVATIKDIVLQELPDVVDLHCNEQLLDSSESESEDERDGVG VQEQLVEQAQQAYGVVTTCGRCYRPVRLVVECRDADVKALQQLLLDNLSIVCPRCA" CDS join(832..2142,2142..2783) /note="putative" /codon_start=1 /db_xref="PID:e224129" /db_xref="PID:g1491686" /translation="MANCEGTERGDGDEDANRAGGWFLVEAIVEQTTGYQESSDEDEN SEDRGEDLVDFIDTRSLGDGQEVPLDLFVQQNARDDAATVQALKRKYTCSPASSSCVS LVDSELSPRLDAISINRGHDRARRRLFDQDSGYGHTQVDIGAPESQVSGGTQHTKGGG GAVQEAEEERVGGDGEAQCSAQTQQTPERAADVLEIFKVSNLRVTLLHKFKELFGLAY GDLVRQFKSDKSICGDWVVCAFGVYHAVAEAVKTLIQPICLYAHIQIQTCQWGMVILM LVRYKCGKSRETVAHSMSTLLNIPEKQMLIEPPKIRSGPCALYWYRTAMGNGSEVYGE TPEWIVRQTVVGHAMQETQFSLSTLVQWAYDNDITDESELAYDYAMLGNEDPNAAAFL ASNCQAKYIKDAITMCKHYKRAEQARMSMTQWIAHRGRKVADSGDLREIVKYLRYQRV EFVTFMGALKLFLKGVPKKSCMVFYGPSDTGKSLFCMSLLKYLGGAVISYVNSGSHFW LSPLVDAKVGLLDDATYQCWQYIDTYLRTVLDGNAISIDRKHRNLTQLKCPPLMITTN INPLEDQAFKYLHSRIVLFKFMHKCPLKSNGDPVYTLNNENWKSFFQRSWARIEGPDE QEEEEDEDGSTSRPFRCVPGEIARPL" exon 832..2142 /gene="E1A" /number=1 mRNA 832..2142 /partial /gene="E1A" /note="putative" mRNA 2142..2783 /partial /gene="E1B" /note="putative" exon 2142..2783 /gene="E1B" /number=2 CDS 2719..3873 /gene="E2" /note="putative" /codon_start=1 /db_xref="PID:e224130" /db_xref="PID:g1491687" /translation="MRMEALADRLDACQEKLLDLYEKDSDKLEDQILHWHYVRLEHAM LFKARQAGLTHVGHQVVPTLSVTKGKAHQAIEVHLSLQGLQNSAYAQEPWTLQNTSLE MWNAHPQRCWKKKGRTITVKFDCEDLKAVEYVSWGCIYVQSTEDEQWYKVQGHVSYHG LYYEFQGQKQYYVTFGHEARKYGDTNTWEVHVGSTVIYEPCASVSSTQDTVREVPTVE TVGRLPDATKSTATATCVGPAQTSSSVQTPPCKRQRLHRDGLQQQPDSTERDICRQRR DSADQWVNRDSDCTQQARDICNSHGAPIIHLKGEPNKLKCFRYRLQQSVPNLFLKASS TWHWACGGDTTKCAFVTLWYVDTDQRTQFLSRVNIPKGIQATAGYMSMCI" CDS 3311..3631 /gene="E4" /note="putative" /codon_start=1 /db_xref="PID:e224131" /db_xref="PID:g1491688" /translation="MNPAPLYLAPRTPCEKYPLLKLLGDCQTPPNPPPPPRAWAPPRH PPQCRRRLVSDSDSTETDCSSSPTLRKETSAGNGVTVLTSGSTVTVTAHNKQGTSVTV TVHL" CDS 4409..5809 /gene="L2" /note="late gene, putative" /codon_start=1 /db_xref="PID:e224132" /db_xref="PID:g1491689" /translation="MTQAVRRRKRASATDLYRTCKQAGTCPPDVIPKVEGDTLADRFL KWASLGVFFGGLGIGTGSGTGGRTGYVPIGTRPPTVVDIGPTTRPPVVIEPVGAADPS IVTLVEESSVVEAGATVPTFTGSGGFEVTTSSTTTPAVLDITPSGASVQVSSSSFTNP LFTEPSIIEPPQAGDLSGHVFTSTPTSGSHSFEEIPMHTFATHSSTSTDPFSSTPLPG VRRLAQPRLGLYSKANQQVRVTNPAFLSRPQSLVTYDNPVYDPEETIIFEHPSIYTPP DPDFLDIISLHRPALTARQGTVRVSRLGQRATLRTRSGKRIGARVHFYQDISPISSDT IEMQSLASSTQPDITYDIYADPDLGEPPPRASVSSTSLHSPSLSAASAVSAKYDNVTV PLSLGPHIPASSGPDIDLSFAPAPVPTMPLVPSTHPHSIYVEGFDFYLLPAYIFFPKR RKRVPYSFADGFVAAW" CDS 5703..7307 /gene="L1" /note="1st ATG, putative" /codon_start=1 /db_xref="PID:e224133" /db_xref="PID:g1491690" /translation="MLRALIFICCLHISFFLNVVNVCPILLQMALWRPGDGKVYLPPN PVSKVLSTDRYVQRTNLYYYGGSSRLLTVGHPYCAIPLNGQGKKNTIPKVSGYQYRVF RVKLPDPNKFALPDGTLYNPDTERLVWACRGIEVGRGQPLGVGTSGHPLYNRLDDTEN TSLLVADNSDSRDNVSVDYKQTQLLIIGCKPPIGEHWTKGTPCAGSNSQPTDCPPLEF TNSTIQDGDMVETGYGAIDFATLQENKSEVPLDICTTTCKYPDYLQMAAEPYGDCMFF CLRREQMFARHFFNRQGTMGEALPASLYLKGASGSDRVTPGSYIYSPTPSGSMVSSDA QLFNKPYWLQRAQGHNNGICWFNELFVTVVDTTRSTNVTICTATASSVSEYTASNFRE YLRHTEEFDLQFIFQLCKIHLTPEIMAYLHNMNKALLDDWNFGVVPPPSTSLDDTYRF LQSRAITCQKGAATPPPKEDPYANLSFWTVDLKDKFSTDLDQFPLGRKFLLQVGSRAV SVSRKRAAPPSSTSTPAPTKRKKRKK" CDS 5787..7307 /gene="L1" /note="2nd ATG, putative" /codon_start=1 /db_xref="PID:e224134" /db_xref="PID:g1491691" /translation="MALWRPGDGKVYLPPNPVSKVLSTDRYVQRTNLYYYGGSSRLLT VGHPYCAIPLNGQGKKNTIPKVSGYQYRVFRVKLPDPNKFALPDGTLYNPDTERLVWA CRGIEVGRGQPLGVGTSGHPLYNRLDDTENTSLLVADNSDSRDNVSVDYKQTQLLIIG CKPPIGEHWTKGTPCAGSNSQPTDCPPLEFTNSTIQDGDMVETGYGAIDFATLQENKS EVPLDICTTTCKYPDYLQMAAEPYGDCMFFCLRREQMFARHFFNRQGTMGEALPASLY LKGASGSDRVTPGSYIYSPTPSGSMVSSDAQLFNKPYWLQRAQGHNNGICWFNELFVT VVDTTRSTNVTICTATASSVSEYTASNFREYLRHTEEFDLQFIFQLCKIHLTPEIMAY LHNMNKALLDDWNFGVVPPPSTSLDDTYRFLQSRAITCQKGAATPPPKEDPYANLSFW TVDLKDKFSTDLDQFPLGRKFLLQVGSRAVSVSRKRAAPPSSTSTPAPTKRKKRKK" misc_feature 832..2783 /gene="E1A and E1B" BASE COUNT 2136 a 1705 c 1942 g 2205 t ORIGIN 1 attactaaca ataatacatg taaaaaagta agacaagacc gaaaacggtc cgaccgacat 61 aggtacatat ataagggaac tgtgaactca gcaaatcagc aatgcctatg ggactgcaca 121 atccaactaa tatttggttg ctgtgcaagg aaattgaggt ggacctagaa gatttacgga 181 ttacctgcat attttgcaaa aatgaattaa caacagaaga attgctggcg attgcaataa 241 aggagctgca gattgtgtgg cgggacaact ggccatttgg agtctgcgca ccatgccttg 301 caagagcaac taaagtgagg gagctacgat actggacgta ttcgggctac ggacccactg 361 tggaacagga aacaggcaaa tcattagcag aactatatat aaggtgccat gcatgctgca 421 aacccctaag ctgtcaggaa aaggaatatc aggtgcagac aggaatccac ttccacaaga 481 taagcggact gtggacggga aggtgctgcc agtgtagagg ggcatgcacg gccaggtggc 541 aaccataaag gacattgtcc ttcaggaact tcctgatgtg gttgacctac actgcaatga 601 gcagttacta gacagctcag agtcagagtc agaggatgag agggacggtg ttggtgtgca 661 ggagcaactt gtagaacaag cacagcaggc ctacggggtg gttactacct gtggcaggtg 721 ctaccgtcca gttaggctgg tggtggagtg cagagacgca gacgtgaagg cgctacaaca 781 actactgctg gacaatttgt ccatagtgtg tcctcgctgc gcataaggga catggccaac 841 tgcgaaggta ctgaacgggg ggatggggac gaggatgcga atcgcgcggg cggatggttt 901 ttggttgagg ccatagtgga gcaaaccaca gggtaccaag agtccagtga tgaggacgaa 961 aacagtgagg acaggggaga agatctggta gactttatag acacaagatc cttaggggat 1021 gggcaggaag tgccgttaga tttgttcgtg caacaaaatg cacgggatga cgctgcaacc 1081 gtgcaggccc taaaacgaaa gtatacatgt agcccagcaa gcagctcgtg tgtgtctttg 1141 gtggacagtg agttaagtcc ccgactggac gccataagca taaaccgggg acacgacagg 1201 gctagaagaa ggctgtttga ccaagacagt ggctatggcc atacgcaggt ggatattgga 1261 gcaccagaaa gccaggtatc ggggggtaca cagcatacaa aggggggagg cggcgccgtt 1321 caggaagcgg aagaggagcg tgtggggggg gatggtgagg cgcagtgtag tgcacagaca 1381 cagcaaacgc cagagagagc agcagacgta ctagaaatat ttaaggttag taatttgcgt 1441 gtcacattac tgcataaatt taaagagcta tttggactag catatgggga tctggtaaga 1501 caatttaaaa gcgataaatc aatatgtggg gattgggtag tatgtgcatt tggggtatat 1561 catgcagtgg cagaggcagt aaagacgtta atacaaccca tatgtctgta tgcacatata 1621 caaatacaga cgtgtcaatg ggggatggta attttaatgc tggtgcggta taaatgtggc 1681 aagagtaggg agacagtggc acacagcatg agcacgctgc taaatatacc tgaaaagcaa 1741 atgcttattg aaccaccaaa aattagaagt ggaccatgtg ccctatactg gtatagaaca 1801 gcaatgggaa atggcagcga ggtgtacggg gaaaccccag aatggatagt aagacaaaca 1861 gtagtggggc atgcaatgca agagacacag tttagccttt ctaccttagt acagtgggca 1921 tatgacaatg acataacaga tgagagcgag ctagcatatg actacgcaat gctaggtaat 1981 gaggacccaa atgcagcagc atttttagca agcaactgcc aggcaaagta tattaaggat 2041 gcaattacaa tgtgcaaaca ttataaacgt gcagaacagg cacgaatgtc tatgacacag 2101 tggatagcac atagggggcg caaggtggca gattcaggtg actgagagaa atagtaaaat 2161 atttaagata tcaaagggtt gaatttgtaa catttatggg agcattaaag ctatttttaa 2221 aaggggtacc aaaaaaaagc tgtatggtat tctatgggcc aagtgacacc ggaaagtcat 2281 tgttttgtat gagtttactt aagtatttag ggggagcagt aatttcatat gtaaattcag 2341 gaagccattt ttggttatca ccactggtag acgccaaagt agggttgtta gatgatgcaa 2401 cataccagtg ctggcaatat atagatacat acctacgaac agtgttagat ggaaatgcta 2461 taagcataga tagaaaacat agaaatttaa cacagttgaa gtgtccacca cttatgataa 2521 caacaaatat aaatccattg gaagaccagg catttaaata tttgcacagt agaatagtgt 2581 tgtttaaatt tatgcataag tgcccattaa aaagcaacgg tgatcccgta tataccctaa 2641 ataatgaaaa ttggaaatcg tttttccaaa ggtcctgggc acgtatagag ggacctgacg 2701 aacaggagga ggaggaggat gaggatggaa gcactagccg accgtttaga tgcgtgccag 2761 gagaaattgc tagaccttta tgaaaaagat agcgacaagc ttgaggacca aatattgcat 2821 tggcactatg tgcgtctgga acatgcaatg ttatttaagg cacgacaagc aggacttacc 2881 catgtaggcc accaggtggt accaacactt agtgttacaa aaggcaaagc acatcaggca 2941 attgaagtgc acctgtcact gcaagggttg caaaacagtg cgtatgcgca agaaccatgg 3001 acattacaga acacctcact ggaaatgtgg aatgcacacc cacaacggtg ttggaagaaa 3061 aaaggacgca caataacagt taaatttgat tgcgaggacc taaaagcagt ggagtatgtg 3121 agctgggggt gtatttatgt gcaaagtaca gaggacgaac agtggtataa agtacaagga 3181 catgtgtcat atcatgggct atattatgaa tttcagggtc agaaacagta ctatgtaaca 3241 tttggacacg aagccagaaa atatggggac acaaacacat gggaggtaca tgtgggaagt 3301 acagtgattt atgaaccctg cgcctctgta tctagcaccc aggacaccgt gcgagaagta 3361 cccactgttg aaactgttgg gcgactgcca gacgccacca aatccaccgc caccgccacg 3421 tgcgtgggcc ccgcccagac atcctcctca gtgcagacgc cgccttgtaa gcgacagcga 3481 ctccacagag acggattgca gcagcagccc gactctacgg aaagagacat ctgcaggcaa 3541 cggcgtgaca gtgctgacca gtgggtcaac cgtgacagtg actgcacaca acaagcaagg 3601 gacatctgta acagtcacgg tgcacctata atacatttaa aaggtgaacc aaataagtta 3661 aagtgttttc ggtataggct tcagcagtca gtgcctaact tgtttttaaa agcatcctct 3721 acatggcatt gggcctgtgg gggtgacaca acaaaatgtg catttgtaac actgtggtat 3781 gtggatactg accaacggac acaattttta agtcgtgtga acattccaaa ggggatacaa 3841 gccactgctg gctatatgtc aatgtgtata taatgtttgt tgcgatggca accagtgtat 3901 agaaccacac ctgcaacatt atgtaaggca gaagcaatcc tggatatact tgtgtgtttg 3961 atatctgggt ggtgtactgt gctgttgctg cttattattt tctggctttc ctatctttct 4021 gcactaagtg cttttttggt gtttgtgtgt gttatatatc taggattgtt ttgtatatat 4081 atgcaggtga tgtggtacat aggtgactta taatccaccc agccattaca tgctgctatt 4141 gtgtaaatag tgttccttgt gtatcttgta tgtaatatgt atcctgttgt agtgggcaat 4201 acggatgggg gtgcattaat tgtactacga gacgataatt gtggattgtg gttcttcttg 4261 tgtatgttaa taatcattgt agtgttgcta tataggttgc tacactgatc ccttcctttt 4321 gtgtattccc acctcctttt tatttttgtt ttgttttgtt tttgtttttt atttttttgc 4381 atttttataa taaacattat ctgccaaaat gacccaagct gtaaggcgtc gcaaacgtgc 4441 ctctgcaacg gacctgtatc gcacatgcaa acaggcgggt acctgccctc ctgatgttat 4501 accaaaggtg gagggtgaca cccttgctga taggttcctg aagtgggcca gtttaggggt 4561 gttctttggt gggttaggca taggcacggg ttcaggcacc ggtgggcgca ctggctatgt 4621 gcctataggt actcgccctc ccactgttgt ggatataggc cctacaacac gcccgcctgt 4681 tgttattgag cccgtggggg ccgcagaccc ttccatagtc acccttgtgg aagaatccag 4741 cgttgtggaa gccggtgcca ccgttcccac ttttactggg tctggtggct ttgaggttac 4801 cacgtcctca actactaccc ctgctgtttt agacattaca ccctctggag cgtctgtcca 4861 agttagcagt agtagcttta caaatccctt atttactgaa ccgtccatta ttgaacctcc 4921 acaggccggg gacctttcag gccatgtatt cactagcaca cccacatctg ggtcgcatag 4981 ctttgaggaa atacccatgc acacatttgc aactcatagc agtactagca cagacccctt 5041 tagtagtacc cctttgcctg gtgttcgtcg ccttgcacag ccccgcttag gattgtatag 5101 caaggctaat caacaggtta gggttactaa ccctgccttt ttgtctcgac cccagtctct 5161 tgttacttat gacaaccctg tgtatgatcc agaggaaact attatttttg agcatcctag 5221 tatatatacc cctcctgatc ctgacttttt ggatattatt tccttacata ggcctgccct 5281 tacagcccgc cagggtacag tacgggtcag ccggttgggt caacgtgcta ccttgcgtac 5341 acgtagtggc aaacgcattg gcgctcgggt acacttttat caggacatta gccccatttc 5401 atctgatact attgaaatgc aatccttggc ctcctctacg cagccagaca taacatatga 5461 catttatgct gaccctgatt taggggaacc cccgccgcgt gcttctgtgt cttctacatc 5521 attgcacagc ccgtccctgt ctgcagcgtc tgctgtttct gccaagtatg acaatgtaac 5581 agttcccttg tccttagggc cacacatccc tgcctcctct ggccctgaca ttgatttgtc 5641 ctttgctcct gcccctgtac ctacaatgcc tcttgtaccc tctacgcatc cacattctat 5701 ttatgttgag ggctttgatt tttatttgtt gcctgcatat atcttttttc ctaaacgtcg 5761 taaacgtgtg ccctattctt ttgcagatgg ctttgtggcg gcctggtgac ggcaaggtat 5821 acctgcctcc caatcctgtt tctaaggttc tcagtactga tcgctatgtc caacgcacca 5881 acctctatta ttatggtggc agttctcgtc tactaactgt aggacatcct tactgtgcca 5941 tacctctcaa cggacagggc aaaaaaaaca ccattcctaa ggtttcgggg tatcaataca 6001 gggtgtttag agtaaaactt cctgatccca ataaatttgc tttgcctgat ggcacacttt 6061 acaatccaga tactgaacgg ctggtatggg cctgtcgtgg cattgaggtt ggtaggggcc 6121 agccccttgg tgttggcact agcggtcacc ccttgtataa tcgcttggat gacactgaaa 6181 acacttcctt acttgtggct gacaattctg acagtcggga caatgtatct gttgactaca 6241 aacagaccca attgcttatt atagggtgca agcctcccat tggtgagcat tggaccaagg 6301 gcactccttg tgcaggctct aattctcagc caactgactg ccccccttta gaatttacaa 6361 attccactat acaggatggt gacatggtgg aaacaggcta tggtgccata gattttgcta 6421 cccttcagga aaataaatca gaagtgcctt tggatatttg caccaccacc tgcaaatatc 6481 ctgactattt gcaaatggct gctgaaccat atggtgattg tatgtttttt tgtcttcgca 6541 gggaacaaat gtttgcacgc cattttttta ataggcaggg cacaatgggt gaggcactac 6601 cagccagttt atatcttaaa ggtgcctcgg gtagcgacag ggtgacacct ggtagttata 6661 tttattctcc cacccccagc ggctctatgg tgtcctctga tgcacaatta tttaataagc 6721 cctattggct acagcgcgcc cagggtcaca acaatggcat ctgttggttt aatgagcttt 6781 ttgtgacagt tgtagatact actcgcagta ctaatgtaac tatttgtact gccacagcgt 6841 cctctgtatc agaatataca gcttctaatt ttcgtgagta tcttcgccac actgaggaat 6901 ttgatttgca gtttatattt caactgtgta aaattcactt aactcctgaa attatggcct 6961 acttgcacaa tatgaataag gccttattgg atgactggaa ttttggtgtg gtgcctcctc 7021 cttctaccag tttggatgat acctataggt ttttgcagtc tcgtgccatt acctgtcaaa 7081 agggggctgc cacccctcct cctaaagaag atccatatgc taacttatcc ttttggactg 7141 tggatttaaa ggacaaattt tccactgact tggatcagtt tcctttaggg cgcaagtttc 7201 tgttacaggt tggttctagg gctgtttcag tgtcacgcaa acgtgctgcc ccaccaagct 7261 ctacctcgac ccccgcccct actaaacgta aaaagcgcaa aaagtaacat gtcatactgt 7321 ttgtgtggtg tatgtgtgta tgtgtgcaat gcatgcatgt gtgtttctgt tgttgtttgt 7381 gtgtacatgt ttgtactatg ttatgttgtt gtatgttttt tgtatggccc ctgcccccgt 7441 gttgtgtatg tatgtggaat gtgtgttatg tgttgtgcat taataaagcg tgtgtcatgt 7501 gtgtgtgtgt gtccggtgaa ccctgtgagt aagtgtgtgt ttgcacgttg tcctacttcc 7561 tacactttgt tttgtgtcac ctttgtatgc cctttactgt actccatttt atattttctc 7621 cattttgtat tcgcgaccgt tttcggtctc ccgccttttc ggtcgtggcg ccgtgccact 7681 gtacatagaa actatgcatt gtgctttcct cccacatcct gtttcaacaa accttatcca 7741 catctgggtg tgcctgacag gtttctggca catacatttt ccatagttat gtgtttcctg 7801 actcatttta caatagatat gcttttaggc acatatttta tgctgactac tttctcctaa 7861 ttgctgtttt ggctaccttt ctaggtgttg tagccaagta tgtgtcttgc aactatgggc 7921 aagcccttta caaacgtgtt aaaacattct actccggtcg ctcccctatg tctcatggtt 7981 ttatagtt