LOCUS HPV44 7833 bp ds-DNA VRL 04-JUL-1995 DEFINITION Human papillomavirus type 44 (HPV44), complete genome. ACCESSION U31788 SOURCE Human papillomavirus type 44 DNA. ORGANISM Human papillomavirus type 44 Viridae; ds-DNA nonenveloped viruses; Papovaviridae; Papillomavirus. REFERENCE 1 (bases 1 to 7833) AUTHORS Delius,H. TITLE Direct Submission JOURNAL Unpublished COMMENT HPV44 is a mucosatropic HPV which to date has not been detected in cervical cancer. Prevalence studies indicate that HPV44 and HPV43 have been found in 4% of cervical intraepithelial neoplasms, but in none of the 56 cervical cancers tested (Lorincz et al, J. Virol 63, 2829-2834). During the analysis of approximately 1000 anogenital tissue samples, two new HPV types, HPV43 and HPV44, were identified. The complete genome of HPV44 was recovered from a vulvar condyloma and cloned into bacteriophage lambda. The biopsy was taken from a woman from the Detroit Michigan area. The DNA recovered was a single 7.8 kb BamHI fragment. Cloned HPV44 DNA was obtained from the Papillomavirus Reference Center, Heidelberg and subsequently sequenced by Dr. H. Delius. A possible feature of HPV types associated with malignant lesions is the potential to produce a different E6 protein by alternative splicing. This potential has been found in types HPV16, HPV18, and HPV31. HPV44 has a potential E6 splice donor at nt 229, but does not contain a potential splice acceptor. Phylogenetic analysis indicates that HPV44 is most closely related to HPV55, HPV6, HPV11 and HPV13. FEATURES Location/Qualifiers CDS 105..557 /note="ORF E6 from bp 72 to 557" /product="transforming protein" /gene="E6" /note="putative" /codon_start=1 /translation="MESANASTSAQSIDQLCKECNIPMHNLQILCVFCRKTLSTAEVY SFAYKQLYVVYRGNFPFAACAICLELQGKVNQFRHFNYAGYAVTVEEETNKSILDVLI RCYLCHKPLCHVEKVRHILDKARFIKLQDTWKGRCFHCWTSCMETILP" CDS 533..826 /note="ORF E7 from bp 488 to 826" /product="transforming protein" /gene="E7" /note="putative" /codon_start=1 /translation="MHGNYTTLKEIVLQLEPPDPVGLHCNEQLDSSEDEVDELATQAT QDVTQPYQIVTTCGTCSRKVRLVVQCTGTDIHHLHTLLLGSLDILCPVCAPKT" CDS 832..2763 /note="ORF E1 from bp 715 to 2763" /product="replication protein" /gene="E1" /note="putative" /codon_start=1 /translation="MADNTGTEGTGCSGWFLVEAIVENTTGQQISEDEDEAVEDSGLD MVDFIDDRPITHNSMEAQALLNEQEADAHYAAVQDLKRKYLGSPYVSPLSNIEQAVEC DISPRLDAITLSRQPKKVKRRLFDRPELTDSGYGNTEVEAETQVERNGEPEDCGGGGQ GRDTEGVEQVETEVQTHSNTQQHTGTTRVLELLKCKNIRATLLGKFKDCYGLSYTDLI RQFKSDKTTCGDWVIAAFGVHHSVSEAFQNLIQPVTTYSHIQWLTNAWGMVLLALVRF KVNKNRCTVARMMATRLNIPEDHMLIEPPKIQSGVAALYWFRSGISNASIVTGETPEW ITRQTIVEHGLADNQFKLADMVQWAYDNDFCEESEIAFEYAQRADIDANARAFLNSNC QAKYVKDCATMCKHYKTAEMKKMNMKQWIKFRSSKFEDTGNWKPIVQFLRHQNIEFIP FLTKLKMWLHGTPKKNCIAIVGPPDTGKSCFCMSLIKFLGGTVISYVNSSSHFWLQPL CNAKVALLDDVTQSCWVYMDTYMRNLLDGNPMTIDRKHKSLALIKCPPLIVTSNIDIT KEEKYKYLCSRVTLFTFPNPFPFDRNGNALYDLCETNWKCFFARLSSSLDIQTSEDED DGDNSQAFRCVPGTVVRTV" CDS 2705..3838 /note="ORF E2 from bp 2678 to 3838" /product="regulatory protein" /gene="E2" /note="putative" /codon_start=1 /translation="METIAKHLDVCQEQLLELYEENSNKLTKHIQHWKCIRYECVLLH KAKQMGLNHIGMQVVPALAVSQTKGHQAIEMQMTLETLLNSDYGTEPWTLQETSREMW LTPPKYCFKKQGQTVEVKFDCNADNAMEYVWWKVIYVFDTDKWVKVTGHIDYKGLYYV HGGHKTYYTNFEKEAEKYGNSLQWEVCIGSSIICSPASISSTVQDVSIAGPASHSSSS TTTTLAQASSTLPIGTAEDCVDAPPCKRPRGPPTNTNNARNTVCVRNSDSVDSTNNNI LPNSYNSNKGRDNNYCTATPVVQLQGDANCLKCLRYRLHAKYKTLFVAASSTWRWTCS DTSSNALVTLTYVDEQQRQQFLNTVKLPPKVTYKVGYMSLQLL" CDS <3162..3596 /note="ORF E4 from bp 3162 to 3596" /gene="E4" /note="putative" /codon_start=1 /translation="TIKGCIMYMVGIKPIIQILKRRPKNMGTLYNGRYVLAAVSYVLL HLYLVLCKTYPLLGLLHTPPPPPPPPLHRPHPHCPLAPPRTAWTRRHVNDPEDPPQTP TTPETPSVSETATPWTVQTTTSSLTVTTVTKDGTTIIVQLRL" CDS 3874..4152 /note="ORF E5 from bp 3859 to 4152" /gene="E5" /note="putative" /codon_start=1 /translation="MEHIPIDATIGATSTSLLPVVIALFVCFVSIVLIICISDFIVYT SILVLTLLLYLLLWLLLTSALQFYLLTLCVCFFPAWYIHFHIVHTQQE" CDS 4325..5707 /note="ORF L2 from bp 4289 to 5707" /product="minor capsid protein" /gene="L2" /note="putative" /codon_start=1 /translation="MAHSRARRRKRASATQLYQTCKAAGTCPSDIIPKVEHNTIADQI LKWGSLGVFFGGLGIGTGSGTGGRTGYIPLQSTPRPDIPSVPTARPPILVDTVAPGDP SIVSLVEESAIINSGAPELVPPSHAGFEITTSESTTPAILDVSVTTHTTSTSVFKNPS FADPSVVQSQPAVEAGGHILISTSSISSHPVEEIPLDTFIVSSSDSNPASSTPIPASG ARPRIGLYSKALHQVQVTDPAFLSSPQRLITFDNPAYEGEDVTLHFAHNTIHEPPDDA FMDIIRLHRPAIQSRRGRVRFSRIGQRGSMYTRSGKHIGGRIHFYQDISPISAAAEEI ELHPLVATAQDSGLFDIYAEPDPDVTEEPVSLSFSTSTPFQRSSVSATPWGNTTVPLS LPADMFVQPGPDIIFPTASTTTPYSPVTPALPTGPVFISGAAFYLYPTWYFARKRRKR VSLFFADVAA" CDS 5694..7196 /note="ORF L1 from bp 5616 to 7196" /product="major capsid protein" /gene="L1" /note="putative" /codon_start=1 /translation="MWRPSENQVYVPPPAPVSKVIPTDAYVKRTNIYYHASSSRLLAV GNPYFAIRPANKTLVPKVSGFQYRVFKMVLPDPNKFALPDTSIYDPTTQRLVWACIGL EVGRGQPLGVGISGHPLLNKLDDVENSASYAAGPGQDNRVNVAMDYKQTQLCLVGCAP PLGEHWGKGKQCNNVSVKDGDCPPLELITSVIEDGDMVDTGFGAMNFAELQPNKSDVP LDICTATCKYPDYLQMAADPYGDRLFFYLRKEQMFARHFFNRAGTVGEDVSQDLVIKS ASKNTVPNAIYFNTPSGSLVSSETQLFNKPFWLQKAQGHNNGICWGNQLFVTVVDTTR STNMTICAATTQSPPSTYTSEQYKQYMRHVEEFDLQFMFQLCSITLTAEVMAYLHTMN AGILEQWNFGLSPPPNGTLEDKYRYVQSQAITCQKPPPEKAKQDPYAKLSFWEVDLRE KFSSELDQYPLGRKFLLQTGVQARSSVRVGRKRPASAATSSSKQKRSRKK" BASE COUNT 2383 a 1545 c 1678 g 2227 t ORIGIN 1 ttaataataa tctaaccttt acaaaaaaga ggaggaaccg aattcggttc caaccgaaaa 61 cggttatata aaaaccagcc caaaaattaa gcaagcgggg cataatggaa agtgcaaatg 121 cctccacgtc tgcacaaagt atagaccagt tgtgcaagga gtgcaacatt cctatgcaca 181 atctgcaaat tttatgcgtg ttttgcagaa aaacgttaag tactgcagag gtttattcat 241 tcgcatataa acagttatat gtagtgtacc gaggaaactt tccatttgca gcctgtgcca 301 tttgtttaga actacaaggt aaggtcaatc aatttaggca ttttaactac gcgggatatg 361 cagtaacagt ggaagaagaa acaaataagt caattctgga cgtgctgata cgctgctatt 421 tgtgccacaa accattgtgc cacgtggaaa aggtgcgcca catattggac aaggcgcgat 481 tcattaaatt acaagatacc tggaagggtc gctgcttcca ttgttggaca tcatgcatgg 541 aaactatact accttaaagg aaattgtttt acagctggaa cctcctgacc ctgtaggcct 601 acattgcaat gagcaattag acagctcaga agatgaggtg gatgaactag ccacgcaagc 661 cacgcaagac gttacacagc cttaccaaat agtaaccacc tgtggtacat gtagtcggaa 721 ggttcggctg gttgtgcagt gcacaggaac agacatccat cacctacata cgcttctgct 781 gggttcactg gatatattgt gtcctgtgtg tgcgcccaaa acctaacaac gatggctgac 841 aatacaggta cagagggaac gggatgctca ggatggtttc tagtagaggc tatagtggag 901 aacacaaccg ggcaacaaat atcagaggat gaggatgagg cagtggagga tagtgggttg 961 gatatggtgg actttataga tgacaggcct attacacaca attccatgga agcacaggca 1021 ttgttaaacg agcaggaggc ggatgctcat tatgcggctg tgcaggacct aaaacgaaag 1081 tatttaggta gtccatatgt tagtccttta agtaatattg agcaggcagt ggagtgtgac 1141 attagcccac ggctggacgc tataacatta agtagacaac caaaaaaagt aaagcgacgg 1201 ctgtttgaca gaccagaatt aacggacagt ggatatggca atactgaagt ggaagctgaa 1261 acgcaggtag agagaaatgg cgaaccggaa gattgtgggg gaggtggaca aggaagggac 1321 acagaggggg tggaacaggt ggaaacggaa gtgcagacac atagcaacac acaacagcac 1381 accgggacca cgcgggtact agaactattg aaatgtaaga atataagggc tacactgctt 1441 ggtaagttta aggattgcta tgggttatca tatacagatt taattagaca atttaaaagt 1501 gacaagacaa catgtgggga ctgggtaatt gcagcctttg gggtgcacca tagtgtgtca 1561 gaggcgtttc aaaatttaat acagccagta acaacatata gccacataca atggcttaca 1621 aatgcatggg gaatggtcct actggcatta gtaaggttta aggtaaataa aaacagatgt 1681 acagtggcac gtatgatggc aacccgttta aatatacctg aggaccacat gttaattgaa 1741 cctcctaaaa tacaaagcgg tgttgcagcg ttatattggt ttagaagtgg tatatccaat 1801 gccagtatag taactggaga aacaccggaa tggataacaa ggcaaaccat tgtagaacat 1861 gggcttgcag acaaccaatt taaattagca gacatggttc aatgggcata tgataatgac 1921 ttttgtgagg aaagtgaaat tgcatttgaa tatgcacaac gtgcagatat agatgccaat 1981 gccagagcat tcctaaatag taattgtcag gcaaaatatg taaaagactg tgccacaatg 2041 tgcaagcact ataaaactgc agaaatgaaa aaaatgaata tgaaacagtg gataaaattt 2101 aggagcagta aatttgaaga cacaggaaat tggaaaccaa tagtgcaatt tttaagacac 2161 caaaacatag aatttattcc gtttttaact aaattaaaga tgtggctgca tggtacacca 2221 aaaaaaaact gtattgcaat agtgggccca ccagacacag gtaaatcgtg tttttgtatg 2281 agtttaatta aattcttagg aggcactgta attagttatg taaactccag cagtcacttt 2341 tggctacagc ccttatgcaa tgcaaaagta gcattattag atgatgtaac ccaatcctgc 2401 tgggtatata tggatacata tatgagaaac ctattagatg gaaaccctat gaccattgac 2461 agaaaacaca aatcattagc attaataaaa tgtccgcctt taatagtaac atcaaacata 2521 gacattacta aagaagagaa atacaaatat ttatgtagca gggtaacatt atttacattt 2581 ccaaatccat tcccctttga cagaaatggg aatgcactat atgacctgtg tgaaacaaac 2641 tggaaatgtt tctttgcaag attatcatca agtctagata tacaaacatc agaggacgag 2701 gacgatggag acaatagcca agcatttaga tgtgtgccag gaacagttgt tagaactgta 2761 tgaagaaaat agtaataaac ttacaaaaca tatacaacat tggaaatgta tacgatatga 2821 atgtgtgtta ctacacaaag ctaagcaaat gggcctgaac cacattggaa tgcaagtggt 2881 gccagcatta gcagtgtcac agacaaaggg acaccaggca attgaaatgc aaatgacatt 2941 agaaacatta ctaaactctg actatggtac ggaaccatgg acattgcaag agacaagtcg 3001 ggaaatgtgg ttaacaccac ccaaatattg ctttaaaaag cagggacaaa ctgtggaagt 3061 aaaatttgac tgcaatgcag acaatgcaat ggagtatgta tggtggaaag tcatttatgt 3121 atttgacaca gacaaatggg taaaagtgac aggacacata gactataaag ggttgtatta 3181 tgtacatggt gggcataaaa cctattatac aaattttgaa aaggaggccg aaaaatatgg 3241 gaactcttta caatgggagg tatgtattgg cagcagtatc atatgttctc ctgcatctat 3301 atctagtact gtgcaagacg tatccattgc tgggcctgct tcacactcct cctcctccac 3361 caccaccacc cttgcacagg cctcatccac actgcccatt ggcaccgccg aggactgcgt 3421 ggacgcgccg ccatgtaaac gaccccgagg accccccaca aacaccaaca acgccagaaa 3481 caccgtctgt gtcagaaaca gcgactccgt ggacagtaca aacaacaaca tcctccctaa 3541 cagttacaac agtaacaaag gacgggacaa caattattgt acagctacgc ctgtagttca 3601 attacaaggt gatgctaatt gtttaaagtg tttaagatat agattacatg caaagtataa 3661 aacattgttt gtagcagcat cgtccacatg gcgctggaca tgttcagata catccagtaa 3721 tgcactggta acattaacat atgttgatga acagcaacgc cagcagtttt taaacactgt 3781 aaagttacca ccaaaagtta catataaagt tggatatatg tctttacaat tgttataatg 3841 tgtgttgtat atatctaatt gtatatattg tacatggaac acatacctat agatgctact 3901 ataggggcaa ccagcacatc attactgcca gttgtaattg ccctgtttgt atgctttgtt 3961 agcattgtat taattatttg tatttctgat tttatagtgt acacatctat attggtacta 4021 accttactgc tatatctgtt actttggctt ttactaacct ctgccctgca attttattta 4081 ctaacactgt gtgtctgctt ttttcctgcg tggtatatac atttccatat tgtacataca 4141 caacaagaat aactattaca atgctaacat gtacgtttga tgatggtgat acatggctgt 4201 tattgtggtt gttattaaca ttaattgtta ccattatagc attgttatta atgcatttaa 4261 aaactgtaca atgcgttaca tgcagtaaat aagtatttgt atatttggtg tgtattgtat 4321 aaatatggca cacagtaggg cacgtagacg taaacgtgca tctgctaccc aattatatca 4381 aacatgtaag gctgcaggca cctgtccctc tgatattatt cctaaggtgg aacataacac 4441 tattgcagat cagatattaa agtggggcag tttgggggtt ttttttgggg gactggggat 4501 tggtacaggc tctggcacag gcggtagaac agggtatata cctttacaat ccaccccgcg 4561 tcctgacatt ccctctgtac ctaccgcaag gccacctata cttgttgata ctgttgcacc 4621 tggggacccg tccattgtat ccttggttga agaatctgct attataaatt cgggggcccc 4681 ggaattggtc cctccttccc atgcaggatt tgaaatcact acatctgaat ctaccacacc 4741 agctatatta gatgtgtctg tcaccacaca tactacctct acaagtgtat ttaaaaaccc 4801 tagctttgct gacccatctg ttgtacagtc gcagcctgct gttgaagctg gtggccacat 4861 acttatctct acctcatcta tatcgtccca ccctgtagaa gaaatacctt tggatacatt 4921 tatagtatct tcctctgata gtaatcctgc atctagcact cccattccag catctggtgc 4981 acggccgcgt attggcctat acagtaaggc tttgcaccag gtacaggtaa cggatcctgc 5041 ctttttgtcc tctccccagc gcctaataac atttgataat cctgcatatg aaggggagga 5101 tgttacttta cactttgcac acaatactat acatgaacct ccagatgatg cgtttatgga 5161 tattatacga ttgcacagac cggctataca gtccaggcgt ggtcgtgtgc ggtttagtag 5221 aattggacaa cgagggtcta tgtacacacg tagtggcaaa catattggtg gcaggataca 5281 tttctatcaa gacatttctc ctatatctgc tgctgcagaa gaaatagaac tgcaccccct 5341 tgtggccact gcacaggata gtggcctgtt tgatatttat gcagaacctg accctgatgt 5401 tacagaagaa cctgtttcat tgtctttttc tacctccaca ccctttcagc ggtcttctgt 5461 gtcagccacc ccatggggca atactactgt ccctctttca ttacctgctg acatgtttgt 5521 acagcctggt cctgacataa tctttcctac tgcatccact acaactccct atagtcctgt 5581 cactcctgct ttacctacag gtcctgtttt tataagtggt gctgcatttt atttatatcc 5641 tacatggtat tttgcacgca aacgccgtaa acgtgtttcc ttgttttttg cagatgtggc 5701 ggcctagtga aaaccaggta tatgtgcctc ctcccgcccc agtatccaaa gtaataccta 5761 cggatgccta tgtcaaacgc accaacatat attaccatgc tagcagttct agacttcttg 5821 ctgtgggcaa cccttatttt gccatacgac cagcaaacaa gacacttgtg cctaaggttt 5881 cgggatttca atatagggtt tttaagatgg tattgccaga ccctaataaa tttgccttac 5941 ctgacacatc tatatatgac cccactacgc aacgcctggt atgggcctgc atcgggctgg 6001 aggtaggtag aggacagccc ttaggtgttg gtattagtgg gcatccatta ttaaataaat 6061 tggatgatgt agaaaattca gctagttatg cagccggtcc gggtcaggat aacagggtaa 6121 atgtggccat ggactataaa caaacacaat tatgtttggt tggctgtgca cccccgttag 6181 gtgagcattg gggtaaaggc aagcagtgta ataatgttag tgttaaggat ggggactgcc 6241 ctcccttgga attaattact agtgtaattg aggatggtga tatggtggac actggttttg 6301 gagccatgaa ttttgctgaa ttgcagccaa ataaatctga tgttccatta gatatatgca 6361 ctgctacatg taaatatcct gactatttac aaatggctgc agatccatat ggggacagat 6421 tgttttttta cttacgaaag gaacagatgt ttgccagaca tttttttaat agggctggaa 6481 cagttggtga ggacgtttcc caggatctgg ttattaaaag tgctagtaaa aatactgttc 6541 ctaatgctat atactttaat acacccagtg gttctcttgt atcttctgaa acccaattat 6601 ttaataagcc tttttggttg caaaaggcgc agggccacaa taatggtatt tgttggggaa 6661 atcagttatt tgttactgtt gtagatacta cccgtagtac aaacatgaca atatgtgctg 6721 ccactacaca gtcccctccg tctacatata ctagtgaaca atataagcaa tacatgcgac 6781 atgttgagga gtttgactta caatttatgt ttcaattatg tagtattacc ttaacggcgg 6841 aggtaatggc ctatcttcat actatgaatg ctggtatttt agaacagtgg aactttgggt 6901 tgtcgccgcc cccaaatggt accttagagg acaaatacag atatgtgcag tcccaggcca 6961 ttacatgtca aaagccaccc cctgaaaagg caaagcagga cccctatgca aaattaagtt 7021 tttgggaggt ggatcttaga gaaaagtttt ctagtgagtt ggatcaatat ccccttggta 7081 gaaaattttt attacaaacg ggtgtgcagg cccgttcctc tgttcgtgtg ggtaggaaac 7141 gtcctgcgtc tgcagccact tcctccagta aacaaaaacg gtctaggaag aagtagtatg 7201 tgttattgtt ttgtttgtat gtgtgtcata tgttattgtg ttatatatgt gttgtgttgt 7261 atatatgttg tatgtgtatg ttgtgtaatg ttgtctgtaa tggaatgcat gtgtgtgttg 7321 tacataataa acttaatctg tgtgtcctgt tccaccccat gagtaagtgt tgtagtgttg 7381 tgttctatgt ttggtatata taatatataa catatgtaca gccatgttag tttttaaaca 7441 tattcctcca ttttgggtgc aaccgttttc ggttgttcat tttgggtgca accgttttcg 7501 gttgttactc attacccaca tcctgtaccc aatttgttat agcaagcaaa atatttaatc 7561 atctctgcca gaactttatt atgttactaa gtacacacct ggcgcacagc taggcgcggt 7621 ttggcaacta cacaatacat tcctaatctc tatactactg ctgtctcgtt tgtgaacaat 7681 agtgcgctgg tagccaactt tttaaaagca tttttggcta ctagcactgc atttttgtac 7741 agttactgtt ggttttataa aatgagtaac ctaaggtcac acacctgcga ccggtatcgg 7801 ttgacacaca ccctgtacac ttccttatca tag