LOCUS HPV35h 7879 bp ds-DNA VRL 04-OCT-1993 DEFINITION Human papillomavirus type 35h (HPV-35h), complete genome. ACCESSION X74477 SOURCE Human papillomavirus type 35h DNA. ORGANISM Human papillomavirus type 35h Viridae; ds-DNA nonenveloped viruses; Papovaviridae; Papillomavirus. REFERENCE 1 (bases 1 to 7879) AUTHORS Delius,H. and Hofmann,B. TITLE Primer-directed sequencing of human papillomavirus types JOURNAL Curr. Top. Microbiol. Immunol. 186, 13-31 (1994) STANDARD full staff_review REFERENCE 2 (bases 1 to 7879) AUTHORS Delius,H. TITLE Direct Submission JOURNAL Submitted (06-AUG-1993) to the EMBL/GenBank/DDBJ databases. H. Delius, Deutsches Krebsforschungszentrum, Abteilung ATV, Im Neuenheimer Feld 506, W 6900 Heidelberg, FRG STANDARD full staff_review COMMENT HPV-35h is considered to be a member of the "intermediate risk" anogenital group. All of the viruses in this group are most prevalent in high-grade intraepithelial lesions and less prevalent in invasive cancers. In a study which used 15 common anogenital probes to screen 2627 subjects, hybridization to the "intermediate risk" probes ocurred in 26.4% of all high-grade intraepithelial neoplasias, 23.3% of all low-grade intrepithelial neoplasias, 13.7% of all invasive cancers, and in 3.3% of all normal samples. The relatively high percentage of high-grade lesions with respect to invasive cancers justifies the placement of type 35h in the "intermediate risk category. HPV35h is considered to be of moderate prevalence. Of all invasive cancer biopsies tested, 11% were positive for DNA in the "intermediate risk" group. HPV35h predominantly infects tissues of the cervix and other lower-anogenital tract sites: the vulva, the vagina, the penis, the perineum and the anus. It has also been detected in a bowenoid lesion of the finger. The 7879 bp genome of HPV-35h was cloned in vector pBR322. As is common in other papillomaviruses, the E4 ORF lacks an initiating methionine codon. Of note, the E6 ORF exhibits putative splice donor and acceptor sites similar to those seen in other oncogenic types (16, 18, 31, and 33). The long control region (LCR) of HPV-35h contains features which are either conserved among all or many of the papillomaviruses or are conserved among just those associated with anogenital lesions. A feature which appears to be common among the mucosal types is the glucocorticoid response element. These elements have been shown to mediate the hormonal response in the presence of glucocorticoids. Besides the presence in HPV-35h, potential GREs have been identified in types 6, 11, 16, 18, 31, 33, and 39. In HPV-35h, five putative NF-1 regions are present. Also found in the LCR of many HPV types are tandem direct repeats, direct repeats, and inverted repeats. Like those other HPVs, HPV-35h contains tandem direct repeats of 8 bp, and direct repeats of 11 bp. Of particular interest is the presence of a 20 bp sequence which is conserved between the oncogenic mucosal types (16, 18, 31, 33, 35h, 35, 39, and 51) and has not been found in other non-oncogenic types (1a, 2a, 5, 6, 8, 9, 11, 17, 19, 20, 25, 36, 47 or 57). This 20 bp region is located approximately 30 bp 5' to the keratinocyte-specific octamer. FEATURES Location/Qualifiers CDS 110..559 /note="putative" /note="ORF E6 from bp 59 to 559" /product="transforming protein" /gene="E6" /note="putative" /codon_start=1 /translation="MFQDPAERPYKLHDLCNEVEESIHEICLNCVYCKQELQRSEVYD FACYDLCIVYREGQPYGVCMKCLKFYSKISEYRWYRYSVYGETLEKQCNKQLCHLLIR CITCQKPLCPVEKQRHLEEKKRFHNIGGRWTGRCMSCWKPTRRETEV" CDS 562..861 /note="putative" /note="ORF E7 from bp 544 to 861" /product="transforming protein" /gene="E7" /note="putative" /codon_start=1 /translation="MHGEITTLQDYVLDLEPEATDLYCYEQLCDSSEEEEDTIDGPAG QAKPDTSNYNIVTSCCKCEATLRLCVQSTHIDIRKLEDLLMGTFGIVCPGCSQRA" CDS 868..2781 /note="putative" /note="ORF E1 from bp 862 to 2781" /product="replication protein" /gene="E1" /note="putative" /codon_start=1 /translation="MADPAGTDEGEGTGCNGWFFVEAVVSRRTGDPVSEDENEDDCDR GEDMVDFINDTDILNIQAETETAQALFHAQEEQTHKEAVQVLKRKYASSPLSSVSLCV NNNISPRLKAICIENKNTAAKRRLFELPDSGYGNSEVEIQQIQQVEGHDTVEQCSMGS GDSITSSSDERHDETPTRDIIQILKCSNANAAMLAKFKELFGISFTELIRPFKSDKST CTDWCVAAFGIAPSVAESLKTLIKPYCLYIHIQCLSCSWGMVILALLRFKCAKNRTTI EKLLSKLLCISAASMLIQPPKLRSTPAALYWFKTAMSNISEVDGETPEWIQRQTVLQH SFNDAIFDLSEMVQWAYDNDFIDDSDIAYKYAQLAETNSNACAFLKSNSQAKIVKDCA TMCRHYKRAEKREMTMSQWIKRRCEKVDDDGDWRDIVRFLRYQQVDFVAFLSALKNFL HGVPKKNCILIYGAPNTGKSLFGMSLMHFLQGAIISYVNSKSHFWLQPLYDAKIAMLD DATSPCWAYIDQYLRNALDGNPISLDVKHKALVQLKCPPLLITSNINAGKDDRWPYLH SRVVVFTFHNEFPFDKNGNPVYGLNDKNWKSFFSRTWCRLNLHEEEDKENDGDAFPAF KCVSGQNTRTLRD" CDS 2714..3817 /note="putative" /note="ORF E2 from bp 2687 to 3817" /product="regulatory protein" /gene="E2" /note="putative" /codon_start=1 /translation="MMETLSQRLSVCQDKILEHYETDSTCLSDHIQYWKLIRLECAVF YKAREMGIKTLNHQVVPTQAISKAKAMQAIELQLMLETLNTTEYSTETWTLQETSIEL YTTVPQGCFKKHGVTVEVQFDGDKQNTMHYTNWTHIYILEDSICTVVKGLVNYKGIYY VHQGVETYYVTFREEAKKYGKKNIWEVHVGGQVIVCPESVFSSTELSTAEIATQLHAY NTTETHTKACSVGTTETQKTNHKRLRGGTELPYNPTKRVRLSAVDSVDRGVYSTSDCT NKDRCGSCSTTTPIVHLKGDANTLKCLRYRLGKYKALYQDASSTWRWTCTNDKKQIAI VTLTYTTEYQRDKFLTTVKIPNTVTVSKGYMSI" TATA_signal 2846..2851 /note="putative" CDS <3294..3584 /note="putative" /gene="E4" /note="putative" /codon_start=1 /translation="LFVLNLYLAAQNYPLLKLLHSYTPTTPPRPIPKPAPWAPQKPRR QITNDFEGVPSSPTTPPSECDSVPWTVLTEGSTLHLTAQTKTGVVVVVQLHL" CDS 3814..4065 /note="putative" /note="ORF E5 from bp 3799 to 4065" /gene="E5" /note="putative" /codon_start=1 /translation="MIDLTASSTVLLCFLLCFCVLLCLCLLVRSLLLSVSLYSALILL VLILWVTVATPLRCFCCFLCFLYIPMGMINAHAQYLAVQ" CDS 4211..5620 /note="putative" /note="ORF L2 from bp 4196 to 5620" /product="minor capsid protein" /gene="L2" /note="putative" /codon_start=1 /translation="MRHKRSTKRVKRASATQLYRTCKAAGTCPPDVIPKVEGNTVADQ ILKYGSMAVFFGGLGIGSGSGTGGRSGYVPLGTTPPTAATNIPIRPPVTVESIPLDTI GPLDSSIVSLVEETSFIESGAPVVTPRVPPTTGFTITTSTDTTPAILDVTSISTHDNP TFTDPSVLHPPTPAETSGHFVLSSSSISTHNYEEIPMDTFIVSTDSNNITNSTPIPGS RPTTRLGLYSKGTQQVKVVDPAFMTSPAKLITYDNPAYEGLNPDTTLQFEHEDISLAP DPDFMDIIALHRPALTSRKGTIRYSRVGNKRTMHTRSGKAIGARVHYYQDLSSITEDI ELQPLQHVPSSLPHTTVSTSLNDGMFDIYAPIDTEEDIIFSASSNNTLYTTSNTAYVP SNTTIPLSSGYDIPITAGPDIVFNSNTITNTVLPVPTGPIYSIIADGGDFYLHPSYYL LKRRRKRIPYFFADVSVAV" CDS 5601..7109 /note="putative" /note="ORF L1 from bp 5565 to 7109" /product="major capsid protein" /gene="L1" /note="putative" /codon_start=1 /translation="MSLWRSNEATVYLPPVSVSKVVSTDEYVTRTNIYYHAGSSRLLA VGHPYYAIKKQDSNKIAVPKVSGLQYRVFRVKLPDPNKFGFPDTSFYDPASQRLVWAC TGVEVGRGQPLGVGISGHPLLNKLDDTENSNKYVGNSGTDNRECISMDYKQTQLCLIG CRPPIGEHWGKGTPCNANQVKAGECPPLELLNTVLQDGDMVDTGFGAMDFTTLQANKS DVPLDICSSICKYPDYLKMVSEPYGDMLFFYLRREQMFVRHLFNRAGTVGETVPADLY IKGTTGTLPSTSYFPTPSGSMVTSDAQIFNKPYWLQRAQGHNNGICWSNQLFVTVVDT TRSTNMSVCSAVSSSDSTYKNDNFKEYLRHGEEYDLQFIFQLCKITLTADVMTYIHSM NPSILEDWNFGLTPPPSGTLEDTYRYVTSQAVTCQKPSAPKPKDDPLKNYTFWEVDLK EKFSADLDQFPLGRKFLLQAGLKARPNFRLGKRAAPASTSKKSSTKRRKVKS" TATA_signal 6657..6663 /note="putative" source 1..7879 /organism="Human papillomavirus type 35h" /clone="insert in BamHI site of pBR322" /sequenced_mol="DNA" BASE COUNT 2570 a 1339 c 1570 g 2400 t ORIGIN 109 bp upstream from beginning of E6 cds 1 ccctataaaa aaaacaggga gtgaccgaaa acggtcgtac cgaaaacggt tgccataaaa 61 gcagaagtgc acaaaaaagc agaagtggac agacattgta aggtgcggta tgtttcagga 121 cccagctgaa cgaccttaca aactgcatga tttgtgcaac gaggtagaag aaagcatcca 181 tgaaatttgt ttgaattgtg tatactgcaa acaagaatta cagcggagtg aggtatatga 241 ctttgcatgc tatgatttgt gtatagtata tagagaaggc cagccatatg gagtatgcat 301 gaaatgttta aaattttatt caaaaataag tgaatataga tggtatagat atagtgtgta 361 tggagaaacg ttagaaaaac aatgcaacaa acagttatgt catttattaa ttaggtgtat 421 tacatgtcaa aaaccgctgt gtccagttga aaagcaaaga catttagaag aaaaaaaacg 481 attccataac atcggtggac ggtggacagg tcggtgtatg tcctgttgga aaccaacacg 541 tagagaaacc gaggtgtaat catgcatgga gaaataacta cattgcaaga ctatgtttta 601 gatttggaac ccgaggcaac tgacctatac tgttatgagc aattgtgtga cagctcagag 661 gaggaggaag atactattga cggtccagct ggacaagcaa aaccagacac ctccaattat 721 aatattgtaa cgtcctgttg taaatgtgag gcgacactac gtctgtgtgt acagagcaca 781 cacattgaca tacgtaaatt ggaagattta ttaatgggca catttggaat agtgtgcccc 841 ggctgttcac agagagcata atctacaatg gctgatcctg caggtacaga tgaaggggag 901 gggacgggat gtaatggatg gttttttgta gaagcagtag ttagtagacg tacgggggat 961 ccagtgtcag aggacgaaaa tgaagatgac tgtgacaggg gggaggatat ggtggacttt 1021 ataaatgata cagatatatt aaacatacag gcagaaacag agacagcaca agcattattt 1081 catgcacagg aggagcaaac acacaaagag gctgtacagg tcctaaaacg aaagtatgct 1141 agtagtccac ttagcagcgt gagcttatgt gttaataata acataagtcc acgtttaaaa 1201 gctatttgca ttgaaaataa aaatacagca gcaaagcgac gattatttga actaccagac 1261 agcggttatg gcaattctga agtggaaata cagcagatac aacaggtaga ggggcatgat 1321 acagttgaac aatgtagtat gggcagtggg gatagtataa cctctagtag cgatgaaaga 1381 catgatgaga ctccaacgcg agacataata caaatactaa aatgtagtaa tgcaaacgca 1441 gctatgttgg ctaaatttaa agaactattt ggtattagtt ttacagaact tattagacca 1501 tttaagagtg ataaatccac atgtacagat tggtgtgtgg ccgcatttgg aatagcccca 1561 agtgtggcgg aaagtttaaa aacattaatt aaaccatatt gtttatatat acatatacaa 1621 tgtttatcgt gttcatgggg tatggtaatt ctagcattat tacgatttaa atgtgcaaaa 1681 aacagaacaa caattgaaaa actattatca aaattgctat gtatttcagc tgcaagtatg 1741 ctaatacaac caccaaaatt acgtagtacc ccagctgcgt tatattggtt taaaacagca 1801 atgtcaaata ttagtgaggt tgatggagaa acaccagaat ggattcaaag acaaacagta 1861 ttacagcata gttttaatga tgcaatattt gacctatctg aaatggtaca atgggcatat 1921 gacaatgatt ttatagatga tagtgatata gcatataaat atgcacaatt ggcagaaact 1981 aatagtaatg catgtgcttt tttaaaaagt aattcgcaag ctaaaattgt aaaagattgt 2041 gcaacaatgt gtagacatta taaacgagct gaaaaaagag aaatgacaat gtcacagtgg 2101 attaaaaggc gatgtgaaaa ggtggacgat gacggtgact ggagggacat agtacgattt 2161 ttaagatatc aacaagtaga ttttgtggca tttttatctg cactaaaaaa ttttttacat 2221 ggtgtgccta aaaaaaattg catacttata tatggagcac caaacacagg taaatcatta 2281 tttggaatga gtctaatgca tttcttacaa ggagctatta tatcctatgt aaattctaaa 2341 agccattttt ggttgcagcc attatatgat gccaaaatag ctatgttaga tgatgctaca 2401 tcgccatgtt gggcatatat agaccaatat ttaagaaatg cactagatgg aaatcctatt 2461 tcattagatg taaagcataa agcattagtg caattaaaat gcccaccttt acttattaca 2521 tcaaatataa atgcaggcaa agatgacagg tggccatact tacatagcag ggtagtggtc 2581 tttacatttc acaatgaatt cccatttgat aaaaatggaa acccagtgta tgggcttaat 2641 gataaaaact ggaaatcctt tttctcaagg acgtggtgca gattaaattt gcacgaggaa 2701 gaggacaaag aaaatgatgg agacgctttc ccagcgttta agtgtgtgtc aggacaaaat 2761 actagaacat tacgagactg atagcacatg tttgtctgat cacatacagt attggaaact 2821 gattcgtctt gaatgtgcag tattttataa agcaagagaa atgggaatta aaactcttaa 2881 ccaccaagtg gttccaacgc aggccatttc aaaagccaaa gcaatgcaag caattgaact 2941 gcaattaatg ttagagacat taaatacaac tgagtatagc acagaaacat ggacactgca 3001 agaaacaagt attgaattat atacaacagt tccacaagga tgttttaaaa aacatggggt 3061 tacagtggaa gtacaatttg atggtgataa acaaaatact atgcattata ctaattggac 3121 acatatatat atattagagg acagtatatg tactgttgta aagggactgg taaattataa 3181 aggtatttat tatgtgcatc agggtgtaga aacatattat gttactttta gggaagaggc 3241 taaaaagtat ggaaaaaaaa atatatggga agtgcatgtg ggtggtcagg taattgtttg 3301 tcctgaatct gtatttagca gcacagaact atccactgct gaaattgcta cacagctaca 3361 cgcctacaac accaccgaga cccataccaa agcctgctcc gtgggcacca cagaaaccca 3421 gaagacaaat cacaaacgac ttcgaggggg taccgagctc ccctacaacc ccaccaagcg 3481 agtgcgactc agtgccgtgg acagtgttga cagaggggtc tactctacat ctgactgcac 3541 aaacaaagac cggtgtggta gttgtagtac aactacacct atagtacatt taaaaggtga 3601 tgcaaataca ttaaagtgtt taagatatag attgggtaaa tataaagcat tgtatcaaga 3661 tgcttcatct acatggagat ggacatgtac aaacgataaa aaacaaatag caattgtaac 3721 attaacttac acaacagaat atcaaaggga taaattttta actacagtaa aaatacctaa 3781 cacagttaca gtgtctaaag gatatatgtc tatatgatag accttacagc ttccagtact 3841 gtgttgctgt gctttttgtt gtgcttttgt gtgcttttgt gcttgtgtct gcttgtacgt 3901 tcgctattgc tatctgtgtc attatactca gcattaatat tactggtttt aatactgtgg 3961 gttactgtag caacaccact acgttgcttt tgttgttttc tttgcttttt gtatatacct 4021 atgggaatga ttaacgctca tgcacaatat ttggcagtac agtaattgta tacaaacatt 4081 gtgtttggta ctgtgtaaca tgtgtgtatg gtggttttat tttttgttgt tcattgtata 4141 ttttgttttt ttactgtttt taaacatttt tatttctgtg tttttaataa attgatcaca 4201 tggtataacc atgcgacaca aaaggtctac aaaacgtgtt aaacgtgcat ctgcaacaca 4261 actatatcgt acttgcaaag ctgcaggaac ttgtccacca gatgttatac ctaaggttga 4321 gggtaatact gttgctgatc aaattttaaa atatggcagc atggctgtgt tttttggggg 4381 gttaggaatt ggttctggat ctggcacagg tggaagatct ggatatgttc cactgggtac 4441 aacacctcca acggctgcca caaacattcc tatacgaccc cctgtaactg tggaaagtat 4501 accattagac acaattggcc ctttagattc ttctatagtg tcattagtag aggaaactag 4561 ttttattgag tctggtgccc ctgttgttac accaagggtc ccacctacaa caggttttac 4621 aataaccaca tctacagata ccacacctgc tattttagat gtgacatcca taagtacaca 4681 tgataatcct actttcactg atccttctgt tttacaccca cccacgcctg cagaaacttc 4741 aggtcatttt gtactttcat catcttctat tagtacacat aattatgaag aaatccctat 4801 ggatactttt attgtttcca cagacagcaa taatataact aatagcacgc ctattccagg 4861 gtctcgccct acgacacgcc taggattata tagtaaaggt acccagcagg ttaaggttgt 4921 tgaccctgcc tttatgactt ctcctgcaaa acttattaca tatgataatc ctgcatatga 4981 aggccttaac cctgatacaa ccttacaatt tgagcatgag gatattagct tagctccgga 5041 tcctgacttt atggacatta tagctttaca taggcctgca ctaacatcta ggaaaggcac 5101 tattagatat agtagagtag gtaataaacg tactatgcat acacgaagtg gaaaagctat 5161 aggggcacgg gtacattatt atcaggattt aagtagtatt actgaagata tagaattaca 5221 acccttacaa catgtaccat cctctttacc acataccact gtttcaacat cattaaatga 5281 tggtatgttt gatatttatg ctcctataga tactgaggaa gatattatat tttcagcatc 5341 ttctaacaat actttatata ctacatctaa cactgcatat gttcctagca atactactat 5401 accattaagt agtggctatg atattcctat aacagcaggg ccagacattg tatttaactc 5461 taatactatt actaacactg tactaccggt acccacaggt cctatatatt ctattattgc 5521 agatgggggt gacttttatt tacaccctag ttattattta ttaaaacgac gtcgtaaacg 5581 tatcccatat ttttttgcag atgtctctgt ggcggtctaa cgaagccact gtctacctgc 5641 ctccagtgtc agtgtctaag gttgttagca ctgatgaata tgtaacacgc acaaacatct 5701 actatcatgc aggcagttct aggctattag ctgtgggtca cccatactat gctattaaaa 5761 aacaagattc taataaaata gcagtaccca aggtatctgg tttgcaatac agagtattta 5821 gagtaaaatt accagatcct aataagtttg gatttccaga cacatcattt tatgatcctg 5881 cctcccagcg tttggtttgg gcctgtacag gagttgaagt aggtcgtggt cagccattgg 5941 gtgtaggtat tagtggtcat cctttattaa ataaattgga tgatactgaa aattctaata 6001 aatatgttgg taactctggt acagataaca gggaatgcat ttctatggat tataaacaaa 6061 cacaattgtg tttaataggt tgtaggcctc ctataggtga acattgggga aaaggcacac 6121 cttgtaatgc taaccaggta aaagcaggag aatgtcctcc tttggagtta ctaaacactg 6181 tactacaaga cggggacatg gtagacacag gatttggtgc aatggatttt actacattac 6241 aagctaataa aagtgatgtt cccctagata tatgcagttc catttgcaaa tatcctgatt 6301 atctaaaaat ggtttctgag ccatatggag atatgttatt tttttattta cgtagggagc 6361 aaatgtttgt tagacattta tttaataggg ctggaactgt aggtgaaaca gtacctgcag 6421 acctatatat taagggtacc actggcacat tgcctagtac tagttatttt cctactccta 6481 gtggctctat ggtaacctcc gatgcacaaa tatttaataa accatattgg ttgcaacgtg 6541 cacaaggcca taataatggt atttgttgga gtaaccaatt gtttgttact gtagttgata 6601 caacccgtag tacaaatatg tctgtgtgtt ctgctgtgtc ttctagtgac agtacatata 6661 aaaatgacaa ttttaaggaa tatttaaggc atggtgaaga atatgattta cagtttattt 6721 ttcagttatg taaaataaca ctaacagcag atgttatgac atatattcat agtatgaacc 6781 cgtccatttt agaggattgg aattttggcc ttacaccacc gccttctggt accttagagg 6841 acacatatcg ctatgtaaca tcacaggctg taacttgtca aaaacccagt gcaccaaaac 6901 ctaaagatga tccattaaaa aattatactt tttgggaggt tgatttaaag gaaaagtttt 6961 ctgcagactt agatcaattt ccgttgggcc gtaaattttt gttacaagca ggactaaagg 7021 ccaggcctaa ttttagatta ggcaagcgtg cagctccagc atctacatct aaaaaatctt 7081 ctactaaacg tagaaaagta aaaagttaat gtgtaaatgt gtatgcatgt atactgtgtg 7141 ttatgtgttg tagtgcttgt atatatatta tgtgttgtgg tgcctgtttg tgttgtacat 7201 ggcgtgtaaa tgtgtgtata atattgtgca atgtgttgta cgtgggtgtt ttttgtatgt 7261 atgttgttgt atgtatgtca gtacgcaata aaagtgatgt gtgtgtttat aattaacact 7321 gtattgttgt atgactatgg gtgcacccat atgacttaca taattacagt acacgctata 7381 tgttgtatat aacaattcta cctccatttt gtgtgttagt gtcctttaca ttacctttca 7441 accgatttcg gttgctgttg gcaagcttta tatgtttttt acaaaaacat tcctacctca 7501 gcagaacact taatccttgt gttcctgata tatattgttt gccaacttta tattggcttt 7561 tgccaatctt taaacttgat tcatcttgca gtattagtca tttttcatac ttgtggtcca 7621 cccacacttg taacacttgt aacagtgctt ttaggcacat attttttgca tttctaaagg 7681 gctttaattg cacacttggc tttacatatt atgtgtgttt gccaacacca ccctacacat 7741 cctgccaact ttaagttaaa acatgcatgt aaaacattac tcactgtatt acacattgtt 7801 atatgcacac aggtgtgtcc aaccgatttg gattacagtt ttataagcat ttctttttat 7861 tatagttagt aacaattat