Top Banner
THE JOURNAL 0 1988 by The American Society for Biochemistry and Molecular Biology, OF BIOLOGICAL CHEMISTRY Inc. VOl. 263, No. 30, Issue of October 25. PP. 15785-15790,1988 Printed in U. S. A. Isolation and Characterization of the Human 2,3-Bisphosphoglycerate Mutase Gene* (Received for publication, April 12, 1988) Virginie Joulin, Marie-Claude Garel, Philippe Le Boulch, Colette Valentin, Raymonde Rosa, Jean Rosa, and Michel Cohen-SolalS From the Institut National de la Santk et de la Recherche Medicale U.91, H6pital Henri Mondor, 94010 Creteil, France The human 2,3-bisphosphoglycerate mutase gene was isolated from genomic libraries and analyzed by Southern blots and DNA sequencing. The transcription initiation site was localized by primer extension as well as by S1 protection of the mRNA. The gene extends over 22 kilobase pairs; it is composed of two introns (8.8 and 11.5 kilobase pairs long) and three exons (84, 662, and 965 base pairs long). The second exon corre- lates with a functional subdomain of the protein, as shown by comparison with the yeast phosphoglycerate mutase structure. The sequence TAGAAAA was found 30 bases upstream from the transcription initiation site and could be analogous to the TATA box. A se- quence homologousto the CCAAT box was found twice, at positions -75 and -178. There is no GC-rich se- quence or GC box in the 5”flanking region of the gene. Northern blot analysis indicates that the 2,3-bisphos- phoglycerate mutase mRNA is detected mainly in erythroid tissues and cell lines, although it is also pre- sent in low amounts in a nonerythroid tissue. A com- parison of the 5“upstream sequences with other pro- moters active only in erythroid cells did not reveal any common signal that could be responsible for the “eryth- roid promoter.” 2,3-Bisphophoglycero mutase (BPGM)’ (EC 2.7.5.4) is a multifunctional enzyme that catalyzes both the synthesis of 2,3-diphosphoglycerate (2,3-DPG) by mutase activity and its degradation by phosphatase activity (1-3). 2,3-DPG is the main allosteric effector of hemoglobin and plays a major role in the modulation of its affinity for oxygen (for review see Ref. 4). In humans, BPGM activity is found only in red blood cells, and 2,3-DPG is detected at a high concentration in these cells. The BPGM activity and the levels of 2,3-DPG increase during the maturation of erythoid precursor cells in rabbit marrow (5) and after induction of murine Friend cells (6), indicating that the enzyme is synthesized during the late stages of differentiation. Furthermore, it was shown by im- * This work was supported by grants from the InstitutNational de la Sante et de la Recherche Medicale, the CentreNational de la Recherche Scientifique, the Universite Paris-Val-de-Marne, and the Fondation pour la Recherche Medicale Francaise. The costs of pub- lication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertise- ment” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The nucleotide sequence(s) reported in thispaper has been submitted to the GenBankTM/EMBL Data Bank with accession numberfs) J04059. j To whom correspondence should be addressed. The abbreviations used are: BPGM, 2,3-bisphosphoglyceromu- tase; kb, kilobase pairs; 2,3-DPG, 2,3-diphosphoglycerate; bp, base pair(s). munoprecipitation following in vitro translation that the BPGM mRNA is present in human reticulocytes and fetal liver but is not detectable in adult liver (7). BPGM thus appears to be a key enzyme in red blood cell metabolism and in the regulation of hemoglobin function as well as a specific marker in the differentiation of the erythroid cell line. As afirst approach to an understanding of the expression and regulation of this tissue-specific gene, we have recently cloned the BPGM cDNA (8) and described the chro- mosomal location of the gene (9). This paper now deals with the isolation and characterization of the complete gene for the human BPGM and the level of BPGM mRNA in several tissues and cell lines. It describes the gene organization, which consists of three exons distributed over 22 kb, and analyzes its putative regulatory upstream elements and RNA process- ing signals. A comparison of these elements with those found in other genes expressed only in erythroid tissues is presented. MATERIALS AND METHODS Preparation and Analysis of RNA-Total RNA was extracted by the LiCl method (10) from the following human tissues: spleen from a homozygous @ thalassemic patient, normal fetal and adult liver, and adult spleen. For the cultured cell lines (K562 and HEL), total RNA was prepared according to Ares and Howell (11). Genomic Southern Blot Analysis-Genomic DNA was isolated from human leukocytes of a normal individual (12) and digested with several restriction endonucleases. After electrophoresis on a 0.7% agarose gel, the DNA was transferred to a nitrocellulose filter (13) and hybridized with specific probes that were labeled by the random primer method (14) (specific activity, lo9 cpm/pg). Isolation and Characterization of Human Genomic Clones-Two different human genomic libraries were screened using aBPGM cDNA subclone that covers most of the coding sequence (clone Xgt Il-BI) (8). The first library was prepared by insertion of MboI partially digested DNA fragments at the BamHI site of cosmid pCV 105 (generously provided by Dr. M. Goossens, Cr6tei1, France and Dr. C. Y. Lau, San Francisco, CA). Two cosmid recombinants were isolated and prepared as described by Godson and Vapnek (15). The second human genomic library was constructed by insertion of Sau3A partially digested DNA fragments at the BamHI site of phage X EMBL4 (kindly provided by Dr. A. Kahn, Paris). Its screening with a 5’ part of the cDNA by the method of Benton and Davis (16) yielded only one positive clone which was identified and purified as described by Maniatis et al. (13). Positively selected restriction fragments of the three positive ge- nomic clones were subcloned into phage M13 mp19 and sequenced (17). Sequence determination of long fragments was performed after partial deletion of the insert using the Cyclone system (IBI, New Haven, CT) with slight modifications (18). Sequences were stored in a DPS 8 computer (CITI 2, Paris) using the DB system (19) and analyzed by programs derived from those of Staden (20). The size of the first intron was estimated by partial Hind111 digestion of the recombinant phage. After blotting of the digests, filters were hybridized with two synthetic, 5’-end-labeled oligonucle- otide probes located in the cDNA sequence near the 5’ (primer A, Figs. 4 and 5) and 3’ ends (primer B, Figs. 4 and 5) of adjacent exons. This method also gives an estimate of the relative position of the 15785
6

Isolation and Characterization of the Human 2,3 ... · The human 2,3-bisphosphoglycerate mutase gene was isolated from genomic libraries and analyzed by Southern blots and DNA sequencing.

Aug 20, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Isolation and Characterization of the Human 2,3 ... · The human 2,3-bisphosphoglycerate mutase gene was isolated from genomic libraries and analyzed by Southern blots and DNA sequencing.

THE JOURNAL 0 1988 by The American Society for Biochemistry and Molecular Biology,

OF BIOLOGICAL CHEMISTRY Inc.

VOl. 263, No. 30, Issue of October 25. PP. 15785-15790,1988 Printed in U. S. A.

Isolation and Characterization of the Human 2,3-Bisphosphoglycerate Mutase Gene*

(Received for publication, April 12, 1988)

Virginie Joulin, Marie-Claude Garel, Philippe Le Boulch, Colette Valentin, Raymonde Rosa, Jean Rosa, and Michel Cohen-SolalS From the Institut National de la Santk et de la Recherche Medicale U.91, H6pital Henri Mondor, 94010 Creteil, France

The human 2,3-bisphosphoglycerate mutase gene was isolated from genomic libraries and analyzed by Southern blots and DNA sequencing. The transcription initiation site was localized by primer extension as well as by S1 protection of the mRNA. The gene extends over 22 kilobase pairs; it is composed of two introns (8.8 and 11.5 kilobase pairs long) and three exons (84, 662, and 965 base pairs long). The second exon corre- lates with a functional subdomain of the protein, as shown by comparison with the yeast phosphoglycerate mutase structure. The sequence TAGAAAA was found 30 bases upstream from the transcription initiation site and could be analogous to the TATA box. A se- quence homologous to the CCAAT box was found twice, at positions -75 and -178. There is no GC-rich se- quence or GC box in the 5”flanking region of the gene. Northern blot analysis indicates that the 2,3-bisphos- phoglycerate mutase mRNA is detected mainly in erythroid tissues and cell lines, although it is also pre- sent in low amounts in a nonerythroid tissue. A com- parison of the 5“upstream sequences with other pro- moters active only in erythroid cells did not reveal any common signal that could be responsible for the “eryth- roid promoter.”

2,3-Bisphophoglycero mutase (BPGM)’ (EC 2.7.5.4) is a multifunctional enzyme that catalyzes both the synthesis of 2,3-diphosphoglycerate (2,3-DPG) by mutase activity and its degradation by phosphatase activity (1-3). 2,3-DPG is the main allosteric effector of hemoglobin and plays a major role in the modulation of its affinity for oxygen (for review see Ref. 4). In humans, BPGM activity is found only in red blood cells, and 2,3-DPG is detected at a high concentration in these cells. The BPGM activity and the levels of 2,3-DPG increase during the maturation of erythoid precursor cells in rabbit marrow (5) and after induction of murine Friend cells (6), indicating that the enzyme is synthesized during the late stages of differentiation. Furthermore, it was shown by im-

* This work was supported by grants from the Institut National de la Sante et de la Recherche Medicale, the Centre National de la Recherche Scientifique, the Universite Paris-Val-de-Marne, and the Fondation pour la Recherche Medicale Francaise. The costs of pub- lication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertise- ment” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in thispaper has been submitted to the GenBankTM/EMBL Data Bank with accession numberfs) J04059.

j To whom correspondence should be addressed. The abbreviations used are: BPGM, 2,3-bisphosphoglyceromu-

tase; kb, kilobase pairs; 2,3-DPG, 2,3-diphosphoglycerate; bp, base pair(s).

munoprecipitation following in vitro translation that the BPGM mRNA is present in human reticulocytes and fetal liver but is not detectable in adult liver (7).

BPGM thus appears to be a key enzyme in red blood cell metabolism and in the regulation of hemoglobin function as well as a specific marker in the differentiation of the erythroid cell line. As a first approach to an understanding of the expression and regulation of this tissue-specific gene, we have recently cloned the BPGM cDNA (8) and described the chro- mosomal location of the gene (9). This paper now deals with the isolation and characterization of the complete gene for the human BPGM and the level of BPGM mRNA in several tissues and cell lines. It describes the gene organization, which consists of three exons distributed over 22 kb, and analyzes its putative regulatory upstream elements and RNA process- ing signals. A comparison of these elements with those found in other genes expressed only in erythroid tissues is presented.

MATERIALS AND METHODS

Preparation and Analysis of RNA-Total RNA was extracted by the LiCl method (10) from the following human tissues: spleen from a homozygous @ thalassemic patient, normal fetal and adult liver, and adult spleen. For the cultured cell lines (K562 and HEL), total RNA was prepared according to Ares and Howell (11).

Genomic Southern Blot Analysis-Genomic DNA was isolated from human leukocytes of a normal individual (12) and digested with several restriction endonucleases. After electrophoresis on a 0.7% agarose gel, the DNA was transferred to a nitrocellulose filter (13) and hybridized with specific probes that were labeled by the random primer method (14) (specific activity, lo9 cpm/pg).

Isolation and Characterization of Human Genomic Clones-Two different human genomic libraries were screened using a BPGM cDNA subclone that covers most of the coding sequence (clone X g t Il-BI) (8). The first library was prepared by insertion of MboI partially digested DNA fragments a t the BamHI site of cosmid pCV 105 (generously provided by Dr. M. Goossens, Cr6tei1, France and Dr. C. Y. Lau, San Francisco, CA). Two cosmid recombinants were isolated and prepared as described by Godson and Vapnek (15). The second human genomic library was constructed by insertion of Sau3A partially digested DNA fragments at the BamHI site of phage X EMBL4 (kindly provided by Dr. A. Kahn, Paris). Its screening with a 5’ part of the cDNA by the method of Benton and Davis (16) yielded only one positive clone which was identified and purified as described by Maniatis et al. (13).

Positively selected restriction fragments of the three positive ge- nomic clones were subcloned into phage M13 mp19 and sequenced (17). Sequence determination of long fragments was performed after partial deletion of the insert using the Cyclone system (IBI, New Haven, CT) with slight modifications (18). Sequences were stored in a DPS 8 computer (CITI 2, Paris) using the DB system (19) and analyzed by programs derived from those of Staden (20).

The size of the first intron was estimated by partial Hind111 digestion of the recombinant phage. After blotting of the digests, filters were hybridized with two synthetic, 5’-end-labeled oligonucle- otide probes located in the cDNA sequence near the 5’ (primer A, Figs. 4 and 5) and 3’ ends (primer B, Figs. 4 and 5) of adjacent exons. This method also gives an estimate of the relative position of the

15785

Page 2: Isolation and Characterization of the Human 2,3 ... · The human 2,3-bisphosphoglycerate mutase gene was isolated from genomic libraries and analyzed by Southern blots and DNA sequencing.

15786 2,3-Bisphosphoglycerate Mutase Gene HindIII restriction sites. The length of the second exon was deter- mined by Southern blot analysis of genomic DNA hybridized with an intronic probe.

Primer Extension-In order to determine the initiation site for the transcription of the BPGM gene, a 30-mer oligonucleotide (5"GGA- CATACTGATGGCTGAACTTCCCAGGCG-3') complementary to nucleotides 122-151 of the BPGM mRNA was synthesized (primer A, Figs. 4 and 5). An excess of 5'-32P-end-labeled oligonucleotide was hybridized with 2 pg of human reticulocyte poly(A)+ RNA. The extension reaction used 10 units of avian myeloblastosis virus reverse transcriptase in the presence of 1 mM of all four deoxynucleotide triphosphates, 10 units of RNasin, and 50 pg of actinomycin D per ml. The resulting transcript was analyzed on a 6% polyacrylamide sequencing gel, followed by autoradiography.

Nuclease SI Mapping-The probe used for nuclease S1 analysis was constructed from a M13 clone whose primer-extended product is complementary to the BPGM mRNA. A 20-mer synthetic oligonucle- otide (5'-GCAAAGGGGCCACCAGCAGC-3') complementary to nu- cleotides 63-82 in the cDNA sequence (primer B, Figs. 4 and 5) was 5'-end-labeled. It was used to generate a radioactive extended product by the action of Escherichia coli DNA polymerase I (Klenow frag- ment) in the presence of 50 p~ dNTP. After linearization with AuaII and purification on a 6% polyacrylamide denaturing gel, this single- stranded probe was hybridized a t 56 "C with 2 pg of human reticulo- cyte poly(A)' RNA for 16 h in 80% formamide, digested with 80 units of S1 nuclease, and the protected fragment was analyzed on a 6% denaturing polyacrylamide sequencing gel and autoradiographed.

RESULTS

BPGM mRNA Distribution in Tissues and Cell Lines-The level of BPGM mRNA was estimated in different human tissues and erythroid cell lines from Northern blot analysis (Fig. 1). Results indicated a high concentration of BPGM mRNA in erythropoietic tissues, such as fetal liver, spleen with erythropoietic islands, as well as in reticulocytes (7). In contrast, BPGM mRNA was observed at very low levels in normal adult liver as well as in HEL and noninduced K562 cells.

Isolation of the Human BPGM Gene-Southern blot analy- sis of normal human genomic DNA probed with a human BPGM cDNA from the coding region revealed that the cor- responding gene was distributed over large fragments and contained large introns (Fig. 2).

Screening of a cosmid genomic library yielded two clones containing only the 3' half of the BPGM gene. In order to detect clones that cover the other half of the BPGM gene and the 5'-flanking region, a second human genomic library made in phage XEMBL4 was screened with a cDNA probe corre- sponding to its 5' area. The single clone detected during the

1 2 3 4 5 6 . _. "- "

- 1880

FIG. 1. Determination of BPGM mRNA levels by RNA blot analysis. Five pg of human poly(A)+ RNA was electrophoresed on a 1.2% agarose-formaldehyde gel, transferred to nitrocellulose, and hybridized with a BPGM probe. Lane I , HEL mRNA; lane 2, nonin- duced K562 mRNA; lane 3, fetal liver mRNA; lane 4, normal adult liver mRNA; lane 5, thalassemic spleen mRNA; and lane 6, normal adult spleen mRNA.

1 2 3 4 5 ".".̂ ." - -.

23.1 - 0 9.4- 1, 6.5- 4.4 - (I

m - 2.3- 0

2.0 - 0

FIG. 2. Southern blot analysis of human leukocyte DNA. Each lane contains 10 pg of genomic DNA digested with PuuII ( l a n e I ) , PstI (lane 2 ) , HindIII (lane 3 ) , EcoRI (lane 4 ) , or BglII ( l a n e 5). The probe used was the Xgtll-B, cDNA insert which covers most of the coding region of BPGM. The size marker (on the left side of the panel) is HindIII-digested X phage DNA, and fragment sizes are given in kilobase pairs.

screening contained only the 5' end of the gene. The three independent clones isolated were characterized

by restriction mapping, Southern blot comparisons with ge- nomic DNA, and extensive nucleotide sequencing. Overlaps between the clones were determined by Southern blot analy- sis. Fig. 3 summarizes the results obtained.

Structure of the Human BPGM Gene-The human BPGM gene extends over 22 kb. I t is comprised of two introns and three exons. As seen in Fig. 4, exon 1 extends for 84 bp and is located entirely within the 5' end untranslated region of the largest cDNA clone previously isolated (8). It contains the transcription initiation site of the BPGM mRNA (see below). Exon 2 extends for 662 bp and contains 61 bp of the 5'- untranslated region followed by 601 bp of the coding region. Exon 3 contains the last 179 bp of coding region as well as the complete 786 bp of 3' end untranslated region. The sequence of the three exons is in complete agreement with the previously reported sequence of the BPGM cDNA (8). The size of the first intron is -8.8 kb, and of the second intron, -11.5 kb. The junction sequences of the two introns conform well to the consensus for splice donors/acceptors (21).

Determination of the Transcription Initiation Site-The transcription initiation site of the BPGM mRNA was deter- mined by two independent methods. A primer extension of reticulocyte mRNA using a 30-mer oligonucleotide comple- mentary to a sequence located upstream from the initiation codon resulted in three major fragments, 153, 151, and 150 bases in length, and one minor transcript of 152 bases (Fig. 5A), corresponding to the sequence TGAT. Moreover, a S1 nuclease protection was performed using a single stranded 5'- end-labeled fragment of complementary RNA which is 135 bases in length (Fig. 5B) and homologous to the sequence of the first exon. The size of the S1 nuclease-resistant RNA- DNA hybrid is 84 bp. Both results indicate a similar 5' terminus, suggesting that exon 1 is the first one in the BPGM gene and that its 5' end corresponds to the transcription initiation site previously assigned. Position +1 was assigned to the A residue located 145 bases upstream from the initiation codon.

The 5"Flanking Region of the Human BPGM Gene-The 5' flanking sequence of the BPGM gene, extending 676 bp upstream from the cap site previously determined (derived from subclone H2, Fig. 3), is shown in Fig. 5. This putative promoter region contains the sequence TAGAAAA, located 30 bases upstream from the transcriptional initiation site,

Page 3: Isolation and Characterization of the Human 2,3 ... · The human 2,3-bisphosphoglycerate mutase gene was isolated from genomic libraries and analyzed by Southern blots and DNA sequencing.

2,3-Bisphosphoglycerate Mutase Gene 15787

84 bp - 8 . 8 Kb 612 bp - 11.5 Kb fP 3’

965 bp

n n n n! tv P H E t t

nz Ill H E 1 E l E2 - E .1 EMBL4 - C 1 -

Caanid 1“ cos-1

FIG. 3. Physical map of the human BPGM gene. The entire human BPGM gene is shown with coding sequences indicated by filled bores and untranslated sequences of the mRNA by open boxes. Restriction sites were assigned by analysis of the sequence and by restriction mapping of the genomic clones EMBL4-C1 and Cos-1 indicated at the bottom of the figure. Enzymes employed were: E, EcoRI; H , HindIII; P, PuuII; EV, EcoRV. The fragments subcloned into M13 mp19 and sequenced by the dideoxy chain-termination method of Sanger (15) are shown below the restriction map: EcoRI fragments (E , and E*), HindIII fragments (HI and H*), and HindIII-EcoRI double-digested fragment (HE,).

ttttrgggc.ttccrttrrtcrcttttcttccccrcctrrart.ctgtcrgcrrrctgcctr~ggpaatccctatr. .rgcggt. .prplr.pt. .ppttgcgttrct~ct atrrrttgpgCrcttt.rrggc.tttgtttt.rtcrcg.cttrrQ&cttte..cc.cg.9tr.tttptgcgaaagga~cttcttctgggrpg.trcctc~pcttcptcttcr ggtttcrrtccctrrrtttccc.tr.gtrrrrprrrcrtc~cctcctctcttccaagagtccgctgacacctgtcttcacccgcaggtgtttcttttaagacgttaaac~ct

.qccarctccttrctggttcaggtatgtgctcgtctctagagtgaaccaatcagagccccacttcttactcctgggatttggtaaacggaagagacctctcctgttccggga

aagc

t t C t ~ ~ ~ ~ C C t C J ~ ~ C ~ ~ C t ~ ~ ~ ~ ~ ~ C ~ C C C C ~ C ~ C C C ~ ~ ~ ~ ~ ~ ~ ~ ~ C ~ C C ~ ~ C ~ ~ C ~ C ~ C ~ ~ ~ ~ Q ~ ~ ~ ~ ~ ~ ~ ~ ~ C ~ A C ~ ~ C C C ~ ~ ~ ~ C C C ~ ~ ~ C C ~ ~ C C C ~ C ~ ~ ~ C ~ ~ C ~ ~

g t a cc r c c ~ t g g a t g a g t g a c r g t t c g r c t t t c c a a ~ ~ g g ~ c c t a g g ~ C C ~ p c c c ~ C p ~ C t g g t t c ~ ~ g g g c t ~ aaaagagcgtcgat~cc~gcggcagTG A T ~ G G A G G C G C T G G C T C m G O C O O C T C G G A G G A G C G G ~ G ~ G ~ G C T G C T G C T G ~ G ~ G ~ G G C C C ~ G C A ~ g t g g c t t t g t t t . . 8.8Kb , .

........ t t c t g t c t t t c t a e / ATGTATTGCTGTCCTTGMTATTAGCCCCYmCGCCTGO~GTrCAGCCATCAG? AT0 TCC M G TAC M CTr Pr imer 8 84/1VS 1

IUS 1/85 Pr tmer A met SI; L r s T r r L r s Leu

A T T ATG TTA AW CAT GGA GAG GGT GCT TOO M T M G GAG M C CGT l l T TGT AGC TGG GTG GAT CAG M CTC M C AGC GM GGA I l e M e t Leu Ar9 H i s Glr Glu Glr A l a T r p Asn Lrs G lu Asn Arg Phe Crs Ser T rp Va l Asp G l n L r s Leu Asn Ser Glu Glr

ATG GAG G M GCT COG M C TOT GGG M G CM CTC W GCG TTA M C m GAG TTT GAT CTT GTA TTC ACA TCT GTC C R M T CGG Met G lu Q lu A la Arg Asn Cvs Glr Lrs G ln Leu L r s A l a Leu Asn Phe G lu Phe Asp Leu Val Phe Thr Ser Val Leu Asn Arq

TCC A T T CAC ACA GCC TGG CTG ATC CTG GM GAG CTA GGC CAG G M TOG GTG CCT GTG GM AGC TCC TGG CGT CTA M T GAG CGT S e r I l e H I S Thr A la T rp Leu I l e Leu Glu Glu Leu Glr Gln Glu Trp Val Pro Val Glu Ser Ser T rp A rg Leu Asn Glu Arq

CAC TAT GGG GCC TTG ATC GGT CTC M C AGO GAG CAG AT0 GCT TTG M T CAT GGT GM GM CAA GTG AGO CTC TGG AGA AGA AGC H i s T r r G l v A l a Leu I l e G l v Leu Asn Arg G lu G ln Net A la Leu Asn H i s Glr Glu Glu Gin Val Arq Leu Trp Arg Arg S e r

TAC M T GTA ACC CCG CCT CCC A l l GAG GAG TCT CAT CCT TAC TAC CM G M ATC TAC CYIC GAC CGG AGO TAT M GTA TGC GAT T r r Asn Val Thr Pro Pro Pro I l e Glu Glu Ser H i s P r o T r r T v r G l n G l u Ilr T r r Asn Asp A r p A r g T r r L r s V a l C r s Asp

GTG ccc TTG GAT ~ A A CTG cos COG TCO GM AGC TTA MG GAT GTT CTG GAG AGA CTC c n ccc TAT TGG MT GM AGO a n GCT Va l Pro Leu Asp Gln Leu Pro Ar9 Ser Glu Ser Leu Lvs Asp Val Leu Glu Arg Leu Leu Pro Trr Trp Asn Glu Arg I l e A l a

CCC GM GTA TTA COT GGC M ACC A T T CTG ATA TCT GCT CAT GGA ClAT AGC AGT AGG OCA CTC CTA M CAC CTG M G / g t a Pro Glu Va l Leu A rq G lv L rs Th r I l e Leu l i e Ser A l a H i s Glv Asn Ser Ser A r g A l a Leu Leu L r s H I S Leu Glu

ccag.. 11.5Kb ..... t g t c c t g a t c a a c a v GT ATC TCA GAT M GAC ATC ATC M C ATT ACT CTT CCT ACT GGA GTC CCC ATT CTT G l v I l e Ser Asp Glu Asp I l e I l e Asn l l e Thr Leu Pro Thr Glv Val Pro 110 Leu

CTG GM TTG GAT GW M C CTG CGT G C l GTT GGG CCT CAT CAG TTC CTG GGT GAC CM GAG GCG ATC tAA G C A GCC ATT M G M Leu Glu Leu Asp Glu Asn Leu A r g A l a V a l Glr Pro H i s Gln Phe Leu G lv Asp Gln G lu A la I l e G l n A l a A l a I l e L r s L r s

746 /IUS 2

IUS 2/74?

GTA GM GAT C M GGA M GTG CYICI C M GCT M M TAG TClTTCTtAACTGTTGGCTMMGAAATGCdWWGMGTGGCATAGGAGTGTGTTAT Val Glu Asp Gln Glr L v s V a l L r s G l n A l a L r s L v s +e*

-673 -561 -449 -331 -225 -1 13 - 1

84

163

247

331

415

499

583

667

746

802

886

983

1095 1207 1319 1431 1543 1655 1763 1875 1891

FIG. 4. Nucleotide sequence of the human BPGM gene and flanking regions. Exon sequences are shown in upper case letters. The deduced amino acid sequence is shown below the nucleotide sequence. Nucleotide residue +1 denotes the A residue which was assigned as the transcription initiation site by primer extension and S1 protection experiments (Fig. 5). The TAG translation termination codon is indicated by three asterisks. The numbers refer to the cDNA sequence. The 5”upstream elements, the polyadenylation signals, and the termination of transcription sequence of the mRNA found in the 3”flanking region are underlined. The two sequences underlined with arrows represent the two primers used in the primer extension experiment (Primer A ) and the S1 mapping analysis (Primer E?) (Fig. 5).

Page 4: Isolation and Characterization of the Human 2,3 ... · The human 2,3-bisphosphoglycerate mutase gene was isolated from genomic libraries and analyzed by Southern blots and DNA sequencing.

15788 2,3-Bisphosphoglycerate Mutase Gene

1 2 3 4 5 6 1 2 3 4 5

135

84

A B

IVS1 Ava II 5 ~(Exonp/jExon\r 5' Exon 1 t - 3 -. I '. ' . t

150- 153 primer A S1 probe

Protected fragment

135 primer B

b 8 4

FIG. 5. Determination of the BPGM transcription initiation site. A, primer extension experiment using a 5'-32P-end-labeled 30-mer synthetic oligonucleotide. The location of the primer is shown at the bottom of the figure. Its inverse complementary sequence is shown in Fig. 4 (Primer A). Two pg of reticulocyte poly(A)+ RNA ( l a n e 5) or 100 pg of yeast tRNA ( l a n e 6 ) were hybridized to the labeled primer extended with avian myeloblastosis virus reverse transcriptase and separated on a 6% polyacrylamide sequencing gel followed by autoradiography at -80 "C with intensifying screens. Lanes 1-4 represent sequencing reactions used as size markers. B, S1 nuclease protection experiment using a single-stranded 5"end-labeled fragment of 135 nucleotides, complementary to BPGM mRNA. The location of the probe is shown at the bottom of the figure. The inverted complementary sequence of the primer used in preparation of the single stranded probe is shown in Fig. 4 (Primer B ) . The protected S1 nuclease fragment from 2 pg of reticulocyte poly(A)' RNA (lune 5) was analyzed on a 6% polyacrylamide sequencing gel. The size markers (lunes 1-4) are as in A. The band at 135 nucleotides represents the undigested probe. The major protected fragment of 84 nucleotides in length is indicated.

which could be analogous to the classical TATA box. The sequence CCAAT was also found a t positions -75 and -178 and is analogous to the classical CCAAT box. The 5"flanking region of this gene does not show any GC-rich sequence and lacks the classical GC box sequence, GGGCGG, or its inverse complement; nevertheless, a GGCCGGG motif was found at position -109.

The 3' End of the Human BPGM Gene-The sequence downstream of the BPGM gene is shown in Fig. 4. The genomic clone Cos-1 is colinear with the cDNA sequence until the beginning of the poly(A) stretch, with an overlap of three As. There is no additional polyadenylation signal in the 200 nucleotides downstream, suggesting that this is the real 3' end of the BPGM transcript. Additionally, a GTGTGTTTGG motif was found 35 bp downstream from the AATAAA signal. This motif is homologous to the consensus sequence YGTGT- TYY located 24-38 bp from the polyadenylation site (22).

DISCUSSION

The overlapping cosmid and phage clones studied contain the complete BPGM coding unit and span around 22 kb of

the genome. The exact correspondence of the restriction pat- tern of a genomic Southern blot of human DNA with the map of the genomic clones confirms that the human genome con- tains only one gene for BPGM. This is in agreement with the single chromosomal location on chromosome 7 observed by in situ hybridization of the cDNA probe (9). The human BPGM gene consists of 3 exons of 84, 662, and 965 bp in length, separated by two introns of about 8.8 and 11.5 kb in length.

Northern blot analysis revealed that BPGM mRNA is present at a very high level in all erythropoietic tissues stud- ied. This correlates well with the fact that until now BPGM could be detected by enzymatic assays and in uitro mRNA translation (7) only in erythropoietic cells. BPGM was con- sequently considered as an erythroid cell specific protein. Nevertheless its product, 2,3-DPG, is detected in minute amounts in nonerythroid cells, where it acts as a cofactor of the glycolytic enzyme phosphoglycerate mutase (23). Until recently, it was assumed that phosphoglycerate mutase could account for the production of 2,3-DPG because of its low diphosphoglycerate mutase activity. Nevertheless, the amount

Page 5: Isolation and Characterization of the Human 2,3 ... · The human 2,3-bisphosphoglycerate mutase gene was isolated from genomic libraries and analyzed by Southern blots and DNA sequencing.

2,3-Bisphosphoglycerate Mutase Gene 15789

of 2,3-DPG found in nonerythroid tissues is too high to be explained by this low activity (23). According to our results, the trace amounts of BPGM mRNA detected in adult liver indicate that a low level of the BPGM protein could be synthesized in nonerythroid tissues.

As a first step toward understanding BPGM gene expres- sion, the 5“flanking sequence was analyzed and compared with that of the globin genes, which are expressed as major components in the same erythroid cells. The region upstream of most genes transcribed by RNA polymerase I1 contains a TATA box and a CCAAT box found 25-30 bp and about 80 bp, respectively, upstream of the transcription start site (24, 25). As can be seen in Fig. 4, the BPGM gene contains the sequence TAGAAAA 30 bp upstream from the transcription start site, which in six out of seven positions conforms to the

A A consensus sequence TATA A of the TATA box (25). A

T T sequence similar to the CCAAT box is found at positions -75 and -178. The downstream putative CCAAT box of the BPGM promoter has the sequence TTCCAAT and contains two substitutions as compared to that of the human @-globin promoter GGCCAAT. The sequence at -178, AACCAATCA, bears considerable homology to the consensus sequence of the

CCAAT box, CCAATCA, also present in the CY globin

promoter (26). Multiple copies of the CCAAT box element are also found in many other genes, in particular in the human @- and y-globin genes (27). However, the exact contribution of these far upstream elements to the expression of such genes remains to be proved (27, 28).

In the 5”flanking region of the BPGM gene, none of the characteristics for housekeeping gene promoters were found there is no GC-rich sequence and no GC box (with a consensus sequence GGGCGG) (29). Therefore, the putative promoter signals described for the BPGM gene are similar to signals found in many gene promoters that regulate tissue-specific expression (30-32). These genes, like the human BPGM gene, encode for proteins that are synthesized by a limited reper- toire of cells.

Comparison of the human BPGM gene with other genes expressed in the same repertoire of cells could aid in defining an “erythroid promoter.” Such a promoter was studied for the a-, @-, and y-globin genes (27) and for the human porphobi- linogen deaminase gene (33); the promoter of the latter ap- pears to be very similar to that of the @-globin gene. They contain, in addition to the classical TATA and CCAAT boxes, a third element 5-10 bases further upstream, called the CACA

box, with the consensus sequence GCCACACCCT. This ele-

ment is duplicated in the @-globin gene promoter, partially duplicated in the human porphobilinogen deaminase pro- moter, and occurs once in the y promoter. The human BPGM promoter does not contain a CACA box, nor does it contain a GC box sequence, although such an element is found nine times in the human al- and a2-globin genes. Thereby, at the present time, comparisons of these promoters reveal certain similarities but also significant differences, thus making it difficult to define an erythroid-type promoter. It is noteworthy that the control of the human @-globin gene is not restricted to the promoter and that elements located farther in the 5’- and the 3’-flanking regions confer the erythroid cell specific- ity and the high level of expression of the @-globin gene (34). Experiments designed to test the functional contribution of

AA

GG

G

T

putative cis-acting sequences are necessary to prove their role in the regulation of the transcription of the BPGM gene and the cell specificity of its promoter.

A comparison of the amino acid sequences of human BPGM and yeast phosphoglycerate mutase indicates a high degree of similarity (49%), suggesting a common origin and tertiary structure for both enzymes (8,35). Such a structural homology is also obtained with the amino acid sequence of human muscle phosphoglycerate mutase (36) and brain type phos- phoglycerate mutase (37). X-ray crystallographic studies of the yeast phosphoglycerate mutase (38-41) have shown that the three-dimensional structure of the yeast enzyme subunit is composed of three folding lobes. Among them, lobe C is of particular interest because of the location of the second intron in the BPGM gene. Lobe C contains two subdomains and it is relevant to note that the position of intron 2 of BPGM, as indicated in Fig. 4, can be localized between these two sub- domains, if one assumes that the subunit spatial structure of the human BPGM can be compared to that of the yeast phosphoglycerate mutase. In addition, as reported by Camp- bell et al. (39), a detailed study of the folding of the first 185 amino acid residues of yeast phosphoglycerate mutase reveals a great similarity with the folding of the first 150 residues of lactate dehydrogenase, and it is thus possible that the mutase folding has evolved from a lactate dehydrogenase-like protein. Consequently, it can be assumed that during evolution an amino acid segment was added to a precursor protein of the yeast phosphoglycerate mutase and human BPGM in order to gain a particular function (e.g. the phosphoglycerate mutase function). In this way, intron 2 of BPGM, which separates this last subdomain from the others, could be the vestige of this phenomenon.

Acknowledgments-We are grateful to Drs. C. Y. Lau (San Fran- cisco, CA) and M. Goossens (Crkteil, France) for providing us with the cosmid human genomic library; to Dr. A. Kahn (Paris) for the phage genomic library; to Dr. A. Dubart for helpful discussions; to Dr. D. M. Ojcius for his help during the revision of the manuscript; and to A. M. Dulac for the preparation of manuscript and figures. We are especially indebted to Dr. P. H. Romeo for valuable advice. Sequence data analysis was performed with the help of the French Ministkre de la Recherche et des Enseignements Superieurs.

REFERENCES 1. Rosa, R., Gaillardon, J., and Rosa, J. (1973) Biochem. Biophys.

Res. Commun. 51,536-542 2. Sasaki, R., Ikura, K., Sugimoto, E., and Chiba, H. (1975) Eur. J.

Biochem. 50,581-593 3. Rose, Z. B., and Dube, S. (1976) Arch. Biochem. Biophys. 177 ,

284-292 4. Bum, H. F., and Forget, B. G. (1986) Hemoglobin: Molecular,

Genetic and Clinical Aspects, W. B. Saunders Co., Philadelphia 5. Narita, H., Ikura, K., Yanagawa, S., Sasaki, R., Chiba, H., Sai-

myoji, H., and Kumagai, N. (1980) J. Biol. Chem. 255, 5230- 5235

6. Narita, H., Yanagawa, S., Sasaki, R., and Chiba, H. (1981) Biochem. Biophys. Res. Commun. 103,90-96

7. Dubart, A., Romeo, P. H., Tsapis, A., Goossens, M., Rosa, R., and Rosa, J. (1984) Biochem. Biophys. Res. Commun. 120, 441-447

8. Joulin, V., Peduzzi, J., Romeo, P. H., Rosa, R., Valentin, C., Dubart, A., Lapeyre, B., Blouquit, Y., Garel, M. C., Goossens, M., Rosa, J., and Cohen-Solal, M. (1986) EMBO J. 5, 2275- 2283

9. Barichard, F., Joulin, V., Henry, I., Garel, M. C., Rosa, R., Valentin, C., Cohen-Solal, M., and Junien, C. (1987) Hum. Genet. 77, 283-285

10. Itoh, N., Nose, K., and Okamoto, H. (1979) Eur. J. Biochem. 97, 1-9

11. Ares, M. J., and Howell, S. H. (1982) Proc. Natl. Acad. Sci. U. S. A. 79, 5577-5581

Page 6: Isolation and Characterization of the Human 2,3 ... · The human 2,3-bisphosphoglycerate mutase gene was isolated from genomic libraries and analyzed by Southern blots and DNA sequencing.

15790 2,3-Bisphosphoglycerate Mutase Gene

12.

13.

14.

15.

16. 17.

18.

19. 20. 21. 22.

23.

24.

25.

26.

27.

Blin, N., and Stafford, D. W. (1976) Nucleic Acids Res. 3, 2303- 2308

Maniatis, T., Fritsch, E. F., and Sambrook, J. (1982) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY

Feinberg, A. P., and Vogelstein, B. (1984) Anal. Biochem. 137 ,

Godson, G. N., and Vapnek, D. (1973) Biochirn. Biophys. Acta

Benton, W. D., and Davis, R. W. (1977) Science 196,180-182 Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl.

Dale, R. M. K., McClure, B. A. M., and Houchins, J. P. (1985)

Staden, R. (1977) Nucleic Acids Res. 4, 4037-4051 Staden, R. (1980) Nucleic Acids Res. 8,3673-3694 Mount, S. M. (1982) Nucleic Acids Res. 10 , 459-472 McLauchlan, J., Gaffney, D., Whitton, J. L., and Clements, J. B.

Chiba, H., and Sasaki, R. (1978) Curr. Topics Cell. Regul. 14,75-

Breathnach, R., and Chambon, P. (1981) Annu. Reu. Biochem.

Benoist, C., O'Hare, K., Breathnach, R., and Chambon, P. (1980)

266-267

299,516-520

Acad. Sci. U. S. A. 74,5463-5467

Plasmid 13, 31-40

(1986) Nucleic Acids Res. 13 , 1347-1368

116

50,349-383

Nucleic Acids Res. 8, 127-142

28.

29. 30.

31.

32.

33.

34.

35. 36.

37.

38.

39.

Jones, K. A., Kadomaga, J. T., Rosenfeld, P. J., Kelly, T. J., and

Dynan, W. S., and Tjian, R. (1985) Nature 316 , 774-778 Collins, C. J., Underdahl, J. P., Levene, R. B., Ravera, C. P.,

Morin, M. J., Dombalagian, M. J., Ricca, G., Livingston, D. M., and Lynch, D. C. (1987) Proc. Natl. Acad. Sci. U. S. A. 84, 4393-4397

Venta, P. J., Montgomery, J. C., Hewett-Emmett, D., Wiebauer, K., and Tashian, R. E. (1985) J. Bwl. Chem. 260,12130-12135

Ott, M.-O., Sperling, L., Herbomel, P., Yaniv, M., and Weiss, M.

ChrCtien, S., Dubart, A., Beaupain, D., Raich, N., Grandchamp, B., Rosa, J., Goossens, M.. and Romio, P. H. (1987) Proc. Natl.

Tjian, R. (1987) Cell 48 , 79-89

C. (1984) EMBO J. 3, 2505-2510

A&. s~i. u. s. A. 86.6-io Grosveld. F.. Blom van Assendelft. G.. Greaves. 0. R.. and Kollias. . ,

G. (1987) Cell 51,975-985 Fothergill-Gilmore, L. A. (1986) Trends Biochem. Sci. 11, 47-51 Shanske, S., Sakoda, S., Hermodson, M. A., Di Mauro, S., and

Schon, E. A. (1987) J. Biol. Chem. 262,14612-14617 Blouquit, Y., Calvin, M. C., Rosa, R., PromC, D., PromC, J. C.,

Pratbernou, F., Cohen-Solal, M., and Rosa, J. (1988) J. Biol.

Campbell, J. W., Hodgson, G. I., and Watson, H. C. (1972) Nature Chem. 263, in press

New Biol. 240,137-139 Campbell, J. W., Watson, H. C., and Hodgson, G. I. (1974) Nature

250.301-303 Chodosh, L. A., Baldwin, A. S., Carthew, R. W., and S a m P. A. 40. Winn,'S. I., Watson, H. C., Harkins, R. N., and Fothergill, L. A.

(1988) Cell 53, 11-24 (1981) Philos. Trans. R. SOC. Lond. B Biol. Sci. 2 9 3 , 121-130 Anagnou, N. P., Karlsson, S., Moulton, A. D., Keller, G., and 41. Amato, S. V., Rose, Z. B., and Liebman, M. N. (1984) Biockm.

Nienhuis, A. W. (1986) EMBO J. 5,121-126 Biophys. Res. Commun. 121, 826-833