Top Banner
Guanine-vacancybearing G-quadruplexes responsive to guanine derivatives Xin-min Li a , Ke-wei Zheng a,1 , Jia-yu Zhang a , Hong-he Liu a , Yi-de He a , Bi-feng Yuan b , Yu-hua Hao a , and Zheng Tan a,1 a State Key Laboratory of Membrane Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, Peoples Republic of China; and b Key Laboratory of Analytical Chemistry for Biology and Medicine (Ministry of Education), Department of Chemistry, Wuhan University, Wuhan 430072, Peoples Republic of China Edited by Philip C. Hanawalt, Stanford University, Stanford, CA, and approved October 13, 2015 (received for review August 26, 2015) G-quadruplex structures formed by guanine-rich nucleic acids are implicated in essential physiological and pathological processes and nanodevices. G-quadruplexes are normally composed of four Gn (n 3) tracts assembled into a core of multiple stacked G-quartet layers. By dimethyl sulfate footprinting, circular dichroism spectroscopy, ther- mal melting, and photo-cross-linking, here we describe a unique type of intramolecular G-quadruplex that forms with one G 2 and three G 3 tracts and bears a guanine vacancy (G-vacancy) in one of the G-quartet layers. The G-vacancy can be filled up by a guanine base from GTP or GMP to complete an intact G-quartet by Hoogsteen hydrogen bond- ing, resulting in significant G-quadruplex stabilization that can ef- fectively alter DNA replication in vitro at physiological concentration of GTP and Mg 2+ . A bioinformatic survey shows motifs of such G-quadruplexes are evolutionally selected in genes with unique dis- tribution pattern in both eukaryotic and prokaryotic organisms, im- plying such G-vacancybearing G-quadruplexes are present and play a role in gene regulation. Because guanine derivatives are natural metabolites in cells, the formation of such G-quadruplexes and guanine fill-in (G-fill-in) may grant an environment-responsive regulation in cellular processes. Our findings thus not only ex- pand the sequence definition of G-quadruplex formation, but more importantly, reveal a structural and functional property not seen in the standard canonical G-quadruplexes. G-quadruplex | guanine-responsive | G-vacancy | nucleic acids G -quadruplexes are four-stranded structures formed in gua- nine-rich nucleic acids (13). Canonical G-quadruplexes are composed of four tracts of consecutive guanines connected by three loops. The guanines in the guanine tracts (G tracts) are packed in a core unit (Fig. 1A) of a stack of multiple G-quartet layers, each with four guanine bases connected by eight Hoogsteen hydrogen bonds (Fig. 1B). G-quadruplexforming sequences are not randomly dis- tributed in the mammalian genomes but concentrated at physio- logically relevant positions (4): for instance, promoters, telomeres, and immuno-globulin switch regions. These facts suggest G-quad- ruplex structures have implications in physiological processes. In- deed, experimental investigations have demonstrated the physiological function of G-quadruplexes in many aspects (58). Studies on G-quadruplexes have mostly focused on sequences described by a consensus of G 3 (N 17 G 3 ) 3 , which can potentially form G-quadruplexes of three or more G-quartet layers with three loops of one to seven nucleotides (Fig. 1A) (9, 10). In recent years, the definition describing the capability of G-quadruplex formation has been broadened. Sequences with a loop up to 11 or 15 nu- cleotides were found capable of forming stable G-quadruplexes when the other two loops are sufficiently short (11, 12). The con- tinuity of guanines in G tracts was also relaxed by the finding of G-quadruplexes with broken (13, 14) or bulged (15, 16) G tracts. Besides these intramolecular G-quadruplexes in single-stranded DNA (ssDNA), we recently found a hybrid type of G-quadruplexes involving G tracts from both DNA and RNA transcript can form during transcription in double-stranded DNA (dsDNA) (1619). In this case, a G-quadruplex may form with as few as two G tracts defined by G 3 (N 17 G 3 ) 1 on the nontemplate DNA strand instead of four. Although the DNA:RNA hybrid G-quadruplexes of three or more G-quartets are very stable, less stable hybrid G-quadruplexes of two G-quartets can also form in transcription (19). Moreover, our study showed a DNA:RNA hybrid G-quad- ruplex could form in transcribed mitochondrial DNA in competi- tion with a bulge-bearing G-quadruplex, which may participate in priming the initiation of DNA replication (16). This example demonstrates that the structural polymorphism of G-quadruplexes has implications in physiological processes. In this work, we describe a unique type of G-quadruplex bearing a guanine vacancy (G-vacancy) in a G-quartet layer. Such struc- tures can form with one G 2 and three G 3 tracts in ssDNA and in transcribed dsDNA as well. By accepting a guanine base from gua- nine derivatives, such as GMP or GTP in solution, the G-vacancy is filled up to form an intact core unit, resulting in an enhanced thermal stability of the G-quadruplexes in a concentration- and charge-dependent manner. At physiological concentration of GTP and Mg 2+ , the stability enhancement can significantly affect in vitro DNA replication. In supporting the formation and functional role of the G-vacancybearing G-quadruplexes (GVBQ) in cells, we found that sequences with potential to form GVBQ are preferentially selected at the 5end of genes in both eukaryotic and prokaryotic organisms. Because guanine deriva- tives are natural metabolites in cells, we speculate the GVBQs may provide regulation in response to intracellular level of guanine derivatives, making them distinctive from the canonical G-quadruplex structures. Significance Guanine-rich nucleic acids fold into a four-stranded structure named G-quadruplex that has implications in essential cellular processes, pharmaceutical applications, and nanodevices. We found a unique type of G-quadruplex that contains a G-vacancy and is stabilized by guanine derivatives such as te physiological concentration of GTP by fill-in at the G-vacancy of a guanine base. In response to changes in the concentration of guanine deriva- tives, this type of G-quadruplex is able to manipulate the tracking activity of protein on DNA as exemplified by a DNA polymerase- catalyzed DNA synthesis. Because guanine derivatives are natural metabolites in cells, such G-quadruplexes may potentially play an environment-responsive regulation in cellular processes, a func- tional property not found in the canonical G-quadruplexes. Author contributions: K.-w.Z. and Z.T. designed research; X.-m.L., K.-w.Z., J.-y.Z., H.-h.L., Y.-d.H., and B.-f.Y. performed research; X.-m.L., Y.-h.H., and Z.T. analyzed data; and Z.T. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. Freely available online through the PNAS open access option. 1 To whom correspondence may be addressed. Email: [email protected] or [email protected]. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1516925112/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1516925112 PNAS | November 24, 2015 | vol. 112 | no. 47 | 1458114586 BIOCHEMISTRY Downloaded by guest on November 30, 2021
6

Guanine-vacancy bearing G-quadruplexes responsive to ...

Dec 01, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Guanine-vacancy bearing G-quadruplexes responsive to ...

Guanine-vacancy–bearing G-quadruplexes responsiveto guanine derivativesXin-min Lia, Ke-wei Zhenga,1, Jia-yu Zhanga, Hong-he Liua, Yi-de Hea, Bi-feng Yuanb, Yu-hua Haoa, and Zheng Tana,1

aState Key Laboratory of Membrane Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, People’s Republic of China; and bKeyLaboratory of Analytical Chemistry for Biology and Medicine (Ministry of Education), Department of Chemistry, Wuhan University, Wuhan 430072, People’sRepublic of China

Edited by Philip C. Hanawalt, Stanford University, Stanford, CA, and approved October 13, 2015 (received for review August 26, 2015)

G-quadruplex structures formed by guanine-rich nucleic acids areimplicated in essential physiological and pathological processes andnanodevices. G-quadruplexes are normally composed of four Gn (n ≥3) tracts assembled into a core of multiple stacked G-quartet layers.By dimethyl sulfate footprinting, circular dichroism spectroscopy, ther-mal melting, and photo-cross-linking, here we describe a unique typeof intramolecular G-quadruplex that forms with one G2 and three G3

tracts and bears a guanine vacancy (G-vacancy) in one of the G-quartetlayers. The G-vacancy can be filled up by a guanine base from GTP orGMP to complete an intact G-quartet by Hoogsteen hydrogen bond-ing, resulting in significant G-quadruplex stabilization that can ef-fectively alter DNA replication in vitro at physiological concentrationof GTP and Mg2+. A bioinformatic survey shows motifs of suchG-quadruplexes are evolutionally selected in genes with unique dis-tribution pattern in both eukaryotic and prokaryotic organisms, im-plying such G-vacancy–bearing G-quadruplexes are present and playa role in gene regulation. Because guanine derivatives are naturalmetabolites in cells, the formation of such G-quadruplexes andguanine fill-in (G-fill-in) may grant an environment-responsiveregulation in cellular processes. Our findings thus not only ex-pand the sequence definition of G-quadruplex formation, but moreimportantly, reveal a structural and functional property not seenin the standard canonical G-quadruplexes.

G-quadruplex | guanine-responsive | G-vacancy | nucleic acids

G-quadruplexes are four-stranded structures formed in gua-nine-rich nucleic acids (1–3). Canonical G-quadruplexes are

composed of four tracts of consecutive guanines connected by threeloops. The guanines in the guanine tracts (G tracts) are packed in acore unit (Fig. 1A) of a stack of multiple G-quartet layers, each withfour guanine bases connected by eight Hoogsteen hydrogen bonds(Fig. 1B). G-quadruplex–forming sequences are not randomly dis-tributed in the mammalian genomes but concentrated at physio-logically relevant positions (4): for instance, promoters, telomeres,and immuno-globulin switch regions. These facts suggest G-quad-ruplex structures have implications in physiological processes. In-deed, experimental investigations have demonstrated the physiologicalfunction of G-quadruplexes in many aspects (5–8).Studies on G-quadruplexes have mostly focused on sequences

described by a consensus of G≥3(N1–7G≥3)≥3, which can potentiallyform G-quadruplexes of three or more G-quartet layers with threeloops of one to seven nucleotides (Fig. 1A) (9, 10). In recent years,the definition describing the capability of G-quadruplex formationhas been broadened. Sequences with a loop up to 11 or 15 nu-cleotides were found capable of forming stable G-quadruplexeswhen the other two loops are sufficiently short (11, 12). The con-tinuity of guanines in G tracts was also relaxed by the finding ofG-quadruplexes with broken (13, 14) or bulged (15, 16) G tracts.Besides these intramolecular G-quadruplexes in single-strandedDNA (ssDNA), we recently found a hybrid type of G-quadruplexesinvolving G tracts from both DNA and RNA transcript can formduring transcription in double-stranded DNA (dsDNA) (16–19). Inthis case, a G-quadruplex may form with as few as two G tractsdefined by G≥3(N1–7G≥3)≥1 on the nontemplate DNA strand

instead of four. Although the DNA:RNA hybrid G-quadruplexesof three or more G-quartets are very stable, less stable hybridG-quadruplexes of two G-quartets can also form in transcription(19). Moreover, our study showed a DNA:RNA hybrid G-quad-ruplex could form in transcribed mitochondrial DNA in competi-tion with a bulge-bearing G-quadruplex, which may participate inpriming the initiation of DNA replication (16). This exampledemonstrates that the structural polymorphism of G-quadruplexeshas implications in physiological processes.In this work, we describe a unique type of G-quadruplex bearing

a guanine vacancy (G-vacancy) in a G-quartet layer. Such struc-tures can form with one G2 and three G3 tracts in ssDNA and intranscribed dsDNA as well. By accepting a guanine base from gua-nine derivatives, such as GMP or GTP in solution, the G-vacancyis filled up to form an intact core unit, resulting in an enhancedthermal stability of the G-quadruplexes in a concentration- andcharge-dependent manner. At physiological concentration ofGTP and Mg2+, the stability enhancement can significantlyaffect in vitro DNA replication. In supporting the formationand functional role of the G-vacancy–bearing G-quadruplexes(GVBQ) in cells, we found that sequences with potential to formGVBQ are preferentially selected at the 5′ end of genes in botheukaryotic and prokaryotic organisms. Because guanine deriva-tives are natural metabolites in cells, we speculate the GVBQsmay provide regulation in response to intracellular level ofguanine derivatives, making them distinctive from the canonicalG-quadruplex structures.

Significance

Guanine-rich nucleic acids fold into a four-stranded structurenamed G-quadruplex that has implications in essential cellularprocesses, pharmaceutical applications, and nanodevices. Wefound a unique type of G-quadruplex that contains a G-vacancyand is stabilized by guanine derivatives such as te physiologicalconcentration of GTP by fill-in at the G-vacancy of a guanine base.In response to changes in the concentration of guanine deriva-tives, this type of G-quadruplex is able to manipulate the trackingactivity of protein on DNA as exemplified by a DNA polymerase-catalyzed DNA synthesis. Because guanine derivatives are naturalmetabolites in cells, such G-quadruplexes may potentially play anenvironment-responsive regulation in cellular processes, a func-tional property not found in the canonical G-quadruplexes.

Author contributions: K.-w.Z. and Z.T. designed research; X.-m.L., K.-w.Z., J.-y.Z., H.-h.L.,Y.-d.H., and B.-f.Y. performed research; X.-m.L., Y.-h.H., and Z.T. analyzed data; and Z.T.wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

Freely available online through the PNAS open access option.1To whom correspondence may be addressed. Email: [email protected] or [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1516925112/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1516925112 PNAS | November 24, 2015 | vol. 112 | no. 47 | 14581–14586

BIOCH

EMISTR

Y

Dow

nloa

ded

by g

uest

on

Nov

embe

r 30

, 202

1

Page 2: Guanine-vacancy bearing G-quadruplexes responsive to ...

ResultsG-Quadruplex Formation in G2/3G3 ssDNA and G-Quartet Completionby G Fill-In. Previous studies in our laboratory suggested that GMPor GTP might affect the structure of the G-quadruplex. To performa systematic investigation, we first studied three ssDNAs containinga native (MYOG-3332) or modified G-core (MYOG-2332 andMYOG-3333) sequence from the MYOG (Myogenin) gene of hu-man in 50 mM K+ solution containing PEG 200. The three MYOGDNAs effectively formed intramolecular G-quadruplex as judgedfrom a native gel electrophoresis (Fig. S1). Circular dichroism (CD)spectroscopy showed, with a characteristic positive peak at 295 and265 nm, the MYOG-2332 might form an antiparallel G-quadruplex,whereas the other two DNAs might form a parallel G-quadruplexaccording to the positive peak at 265 and negative peak at 245 nm(Fig. 2A).The effect of GMP on G-quadruplex formation was analyzed by

dimethyl sulfate (DMS) footprinting (20) (Fig. 2 B and C), in whichguanine residues in a G-quadruplex were protected from methyl-ation and subsequent cleavage by the Hoogsteen hydrogen bonds atthe N7s (Fig. 1B). DNAs in Li+ solution without PEG were used asstructure-less references because PEG can promote G-quadruplexformation under salt-deficient conditions (21). It was anticipatedthat the Gs in a G tract would be better protected in K+ than in Li+

solution. In principle, the MYOG-2332, with a G2-G3-G3-G2 ar-rangement of G tracts, is able to form a G-quadruplex of twoG-quartet layers. In agreement with this, greater protection to twoGs in each G tract was observed in K+ than in Li+ solution, with anexception that the first G in the G3 tract immediately downstreamof the first G2 tract from the 5′ end was heavily cleaved in K+ so-lution with a magnitude greater than that in the Li+ solution (Fig.2C, Top, red arrowhead). Previous studies showed that such hyper-cleavages could occur at the Gs in the terminal G-quartet of aG-quadruplex, and the reason for that was not clear (22). Additionof GMP to the DNA did not alter the cleavage (blue vs. green peak).For the MYOG-3332, which had a G3-G3-G3-G2 arrangement,

the two G3 tracts in the middle of the sequence were well protectedin K+ solution without GMP (Fig. 2C, Middle, green curve). Likethe MYOG-2332, hyper-cleavage was also observed at the last G inthe first G3 tract from the 5′ end (Fig. 2C, Middle, red arrowhead).Given the parallel folding topology of this DNA suggested by theCD spectrum (Fig. 2A), a preferable structure was deduced tofit this particular protection pattern (Fig. 2D, left scheme). Thisstructure consisted of two intact G-quartets and one G-vacancy–bearing G-quartet or G-triad in which two guanine residues wouldbe protected by Hoogsteen hydrogen bonds and one be exposed tocleavage (Fig. 1C). The hyper-cleaved G became protected onceGMP was added (blue vs. green peak), suggesting that the exposedN7 formed a Hoogsteen hydrogen bond. This fact implied thatthe G-vacancy was filled up with a guanine from GMP, resulting inan intact G-quartet to protect the G in the G-triad (Fig. 2D, rightscheme). Such a G fill-in made the DMS profile of MYOG-3332

with GMP resemble that of the MYOG-3333. In the MYOG-3333,a G-vacancy was not present or, in other word, was filled up by anendogenous guanine (Fig. 2E); therefore, no hyper-cleavage andcorresponding protection were seen in the absence or presence ofGMP (Fig. 2C, Bottom). We also analyzed another set of DNAsderived from the HIF1α (hypoxia inducible factor 1, alpha sub-unit) gene (Fig. 3) that had a reversed G-tract arrangement of G2-G3-G3-G3 and different loops. Similar results were obtained withrespect to the CD spectrum, hyper-cleavage, and its protection byG fill-in. In both sets of DNA, the protection mediated by G fill-inwas G-vacancy specific because it was not seen with the 3-3-3-3 Gtract arrangement (Figs. 2C and 3C, Bottom). The protection wasalso G-quadruplex specific because protection was not seen for theorphan G (black arrowhead in Fig. 3B, lane 5, and 3C, Middle).

G Fill-In Requires N7 and Depends on Guanine Derivative Concentrationand Charge. We further compared 7-deaza-GTP (dzGTP), GTP,and GMP for their ability to fill-in the G-vacancy in the GVBQ ofthe MYOG-3332, which was assessed by the protection to thecorresponding G residue prone to hyper-cleavage (Fig. 4A, redarrow) in DMS footprinting. With its N7 being replaced by acarbon, dzGTP is unable to form a Hoogsteen hydrogen bond; as aresult, dzGTP brought little protection to the hyper-cleavage (Fig.4B, lanes 1–4; 4C, Top, red arrow). With a N7, G fill-in was de-tected with both GTP and GMP as judged from the suppression ofthe hyper-cleavage (Fig. 4B, lanes 5–12, and 4C, Middle and Bot-tom, red arrow). The protection to the hyper-cleaved G with GTP/GMP and the requirement of the N7 in the GTP implied that theinteraction of a guanine derivative with the GVBQ involved aformation of a Hoogsteen hydrogen bond, which could only beimplemented by a G fill-in at the G-vacancy.Similar protection to the hyper-cleavage and a requirement of

N7 was also observed for the HIF1α-2333 DNA (Fig. 5). For bothMYOG-3332 and HIF1α-2333, the protection was concentrationdependent, with a higher concentration leading to greater pro-tection (Figs. 4 and 5 and Fig. S2). The G fill-in had to overcome

A B C

Fig. 1. Scheme of a stacked core unit (A) of a parallel G-quadruplex withthree G-quartet layers. Each G-quartet (B) consists of four guanine bases(dashed polygon) connected by eight Hoogsteen hydrogen bonds (hashedbonds). The N7 in each guanine base is indicated in green. Removing oneguanine from a G-quartet exposes the corresponding N7 (C, red circle).

A C

B

D E

Fig. 2. G-quadruplex formation in MYOG G-core ssDNAs detected by (A) CDspectroscopy and (B and C ) DMS footprinting. (A) CD spectra of MYOGG-quadruplexes. (B) DNA cleavage fragments resolved by denaturing gelelectrophoresis. (C) Digitization of the gel in B. (D) Scheme of G-quartetcompletion by G fill-in in the GVBQ of MYOG-3332. (E) Structure of MYOG-3333 G-quadruplex. Red arrowhead in B–D indicates hyper-cleaved guanineresidue that was protected by a G fill-in with GMP in MYOG-3332.

14582 | www.pnas.org/cgi/doi/10.1073/pnas.1516925112 Li et al.

Dow

nloa

ded

by g

uest

on

Nov

embe

r 30

, 202

1

Page 3: Guanine-vacancy bearing G-quadruplexes responsive to ...

the repulsion between the negative charges in the DNA and GTP/GMP. In correlation with this, a better protection was seen withGMP than with GTP.

G Fill-In Stabilizes GVBQs. G-quadruplexes of two G-quartets aremuch less stable than those of three G-quartets (19). We examinedthe thermal stability of the structures formed by the MYOG DNAs(Fig. 6A) by thermal melting monitored by FRET (23) in 50 mMK+

solution. Among them, the two G-quartet G-quadruplex formed byMYOG-2332 showed the lowest stability with a T1/2 of 51 °C. Onthe other hand, the G-quadruplex of three G-quartets formed byMYOG-3333 was too stable, such that a top plateau could not beobtained in the melting curve and so was the T1/2. For this reason,we lowered the K+ to 1 mM and obtained an intact melting curve,which yielded a T1/2 of 78 °C. With an additional G-triad at one sideof a two G-quartet G-quadruplex, the GVBQs of MYOG-3332showed stability (T1/2 = 68 °C) between that of the MYOG-2332and MYOG-3333.Completion of an intact G-quartet by G fill-in enhanced the

stability of the MYOG-3332 GVBQ in a concentration-dependentmanner (Fig. 6B). GTP and GMP at 5 mM enhanced the T1/2 of theGVBQ by 3 and 5 °C, respectively (Table S1). A smaller incrementin the T1/2 for the GTP than for the GMP reflected a charge de-pendence, which was fully in agreement with the protection to thehyper-cleaved guanine in the DMS footprinting (Fig. 4). dzGTPfailed to stabilize the MYOG-3332 GVBQ because of its inability toform the required Hoogsteen hydrogen bond in the G fill-in. Againthe G-vacancy dependence was also indicated by the fact that theG-quadruplex of MYOG-3333 was not stabilized by the two com-pounds. Similarly, the GVBQ formed by the HIF1α-2333 was alsostabilized by GMP and GTP but not by dzGTP (Fig. S3A).

Confirmation of G Fill-In in GVBQ by Photo-Cross-Linking. The re-quirement of G-vacancy and Hoogsteen hydrogen bonding in

our aforementioned results strongly supported a G fill-in in theG-triad-to-G-quartet conversion. To further confirm this, we synthe-sized a trifunctional compound sulfosuccinimidyl-2-[6-(biotinamido)-2-(p-azidobenzamido) hexanoamido]ethyl-1,3′-dithiopropionate (SBED)-GMP (Fig. 7A and Fig. S4). The molecule carried a guanine base thatcan fill-in a G-vacancy, a phenyl azide group that can react with theprimary amine in adenine, guanine, and cytosine to covalently cross-link a DNA, and a biotin moiety that can bind a streptavidin. Weincubated the three MYOG DNAs separately with the SBED-GMPand induced cross-linking by UV light. The DNAs were then resolvedby denaturing electrophoresis. Cross-linking occurred in the MYOG-3332 as indicated by an extra band migrating behind the originalDNA (Fig. 7B, lane 6). In contrast, little cross-linking was seen in theother two DNAs. Because the SBED-GMP had a biotin, the cross-linked MYOG-3332 could be further shifted by streptavidin in anative gel electrophoresis (Fig. 7C, lane 6). These results thereforeconfirmed the G fill-in in the GVBQ of MYOG-3332. Cross-linkingand mobility shift was also observed with the HIF1α-2333 GVBQ(Fig. S5).

GVBQ Formation, G Fill-In, and Stability Enhancement in Other G2/3G3

Combinations. Our aforementioned results used DNA with aG3-G3-G3-G2 or G2-G3-G3-G3 G-tract arrangement. In general, theG2 tract can be placed at four different positions. To find out thegenerality of G fill-in, we tested three additional native sequencesfrom human genes, namely, LRRC42 (leucine rich repeat contain-ing 42), ABTB2 [ankyrin repeat and BTB (POZ) domain containing2], and TSC22D3 (TSC22 domain family, member 3), which hadan arrangement of G3-G3-G2-G3, G3-G2-G3-G3 and G2-G3-G3-G3, respectively (Fig. S1). Similar to the MYOG-3332, the threeDNAs all featured a positive peak at 265 nm and a negativepeak at 245 nm in their CD spectrum, suggesting they formedparallel G-quadruplexes (Fig. S6A).In DMS footprinting, the G tracts in all DNAs were better

protected in K+ than in Li+ solution (Fig. S6 B–G), except thehyper-cleavage in the G3 immediately downstream of the G2 tractas we saw in the MYOG-3332. Unlike the MYOG-3332, whichshowed only one hyper-cleavage peak, the LRRC42 and ABTB2displayed two hyper-cleavage peaks at both the 5′ and 3′ G in theG3 tract, which were all prevented by GMP (Fig. S6 B, C, E, and F,red arrowhead). This particular cleavage/protection pattern couldbe explained by an alternative G fill-in to the two ends of the G3 tract(schemes on the right side of the panels). In TSC22D3, the cleavage/protection more or less resembled that in the MYOG-3332 in that

A C

B

D E

Fig. 3. G-quadruplex formation in HIF1α G-core ssDNAs detected by (A) CDspectroscopy and (B and C) DMS footprinting. (A) CD spectra of HIF1α-2333G-quadruplexes. (B) DNA cleavage fragments resolved by denaturing gel elec-trophoresis. (C) Digitization of the gel in B. (D) Scheme of G-quartet completionby G fill-in in the GVBQ of HIF1α-2333. (E) Structure of HIF1α-3333 G-quad-ruplex. Red arrowhead in B–D indicates hyper-cleaved guanine residue that wasprotected by a G fill-in with GMP in HIF1α-2333. Black arrowhead indicates theorphan guanine that was not assembled in the G-quadruplex and hence notprotected by GMP from cleavage in the HIF1α-2333 (blue vs. green peak).

A

B

C

Fig. 4. Protection of the hyper-cleave prone guanine in the GVBQ ofMYOG-3332 by G fill-in with different guanine derivatives in DMS foot-printing. (A) Structure of the GVBQ of MYOG-3332 with a G fill-in (B) DNAcleavage fragments resolved by denaturing gel electrophoresis. (C) Digiti-zation of the gel in B. Red arrowhead indicates hyper-cleaved guanine res-idue that was protected by G fill-in with a guanine derivative.

Li et al. PNAS | November 24, 2015 | vol. 112 | no. 47 | 14583

BIOCH

EMISTR

Y

Dow

nloa

ded

by g

uest

on

Nov

embe

r 30

, 202

1

Page 4: Guanine-vacancy bearing G-quadruplexes responsive to ...

the Gs in the G tracts were all protected except the first G in theG3 tract immediately downstream of the G2 tract (Fig. S6 D andG, red arrowhead). The hyper-cleavage seems to be a commonfeature in all of the DNAs, implying that the unprotected gua-nine in the G-triad (Fig. 1C, red circle) was highly exposed tochemical attack in the DMS footprinting and rescued by the Gfill-in. Again, the orphan Gs was not protected in the threeDNAs (Fig. S6 B–G, black arrowhead), further confirming thatthe G fill-in was specific to G-vacancy.The GVBQs formed in these DNAs were also stabilized by a G

fill-in with GTP and GMP (Fig. S3 B–D). At 5 mM, GMP led toan increment in T1/2 of 7, 12, and 10 °C, respectively, for LRRC42,ABTB2, and TSC22D3, whereas GTP was less effective. Similarly,dzGTP failed to stabilize any GVBQ, indicating a requirement ofHoogsteen hydrogen bonding. For comparison, the melting datawere summarized in Table S1. Again, the G fill-in for the threeDNAs was also confirmed by cross-linking with SBED-GMP andsubsequent mobility shift with streptavidin (Fig. S7).

GVBQ Formation and G Fill-In in G2/3G3 ssDNA in Solution Without PEG.Our aforementioned experiments were carried out in a K+ solutioncontaining PEG. PEG stabilizes G-quadruplexes (21, 24), thusbenefiting their detection in vitro. We also assessed the formationof GVBQs and G fill-in in PEG-free K+ solution. CD spectroscopyshowed that MYOG-3332, LRRC42, ABTB2, and TSC22D3 ssDNAmight all form parallel G-quadruplexes (Fig. S8A). In DMS foot-printing, the four DNAs all displayed hyper-cleavage and pro-tection pattern (Fig. S8 B–F) similar to those obtained in the K+

solution containing PEG (Figs. 2C and 3C and Fig. S6 E–G). G fill-inwas also detected by photo-cross-linking and mobility shift withSBED-GMP (Fig. S8G). All these results showed that the forma-tion of GVBQ and G fill-in also occurred in the absence of PEG.

G-Quadruplex Formation in Transcribed G2/3G3 dsDNA.Here we showthat GVBQ can also form in dsDNA in transcription. ThreedsDNAs containing the MYOG-3332, MYOG-2332, and MYOG-3333 G-core were transcribed in the presence of dzGTP, GTP, andGMP. G-quadruplex formation was analyzed by DMS footprinting.For the MYOG-2332 (Fig. 8A, lanes 1–4), transcriptions in thepresence of different guanine derivatives all led to a protection totwo Gs in each G tract (Fig. 8B, Top), indicating that G-quad-ruplexes of two G-quartets formed. For the MYOG-3332 (Fig. 8A,lanes 5–8), transcription with dzGTP resulted in a protection to only

two Gs in each G3 tract (Fig. 8B, Middle). Because dzGTP lackedthe N7 and thus prevented the RNA transcript from formingG-quadruplex, the protection pattern indicated that an intramolecularDNA G-quadruplex of two G-quartets formed (Fig. 8C, leftscheme). However, when we supplied GMP with dzGTP or con-ducted the transcription with GTP and GMP, the previouslycleaved Gs in each G3 tract became protected. The magnitude ofprotection in these two cases was similar to but less than that to theG3 tracts in the MYOG-3333 (Fig. 8B, Bottom), which was capableof forming a G-quadruplex of three intact G-quartets.The protection of all of the Gs in the G3 tracts in the MYOG-

3332 brought by GMP/GTP suggested a formation of new forms ofG-quadruplexes in which all of the Gs in all of the G3 tracts par-ticipated in G-quadruplex assembly. In principle, there were twopossibilities. First, the new G-quadruplexes might involve an alter-native alignment of the G2 to the 5′ or 3′ side of the G3 tracts (Fig.8C). However, it is difficult to imagine how GMP/GTP in solutionwould drive such a change in the alignment. The fact that the pro-tection was seen with GMP/GTP and not with dzGTP implied for-mation of a Hoogsteen hydrogen bond at the N7 was required forthe protection. Given this, then a plausible interpretation would be aformation of GVBQ of three G-quartets with a G-vacancy in aterminal G-quartet (Fig. 8D, center scheme). The guanine base fromGMP or GTP then filled up the G-vacancy to complete a G-quartet(Fig. 8D, right scheme), and thus, protect all of the Gs in the G3tracts from cleavage. dzGTP failed to do so because it lacked the N7to form the required Hoogsteen hydrogen bond in the G-quartet.

Effect of GVBQ and G Fill-In on in Vitro DNA Replication. We exam-ined the effect of GVBQ stabilization by GMP/GTP on an in vitroDNA replication in which primer extension was catalyzed by DNApolymerase through a GVBQ-containing DNA template. TheGVBQ caused premature termination (PT) of replication, whichwas significantly enhanced by both GMP and GTP (Fig. 9 A andB). For MYOG-3332, the PT at the GVBQ increased by morethan threefold at 1 mM GMP/GTP. Because GMP/GTP at 1 mMincreased the T1/2 of the MYOG-3332 GVBQ by only 1–2 °C (Fig.6B), we reasoned that the interaction between the GVBQ andGMP/GTP might be reinforced by the Mg2+ in the reaction buffer.Indeed, a more dramatic increase in the T1/2 of the GVBQ ofMYOG-3332 and TSC22D3 was observed in the presence ofphysiological concentration of 2 mM Mg2+ (25, 26) (Fig. 9C).Particularly, GTP became as effective as GMP in stabilizing theGVBQs. The effect of GMP/GTP was GVBQ specific becausethey had little effect on the random DNA without a GVBQ andMYOG-3332 and TSC22D3 when they were replicated in Li+

A C

B

Fig. 5. Protection of the hyper-cleave-prone guanine in the GVBQ of HIF1α-2333 by G fill-in with different guanine derivatives in DMS footprinting.(A) Structure of the GVBQ of HIF1α-2333 with a G fill-in. (B) DNA cleavagefragments resolved by denaturing gel electrophoresis. (C) Digitization of thegel in B. Red arrowhead indicates hyper-cleaved guanine residue that wasprotected by G-fill-in with a guanine derivative.

A B

Fig. 6. G fill-in stabilized the GVBQ ofMYOG-3332 in FRETmelting. (A) Meltingof MYOG GVBQ in the absence of guanine derivatives. Curves were obtained ina solution containing 50 mM K+ (solid) or in 1 mM K+ plus 49 mM Li+ (dashed).T1/2 gives the temperature for the fluorescence to reach the midvalue betweenthe minimum and maximum values. (B) Effect of guanine derivatives on the T1/2of G-quadruplex from MYOG-2332, MYOG-2333, and MYOG-3333. Assay wascarried out in a solution containing 1mMK+ plus 49mM Li+ for MYOG-3333 and50 mM K+ for the others at various concentrations of guanine derivatives.

14584 | www.pnas.org/cgi/doi/10.1073/pnas.1516925112 Li et al.

Dow

nloa

ded

by g

uest

on

Nov

embe

r 30

, 202

1

Page 5: Guanine-vacancy bearing G-quadruplexes responsive to ...

solution that does not stabilize G-quadruplex (Fig. 9A). Weattempted to examine the effect of GVBQ on in vitro transcriptionby T7 RNA polymerase but did not succeed because efficientRNA synthesis requires starting with a guanine residue such thatGMP or GTP had to be used in all samples.

Occurrence of GVBQ-Forming Motifs in Eukaryotic and ProkaryoticGenes. To seek information regarding the potential formation andfunction of GVBQ in cells, we conducted bioinformatic surveys on

the distribution of GVBQ-forming motifs in both eukaryotic andprokaryotic genes. The results show that such motifs are enrichednear the transcription start site (TSS) in mammalian genes (Fig. 10A and B), suggesting that GVBQ may form and play a functionalrole in cells. A survey of the whole human genome (hg19) found atotal of ∼220,000 potential GVBQ-forming motifs with a G-tractcombination of G2/3G3–4, which is in the same order of that of thecanonical G-quadruplex forming motifs (360,000) found inthis survey.Although a survey on a single lower species, such as Saccharo-

myces cerevisiae, did not result in a clear occurrence pattern (27)because of limited number of genes, we were able to obtain onewhen 53 fungi species available in the Ensembl genome database,including S. cerevisiae, were analyzed. The results showed a positiveselection of GVBQ-forming motifs on the template DNA strandand a small negative selection on the nontemplate DNA strandaround TSS (Fig. 10C). Because TSS coordinates are not availablefor bacteria, we only surveyed within bacteria genes. The resultsfrom 4,222 chromosomes available in the National Center forBiotechnology Information (NCBI) database showed a strongpositive selection of such motifs on the template DNA strand nearthe 5′ end of genes, but not on the nontemplate DNA strand (Fig.10D). The distribution of GVBQ-forming motifs in all of theanalyzed organisms closely resembled that of the canonicalG-quadruplexes (Fig. 10 E–H) except in a lower frequency. Allthese results are supportive of the existence and functional roleof GVBQs in the eukaryotic and prokaryotic kingdoms.

DiscussionIn this work, we found a different type of intramolecular G-quad-ruplexes that can form in sequences not following the consen-sus of G≥3(N1–7G≥3)≥3. Using one G2 and three G3 tracts, theG-quadruplexes contained a G-triad layer or, in other words, anincomplete G-quartet with a G-vacancy at one corner. TheG-vacancy can be filled up by a guanine base when it is availablefrom solution to complete a G-quartet. The G fill-in was indicatedby the requirement of N7 in the guanine base needed for theformation of a Hoogsteen hydrogen bond in a G-quartet, sup-ported by protection to guanine hyper-cleavage and G-quadruplexstabilization, and further confirmed by photo-cross-linking, all in aG-vacancy– and N7-dependent manner. This finding expands notonly the possibility of G-quadruplex formation in guanine-richnucleic acids, but also the structural diversity of G-quadruplexes.

A B

C

D

Fig. 8. G-quadruplex formation in transcribed MYOG dsDNAs detected byDMS footprinting. (A) DNA cleavage fragments resolved by denaturing gelelectrophoresis. (B) Digitization of the gel in A. (C) Possible alternative G2

tract alignment that could lead to G3 tracts protection. (D) G fill-in at theG-vacancy by GMP/GTP that could lead to G3 tracts protection. DNA was nottranscribed (NT) or transcribed in the presence of the indicated guaninederivative(s). Red arrowhead in B–D indicates the guanine residue attackedin the absence and protected in the presence of GMP/GTP. Blue curves in Blargely overlap with the green ones such that they are barely visible.

A B C

Fig. 9. Effect of G fill-in in GVBQ on in vitro DNA replication and stability inthe presence of 2 mM Mg2+. (A) Inhibition of DNA primer extension bystabilization of GVBQ of MYOG-3332 and TSC22D3 with GMP/GTP, assayedby DNA-templated primer extension. Marker was made by a same extensionreaction with a template without the G-core and its 3′ flanking sequence.(B) Ratio of premature termination (PT) over full-length (FL) replicon (±SD) inthree independent experiments in K+ solution demonstrated in A. (C) Sta-bilization of the GVBQ of MYOG-3332 and TSC22D3 as a function of GMP/GTP concentration assayed by FRET melting.

A

B C

Fig. 7. Confirmation of G fill-in in the GVBQ of MYOG-3332 by photo-cross-linking and subsequent electrophoretic mobility shift. (A) TrifunctionalSBED-GMP used for cross-linking. (B) Cross-linking between SBED-GMP andMYOG-3332 DNA detected by denaturing gel electrophoresis. (C) Mobilityshift of cross-linked MYOG-3332 DNA by streptavidin (SA) detected by nativegel electrophoresis. Schemes at the left side of gel indicate the structure ofthe corresponding DNA bands. Open triangle indicates cross-linked DNA.

Li et al. PNAS | November 24, 2015 | vol. 112 | no. 47 | 14585

BIOCH

EMISTR

Y

Dow

nloa

ded

by g

uest

on

Nov

embe

r 30

, 202

1

Page 6: Guanine-vacancy bearing G-quadruplexes responsive to ...

One unique characteristic of the GVBQs distinctive from thecanonical G-quadruplexes is their capability to interact withguanine derivatives. By a G fill-in of a G-vacancy, the G-quartetcompletion doubles the number of Hoogsteen hydrogen bondsfrom 4 to 8 in the G-quartet (Fig. 1C), resulting in a full stackingwith the neighboring G-quartet. Therefore, the GVBQ is dra-matically stabilized by GMP/GTP and the stabilization is furtherenhanced by physiological concentration of Mg2+. In the pres-ence of 2 mM Mg2+, GTP is as effective as GMP in stabilizingGVBQs (Table S1) and affecting replication (Fig. 9). Guaninederivatives, such as GMP, GDP, and GTP, are natural metabolitesin cells. As the most dominant species, free GTP content is∼0.5 mM in animal (28) and ∼5 mM (29) in bacterial cells. FreeMg2+ content ranges from 0.15 to 6 mM in animal (25) and 2–3 mM

in bacterial cells (26). The stability and the replication of GVBQ-containing DNA could be effectively manipulated by submillimolarand millimolar GTP (Fig. 9). This fact implies that the intracellularGTP level in both animal and bacterial cells is able to influence thestability of GVBQs and likely the related cellular processes.The in vitro formation of GVBQ and G fill-in with physiological

concentration of GTP, as well as the enrichment of GVBQ-formingmotifs near the 5′ end of genes, support the formation of GVBQin cells. The ability of the GVBQs to response to changes in theconcentration of guanine derivatives suggests a possibility for theGVBQs to play environment-responsive regulation in cellular pro-cesses. On the other hand, the G fill-in may also constitute a uniquepathway for drug targeting toward such regulations. Becausetheir number in the human genome is comparable to that ofthe canonical G-quadruplexes, the GVBQs represent a uniquecategory of G-quadruplex structures distinctive from the canonicalG-quadruplexes.In our current study, only G2/3G3 DNAs were used. Sequences

with potential to form GVBQs may go beyond this rule. For in-stance, a G2 tract may combine with three G tracts with three tofour consecutive guanines (G2/3G3–4). GVBQs may also form witha G tract format of Gn-1/3Gn (n ≥ 3) with n − 1 intact G-quartetlayers. We also expect that a GVBQmay accommodate more thanone G-vacancy when having sufficient number of intact G-quar-tets. These possibilities may further diversify the formation ofGVBQs and their physiological implications.

Materials and MethodsDetails are in SI Materials and Methods, including chemicals and oligonu-cleotides, preparation of DNAs, in vitro transcription, DMS footprinting, CDspectroscopy, photo-cross-linking, EMSA, thermal melting, DNA polymerasestop assay, and computational survey.

ACKNOWLEDGMENTS. This work was supported by Ministry of Science andTechnology of China Grants 2013CB530802 and 2012CB720601 and National Sci-ence Foundation of China Grants 31470783 and 21432008.

1. Burge S, Parkinson GN, Hazel P, Todd AK, Neidle S (2006) Quadruplex DNA: Sequence,topology and structure. Nucleic Acids Res 34(19):5402–5415.

2. Balasubramanian S, Hurley LH, Neidle S (2011) Targeting G-quadruplexes in genepromoters: A novel anticancer strategy? Nat Rev Drug Discov 10(4):261–275.

3. Patel DJ, Phan AT, Kuryavyi V (2007) Human telomere, oncogenic promoter and5′-UTR G-quadruplexes: Diverse higher order DNA and RNA targets for cancer ther-apeutics. Nucleic Acids Res 35(22):7429–7455.

4. Tarsounas M, Tijsterman M (2013) Genomes and G-quadruplexes: For better or forworse. J Mol Biol 425(23):4782–4789.

5. Cahoon LA, Seifert HS (2009) An alternative DNA structure is necessary for pilin antigenicvariation in Neisseria gonorrhoeae. Science 325(5941):764–767.

6. Rodriguez R, et al. (2012) Small-molecule-induced DNA damage identifies alternativeDNA structures in human genes. Nat Chem Biol 8(3):301–310.

7. Gray LT, Vallur AC, Eddy J, Maizels N (2014) G quadruplexes are genomewide targetsof transcriptional helicases XPB and XPD. Nat Chem Biol 10(4):313–318.

8. Nguyen GH, et al. (2014) Regulation of gene expression by the BLM helicase correlates withthe presence of G-quadruplex DNA motifs. Proc Natl Acad Sci USA 111(27):9905–9910.

9. Todd AK, Johnston M, Neidle S (2005) Highly prevalent putative quadruplex sequencemotifs in human DNA. Nucleic Acids Res 33(9):2901–2907.

10. Huppert JL, Balasubramanian S (2005) Prevalence of quadruplexes in the humangenome. Nucleic Acids Res 33(9):2908–2916.

11. Agrawal P, Lin C, Mathad RI, Carver M, Yang D (2014) The major G-quadruplex formedin the human BCL-2 proximal promoter adopts a parallel structure with a 13-nt loop inK+ solution. J Am Chem Soc 136(5):1750–1753.

12. Guédin A, Gros J, Alberti P, Mergny JL (2010) How long is too long? Effects of loopsize on G-quadruplex stability. Nucleic Acids Res 38(21):7858–7868.

13. Phan AT, Kuryavyi V, Burge S, Neidle S, Patel DJ (2007) Structure of an unprecedentedG-quadruplex scaffold in the human c-kit promoter. J Am Chem Soc 129(14):4386–4392.

14. Chen Y, et al. (2012) The major G-quadruplex formed in the human platelet-derivedgrowth factor receptor β promoter adopts a novel broken-strand structure in K+solution. J Am Chem Soc 134(32):13220–13223.

15. Mukundan VT, Phan AT (2013) Bulges in G-quadruplexes: Broadening the definitionof G-quadruplex-forming sequences. J Am Chem Soc 135(13):5017–5028.

16. Zheng KW, et al. (2014) A competitive formation of DNA:RNA hybrid G-quadruplex isresponsible to the mitochondrial transcription termination at the DNA replicationpriming site. Nucleic Acids Res 42(16):10832–10844.

17. Zheng KW, et al. (2013) Co-transcriptional formation of DNA:RNA hybrid G-quad-ruplex and potential function as constitutional cis element for transcription control.Nucleic Acids Res 41(10):5533–5541.

18. Wu RY, Zheng KW, Zhang JY, Hao YH, Tan Z (2015) Formation of DNA:RNA hybridG-quadruplex in bacterial cells and its dominance over the intramolecular DNA G-quad-ruplex in mediating transcription termination. Angew Chem Int Ed Engl 54(8):2447–2451.

19. Xiao S, et al. (2014) Formation of DNA:RNA hybrid G-quadruplexes of two G-quartetlayers in transcription: Expansion of the prevalence and diversity of G-quadruplexes ingenomes. Angew Chem Int Ed Engl 53(48):13110–13114.

20. Sun D, Hurley LH (2010) Biochemical techniques for the characterization of G-quad-ruplex structures: EMSA, DMS footprinting, and DNA polymerase stop assay.MethodsMol Biol 608:65–79.

21. Kan ZY, et al. (2006) Molecular crowding induces telomere G-quadruplex formationunder salt-deficient conditions and enhances its competition with duplex formation.Angew Chem Int Ed Engl 45(10):1629–1632.

22. Guo K, Gokhale V, Hurley LH, Sun D (2008) Intramolecularly folded G-quadruplex andi-motif structures in the proximal promoter of the vascular endothelial growth factorgene. Nucleic Acids Res 36(14):4598–4608.

23. De Cian A, et al. (2007) Fluorescence-based melting assays for studying quadruplexligands. Methods 42(2):183–195.

24. Zheng KW, Chen Z, Hao YH, Tan Z (2010) Molecular crowding creates an essentialenvironment for the formation of stable G-quadruplexes in long double-strandedDNA. Nucleic Acids Res 38(1):327–338.

25. Romani A, Scarpa A (1992) Regulation of cell magnesium.Arch BiochemBiophys 298(1):1–12.26. Cayley S, Lewis BA, Guttman HJ, Record MT, Jr (1991) Characterization of the cyto-

plasm of Escherichia coli K-12 as a function of external osmolarity. Implications forprotein-DNA interactions in vivo. J Mol Biol 222(2):281–300.

27. Xiao S, Zhang JY, Zheng KW, Hao YH, Tan Z (2013) Bioinformatic analysis reveals anevolutional selection for DNA:RNA hybrid G-quadruplex structures as putative tran-scription regulatory elements in warm-blooded animals. Nucleic Acids Res 41(22):10379–10390.

28. Traut TW (1994) Physiological concentrations of purines and pyrimidines. Mol CellBiochem 140(1):1–22.

29. Bennett BD, et al. (2009) Absolute metabolite concentrations and implied enzymeactive site occupancy in Escherichia coli. Nat Chem Biol 5(8):593–599.

30. Han H, Hurley LH, Salazar M (1999) A DNA polymerase stop assay for G-quadruplex-interactive compounds. Nucleic Acids Res 27(2):537–542.

Fig. 10. Distribution of potential GVBQ-forming motifs in comparison withthe canonical G-quadruplex–forming sequences (GQ) in prokaryotic andeukaryotic genes. Motifs bearing one G2 and three G3–4 tracts, with loopsizes from one to seven nucleotides, are counted on both the nontemplate(red) and template (green) DNA strands. Survey on fungi and bacteria cov-ered genes from 53 strains and 4,222 chromosomes, respectively, whosesequences were available in the Ensembl and NCBI database. Frequency wasnormalized to the number of sequences and expressed as the number ofoccurrences in 100 sequences in a 100- (human, rat, and fungi) or 25-nt(bacteria) window.

14586 | www.pnas.org/cgi/doi/10.1073/pnas.1516925112 Li et al.

Dow

nloa

ded

by g

uest

on

Nov

embe

r 30

, 202

1