Top Banner
Plant Physiol. (1996) 112: 1151-1166 Ancestral Multipartite Units in Light-Responsive Plant Promoters Have Structural Features Correlating with Specific Phototransduction Pathways' Cerardo R. Argüello-Astorga and Luis R. Herrera-Estrella* Departamento de lngeniería Genética de Plantas, Centro de lnvestigacion y de Estudios Avanzados, Apartado Postal 629 (36500) Irapuato, Guanajuato, México Regulation of plant gene transcription by light is mediated by multipartite cis-regulatory units. Previous attempts to identify structural features that are common to all light-responsive ele- ments (LREs) have been unsuccessful. To address the question of what is needed to confer photoresponsiveness to a promoter, the upstream sequences from more than 110 light-regulated plant genes were analyzed by a new, phylogenetic-structural method. As a result, 30 distinct conserved DNA module arrays (CMAs) associated with light-responsive promoter regions were identi- fied. Severa1 of these CMAs have remained invariant throughout the evolutionary radiation of angiosperms and are conserved between homologous genes as well as between members of dif- ferent gene families. The identified CMAs share a gene superfam- ily-specific core that correlates with the particular phytochrome- dependent transduction pathway that controls their expression, i.e. ACCTA(A/C)C(A/C) for the cCMP-dependent phenylpro- panoid metabolism-associated genes, and CATA(A/T)CR for the Ca*+/calmodulin-dependent photosynthesis-associated nuclear genes. In addition to suggesting a general model for the func- tional and structural organization of LREs, the data obtained in this study indicate that angiosperm LREs probably evolved from complex cis-acting elements involved in regulatory processes other than photoregulation in gymnosperms. ~ ~~ ~ ~ ~~ ~ ~~ Photosynthetic organisms have evolved complex bio- chemical systems to perceive and respond to light of dif- ferent wavelengths. Three classes of photoreceptors have been identified in higher plants: red- and far-red-light- absorbing phytochromes, blue-light receptors, and UV- light receptors (Ahmad and Cashmore, 1993; Furuya, 1993; Quail, 1994). Light signals that are absorbed by these pho- toreceptors and transduced by associated molecular sys- tems regulate the expression of many genes at both the transcriptional and the posttranscriptional levels (Sil- verthorne and Tobin, 1984; Gallie, 1993). Some light- responsive genes, such as those encoding the small subunit of Rubisco (rbcS genes) and chalcone synthase (chs genes) enzymes, are dependent on more than one photoreceptor, This work was supported in part by grant no. 75191-526901 from the Howard Hughes Medica1 Institute to L.R.H.-E. G.R.A.-A. is indebted to the Consejo Nacional de Ciencia y Tecnología- Mexico for a doctoral fellowship. * Corresponding author; e-mail lherrera8mvaxl.red.cinvestav. mx; fax 52-462-4-58-49. whereas others are regulated by a single type of photosys- tem (reviewed by Thompson and White, 1991). Recently, important progress in the elucidation of the signal transduction pathways linking one of these photo- receptors (phytochrome)with the expression of some genes has been achieved (reviewed by Bowler and Chua, 1994). Three different phytochrome-associated transduction path- ways have been proposed: the first is dependent on cGMP, which activates the genes encoding anthocyanin bio- synthetic enzymes (e.g. cks); the second is dependent on calcium/calmodulin, which regulates a subset of chloroplast-associated genes (e.g. rbcS); and the third is dependent on both calcium and cGMP, which controls another subset of genes that are involved in chloroplast development (Neuhaus et al., 1993; Bowler et al., 1994a, 1994b). It is conceivable that these transduction pathways target different transcription factors, but, to our knowl- edge, this is not yet known. In fact, despite extensive studies of light-responsive gene promoters and the identi- fication and functional characterization of a plethora of cis-regulatory elements and trans-acting factors involved in their regulation (reviewed by Batschauer et al., 1994; Tobin and Kehoe, 1994; Terzaghi and Cashmore, 1995), unequiv- oca1 experimental evidence indicating an essential role in phytochrome responsiveness exists for only two short pro- moter subregions from genes encoding chlorophyll alb- binding proteins (cab), namely the LS5-LS7 region from the Lemna cabAB19 gene (Kehoe et al., 1994) and the CGF-1 factor-binding site from the Arabidopsis cab2 gene (Ander- son and Kay, 1995). These two regions were unable to direct detectable transcriptional activity by themselves, in the context of homologous or heterologous minimal pro- moters (Anderson et al., 1994; Kehoe et al., 1994), suggest- ing that additional cis-regulatory elements are involved in mediating photoresponses in these genes. These and many other experimental data have led to the general hypothesis that the plant LREs are actually com- posite elements, i.e. aggregates of cognate sequences for different transcription factors, which interact to regulate gene expression (Schulze-Lefert et al., 1989; Terzaghi and Cashmore, 1995). This hypothesis has been confirmed for Abbreviations: CMA, conserved modular arrangement; IB, I- box; IBF, IB-binding factor; LRE, light-responsive element; PhANGs, photosynthesis-associated nuclear genes; PheMAGs, phenylpropanoid metabolism-associated genes. 1151
16

Ancestral Multipartite Units in Light-Responsive Plant Promoters ...

Feb 11, 2017

Download

Documents

phamtram
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Ancestral Multipartite Units in Light-Responsive Plant Promoters ...

Plant Physiol. (1996) 112: 1151-1166

Ancestral Multipartite Units in Light-Responsive Plant Promoters Have Structural Features Correlating with Specific

Phototransduction Pathways'

Cerardo R. Argüello-Astorga and Luis R. Herrera-Estrella*

Departamento de lngeniería Genética de Plantas, Centro de lnvestigacion y de Estudios Avanzados, Apartado Postal 629 (36500) Irapuato, Guanajuato, México

Regulation o f plant gene transcription by l ight i s mediated by mult ipart i te cis-regulatory units. Previous attempts to identify structural features that are common to al l light-responsive ele- ments (LREs) have been unsuccessful. To address the question of what i s needed to confer photoresponsiveness to a promoter, the upstream sequences from more than 110 light-regulated plant genes were analyzed by a new, phylogenetic-structural method. As a result, 30 distinct conserved DNA module arrays (CMAs) associated w i th light-responsive promoter regions were identi- fied. Severa1 o f these CMAs have remained invariant throughout the evolutionary radiation of angiosperms and are conserved between homologous genes as well as between members of dif- ferent gene families. The identif ied CMAs share a gene superfam- ily-specific core that correlates w i th the particular phytochrome- dependent transduction pathway that controls their expression, i.e. ACCTA(A/C)C(A/C) for the cCMP-dependent phenylpro- panoid metabolism-associated genes, and CATA(A/T)CR for the Ca*+/calmodulin-dependent photosynthesis-associated nuclear genes. I n addition to suggesting a general model for the func- tional and structural organization of LREs, the data obtained in this study indicate that angiosperm LREs probably evolved from complex cis-acting elements involved in regulatory processes other than photoregulation i n gymnosperms.

~ ~~ ~ ~ ~~ ~ ~~

Photosynthetic organisms have evolved complex bio- chemical systems to perceive and respond to light of dif- ferent wavelengths. Three classes of photoreceptors have been identified in higher plants: red- and far-red-light- absorbing phytochromes, blue-light receptors, and UV- light receptors (Ahmad and Cashmore, 1993; Furuya, 1993; Quail, 1994). Light signals that are absorbed by these pho- toreceptors and transduced by associated molecular sys- tems regulate the expression of many genes at both the transcriptional and the posttranscriptional levels (Sil- verthorne and Tobin, 1984; Gallie, 1993). Some light- responsive genes, such as those encoding the small subunit of Rubisco (rbcS genes) and chalcone synthase (chs genes) enzymes, are dependent on more than one photoreceptor,

This work was supported in part by grant no. 75191-526901 from the Howard Hughes Medica1 Institute to L.R.H.-E. G.R.A.-A. is indebted to the Consejo Nacional de Ciencia y Tecnología- Mexico for a doctoral fellowship.

* Corresponding author; e-mail lherrera8mvaxl.red.cinvestav. mx; fax 52-462-4-58-49.

whereas others are regulated by a single type of photosys- tem (reviewed by Thompson and White, 1991).

Recently, important progress in the elucidation of the signal transduction pathways linking one of these photo- receptors (phytochrome) with the expression of some genes has been achieved (reviewed by Bowler and Chua, 1994). Three different phytochrome-associated transduction path- ways have been proposed: the first is dependent on cGMP, which activates the genes encoding anthocyanin bio- synthetic enzymes (e.g. cks); the second is dependent on calcium/calmodulin, which regulates a subset of chloroplast-associated genes (e.g. rbcS); and the third is dependent on both calcium and cGMP, which controls another subset of genes that are involved in chloroplast development (Neuhaus et al., 1993; Bowler et al., 1994a, 1994b). It is conceivable that these transduction pathways target different transcription factors, but, to our knowl- edge, this is not yet known. In fact, despite extensive studies of light-responsive gene promoters and the identi- fication and functional characterization of a plethora of cis-regulatory elements and trans-acting factors involved in their regulation (reviewed by Batschauer et al., 1994; Tobin and Kehoe, 1994; Terzaghi and Cashmore, 1995), unequiv- oca1 experimental evidence indicating an essential role in phytochrome responsiveness exists for only two short pro- moter subregions from genes encoding chlorophyll alb- binding proteins (cab), namely the LS5-LS7 region from the Lemna cabAB19 gene (Kehoe et al., 1994) and the CGF-1 factor-binding site from the Arabidopsis cab2 gene (Ander- son and Kay, 1995). These two regions were unable to direct detectable transcriptional activity by themselves, in the context of homologous or heterologous minimal pro- moters (Anderson et al., 1994; Kehoe et al., 1994), suggest- ing that additional cis-regulatory elements are involved in mediating photoresponses in these genes.

These and many other experimental data have led to the general hypothesis that the plant LREs are actually com- posite elements, i.e. aggregates of cognate sequences for different transcription factors, which interact to regulate gene expression (Schulze-Lefert et al., 1989; Terzaghi and Cashmore, 1995). This hypothesis has been confirmed for

Abbreviations: CMA, conserved modular arrangement; IB, I- box; IBF, IB-binding factor; LRE, light-responsive element; PhANGs, photosynthesis-associated nuclear genes; PheMAGs, phenylpropanoid metabolism-associated genes.

1151

Page 2: Ancestral Multipartite Units in Light-Responsive Plant Promoters ...

1152 Argüello-Astorga and Herrera-Estrella Plant Physiol. Vol. 1 1 2, 1996

both the so-called unit 1 from parsley and mustard chs genes (Schulze-Lefert et al., 1989; Rocholl et al., 1994; Kaiser et al., 1995) and for the I-G unit from tobacco rbcS genes (Argüello-Astorga and Herrera-Estrella, 1995), which are the shortest native sequences from phytochrome-dependent genes that have been shown to function as LREs in gain-of-function experiments. More- over, the presumed composite structure of LREs may explain why genes that are activated by the same phytochrome-associated transduction pathway, such as the rbcS and cab genes, display such marked differences in their responses to light in terms of the intensity and spectral quality required for their activation, time course and leve1 of the induction, and phytochrome escape kinetics (White et al., 1992, 1995; Lubberstedt et al., 1994a). It has been proposed that a combinatorial interaction, which is characteristic of complex elements, may allow genes to be induced with different kinetics in response to the same stimulus (Hill and Treis- man, 1995).

It is foreseeable that the identification and functional characterization of other minimal, composite photore- sponsive units, which are analogous to the chs unit 1 and rbcS I-G unit, will allow us to address unresolved ques- tions that are relative to the photocontrol of the tran- scriptional processes in plants: What is needed to confer phytochrome responsiveness to a promoter? Why do LREs from related genes, or independent LREs from the same promoter, differ both in sequence and in overall functional properties? Do phytochrome-associated trans- duction pathways target diverse transcription factors, or do they interact only with a set of related regulatory proteins? Therefore, the systematic comparison of DNA motif combinations, which function as LREs in genes that are dependent on the same signal transduction path- way, might reveal structural features that are common to a11 of them, providing important clues in understanding the molecular events underlying the coordinated regu- lation of their expression by specific photoreceptors.

We recently identified a composite light-responsive unit from rbcS genes using a new approach that combines the phylogenetic-structural analysis of promoter sequences with gain-of-function experiments (Argiiello-Astorga and Herrera-Estrella, 1995). The same method of sequence anal- ysis was successfully used to identify the cognate se- quences for the replication-associated proteins and the functional target for a late-expression trans-activating fac- tor in the Geminivirideae virus family (Argiiello-Astorga et al., 1994). In the present study we have utilized this phylogenetic-structural approach to analyze the promoter region from more than 100 light-inducible plant genes, in an attempt to delimit conserved DNA motif arrays (i.e. putative composite elements) potentially involved in tran- scriptional photoresponses. As a result, 30 different DNA motif arrangements, which are conserved in diverse plant lineages, were identified. We show that severa1 subsets of these CMAs are related both structurally and evolutionar- ily and that their presence correlates with experimentally defined light-responsive promoter regions. These CMAs

share a gene superfamily-specific core sequence, which, in turn, correlates with the phototransduction pathway that controls the expression of each gene group. Based on data derived from this analysis, we propose a general hypoth- esis about the structural and functional organization of LREs in plant genes.

MATERIALS AND METHODS

General Approach

Two hypotheses were used as heuristic assumptions in the present analysis: (a) LREs in plant genes are composite elements, i.e. multipartite cis-regulatory units, and (b) the specific combinations, spacing, and / or relative orientation of individual factor-binding sites, which constitute a mul- tipartite LRE, are critica1 determinants for its function, and, therefore, these modular arrangements should be con- served in evolution as a unit. These hypotheses were the basis for the search of CMAs, which are consistently asso- ciated with photoresponsive promoter regions. Two major groups of light-inducible plant genes were included in this study: (a) PhANGs, a superfamily of chloroplast protein- encoding genes, and (b) PheMAGs, which are involved in the biosynthesis of photoprotective pigments (e.g. antho- cyanins) and in the active defense reactions of plants (reviewed by Hahlbrock and Scheel, 1989). Phytochrome control of PhANGs is exerted via calcium 1 calmodulin- or cGMP 1 calcium-dependent transduction pathways, whereas PheMAGs (eg. cks) are apparently regulated via the cGMP-dependent transduction pathway (Bowler et al., 1994a, 199413).

A systematic comparative analysis of promoter regions from PhANGs and PheMAGs, in which the existence of an LRE has been experimentally established, was conducted. Short promoter regions, including at least two different DNA stretches larger than 6 bp (putative individual factor- binding sites or "phylogenetic footprints" [Gumucio et al., 1992]), in which nucleotidic sequence, spacing, and posi- tion relative to the transcription start site are conserved in a phylogenetic series, were defined as CMAs. An identified CMA was considered significant when it was conserved between distantly related plant species, because divergence in nonrelevant promoter segments is expected to increase with time from a specific plant lineage split event (Avise, 1994). Since protein-protein interactions between trans- acting factors facilitate binding to imperfect target sites (i.e. cooperative binding; Miner and Yamamoto, 1991; Wright and Funk, 1993), stringent sequence conservation of indi- vidual DNA motifs within a CMA was not expected.

Sequences

The sequences used were obtained from the GenBankI EMBL nucleotide sequence databases. Sequences were identified by a keyword search. Published compilations of promoter sequences from specific gene groups were also utilized (Manzara and Gruissem, 1988; Mitra et al., 1989; Piechulla et al., 1991), and the upstream sequences from

Page 3: Ancestral Multipartite Units in Light-Responsive Plant Promoters ...

Evolution of Transcriptional Light-Responsive Elements 1153

more than 90 chloroplast-associated nuclear genes and more than 25 PheMAGs were analyzed. The GenBank/ EMBL accession numbers or references for sequences from the genes that are quoted in the figures are as follows. rbcS genes: cotton rbcS1, X54091; tomato rbcS1, M13542; tomato rbcS-2, X66069; and tomato rbcS3A, S44160; potato rbcS 2, X69760; Nicotiana plumbaginifolia rbcS-8B, Poulsen and Chua, 1988; pea rbcS-3A, M21356; sunflower rbcS1, Y00431; Arabidopsis rbcS3B, X13611; soybean rbcS SRS4, M16889; Larix laricina rbcS1, X16039; wheat rbcS1, M37328; Lemna SSdA, S45166; maize rbcS Zm-3, U09743. Cab genes: Ara- bidopsis cab-1, cab-2, and cab-3 (Mitra et al., 1989); pea ABSO, K02067; tobacco cabE, X12512; wheat cab-1, X05823; maize cnbM7, X53398; maize cab 48, X63205; Lemna AB19, M12152; rice cab-2R, X13909; Pinus tkunbergii cab-6 (Kojima et al., 1992). Gap genes: Arabidopsis GapA, L14743; Arabi- dopsis GapB, L14749; pea gpal, X52148; maize Gpal, X15408. Fed genes: Arabidopsis Fd2, X51370; pea Fed-1, X14207; wheat PetF, X75089. Pc genes: pea Pc, S66544; spinach petE, X52288. p p d k genes: Flaveria pdk, X79095; maize C4ppdkZm1, S46964. rbcA genes: Arabidopsis rbcA, A86720; barley rcaB, M55449; spinach rbcA, S45033. SBP genes: Arabidopsis SBP, S74719; wheat SBP, S63737. Ls genes: potato ST-LS1, X04753. Atp genes: spinach AtpC, X76131; spinach AtpD, X61362. CHS genes: Antirrkinum CHS, X03710; Arabidopsis CHS, M20308; barley CHS, X58339; carrot CHS2, D16255; parsley CHS1, M35515; maize C2, X60205; mustard CHS1, X16437; soybean cks15, X16184. PAL genes: parsley PAL-1, X15473; pea PAL-l (Yamada et al., 1992); tomato Pal-1, M90692. 4cZ genes: parsley 4CL-1, X05350; potato 4CL-1, M62755.

Sequence Alignments

Sequence alignments were carried out with the Gene- works 2.0 package (Intelligenetics, Mountain View, CA), followed by a manual adjustment when necessary. The general procedure consisted of searching for significant similarities between a specific LRE-containing promoter region with structurally equivalent segments in homolo- gous genes from closely related plant species. Subse- quently, comparisons with analogous regions from dis- tantly related species were carried out. Adjustments in the alignments were made to maximize modular similarity in a phylogenetic series. Discontinuous sequence similarity was not considered significant, and gaps were allowed only between conserved DNA stretches of more than 6 contig- uous bp, which defines phylogenetic footprints (Gumucio et al., 1992). AI1 of the alignments are available from the authors upon request.

RESULTS

Phylogenetic Series of PhANG and PheMAG Minimal Photoresponsive Units

To establish the criteria for significant modular similarity between functionally analogous promoter regions, a com- parative analysis of the DNA segments that were homolo- gous to a well characterized light-responsive unit from

both a PheMAG and a PhANG, respectively, was con- ducted, which included the unit 1 from parsley and mus- tard cks genes (Schulze-Lefert et al., 1989; Rocholl et al., 1994) and the I-G unit from tobacco rbcS genes (Argüello- Astorga and Herrera-Estrella, 1995).

The search for sequences that were modularly similar to unit 1 in more than 15 cks genes from 12 plant species led to the unexpected finding that they are present in only the genes of plants belonging to two unrelated families, Bras- sicaceae and Gramineae (Fig. 1A). This bizarre result is, however, in complete agreement with phylogenics derived from CHS protein sequences, which classify the cks genes of the above-mentioned two plant families in a cluster that is separate from other dicotyledonous cks genes (Durbin et al., 1995). Alignment of the unit 1 homologous regions allowed us to distinguish three simple, conserved modules or phylogenetic footprints (Fig. lA), one of them corre- sponding to the parsley chs box I1 (a G-box element) and the other two partially comprising the box I element (Schulze-Lefert et al., 1989), suggesting that the latter is a combined binding site for a heteromeric complex rather than for a single regulatory protein. Sequence conservation of DNA modules 1 and 2 is relatively high, but consider- able degeneration in module 3 (box 11) is appreciated in this phylogenetic series. It is interesting that the similarity be- tween particular modules is not related to the phylogenetic distance between species (e.g. in Fig. 1A, compare modules 1 and 3 from parsley unit 1 with those from maize and mustard genes).

The search for promoter regions that were structurally similar to the tobacco I-G unit in rbcS genes from 30 species of dicotyledons, 8 monocotyledons, and 1 gymnosperm revealed that they actually exist in at least one member of the multicopy-gene family in all of the examined dicotyle- donous species, in one monocotyledonous plant (maize), and in the conifer L. laricina. Alignment of the I-G unit homologous regions showed that the heptameric IB core sequence GATAAGR (Giuliano et al., 1988) is conserved throughout all of the phylogenetic series (with the Larix gene exception; see Fig. 1B). It is interesting that one addi- tional phylogenetic footprint adjacent to the IB core motif was found in the genes from dicotyledons (GA-motif in Fig. 1B). Together, the IB core sequence and GA-motif constitute the binding site for a light-modulated nuclear factor IBF-la (Borello et al., 1993). Spacing between IB and G-box elements is not strictly conserved in evolution, al- though it is less variable within certain plant lineages such as the Solanaceae (not shown). Close analysis of the DNA segments that separate the IB and G-box revealed that members of the multicopy rbcS family display divergent nucleotidic sequences, which are conserved in orthologous genes from diverse plant species (Fig. 1C). Therefore, they should be properly recognized as “family member- specific” motifs, which might be cis-acting elements that confer additional regulatory properties to the basic, struc- turally invariant, I-G functional unit. This was suggested by the recent finding of a fruit-specific nuclear factor that binds to an IB-overlapping element (F-box) in tomato rbcS-3A (Meier et al., 1995).

Page 4: Ancestral Multipartite Units in Light-Responsive Plant Promoters ...

1154 Argüello-Astorga and Herrera-Estrella Plant Physiol. Vol. 112, 1996

Figure 1. A, Comparison between promoter re- gions from chs genes that are structurally similar to unit 1 of the parsley chs gene. B, Short phy- logenetic series of the (IB)-(G-box) region from rbcS genes. C, Region of the I- and G-box ele- ments from three tomato rbcS genes that illus- trates the distinction between invariant and vari- able conserved DNA modules. Mismatches between aligned sequences are denoted by as- terisks (*). Delimitation of chs box I and box 11, the IBF-la-binding site, and the F-box is accord- ing to Schulze-Lefert et al. (1 9891, Borello et al. (19931, and Meier et al. (19951, respectively. References for gene promoter sequences are in- dicated in "Materials and Methods."

chs -Unit 1 A module 3 module 2 module 1 - , ,I

Parsley CHS I

Arab ido psis CHS

Barley CHS (Monoeots) I I I I I I I I I I

I I I I I I

I I I I I I I I I I I I I - Box II

I I

Box I

-118

-99

-61

-53

-74

I B rbcS -(I-G) Unit

G y m n o s p e r m s [ LQrk rbcs

Monocots [ Maine rbcSZm3

Tob aeeo rbe S-1 Dicots

"GA-mo tif" I- b O x G - b o x - JBF-la binding site

Family member- Homo logous Genes var i a spe c i f ic modul e In v a r i a m t

C Mod "k 1 Potato rb eS3

Toma to rb cSI GGATGAGATAAGATT CACCATATTCCG ACACGTGGCACC ?i. t . I . I t I' I Pea rbcS3A

Tobaeeo rbeS-SB

F-box I - b o x ( r b e s . 3 ~ ) G-box

Potato rbcSZ

Phylogenetic-Structural Analysis of PhANG Promoters

Comparative analysis of multicopy-gene families (eg. cab and rbcS) is complicated by the fact that numerous conserved sequence motifs may be found within their pro- moters (Manzara and Gruissem, 1988; Piechulla et al., 1991), although not all of them are involved in photoregu- lation. Therefore, analysis should be focused on the search for the minimal combination of sequence elements that are invariant components of a series of homologous promoter regions, in which an LRE has been experimentally identi- fied. For example, a number of works have established the existence of a photoresponsive element within the proxi- mal segment from three different Arabidopsis cab promot- ers (Ha and An, 1988; Mitra et al., 1989; Anderson et al., 1994; Kenigsbuch and Tobin, 1995). Comparative analysis of these regions and their counterparts in genes from spe- cies belonging to the three main higher plant lineages allowed us to identify six conserved DNA modules (Fig. 2). Only some of them, however, are invariant components of these regions and correlate with all of the experimentally defined LREs in orthologous cab genes. Therefore, the ar- rangement, including these invariant DNA modules, was defined as an LRE-associated CMA (Fig. 2).

Analysis of promoter sequences from approximately 40 rbcS, approximately 35 cab, and more than 25 additional PhANGs led to the identification of 24 LRE-associated modular arrangements. Some of these CMAs are conserved from gymnosperms to monocotyledons ( e g cab-CMAl and rbcS-CMA5 [IG-unit]), whereas others are present in a single plant family (e.g. rbcS-CMA2). The identified CMAs are generally located within the first 250-nucleotide pro- moter segment and are composed of two or three small conserved DNA modules (Fig. 3).

The observation that all of the shortest PhANG pro- moter regions, where the presence of a light-responsive unit has been experimentally established, contain one CMA is relevant. These are the tobacco rbcS gene 52-bp I-G region (Argiiello-Astorga and Herrera-Estrella, 1995), a 56-bp DNA segment from the maize C4ppdkZml (encod- ing pyruvate, Pi dikinase) gene (Sheen, 1991), a 65-bp trun- cated pea rbcS 3A gene promoter (Kuhlemeier et al., 1989), a 72-bp region from the spinach rbcA (encoding Rubisco acti- vase) gene promoter (Orozco and Ogren, 1993), a 78-bp seg- ment from the Arabidopsis cab2 gene (Anderson et al., 1994), and an 89-bp region from the spinach plastocyanin gene promoter (Lubberstedt et al., 199413; Fig. 3). Moreover, the

Page 5: Ancestral Multipartite Units in Light-Responsive Plant Promoters ...

Evolution of Transcriptional Light-Responsive Elements

Modular Arranaement

Mustard Cabl

S

Petunia Cab 22R Maize cab- 1

m mm Tomato eab l a

m mmmm

Tomato cab 3s

MONOCOTS

Angi os perm C o m o n Ancestor

Phy to chr o me Responsiveness? 4- Pinu s thunb ergii eab- 6

ccl 25 nt m GYMNOSPERMS

1 Ph y lo gene t ic-St ruc tural

Analvsis

AB30

o m

I

I I I

+ I

c Theoretical Definition of

Basic Structural Unit associated to a IRE

CMA- 1

1155

Figure 2. Phylogenetic-structural analysis of cab genes. Six different conserved DNA motifs present within the TATA- proximal region of cab type 1 genes were identified. Only some of them correlate well with experimentally defined LREs. Thus, module 6 lies outside of the minimal light-responsive regions from Arabidopsis cab2 and cab3 (Mitra et al., 1989; Anderson et al., 1994), whereas module 5 was excluded from engineered Arabidopsis cabl gene promoters without an effect on its photoresponsiveness (Ha and An, 1988). Since a cab gene (P. thunberghii Cab 6) containing only modules 1 and 4 seems to be nonphotoregulated (Kojima et al., 1992, 1994), the minimal D N A modular array, which is consistently associated with light-responsive regions, is the one constituted by modules 1 , 2, and 4. Individual conserved DNA modules are: 1 to 3, GATA motifs (Gidoni et al., 1989); 4, CCAAT box; 5 , motif related to the ABF-2-binding site (Argüello et al., 1992) and contained within the region bound by CA-1, CUF2, and CUF3 factors (Sun et al., 1993; Carré and Kay, 1995); 6, CUF-1 binding-site (Anderson et al., 1994).

region made up of the two 10-bp sequences (LS5 and LS7) involved in phytochrome responsiveness of the Lemna cabAB19 gene (Kehoe et al., 1994) also corresponds to a monocotyledon-specific CMA (cab-CMA4). An additional CMA, AtpCD-CMA*, was defined by its modular similar- ities between short, photoresponsive regions from two co- ordinately regulated spinach genes (AtpC and AtpD) en- coding different subunits of the chloroplastic ATP synthase and for which there are no homologous genes in which this sequence is known (Bolle et al., 1996).

Analysis of PheMAG Promoters

Three families of genes involved in the phenylpropanoid biosynthetic pathway have been studied in relation to the photocontrol of gene expression: chs, pal (encoding the phenylammonia-lyase enzyme), and 4cl (encoding the 4- coumarate:CoA ligase enzyme). Comparative analysis of the upstream sequences from more than 25 PheMAGs from both monocotyledonous and dicotyledonous plant species allowed us to identify six CMAs. Their modular composi- tion, arrangement, and localization within the promoter are

summarized in Figure 3A. Most of the PheMAG CMAs are localized in the promoter proximal region, and their mod- ular components include several previously described cis- acting elements (see legend to Fig. 3). A remarkable fact is that five of the identified PheMAG CMAs completely over- lap with the UV-light-induced in vivo footprints found in parsley cks, pal, and 4-cl gene promoters (Lois et al., 1989; Schulze-Lefert et al., 1989). The remaining CMA, namely chs-CMA3, is contained within light-responsive regions from both soybean chsl5 and Antirrhinum Chs promoters (Lipphardt et al., 1988; Wingender et al., 1990).

Phylogenetic conservation of these CMAs is apparently high because they are present in PheMAGs from a11 of the examined plant species. Thus, pal-CMAl was found in genes from such diverse plant species as Arabidopsis and parsley (Brassicaceae), tomato (Solanaceae), pea and bean (Fabaceae), and poplar (Salicaceae) (shown in Fig. 3A only in part). Likewise, chs-CMA3 was found in genes from several dicotyledonous species such as carrot (Apiaceae); Antirrhinum (Scrophulariaceae); bean, pea, soybean, and Trifolium (Fabaceae); and petunia (Solanaceae). In the most

Page 6: Ancestral Multipartite Units in Light-Responsive Plant Promoters ...

1156 Argüello-Astorga and Herrera-Estrella Plant Physiol. Vol. 11 2, 1996

~

Figure 3. Legend appears on facing page.

odd case, both cks-CMAl (unit 1) and cks-CMA2 were found in genes from parsley (a dicotyledon) and barley (a monocotyledon), whereas in the genes of their close rela- tives, mustard and Arabidopsis, and maize, respectively, only cks-CMAl is present. Phylogenetic and evolutionary implications of these observations will be discussed else- where (G.R. Argiiello-Astorga and L.R. Herrera-Estrella, unpublished data).

CMAs from Different Cenes Are Structurally Similar

Genes involved in the same physiological process, such as diverse PheMAGs and PhANGs, generally display a coordinated pattern of expression. Hence, it is anticipated that different members of those gene superfamilies will display certain similarities in promoter architecture, par- ticularly in the regions that mediate responses to light- and chloroplast-derived signals. A systematic structural com- parison of identified CMAs was conducted to verify this prediction. The analysis revealed the existence of striking structural analogies between CMAs from severa1 unrelated genes. Examples are rbcS-CMA5 and fed-CMAl, which have an (1B)-(G-box) basic arrangement; rbcS-CMA4 and cab-CMA2, which have a (G-box)-(IB) arrangement; rbcS- CMA3 and Ls-CMA1 with a (GT-1)-(1B)-(GT-1) module array; and rbcS-CMAl, Pc-CMA1, and atpCD-CMA*, a11 of which display a "LAMP" element (actually an inverted IB

motif; Grob and Stuber, 1987) that is closely associated with the TATA box (Fig. 4A).

In certain cases the overall sequence similarity between CMAs from heterologous promoter segments is even higher than that observed between equivalent regions from homologous genes. For example, pea cabAB80 CMA2 dis- play 56% of sequence identity with soybean cab 4 CMA2 and 61% with the unrelated pea rbcS3A CMA4 (not shown). Structural resemblances between CMAs from non- homologous genes suggest that similar, multicomponent regulatory complexes could assemble on these composite elements coordinately to activate (or repress) the expres- sion of those genes.

Similarities in promoter architecture between coordi- nately expressed pal and 4cl genes have been previously observed (Lois et al., 1989; Logemann et al., 1995) and were confirmed in this analysis.

Different CMAs Are Associated with Redundant LREs

Severa1 phytochrome-responsive gene promoters contain multiple LREs that can function independently of one an- other. Such multiple LREs are generally deemed to be "redundants," but experimental observations suggest that their overall functional properties are not identical. For example, an LRE-containing proximal region from the maize C4ppdkZml promoter displays activity that is depen-

Page 7: Ancestral Multipartite Units in Light-Responsive Plant Promoters ...

Evolution of Transcriptional Light-Responsive Elements 1157

Ls-CMA1

cab-CMA2 Pra AB80

- ' 6 2 WGCAGCATTGGATACAA I ~ 1 ~ ' 2 r G- b o 1.1 I k c I - b u l

Brassicaceae Potato ST-LSl Stockhaus et al., Solanaceae -210 to -96 1989

cab-CMA3

-1.8 -1 GTGGATATTATATI-1 6 - [CAGACGTGGCAI

N.plumbegini fo l iaCab-E

I - b o r . 1 1 k e Bor n m o d u l e G - b a r

( I " " )

Pc-CMAI Pea plastocyanin (PC)

p E i q . 2 e -7.- "ATCC motlT' lAMP c l tment TATA bor

CMA

Ls-CMAI Polatn ST-ISI

LTGATAATGTI TGGTAATAA ITATTATCTAa I.%-- 7 fTGCCACGTGTCA] ." & b o i l ikr Bor U llke I-boX l lke cT.1 .lcment C - b o r

( i " " . )

References LRE-cont a in ing r e g i o n Plant Lineage

RRdk-cMA2

AtpCD-CMA*

gapB-CMAl Arobidopsir CopB

- 1 4 a I ACAGAATCTTATCC~. 9 -m CT IACATAATAGCCACATATT l A M P - I I k c m o l i r Bax-n llke (inr.)

Shean, 1991 Asteraceae Maize C4ppdkZml Poaceae -347 to -10 9

puenopodiaceae 'pinach A*pC Bolle 1996 et a1. -73 to +17 3

gapAB-CMA2 A r o b i d o p r b GopA

-' ' ' r ( m I C A G ( C C T T A T C C T ( I- b o x.11 k r

(I"".) C A G . m o f i f G n p - b o i

ppdk-CMAl Maize C4 ppdk7ml

., 1

-'O'-. 1 8 .-. 3 I . piÁGq. 5 .- Lbox I-bor llke TATA box " C A T T - m o l l r "

ppdk-CMAP Raveria ppdk

. ' 6 9 e]- 5 -[CCATiTCCAAT I CCTGAACGC ITTIGGAGACGCTTTGTI-'JJ -- b b o r like Lbox like (inv.1 ~ ~ E T G . m o l l f ~ ~ "GGA-mOtlf'

AtpCD-CMA*

Oinach AtpC - 1 5

[ A C T T T A C C T C C U T C A 1 1 O --- O - m- 6 -1 CAACAAAAACCT I [ATTTATCCTCCAAAAATCAI . O - -1- 6 8 .-- 1 -1 CAATCAAAACCT 1 Oinach AtpD . -21

" A A A C - m o f l f " ( T A T I h"., I " D R E P - m o d u l e " "TCCC-motif" , , L A M P - e l r m e n l

I I , I

S O G M W 2 SOGMEM 1

Maize rbcS-Zml Schlfner E rbcS-CWi61 Monocots I

-182 to -44 (Poaceae) I Sheen, 199 1 "1 KYmn et al.,

Pea AB80 Argiiello-A. cab-cm2 1 I -200 to -100 1 (unpublisheddata) I I gpdk-cml 1 Asteraceae Poaceae I Maize -108 C4ppdkZml to -52 I Sheen, 1991 I

Figure 3. (Continued from facing page,) CMAs from diverse PhANGs and PheMAGs. Structurally homologous segments of light-responsive promoters are aligned, and conserved DNA modules are enclosed in rectangles, being denominated according to diverse authors (see below). Previously undesignated modules with no homology to known regulatory elements are arbitrarily denominated, using (") for denotation. DNA modules marked with an asterisk (*) contain sequences related to either the IB core or the LAMP element, and the ones marked with a plus (+) contain sequences related to the H-box core. Position of aligned sequences relative to the transcription start point is indicated. When this has not been defined, numbering was conventionally made defining -30 to the 5 ' end of the putative TATA box and is indicated by an asterisk on the number. Plant lineages in which identified CMAs were found are listed, and one specific CMA-containing region where an LRE has been experimentally identified is also indicated. #, Light-inducible DNase hypersensitive site; ##, light-modulated in vivo footprinted region. A, Two or three homologous CMAs are aligned to show variations in spacing and nucleotidic sequence of the individual DNA modules. B, Only one representative CMA is shown. Module denominations: LAMP element, Grob and Stuber, 1987; rbcS motif 15 and 17, Manzara and Gruissem, 1988; box II to 111, Green et ai., 1987; 3AF3 and 3AF5 sites, Sarokin and Chua, 1992; G-box and 16, Giuliano et al., 1988; M consensus, Schafner and Sheen, 1991; Lemna X- and Y-boxes, Buzby et al., 1990; GATA motifs, Gidoni et al., 1989; Gap-box, Kwon et al., 1994; H-box, Loake et al., 1992; L-box, Logemann et ai., 1995; box 4, Yamada et al., 1992; PC2 region, Lubberstedt et al., 199413. "DREP module" is a conserved sequence including the AAAAT motif, which is critical for functional interactions between AtpC promoter and a putative in-dark repressor (Bolle et al., 1996). References for promoter sequences are in "Materiais and Methods."

dent on both chloroplast development and light pulses, whereas a second, more upstream region directs an expres- sion program that is not dependent on mature chloroplasts and requires continuous illumination (Sheen, 1991). Subtle functional differences between photoresponsive units 1

and 2 from the parsley chs gene (Schulze-Lefert et al., 1989; Block et al., 1990) and three LRE-containing regions from the pea rbcS 3A gene promoter (Kuhlemeier et al., 1988, 1989; Gilmartin and Chua, 1990) also have been identified. A distinct structural composition and organization of re-

Page 8: Ancestral Multipartite Units in Light-Responsive Plant Promoters ...

1158 Argüello-Astorga and Herrera-Estrella Plant Physiol. Vol. 11 2, 1996

dundan

A Potato ST-LSl CMA-1 mmm

( i n r e r t e d ) -96 TGACACGTGGCACCCCTTC - GTGGATTGCTTC IACSJAAT- ATTATTACC - 1 4 2

r b c S - 2 . 3 ~ ~ . TGACACGTGGCACCCCTTC T GTGGCTTAATTA dCA- ATTATTAGC - 3 3 4 '

(region "f

I I I I I I I I I I I I I I I I I I I I I I I I I I1 I111111 I

Box II I-box like BoX m CMA-5 and -3) G-box mot i f

.- Pea r b c S 3 A CMA-4

A B 8 0 CMA-2

G-box I-box

Sui n ach A t p C

A f p D

r b c S - 1 ( CMA- 1 )

DRFP module

Sai n ac h

( P c -CMAl)

DRFP module I

rbc S prom oter s cab promoters I I Solanaceae Poaceae Tomato r b c E Z

p p d k promoters p p d k promoters

Poaceae Asteraceae LIsirr CIppdWml F,."*,,# pdk

CEIE? 2 1 2 1

Poaceae Asteraceae LIsirr CIppdWml F,."*,,# pdk

CEIE? 2 1 2 1

Poaceae Wheat cab i

Solanaceae Tobeoco Cab-E

Fabaceae F%a f f l80

chs promoter s chs promoter s Brassicaceae

PP'flaY CHS Poaceae

e4,l.Y ms I Poaceae e4,l.Y ms

Brassicaceae PP'flaY CHS I

250 40 225 135

2 1 2 1 I 250 40 225 135

2 1 2 1 I Figure 4. A, Comparison of CMAs from unrelated genes in the same plant species. The promoter region from potato rbcS 2 comprised CMA3 and the G-box module of the adjacent CMA5. The Ls-CMAI sequence is inverted. The DREP module is as in Figure 38 and includes sequences presumably essential for binding at DNA of a putative repressor in the dark. B, Simpiified schemes of plant promoters harboring two or more CMAs. These are identified by a number below the bar. The position of CMAs relative to the transcription start site is also indicated.

LREs might explain the functional differences, and ppdk-CMA2 are contained within the -1081 -52 anl - such as that recently established for two composite, auxin- responsive elements from a soybean gene promoter (Ulma- sov et al., 1995).

Accordingly, comparative analysis of PhANG and PheMAG promoters showed that they generally contain more than one CMA (Fig. 48). Close examination of the distribution and relative position of the identified CMAs revealed that severa1 of them correlate with known redun- dant LRE-containing regions from specific genes. For ex- ample, chs-CMAl and chs-CMA2 from the parsley chs gene correspond to the light-responsive units 1 and 2, respec- tively; rbcS-CMAl, rbcS-CMA3, and rbcS-CMA41 CMA5 are found within the -50/+15, -1661-50, and -4101 -166 photoresponsive regions from the pea rbcS 3A pro- moter (Gilmartin et al., 1990), respectively; and ppdk-CMAl

. . -3471 -109 LRE-containing regions from the maize C4ppdkZmZ gene (Sheen, 1991) (Fig. 4B). The finding that the modular composition of multiple CMAs from the same promoter is different (i.e. they are not structurally redun- dants) corresponds well with the anticipated structural features of PhANG and PheMAG multiple LREs.

Evolutionary Relationships between LRE-Associated CMAs

Since most of the known PhANGs and PheMAGs from both monocotyledons and dicotyledons seem to be regu- lated by light at the transcriptional level, it is conceivable that the homologous genes in their common ancestor were similarly regulated. Therefore, it is reasonable to consider that some multipartite photoresponsive units might be con-

Page 9: Ancestral Multipartite Units in Light-Responsive Plant Promoters ...

Evolution of Transcriptional Light-Responsive Elements 1159

served in orthologous genes from these two angiosperm lineages. Our analysis shows that severa1 CMAs are present in the genes from both monocotyledonous and dicotyledonous plant species, and these are cab-CMAl, chs-CMAl and chs-CMA2, fed-CMAl, gapA-CMAl, ppdk- CMA1, and ppdk-CMA2, sbp-CMAl, rbcA-CMAl, and rbcS- CMA4 and rbcS-CMA5 (Figs. 1-3).

Since other cis-regulatory units that are also involved in transcriptional photocontrol may have diverged signifi- cantly in sequence throughout the process of angiosperm radiation, phylogenetic relatedness of LREs in genes from different plant lineages may be obscured. Therefore, a closer structural analysis of CMAs that were restricted to specific plant groups was performed in an attempt to trace plausible evolutionary relationships with the more con- served modular arrangements. This new analysis revealed that many of the identified LRE-associated CMAs are phy- logenetically related. For instance, apparently unrelated cis-acting elements such as the X- and Y-boxes from the

Lemna rbcS genes (Buzby et al., 1990) and the I- and G-box elements from the Solanaceae rbcS-CMA5 (I-G unit), seem to be evolutionarily derived from the same ancestral reg- ulatory promoter region (Fig. 5A). An analogous conclu- sion may be derived from the comparison between the cab-CMA4 region from the wheat cabl gene with the cab-CMA5 from the rice cab 2R gene (Fig. 58; see also Fig. 3A). Thus, the apparent high diversity of LRE-associated CMAs that were found in genes from modern plants may be reduced, from a phylogenetic point of view, to only a few primary cis-regulatory units. For example, rbcS- CMAs 4, 5, 6, and 7 seem to have been derived from only one ancestral complex regulatory arrangement (Fig. 5A) and two monocotyledonous cab-CMAs from only one (Fig. 5B). Moreover, rbcS-CMA2 and rbcS-CMA3 are partially overlapped in Solanaceae (Fig. 3), and phylogenetic relat- edness between Fabaceae cab-CMA2 and Solanaceae cab- CMA3 is suggested by structural similarities found be- tween some tomato (e.g. cab 7 ) and pea (eg. AB80) cab

~~ ~ ~

4 Tomato rbcS 3A Tmnato rbcSl

Potato rbCS 2 Potato rbcS 3

Tmnato rbcS 2 Potsto rbcS 1

mtunia rbcS 611

Pea rbcS 3 6 m a rbCS E9

& & I C

Wheat rbcS 1 mce rbCS1

Cotton rbcS 1 Sunflower r b 6 1 m a rbcs 3A Soybean rbcS SRS4

Arabrdopsfs rbcs 2 8 Maze rbcS Zm3

Em---= MC

DICOTS MONOCOTS

2 1 - 1 1

Lanx sridna r b 6 1

Angiosperl Common G Y M N O S P E R M S Ancestor

rbcS1 I

B

mlze cabM7

L S S ' LSlrr "m Rlce cab-2 LS6' Wize cab 4 0

Wheat Cab 1

CabA LSS' L S 7 + +

mn LSBf

\ Poaceae

Monocot common ancestor

Cab Genes I Figure 5. Evolutionary relatedness of diverse LRE-associated CMAs. A, A tripartite modular array (I-G-I boxes) is found in rbcS genes from plant species belonging to the three main lineages of higher plants. Divergence in overall structural organization of this region has taken place either by incorporation of new DNA motifs closely associated with primitive modules or by secondary loss of some of them in specific plant lineages ( e g CMA4 in Solanaceae) or in individual members of this multicopy gene family (e.g. pea rbcS genes). Additional conserved modules are: MC, monocotyledonous consensus; CG and TG motifs, as in Figure 3; GA, RGATGA motif (Fig. 1 B); 1 A, 16, and 1 C, family-member-specific modules in Figure 1 C. B, Evolutionary relationships between cabCMA-4 and cabCMA-5 in genes from monocotyledonous plant species. Since some cab genes from Gramineae (Poaceae) have promoter regions modularly similar to the Lemna cabABl9 LS5-LS7 region, it is inferred that the common ancestor of these two monocotyledonous lineages harbored similar sequence motifs in i t s homologous genes. CabA designates modules similar to the so-called footprinting A from wheat cab-1 gene (Gotor et al., 1993). An asterisk (*) indicates high similarity but not identity, and ++ denotes lower similarity in the nucleotidic sequence relative to Lemna DNA modules.

Page 10: Ancestral Multipartite Units in Light-Responsive Plant Promoters ...

1160 Argüello-Astorga and Herrera-Estrella

ACCTAACC

ACCTACAC

ACCTACCC

ACCTACCA

CCCTAACC

ACCTACCA

ACCTAACA

Plant Physiol. Vol. 11 2, 1996

TCC'

A C 9

TACTTCCTA'

AAACCTCA'

A '

C"

CA'

promoters (not shown). Common ancestry for cks-CMAl and cks-CMA3 is also suggested by both their position relative to the TATA box and their modular composition (Figs. 1 and 3).

Evolutionary relationships between well-studied com- posite photoresponsive units such as cks-CMAl, cab- CMA4, and rbcS-CMA5 with less-known CMAs provide additional albeit indirect evidence supporting the hypoth- esis that the identified CMAs are actually LREs.

Comparative Analysis of LRE-Associated CMAs

To explore the existence of the structural features that are common to light-responsive promoter regions from PhANGs and PheMAGs (the central question addressed in this work), a comparative analysis of a11 of the iden- tified CMAs was made. Severa1 important observations were derived from this analysis: G-box (core motif, CACGTG; Giuliano et al., 1988) and G-box-like elements are common components of CMAs that are present in different gene families. For example, they are found in a11 of the cks-CMAs, in three cab-CMAs, in two rbcS- CMAs, and in other PhANG CMAs (Fig. 3). However, several CMAs from well-studied LRE-containing pro- moter regions lack any discernible G-box motif, e.g. rbcS-CMAs 1, 2, 3, and 6; cab-CMAs 1 and 4; and the CMAs found in gapA, gapB, ppdk, atpC-D, pal, and 4-cl genes. GT-1 elements (core motif, GGTTAA; Green et al., 1988) are invariant components of only a few CMAs from dicotyledonous PhANGs such as rbcS-CMAs 2 and 3 and LsCMA1. Modules that are similar in sequence to the rbcS box I1 (GTGTGGTTAATATG), the prototype of GT-1 e lemen ts ( G r e e n et al., 1987), are also found in cab- CMA3, rbcS-CMA6, Pc-CMA2, rbcA-CMAl, and gapB- CMAl (Fig. 3). However, they mainly display more sim- ilarity out of the defined hexameric GT-1 core sequence in the overlapping G T G T G motif. In fact, the less con- served positions within box I1 in an rbcS-CMA3 phylo- genetic series include the central TTAA motif (not shown; see also Manzara and Gruissem, 1988). Typical GT-1 elements were not found in any of the identified CMAs from PheMAGs. A11 of the identified PhANG CMAs include at least one DNA module with a core sequence that is identical with or related to the IB core motif (GATAAGR) or its inverted version, the LAMP element (YCTTATC) (Fig. 6). In some cases the similarity with the IB motif is evident only in certain members of a phylogenetic series (e.g. in Lemna but not in wheat cab- CMA4; Fig. 3A). The proper spacing of critica1 nucleo- tides relative to other, more conserved DNA modules probably compensates for sequence degeneration via protein-protein interactions between regulatory factors (Wright and Funk, 1993).

It was also found that certain PhANG CMA modules display a consensus with minimal but specific differences with respect to the canonical IB motif. For instance, motif 15 from Solanaceae rbcS-CMA2 is GATGAGG (Manzara and Gruissem, 1988), and IB/LAMP modules from gapA- CMAl and rbcA-CMAl display the consensus GATTA- GATT and GGATEG, respectively (Figs. 3 and 6).

CMA

r b c S - C M A l

rbcS-CMAZ

rbcS-CMA3

r b c S-CMA4

r b cS-CMAS

r b c S - CMAB

r b c S-CMA'I

eob - C M A l

cab -CMAZ

cab -CMA3

eob -CMA4

cab -CMAS

Pc-CMAl

Pc-CMAZ

fed - C M A l

Lr-CMA1

s b p - C M A l (module ,)

(mcdulo 1)

r b c A - C M A I

gap A - C M A l

gap B-CMAI

gap-ABCMAZ

ppdk-CMAl

pp dk-CMAZ

AipCD-CMA*

c h s - C M A l

c h s-CMA2

c h s - C M A 3

p a l - C M A l

4 e I - C M A l (module I ) (module 1)

4 c I . C M A l

( inuer ted) A T

A A

( ~ n v e r t e d ) A T

TGGTGGCTAAT

GGATGA

A

GTC

GPITA"2ATA Ne

A T

( i n v e r t e 4 AG

CCG

( inv . ) ATGGP;TGPI

( ~ n v e r f e d ) A T

TA

A

( invor tod)

( 1 n v . 1 CTGPIAGAA

( ~ n v . ) TTGGAAGA

( ~ n v e r t e d ) G

(inverted] A

(inuerted) G

( inver ted) A G

G

ATTG

( ~ n v e r f e d ) TG

GPITAAGG

GATGPlffi

GPITAAAG

GATAAGG

GPITATGPI

GATAAGG

GPITAAGG

GPITAAGG

GATATGA

CATAffiA

GPITAGGG

GATAGGA

GATAAAG

GATAATG

GATAAGA

GPITAATG

GPITATGPI

GATAAAG

GATTGGC

GPITTAGA

GATAAGA

GATAAGG

GATAAGG

GATATGG

GATAAGG

T T '

'TT

r T T 2

TTA '

(Dicot CO.*B.*"S)

G

I

I

I

6

T T 7

7

7

T T

TTCTGT

T G ~

I PheMAGs I

CTC

T T A C

CCCTTTC

TCATC

Figure 6. Comparisons between DNA modules from specific CMAs containing sequences similar to either the LAMP/IB or the H-box core motifs. Sequences correspond to modules indicated with an asterisk (*) or a plus (+) in Figure 3. They are not a consensus but the actual sequence from one specific gene selected between those that are shown in the latter figure. Plant species that correspond to the shown module are indicated by a number at the right side and are as follows: 1, tobacco; 2, pea; 3, Lemna; 4, maize; 5, spinach; 6, wheat; 7, Arabidopsis; 8, parsley; 9, barley; 1 O, soybean; and 11, potato. In several cases the complementary sequence is shown, and this is i ndicated.

A final, important observation is that a11 of the identified CMAs from PheMAGs contain a conserved module related to the cks H-box (Loake et al., 1992) or pal L-box (Logemann et al., 1995) core motif ACCTA(A/C)C(A/C) (Fig. 6). Therefore, a structural feature that is common to the light- responsive regions of genes belonging to a particular su-

Page 11: Ancestral Multipartite Units in Light-Responsive Plant Promoters ...

Evolution of Transcriptional Light-Responsive Elements 1161

perfamily of light-dependent genes does exist, and the implications of such findings are discussed below.

DISCUSSION

Phylogenetic-structural analysis of the promoter region from more than 110 photoinducible plant genes has al- lowed us to the identify 30 LRE-associated DNA motif arrangements that are conserved in 15 different gene fam- ilies. These CMAs contain practically a11 of the regulatory elements that have been shown to play a role in light responsiveness of these genes. Several lines of direct and indirect evidence support the notion that the identified CMAs are actually involved in the control of plant gene transcription by light signals. First, the shortest known photoresponsive promoter regions from both PhANGs and PheMAGs contain one CMA. Second, severa1 nonhomolo- gous genes, in which their expression is coordinately reg- ulated by light, contain CMAs that are structurally similar. Third, localization of multiple CMAs in a same promoter correlates with known redundant LREs. Fourth, diverse LRE-associated CMAs from a specific gene family consti- tute a set of phylogenetically related promoter modules, which conceivably display similar regulatory functions. Fifth, the CMAs from genes that belong to the same super- family have a set of related sequence motifs in common, which suggests the participation of a family of related regulatory factors in the transcriptional control of these genes.

Are Some of the IB- and H-Box-Related Elements Binding Sites for Heteromeric Complexes?

The finding that Iight-responsive units of a presumed bipartite nature (e.g. rbcS I-G unit) are actually more com- plex from a structural viewpoint may be significant. Align- ment of CMAs in a phylogenetic series allowed us to detect two distinct components in otherwise apparently single DNA modules. This is the case of box 1 from cks unit 1 comprising motifs 1 and 2 (Fig. l), and of the IB (IB) elements associated with the G-box module in rhcS- CMA4/ 5. The latter display distinct "broadened" consen- sus in dicotyledons, which are RGATGAGATAAGAT (20 genes from species belonging to 10 genera and five families compared; not shown) and TGGTGNNYAAYGATAAGG (derived from 11 genes from plants belonging to four di- cotyledonous families) for rbcS-CMA5 and rbcS-CMA4 IB modules, respectively. Cab-CMA1 also includes an IB ele- ment, which is part of a conserved array of three GATA motifs found in many cab gene promoters (Gidoni et al., 1989) and in which the consensus in angiosperms is GA- TANNGATAN(6-8)GATAAGR (Fig. 2).

The conservation of sequences flanking the heptameric IB core motif in these PhANG CMAs suggests that either different yet related IBFs interact with specific IB modules or that these are actually combined binding sites for dif- ferent IBF-containing heteromeric regulatory complexes. Moreover, the presumption that known nuclear IBF activ- ities, such as LRF-1, IBF-la, GAF-1, and others (reviewed in Batschauer et al., 1994; Terzaghi and Cashmore, 1995), are

indeed heteromeric protein complexes allows us to explain why in some cases sequences flanking IBs rather than the IB core motif determine both in vitro binding specificity (Borello et al., 1993; Carré and Kay, 1995) and the func- tional properties of the IB elements (Lubberstedt et al., 1994b).

Are the IB- and the H-Box-Related Modules the Light- Specific Elements of Composite LREs?

From a structural point of view, LRE-associated CMAs from PhANGs may be considered as variants from a com- mon theme: a combination of an IB / LAMP-related element with one or more additional conserved DNA modules ar- ranged in a well-defined and characteristic manner. Like- wise, CMAs from PheMAGs would be combinations from an H-box-related module with diverse DNA-conserved motifs. This organization is reminiscent of the composite elements that are responsive to steroid hormones. They are found in a variety of forms in families of mammalian genes, which are regulated by these hormones, but they display different spatial and temporal patterns of expres- sion in response to them (Miner and Yamamoto, 1991; Yamamoto et al., 1992). A further analogy leads us to consider IB / LAMP- and H-box-related modules as the structural equivalents to cognate sequences for a specific hormone-receptor complex and to adjacent elements as the binding sites for other regulatory factors that ultimately determine where and when this complex enhances or re- presses transcription (Miner and Yamamoto, 1991).

Several independent lines of evidence suggest that the IB and IB-related elements might be the critica1 (i.e. light- specific) components of PhANG LREs: (a) two elements from cab genes that have been unequivocally defined as responsive to phytochrome-derived signals, namely the CGF-1 cognate site and the LS7 element, contain IB-like motifs (Kehoe et al., 1994; Anderson and Kay, 1995); (b) the IB is the only regulatory sequence common to monocoty- ledon and rbcS genes that has been shown to be involved in their regulation by light (Rolfe and Tobin, 1991; Schafner and Sheen, 1991); (c) most (if not all) of the known plant nuclear factors in which relative concentration and/ or phosphorylation state are modulated by light bind to IB/ LAMP-related sequences; they include the LRF-1 (Buzby et al., 1990), GAF-1 (Gilmartin et al., 1990), 3AF3 and 3AF5 (Sarokin and Chua, 1992), and IBF-la (Borello et al., 1993) factors; (d) in vivo methylation interference experiments showed light-modulated changes in methylation patterns around the IB-like "motif 15" (rbcS-CMA2) from tomato rbcS 3 B (Manzara et al., 1991); (e) the only light-induced DNase-I hypersensitive site found in the pea rbcS3.6 pro- moter was centered around the IB element (Gorz et al., 1988); and (f) the presence of IB / LAMP-related sequences is the only structural feature that is shared by a11 of the known photoresponsive regions from PhANG promoters.

On the other hand, the relevance of the H-box-related elements in the photocontrol of PheMAG transcription is highlighted by the fact that the only structural feature in common between the light-induced in vivo footprinted

Page 12: Ancestral Multipartite Units in Light-Responsive Plant Promoters ...

1162 Argüello-Astorga and Herrera-Estrella Plant Physiol. Vol. 1 1 2, 1996

regions in parsley chs, pal, and 4cl genes is the presence of these elements (Lois et al., 1989; Schulze-Lefert et al., 1989).

A General Model of LRE Organization

Integration of the data obtained in this work with results from a number of published studies allows us to propose a hypothetical model of structural and functional organiza- tion of PhANG and PheMAG LREs. This is analogous to those proposed for complex cis-acting elements that are responsive to hormones (Rogers and Rogers, 1992; Ulma- sov et al., 1995) or to a variety of environmental stimuli (deVetten and Ferl, 1994). The model is based on two general hypotheses: (a) LREs are multipartite cis-regulatory elements in which overall activity is the result of synergis- tic interactions between cognate transcription factors, and (b) modular components of LREs consist of two general classes: "light-specific" elements and "coupling" elements.

Light-specific elements are presumably bound by tran- scription factors that are targeted (directly or indirectly) by light-signal transduction systems, and they confer photo- responsiveness to the overall modular complex by either activating or repressing gene expression. Coupling ele- ments would bind regulatory proteins that couple the light stimulus to transcription in a cell- and developmental stage-specific way and perhaps also determine the relative strength of photodependent gene expression.

We propose that the IB/LAMP-related modules are the light-specific elements from PhANG LREs and that the H-box-related motifs are the analogous elements from PheMAG LREs. Elements such as the CCAAT-, Gap-, and G-boxes, among others, probably function as coupling el- ements. In principle, cognate transcription factors of these elements may be cell-specific and / or developmentally reg- ulated and, in addition, dependent on signal transduction systems other than those activated by light. Therefore, the organization of LREs suggested in this model could help to explain both why genes that are dependent on the same phototransduction pathway display different spatial and temporal patterns of expression, and how light signals are coupled to other exogenous or endogenous stimuli such as hormonal, metabolic, or chloroplast-derived signals, mak- ing it possible to have more versatile control of photoregu- lated gene expression.

There is still a query relevant to the composite LRE model, namely the actual function of the GT-1 factor- binding sites in transcriptional photoregulation. A number of these elements are found in both PhANG and PheMAGs promoters, and some of them have been proposed to be involved in light regulation (Gilmartin et al., 1990). How- ever, several studies have shown that deletion or mutation of GT-1 sites do not affect the photoresponsiveness of some PhANG promoters (reviewed by Terzaghi and Cashmore, 1995), and it has been established that GT-1 elements that were found in bean chs25 and rice phyA promoters function as silencers and constitutive activating elements, respec- tively (Lawton et al., 1991; Dehesh et al., 1992). Moreover, it has been reported that GT-1-like factors interact with nonphotoresponsive promoters (Buchel et al., 1996). Direct evidence of the participation of the GT-1 sites in light-

regulatory processes have been obtained only in the case of pea rbcS box 11, a 14-bp element considerably larger than the defined 6-bp GT-1 core consensus (GGTTAA, Green et al., 1988) and that is a modular component of the rbcS- CMA3. A synthetic box I1 tetramer was able to confer photoresponsiveness to a truncated (-90) but not to a minimal (-46) cauliflower mosaic virus 35s promoter (Lam and Chua, 1990). It is interesting that the nuclear factor (GT-1) binding to such a box I1 tetramer is very similar in sequence specificity and several physicochemical properties to the GATA-binding factors IBF-2b and CGF-1 (Teakle and Kay, 1995). IBF-2b binds the IB element of the tomato nitrate reductase gene (Borello et al., 1993), and CGF-1 binds the three GATA motifs from Arabidopsis cab2-CMAl region (Anderson et al., 1994). Therefore, it is plausible that diverse GT-1-like factors actually exist in plants, some of which would not be involved in photoregu- lation, whereas the specific GT-1 factor interacting in vivo with box I1 would be more related to the family of IB- or GATA-binding proteins that our model proposes are the targets of light-signal transduction systems. In this context, it could prove to be significant that the observation that box I1 of some rbcS genes from solanaceous plants, and similar elements in some cab genes, display a GATATTA core motif, suggesting a structural relation with more typical IB-related elements (see examples of ubcS-CMA3 and cab- CMA3 in Fig. 3; see also Manzara and Gruissem, 1988).

Are Composite LREs Positive or Negative Elements?

A hypothetical model for LRE function should consider the existence of contradictory evidence with regard to the basic mechanism underlying transcriptional photoregula- tion. Indeed, a set of experimental observations suggest that the latter process is positive in nature (i.e. regulatory factors activate transcription in light and become inactive in dark), whereas other lines of evidence indicate that a repression-in-dark mechanism is involved. For instance, several plant mutants exist in which PhANGs are ex- pressed in the dark at levels that are similar to those in the light (reviewed by McNellis and Deng, 1995), and pro- moter regions from some of those genes have been shown to contain cis-acting elements that function as silencers in the absence of light (Kuhlemeier et al., 1987, 1989; Stock- haus et al., 1989). On the other hand, mutational studies from PhANG promoters have established a positive role in transcription for a variety of IB-related motifs, including some unequivocally defined as phytochrome-responsive elements (Gidoni et al., 1989; Donald and Cashmore, 1990; Schafner and Sheen, 1991; Kehoe et al., 1994; Anderson and Kay, 1995; Argüello-Astorga and Herrera-Estrella, 1995; I. Meier and W. Gruissem, unpublished data).

These paradoxical observations might be reconciled, at least theoretically, if a partia1 or full overlapping of positive and negative regulatory elements is postulated. For exam- ple, the distinct IBFs might, in combination with other positive regulators, activate transcription in a light- independent manner. In the dark, however, nuclear regu- lators involved in repression of gene expression, such as some de-etiolated and constitutive photomorphogenic pro-

Page 13: Ancestral Multipartite Units in Light-Responsive Plant Promoters ...

Evolution of Transcriptional Light-Responsive Elements 1163

LIGHT

CaZ+/C dmodulin I +

Signal A Signal B \ \

I sement 2

Minimal Light-Responsive Unit us Oement

Posit iv e Regu lati on Scenario

Negative Regula tion Scenario

Figure 7. Simplified hypothetical scheme of events leading to transcriptional regulation of PhANGs and PheMAGs by phytochrome-associated transduction systems. Light is absorbed by this photoreceptor, and subsequently components of the associated transducing system target to specific transcription factors, which become either active or inactive, according to their actual regulatory function. In a positive regulation scenario, both the IB/LAMP-binding factors (IBFs) and the H-box binding factors (HBFs) are activated by light and can then bind either alone or associated with other transcription factors (forming an heteromeric regulatory complex) at their DNA cognate sites (LiS elements). If proper exogenous and/or endogenous signals are concurrent with the light stimulus, the photoresponsive unit can activate the gene transcription. In a negative regulation scenario, IBFs and near-bounded factors functionally interact to activate transcription independently of light conditions. In the absence of light, however, de-etiolated/constitutive photomorphogenic like regulators interact with the IBFs, forming a multicomponent repressor complex. Phototransduction pathways have been represented according to the model proposed by Bowler et al. (1994b) for phytochrome control of cab and chs genes.

teins (McNellis and Deng, 1995), could interact with IBFs, forming a repressor complex, which would become inac- tive once light is perceived (Fig. 7). Nevertheless, the pos- sibility that light-modulated activating factors really exist cannot be excluded and should be examined.

Are Some CMAs also lnvolved in Chloroplast-Dependent Expression of PhANCs?

Evidence assembled in this study indicates that the ex- istence of several LRE-associated CMAs predates the monocotyledon-dicotyledon split event (between 130 and 110 million years ago according to recent estimates; re- viewed by Crane et al., 1995) and in some cases even the divergence of gymnosperm-proangiosperm lineages. Since several gymnosperm PhANGs, including the cab-CMAl- containing Pinus thunbergii cab6 gene (Fig. 2), are appar- ently not regulated by light at the transcriptional leve1 (Kojima et al., 1992, 1994), the question arises of whether the primary function of ancestral CMAs was photoregula- tion. In this context, observations relating PhANG CMAs with photosynthetic cell-specific expression may prove to be important. In fact, it is well established that PhANGs transcription is dependent on the developmental stage of the plastids and that specific promoter segments are in- volved in response to chloroplast-derived signals mediat-

ing such control (reviewed by Taylor, 1989). It has been observed that the former sequences generally overlap with light-responsive promoter regions (Simpson et al., 1986; Bolle et al., 1994, 1996). Accordingly, we have found that short sequences, including either the tobacco rbcS-CMA5 or the pea AB80 cab-CMA2, mediate not only light-regulated but also chloroplast-dependent gene expression (Argüello- Astorga and Herrera-Estrella, 1995; G.R. Argüello-Astorga, unpublished data). In general, these observations both sug- gest the remarkable possibility that some of the PhANG CMAs identified here might also be targets for plastid- signal transduction systems and allow us to speculate that the primary function of ancestral PhANG CMAs was probably related to the coordination of gene expression between nuclear and chloroplast genomes rather than light regulation.

ACKNOWLEDCMENTS

We wou ld l ike to thank Luisa López-Ochoa, Plinio Guzman, and Roberto Ruiz-Medrano for critica1 reading of this manuscript and many helpful suggestions. The authors are members of the Sistema Nacional de Investigadores, México.

Received April 18, 1996; accepted August 12, 1996. Copyright Clearance Center: 0032-0889/ 96/ 112/ 1151 / 16.

Page 14: Ancestral Multipartite Units in Light-Responsive Plant Promoters ...

1164 Argüello-Astorga and Herrera-Estrella Plant Physiol. Vol. 11 2 , 1996

LITERATURE CITED

Ahmad M, Cashmore AR (1993) HY4 gene of A. thaliana encodes a protein with characteristics of a blue-light photoreceptor. Na- ture 366: 162-166

Anderson SL, Kay SA (1995) Functional dissection of circadian clock- and phytochrome-regulated transcription of the Avabidop- sis CABZ gene. Proc Natl Acad Sci USA 9 2 1500-1504

Anderson SL, Teakle GR, Martino-Catt SJ, Kay SA (1994) Circa- dian clock- and phytochrome-regulated transcription is con- ferred by a 78 bp cis-acting domain of the Avabidopsis CABZ promoter. Plant J 6 457470

Argiiello G, García-Hernández E, Sánchez M, Gariglio P, Her- rera-Estrella LR, Simpson J (1992) Characterization of DNA sequences that mediate nuclear protein binding to the regula- tory region of the Pisum sativum (pea) chlorophyll a / b binding protein gene AB80: identification of a repeated heptamer motif. Plant J 2: 301-309

Argüello-Astorga GR, Guevara-González RG, Herrera-Estrella LR, Rivera-Bustamante RF (1994) Geminivirus replication ori- gins have a group-specific organization of iterative elements: a model for replication. Virology 203: 90-100

Argiiello-Astorga GR, Herrera-Estrella LR (1995) Theoretical and experimental delimitation of minimal photoresponsive elements in Cab and rbcS genes. In M Terzi, R Cella, A Falavigna, eds, Current Issues in Plant Molecular and Cellular Biology. Kluwer Academic, Dordrecht, The Netherlands, pp 501-511

Avise J (1994) Molecular Markers, Natural History and Evolution. Chapman and Hall, New York

Batschauer A, Gilmartin PM, Nagy F, Schafer E (1994) The mo- lecular biology of photoregulated genes. In RE Kendrick, GHM Kronenberg, eds, Photomorphogenesis in Plants, Ed 2. Kluwer Academic, Dordrecht, The Netherlands, pp 559-599

Block A, Dangl JL, Hahlbrock K, Schulze-Lefert P (1990) Func- tional borders, genetic fine structure, and distance requirements of cis elements mediating light-responsiveness of the parsley chalcone synthase promoter. Proc Natl Acad Sci USA 87: 5387- 5391

Bolle C, Kusnetsov VV, Herrmann RG, Oélmuller R (1996) The spinach AtpC and AtpD genes contain elements for light-regu- lated, plastid-dependent and organ-specific expression in the vicinity of the transcription start sites. Plant J 9: 21-30

Bolle C, Sopory S, Lübbersted T, Klosgen RB, Herrmann RG, Oelmiiller R (1994) The role of plastids in the expression of nuclear genes for thylakoid proteins studied with chimeric P-glucuronidase gene fusions. Plant Physiol 105: 1355-1364

Borello U, Cecarelli E, Guiliano G (1993) Constitutive, light- responsive and circadian clock-responsive factors compete for the different I-box elements in plant light-regulated promoters. Plant J 4: 611-619

Bowler C, Chua N-H (1994) Emerging themes of plant signal transduction. Plant Cell 6: 1529-1541

Bowler C, Neuhaus G, Yamagata H, Chua N-M (1994a) Cyclic GMP and calcium mediate phytochrome phototransduction. Cell 77: 73-81

Bowler C, Yamagata H, Neuhaus G, Chua N-M (199413) Phyto- chrome signal transduction pathways are regulated by recipro- cal control mechanisms. Genes Dev 8: 2188-2202

Buchel AS, Molenkamp R, Bol JF, Linthorst HJM (1996) The PR-la promoter contains a number of elements that bind GT-1 like nuclear factors with different affinity. Plant Mo1 Biol 30:

Buzby JS, Yamada T, Tobin EM (1990) A light-regulated DNA- binding activity interacts with a conserved region of a Lemnn gibba vbcS promoter. Plant Cell 2: 805-814

Carrasco P, Manzara T, Gruissem W (1993) Developmental and organ-specific changes in DNA-protein interactions in the to- mato rbcS3B and rbcSA3C promoter regions. Plant Mo1 Biol

Carré I, Kay SA (1995) Multiple DNA-protein complexes at a circadian-regulated promoter element. Plant Cell 7: 2039-2051

Castresana C, Garcia-Luque I, Alonso E, Malick VS, Cashmore AR (1988) Both positive and negative regulatory elements me-

493-504

21: 1-15

diate expression of a photoregulated CAB gene from Nicotiana plumbaginifolia. EMBO J 7: 1929-1936

Conley TR, Park SC, Kwon HB, Peng HP, Shih MC (1994) Char- acterization of cis-acting elements in light-regulation of the nu- clear gene encoding the A subunit of chloroplast isozymes of glyceraldehyde 3-phosphate dehydrogenase from Arabidopsis thaliana. Mo1 Cell Biol 14: 2523-2533

Crane PR, Friis EM, Pedersen KR (1995) The origin and early diversification of angiosperm. Nature 374: 27-33

Dehesh K, Hung H, Tepperman JM, Quail PH (1992) GT2: a transcription factor with twin autonomous DNA-binding do- mains of closely related but different target sequence specificity. EMBO J 11: 4131-4144

deVetten NC, Ferl RJ (1994) Transcriptional regulation of envi- ronmentally inducible genes in plants by an evolutionary con- served family of G-box binding factors. Int J Biochem 16: 1055- 1068

Donald RGK, Cashmore AR (1990) Mutation in either G box or I box sequences profoundly affects expression from the Arnbidop- sis thaliana rbcS-IA promoter. EMBO J 9: 1717-1726

Durbin ML, Learn GH, Huttley GA, Clegg MT (1995) Evolution of the chalcone synthase gene family in the genus lpomoen. Proc Natl Acad Sci USA 92: 3338-3342

Furuya M (1993) Phytochromes: their molecular species, gene families and functions. Annu Rev Plant Physiol Plant Mo1 Biol

Gallie DR (1993) Posttranscriptional regulation of gene expression in plants. Annu Rev Plant Physiol Plant Mo1 Biol 44: 77-105

Gidoni D, Brosio P, Bond-Nutter D, Bedbrook J, Dunsmuir P (1989) Nove1 cis-acting elements in petunia Cab gene promoters. Mo1 Gen Genet 215: 337-344

Gilmartin PM, Chua N-H (1990) Localization of a phytochrome- responsive element within the upstream region of pea rbcS-3A. Mo1 Cell Biol 1 0 5565-5568

Gilmartin PM, Sarokin L, Memelink J, Chua N-H (1990) Molec- ular light-switches for plant genes. Plant Cell 2: 369-378

Giuliano G, Pichersky E, Malik VS, Timko MP, Scolnick PA, Cashmore AR (1988) An evolutionarily conserved protein bind- ing sequence upstream of a plant light-regulated gene. Proc Natl Acad Sci USA 85: 7089-7093

Gorz A, ShIffer W, Hirasawa E, Kahl G (1988) Constitutive and light-induced DNAseI hypersensitive sites in the vbcS genes of pea (Pisum sativum). Plant Mo1 Biol 11: 561-573

Gotor C, Romero LC, Inouye K, Lam E (1993) Analysis of three tissue-specific elements from the wheat Cab-l enhancer. Plant J

Green PJ, Kay SA, Chua N-H (1987) Sequence-specific interactions of a pea nuclear factor with light-responsive elements upstream of the vbcS-3A gene. EMBO J 6: 2543-2549

Green PJ, Yong M-H, Cuozzo M, Kano-Mukarami Y, Silverstein P, Chua N-H (1988) Binding site requirements for pea nuclear protein GT-1 correlate with sequences required for light-depen- dent transcriptional activation of the rbcS-3A gene. EMBO J 7: 40354044

Grob U, Stuber K (1987) Discrimination of phytochrome depen- dent light-inducible from non-light-inducible plant genes. Pre- diction of a common light-responsive element (LRE) in phyto- chrome dependent light-inducible genes. Nucleic Acids Res 15: 9957-9972

Gumucio DL, Heilstedt-Williamson H, Gray TA, Tarle SA, Shel- ton DA, Tagle DA, Slightom JL, Goodman M, Collins FS (1992) Phylogenetic footprinting reveals a nuclear protein which binds to silencer sequences in the human gamma and epsilon globin genes. Mo1 Cell Biol 12: 4919-4929

Ha S-B, An G (1988) Identification of upstream regulatory ele- ments involved in the developmental expression of the Avabi- dopsis thaliana cabl gene. Proc Natl Acad Sci USA 85: 8017-8021

Hahlbrock K, Scheel D (1989) Physiology and molecular biology of phenylpropanoid metabolism. Annu Rev Plant Physiol Plant Mo1 Biol40: 347-369

Hill CS, Treisman R (1995) Transcriptional regulation by extra- cellular signals: mechanisms and specificity. Cell 80: 199-211

44: 617-645

3: 509-518

Page 15: Ancestral Multipartite Units in Light-Responsive Plant Promoters ...

Evolution of Transcriptional Light-Responsive Elements 1165

Kaiser T, Emmler K, Kretsch T, Weisshaar B, Schafer E, Batschauer A (1995) Promoter elements of the mustard CHSl gene are sufficient for light regulation in transgenic plants. Plant Mo1 Biol 28: 219-229

Kehoe DM, Degenhardt J, Winicov I, Tobin EM (1994) Two 10-bp regions are critica1 for phytochrome regulation of a Lenzna gibba Lkcb gene promoter. Plant Cell 6: 1123-1134

Kenigsbuch D, Tobin EM (1995) A region of the Arubidopsis LkcbZ*3 promoter that binds to CA-1 activity is essential for high expression and phytochrome regulation. Plant Physiol 108: 1023-1027

Kojima K, Sasaki S, Yamamoto N (1994) Light-independent and tissue-specific expression of a reporter gene mediated by the pine cab-6 promoter in transgenic tobacco. Plant J 6: 591-596

Kojima K, Yamamoto N, Sasaki S (1992) Structure of the pine (Pinus tkunbergii) chlorophyll a / b binding protein gene ex- pressed in the absence of light. Plant Mo1 Biol 19: 405410

Kuhlemeier C, Cuozzo M, Green P, Goyvaerts E, Ward K, Chua N-H (1988) Localization and conditional redundancy of regula- tory elements in rbcS-3A, a pea gene encoding the small subunit of ribulose-bisphosphate carboxylase. Proc Natl Acad Sci USA 85: 46624666

Kuhlemeier C, Fluhr R, Green PJ, Chua N-H (1987) Sequences in the pea rbcS-3A gene have homology to constitutive mammalian enhancers but function as negative regulatory elements. Genes Dev 1: 247-255

Kuhlemeier C, Strittmatter G, Ward K, Chua N-H (1989) The pea rbcS-3A promoter mediates light-responsiveness but not organ specificity. Plant Cell 1: 471-478

Kwon H-B, Park S-C, Peng H-P, Goodman HM, Dewdney J, Shih MC (1994) Identification of a light-responsive region of the nuclear gene encoding the B subunit of chloroplast glyceralde- hyde 3-phosphate dehydrogenase from Arabidopsis tkaliana. Plant Physiol 105: 357-367

Lam E, Chua N-H (1990) GT-1 binding site confers light- responsive expression in transgenic tobacco. Science 248: 471-474

Lawton MA, Dean SM, Dron M, Kooter JM, Kragh KM, Harrison MS, Yu L, Tanguay L, Dixon RA, Lamb CJ (1991) Silencer region of a chalcone synthase promoter contains multiple bind- ing sites for a factor, SBF-1, closely related to GT-1. Plant Mo1 Biol 1 6 235-249

Lipphardt S, Brettschneider R, Kreuzaler F, Schell J, Dangl JL (1988) UV-inducible transient expression in parsley protoplasts identifies regulatory cis-elements of a chimeric Antirrkinum ma- jus chalcone synthase gene. EMBO J 7: 4027-4033

Loake GJ, Faktor O, Lamb CJ, Dixon RA (1992) Combination of H-box [CCTACC(N),CT] and G-box (CACGTG) cis-elements is necessary for feed-forward stimulation of a chalcone synthase promoter by the phenylpropanoid-pathway intermediate p- coumaric acid. Proc Natl Acad Sci USA 89: 9230-9234

Logemann E, Parniske M, Hahlbrock K (1995) Modes of expres- sion and common structural features of the complete phenylal- anine ammonia-lyase gene family in parsley. Proc Natl Acad Sci

Lois R, Dietrich A, Hahlbrock K, Schulz W (1989) A phenylala- nine ammonia-lyase gene from parsley: structure, regulation and identification of elicitor and light responsive cis-acting ele- ments. EMBO J 8: 1641-1648

Lubberstedt T, Bolle CEH, Sopory S, Flieger K, Herrmann RG, Oelmuller R (1994a) Promoters from genes for plastid proteins possess regions with different sensitivities toward red and blue light. Plant Physiol 104: 997-1006

Lubberstedt T, Oelmiiller R, Wanner G, Herrmann RG (199413) Interacting cis elements in the plastocyanin promoter from spin- ach ensure regulated high-leve1 expression. Mo1 Gen Genet 242 602-613

Manzara T, Carrasco P, Gruissem W (1991) Developmental and organ-specific changes in promoter DNA-proteins interactions in the tomato rbcS gene family. Plant Cell 3: 1305-1316

Manzara T, Gruissem W (1988) Organization and expression of the genes encoding ribulose-1,5-bisphosphate carboxylase in higher plants. Photosynth Res 16: 117-139

USA 92: 5905-5909

McNellis TW, Deng X-W (1995) Light control of seedling mor- phogenetic pattern. Plant Cell 7 1749-1761

Meier I, Callan KL, Fleming AJ, Gruissem W (1995) Organ- specific differential regulation of a promoter subfamily for the ribulose-1,5-bisphosphate carboxylase / oxygenase small subunit genes in tomato. Plant Physiol 107: 1105-1118

Miner JN, Yamamoto KR (1991) Regulatory crosstalk at composite response elements. Trends Biochem Sci 16: 423-426

Mitra A, Choi HK, An G (1989) Structural and functional analyses of Arabidopsis tkaliana chlorophyll a / b-binding protein (cab) pro- moters. Plant Mo1 Biol 12: 169-179

Nagy F, Fluhr R, Kuhlemeier C, Kay S, Boutry M, Green P, Poulsen G, Chua N-H (1986) Cis-acting elements for selective expression of two photosynthetic genes in transgenic plants. Philos Trans R SOC Lond-Biol Sci 314: 493-500

Neuhaus G, Bowler Ch, Kern R, Chua N-H (1993) Calcium/ calmodulin-dependent and -independent phytochrome signal transduction pathways. Cell 73: 937-952

Orozco BM, Ogren WL (1993) Localization of light-inducible and tissue-specific regions of the spinach ribulose-1,5-bisphosphate carboxylase / oxygenase (rubisco) activase promoter in trans- genic tobacco plants. Plant Mo1 Biol 23: 1129-1138

Piechulla B, Kellman JW, Pichersky E, Schawrtz E, Forster HH (1991) Determination of steady-state mRNA levels of individual chlorophyll a / b binding protein genes of the tomato cab gene family. Mo1 Gen Genet 230: 413422

Poulsen C, Chua N-H (1988) Dissection of 5’ upstream sequences for selective expression of the Nicotiana plumbaginifolia rbcS-88 gene. Mo1 Gen Genet 214: 16-23

Quail PH (1994) Phytochrome genes and their expression. In RE Kendrick, GHM Kronenberg, eds, Photomorphogenesis in Plants, Ed 2. Kluwer Academic, Dordrecht, The Netherlands, pp 71-104

Rocholl M, Talke-Messerer C, Kaiser T, Batschauer A (1994) Unit 1 of the mustard chalcone synthase promoter is sufficient to mediate light responses from different photoreceptors. Plant Sci 97: 189-198

Rogers JC, Rogers SW (1992) Definition and functional implica- tions of giberellin and abscisic acid cis-acting hormone response complexes. Plant Cell4: 1443-1451

Rolfe SA, Tobin EM (1991) Deletion analysis of a phytochrome- regulated monocot rbcS promoter in a transient assay system. Proc Natl Acad Sci USA 88: 2683-2686

Sarokin LP, Chua N-H (1992) Binding sites for two novel phos- phoproteins, 3AF5 and 3AF3, are required for rbcS-3A expres- sion. Plant Cell 4: 473483

Schafner AR, Sheen J (1991) Maize rbcS promoter activity de- pends on sequence elements not found in dicot rbcS promoters. Plant Cell 3: 997-1012

Sheen J (1991) Molecular mechanisms underlying the differential expression of maize pyruvate, orthophosphate dikinase genes. Plant Cell3: 225-245

Schulze-Lefert P, Becker-André M, Schulz W, Hahlbrock K, Dangl JL (1989) Functional architecture of the light-responsive chalcone synthase promoter from parsley. Plant Cell 1: 707-714

Silverthorne J, Tobin E (1984) Demonstration of transcriptional regulation of specific genes by phytochrome action. Proc Natl Acad Sci USA 81: 1112-1116

Simpson J, Van Montagu M, Herrera-Estrella LR (1986) Photo- synthesis-associated gene families: differences in response to tissue-specific and environmental factors. Science 233: 34-38

Stockhaus J, Schell J, Willmitzer L (1989) Identification of en- hancer elements in the upstream region of the nuclear photo- synthetic gene ST-LS1. Plant Cell 1: 805-813

Sun L, Doxsee RA, Harel E, Tobin EM (1993) CA-1, a novel phosphoprotein, interacts with the promoter of the cab 140 gene in Arabidopsis and is undetectable in detl mutant seedlings. Plant Cell 5: 109-121

Taylor WC (1989) Regulatory interactions behveen nuclear and plas- tid genomes. AMU Rev Plant Physiol Plant Mo1 Biol40 211-233

Teakle GR, Kay SA (1995) The GATA-binding protein CGF-1 is closely related to GT-1. Plant MOI Biol 29: 1253-1266

Page 16: Ancestral Multipartite Units in Light-Responsive Plant Promoters ...

1166 Argüel lo-Astorga and Herrera-Estrella Plant Physiol. Vol. 11 2, 1996

Terzaghi WB, Cashmore AR (1995) Light-regulated transcription. Annu Rev Plant Physiol Plant Mo1 Biol 46: 445-474

Thompson WF, White MJ (1991) Physiological and molecular studies of light-regulated nuclear genes in higher plants. Annu Rev Plant Physiol Plant Mo1 Biol 42: 423-466

Tobin EM, Kehoe DM (1994) Phytochrome regulated gene expres- sion. Semin Cell Biol 5: 335-346

Ulmasov T, Liu Z-B, Hagen G, Guilfoyle TJ (1995) Composite structure of auxin response elements. Plant Cell 7: 1611-1623

Vorst O, vanDam F, Weisbeek P, Smeekens S (1993) Light- regulated expression of the Arabidopsis thaliana ferredoxin A gene involves both transcriptional and post-transcriptional processes. Plant J 3: 793-803

White MJ, Fristensky BW, Falconet D, Childs LC, Watson JC (1992) Expression of the chlorophyll-a/b-protein multigene fam- ily in pea (Pzsum sativum L.). Planta 188: 190-198

White MJ, Kaufman LS, Horwitz BA, Briggs WR, Thompson WF

(1995) Individual members of the Cab gene family differ widely in fluence response. Plant Physiol 107: 161-165

Wingender R, Rohrig H, Horicke C, Schell J (1990) cis-Regulatory elements involved in ultraviolet light regulation and plant de- fense. Plant Cell 2: 1019-1026

Wright WE, Funk WD (1993) Casting for multicomponent DNA- binding complexes. Trends Biochem Sci 18: 77-80

Yamada T, Tanaka Y, Sriprasertsak P, Kato H, Hashimoto T, Shimizu H, Shiraushi T (1992) Phenylalanine ammonia-lyase genes from Pisum sativum: structure, organ specific expression and regulation by funga1 elicitor and suppressor. Plant Cell Physiol 33: 715-725

Yamamoto KR, Pearce D, Thomas J, Miner JN (1992) Combina- torial regulation at a mammalian composite response element. In SL McKnight, KR Yamamoto, eds, Transcriptional Regulation, Vol2. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, pp 1169-1192