Top Banner
Identification of genes dierentially over-expressed in lung squamous cell carcinoma using combination of cDNA subtraction and microarray analysis Tongtong Wang* ,1 , Deborah Hopkins 1 , Cheryl Schmidt 1 , Sandra Silva 1 , Raymond Houghton 1 , Hiroshi Takita 2 , Elizabeth Repasky 3 and Steven G Reed 1 1 Department of Tumor Antigen Discovery, Corixa Corporation, 1124 Columbia Street, Seattle, Washington, WA 98104, USA; 2 Department of Surgery at Roswell Park Cancer Center, Bualo, New York, NY 14263, USA; 3 Department of Immunology at Roswell Park Cancer Center, Bldg. CCC Room 411, ELM and Carlton Streets, Bualo, New York, NY 14263, USA In order to develop eective vaccine products against human cancer, we are interested in identifying genes over-expressed in tumor cells. Through a combination of cDNA library subtraction and microarray technology, we identified seventeen genes preferentially expressed in lung squamous cell carcinoma, including four novel genes. To date, expression profiles of these genes were confirmed by Northern and/or real-time analysis, and several genes were also found to be expressed in head and neck squamous tumors. Thus, these combined methods represent a high throughput approach for identifying tumor specific genes. Furthermore, the report of characterization on these genes will allow them to be exploited for their diagnostic, prognostic, and therapeutic potentials including immunotherapy and antibody based anticancer therapy. Oncogene (2000) 19, 1519 – 1528. Keywords: lung squamous cell carcinoma (LSCC); microarray; cDNA subtraction; connexin 26; plakophi- lin 1; keratin Introduction Lung cancer is a leading cause of death among all types of cancers. Approximately, one million people worldwide die from lung cancer each year, while six million people die of all types of cancer. Though the incidence of this disease among men has decreased in recent years, the number of cases in women has increased significantly. As a result, the death rate for woman from lung cancer has surpassed that of breast cancer. There are two major types of lung cancer, non- small cell lung carcinoma (NSCLC) which comprises 80% of total incidence, and small cell lung carcinoma (SCLC) which accounts for 20% of all cases. The 5- year survival rate for either category is less than 10%. The NSCLC includes squamous cell carcinoma (LSCC), adenocarcinoma, and large cell carcinoma. Current treatment for NSCLC patients includes surgery if the tumor is confined and/or a combination of chemotherapy and radiation therapy. Unfortunately, at the time of lung cancer diagnosis, the tumor has often become disseminated. A lack of dependable markers for early diagnosis and treatment has impeded the management of lung cancer. Although several genes have been reported and tested as diagnostic and prognostic markers for lung cancer, i.e. carcinoem- bryonic antigen (CEA), urokinase plasminogen activa- tor, squamous cell carcinoma antigen (SCC), cytokeratin 19 fragment (CYFRA 21..1) (Morita et al., 1998; Pastor et al., 1997; Brechot et al., 1997), and PGP 9.5 (Hibi et al., 1999), there is room for much improvement in this area. The identification of tumor markers is also impor- tant as such markers may be developed as antigens for tumor therapy. To date, melanoma antigens have been isolated primarily through immunological identification (Robbins and Kawakami, 1996). Using stable or transiently transfected cDNA libraries, antigens that specifically stimulated tumor-reactive cytotoxic T-cells have been identified on antigen presenting cells expressing appropriate MHC class I molecules. These melanoma antigens are either tumor-specific with normal tissue expression limited to testes and placenta such as MAGE-1 and MAGE-3, or tissue-specific with lower level expression in normal melanocytes like gp- 100 and MART-1 (Robbins and Kawakami, 1996). Another well-characterized non-melanoma antigen is Her2/Neu which is not tumor- or tissue-specific, but is often over-expressed in breast and ovarian carcinoma. Her2/Neu derived peptides have been shown to be recognized by breast and ovarian cytotoxic T-cells (Peoples et al., 1995; Fisk et al., 1995). Along this line, we are seeking approaches for identifying genes that are over-expressed in lung tumors. These genes can then be evaluated as potential immunotherapeutic targets as determined by their abilities to generate antigen-specific cytotoxic T-cells capable of recognizing tumor cells. Recent developments in cDNA microarray technol- ogy (Shena et al., 1995) have allowed the tissue expression profiles of thousands of genes to be compared simultaneously (DeRisi et al., 1996, 1997; Lyer et al., 1999). The limitation of this technology, however, is that genes with abundant messages will dominate the corresponding cDNAs to be arrayed, limiting the representation of genes expressed at lower levels. To increase the potential for identifying both abundant and less abundant messages expressed in lung tumors, we combined cDNA subtractive metho- dology with microarray technology. Here we report the identification and characterization of 17 genes that are over-expressed in LSCC relative to normal tissues. To our knowledge, this is the first report of extensive analysis and characterization on a large number of dierentially expressed lung squamous cell carcinoma Oncogene (2000) 19, 1519 – 1528 ª 2000 Macmillan Publishers Ltd All rights reserved 0950 – 9232/00 $15.00 www.nature.com/onc *Correspondence: T Wang Received 13 October 1999; revised 11 January 2000; accepted 14 January 2000
10

Identification of genes differentially over-expressed in lung squamous cell carcinoma using combination of cDNA subtraction and microarray analysis

Feb 21, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Identification of genes differentially over-expressed in lung squamous cell carcinoma using combination of cDNA subtraction and microarray analysis

Identi®cation of genes di�erentially over-expressed in lung squamous cellcarcinoma using combination of cDNA subtraction and microarray analysis

Tongtong Wang*,1, Deborah Hopkins1, Cheryl Schmidt1, Sandra Silva1, Raymond Houghton1,Hiroshi Takita2, Elizabeth Repasky3 and Steven G Reed1

1Department of Tumor Antigen Discovery, Corixa Corporation, 1124 Columbia Street, Seattle, Washington, WA 98104, USA;2Department of Surgery at Roswell Park Cancer Center, Bu�alo, New York, NY 14263, USA; 3Department of Immunology atRoswell Park Cancer Center, Bldg. CCC Room 411, ELM and Carlton Streets, Bu�alo, New York, NY 14263, USA

In order to develop e�ective vaccine products againsthuman cancer, we are interested in identifying genesover-expressed in tumor cells. Through a combination ofcDNA library subtraction and microarray technology, weidenti®ed seventeen genes preferentially expressed in lungsquamous cell carcinoma, including four novel genes. Todate, expression pro®les of these genes were con®rmedby Northern and/or real-time analysis, and several geneswere also found to be expressed in head and necksquamous tumors. Thus, these combined methodsrepresent a high throughput approach for identifyingtumor speci®c genes. Furthermore, the report ofcharacterization on these genes will allow them to beexploited for their diagnostic, prognostic, and therapeuticpotentials including immunotherapy and antibody basedanticancer therapy. Oncogene (2000) 19, 1519 ± 1528.

Keywords: lung squamous cell carcinoma (LSCC);microarray; cDNA subtraction; connexin 26; plakophi-lin 1; keratin

Introduction

Lung cancer is a leading cause of death among alltypes of cancers. Approximately, one million peopleworldwide die from lung cancer each year, while sixmillion people die of all types of cancer. Though theincidence of this disease among men has decreased inrecent years, the number of cases in women hasincreased signi®cantly. As a result, the death rate forwoman from lung cancer has surpassed that of breastcancer. There are two major types of lung cancer, non-small cell lung carcinoma (NSCLC) which comprises80% of total incidence, and small cell lung carcinoma(SCLC) which accounts for 20% of all cases. The 5-year survival rate for either category is less than 10%.The NSCLC includes squamous cell carcinoma(LSCC), adenocarcinoma, and large cell carcinoma.Current treatment for NSCLC patients includessurgery if the tumor is con®ned and/or a combinationof chemotherapy and radiation therapy. Unfortunately,at the time of lung cancer diagnosis, the tumor hasoften become disseminated. A lack of dependablemarkers for early diagnosis and treatment has impededthe management of lung cancer. Although several

genes have been reported and tested as diagnostic andprognostic markers for lung cancer, i.e. carcinoem-bryonic antigen (CEA), urokinase plasminogen activa-tor, squamous cell carcinoma antigen (SCC),cytokeratin 19 fragment (CYFRA 21..1) (Morita etal., 1998; Pastor et al., 1997; Brechot et al., 1997), andPGP 9.5 (Hibi et al., 1999), there is room for muchimprovement in this area.

The identi®cation of tumor markers is also impor-tant as such markers may be developed as antigens fortumor therapy. To date, melanoma antigens have beenisolated primarily through immunological identi®cation(Robbins and Kawakami, 1996). Using stable ortransiently transfected cDNA libraries, antigens thatspeci®cally stimulated tumor-reactive cytotoxic T-cellshave been identi®ed on antigen presenting cellsexpressing appropriate MHC class I molecules. Thesemelanoma antigens are either tumor-speci®c withnormal tissue expression limited to testes and placentasuch as MAGE-1 and MAGE-3, or tissue-speci®c withlower level expression in normal melanocytes like gp-100 and MART-1 (Robbins and Kawakami, 1996).Another well-characterized non-melanoma antigen isHer2/Neu which is not tumor- or tissue-speci®c, but isoften over-expressed in breast and ovarian carcinoma.Her2/Neu derived peptides have been shown to berecognized by breast and ovarian cytotoxic T-cells(Peoples et al., 1995; Fisk et al., 1995). Along this line,we are seeking approaches for identifying genes thatare over-expressed in lung tumors. These genes canthen be evaluated as potential immunotherapeutictargets as determined by their abilities to generateantigen-speci®c cytotoxic T-cells capable of recognizingtumor cells.

Recent developments in cDNA microarray technol-ogy (Shena et al., 1995) have allowed the tissueexpression pro®les of thousands of genes to becompared simultaneously (DeRisi et al., 1996, 1997;Lyer et al., 1999). The limitation of this technology,however, is that genes with abundant messages willdominate the corresponding cDNAs to be arrayed,limiting the representation of genes expressed at lowerlevels. To increase the potential for identifying bothabundant and less abundant messages expressed inlung tumors, we combined cDNA subtractive metho-dology with microarray technology. Here we report theidenti®cation and characterization of 17 genes that areover-expressed in LSCC relative to normal tissues. Toour knowledge, this is the ®rst report of extensiveanalysis and characterization on a large number ofdi�erentially expressed lung squamous cell carcinoma

Oncogene (2000) 19, 1519 ± 1528ã 2000 Macmillan Publishers Ltd All rights reserved 0950 ± 9232/00 $15.00

www.nature.com/onc

*Correspondence: T WangReceived 13 October 1999; revised 11 January 2000; accepted 14January 2000

Page 2: Identification of genes differentially over-expressed in lung squamous cell carcinoma using combination of cDNA subtraction and microarray analysis

genes. The report of these genes will facilitate thecritical assessment of their diagnostic as well astherapeutic potentials.

Results

Generation and characterization of LSCC specific cDNAlibraries

To enrich for genes preferentially expressed in lungsquamous cell carcinomas, we have generated threelung squamous tumor speci®c cDNA libraries, referredas LST-S1, LST-S2, and LST-S3 (for details, seeMaterials and methods). In the LST-S1 subtractedcDNA library, we repeatedly recovered genes such askeratin isoform 6 and 58KD type II keratin. Askeratinization is one of the major characteristics of allsquamous cell carcinomas, recovery of keratin andkeratin related genes is an indication of successfulsubtraction for enriching genes di�erentially expressedin lung squamous tumors. However, the complexity ofthe library was apparently low, as evident by therepeated isolation of the same clones. To solve thisproblem, a second subtracted library, LST-S2, wasconstructed using an additional driver comprised of apool of ®ve genes that are highly enriched in LST-S1.The LST-S2 library contained *20 000 independentclones. DNA from 300 randomly picked clones waspuri®ed and sequenced to estimate the complexity ofthe LST-S2 library. Among 227 clones sequenced, 114were unique and 57 were novel (with no EMBL/Genbank hits). Similarly, LST-S3 was generated usinga pool of 20 abundant genes from LST-S1 and LST-S2as additional driver DNA, and roughly 700 clones wererecovered. Sequence analysis of 20 clones revealed eightadditional novel clones and only two cDNAs that

overlapped with sequences from LST-S2. These resultssuggest that by including abundant tester-speci®c genesin the driver DNA, we were able to enrich for LSCCspeci®c genes that are expressed at lower level, andthus increased the complexity of the subtracted cDNAlibrary.

Identification of LSCC specific antigens

Over 3200 clones from LST-S2 and LST-S3 librarieswere subjected to pre-screening by colony hybridizationusing a pool of the eight most frequently identi®edgenes to minimize the redundancy of the clones to beanalysed. Roughly 700 clones were eliminated by thisstep and the remaining 2500 cDNA clones were PCRampli®ed. Clones of no insert, small insert (less than200 bp) as well as multiple inserts were furthereliminated and a ®nal number of 2002 cDNA cloneswere arrayed and 25 sets of these were fabricated. Theexpression pro®les of these cDNA clones were analysedusing 23 pairs of probes, and Table 1 summarizes theprobe identity, tissue sources, and the balancedcoe�cient for each pair of probes (Cy3/Cy5, seeMaterials and methods). The quality of the probesdepends on the purity, the integrity and the quantity ofthe mRNA. In addition, the half-life of Cy3 and Cy5Cyanine dyes is di�erent, and these two dyes may notbe incorporated into the probes with equal e�ciency.As a result, a balanced coe�cient (Cy3/Cy5) is neededto normalize the hybridization signals for any givenpair of probes. This can be achieved in two ways.Ideally, the Cy3 and Cy5 signals can be normalized byhybridizing the same cDNA elements with a pair ofprobes labeled with Cy3 and Cy5, and reversibly Cy5and Cy3, respectively. However, this will double theamount of work and the amount of RNA used forprobe synthesis which can be very limited. Alterna-

Table 1 poly A+ probes used for microarray analyses

Probes Cy3 Cy5 Balanced coefficient

1 Normal lung (8009N) Normal skin (S5) 0.382 Lung squamous tumor (96A) Normal lung (CT-2) 1.53 Lung pleural effusion (86-52) Normal lymph nodes (CT6) 1.064 Colon tumor (S18) Normal colon (I1) 0.235 Lung squamous tumor (9688T) Normal liver (CT1) 1.956 Lung squamous tumor (Pooled) Normal pancreas (S2) 2.17 Bronchioloalveolar adenocarcinoma Normal breast (S73) 0.358 Lung squamous tumor (9681T) Normal heart (CT5) 1.519 Lung pleural effusion (86-52) Normal bone marrow (CT4) 0.8410 Bronchioloalveolar adenocarcinoma Normal large intestine (S55) 0.4811 Lung squamous tumor (9688T) Normal kidney (CT9) 1.3512 Lung squamous tumor (Pooled) Normal stomach (S6) 1.4713 Lung squamous tumor (Pooled) Normal lung (NL873) 2.7114 Lung squamous tumor (96A) Resting PBMC (S39) 1.2415 Lung squamous tumor (96A) Normal brain (CT2) 1.7616 Lung adenocarcinoma (9680T) Normal small intestine (CT10) 0.3517 Lung adenocarcinoma (8009T) Normal bladder (S9) 4.7718 Lung squamous tumor (9681T) Normal salivary gland (CT8) 4.9919 Lung adenocarcinoma (8009T) Matched normal lung (8009N) 3.6520 Lung squamous tumor (9681T) Matched normal lung (9681N) 3.8321 Lung squamous tumor (96A) Normal lung (CT-1) 48*22 Lung squamous tumor (9681T) Normal lung (CT-2) 4.4523 Lung squamous tumor (9688T) Matched normal lung (9688N) 2.98

200 ng poly A+ RNA was used to generate a single probe. The RNA was reverse transcribed and labeled with Cy3 or Cy5 ¯uorescent taggednucleotides. Balanced coe�cient is calculated by the ¯uorescence ratio of Cy3 to Cy5 using the entire chip clones as references. All tumors listedabove are primary tumors except the lung pleural e�usion (86 ± 52) is metastatic lung adenocarcinoma. There are four lung squamous tumors,four lung adenocarcinoma specimens, one colon tumor, and a variety of other normal tissues that are either distal matched normal tissues ofcancer patients or from commercial source (Clontech, Palo Alto, CA, USA); *, The balanced coe�cient for this pair of probes is out of linearrange and as a result, the di�erential expression index is no longer correlating to the ¯uorescence intensities. See Figure 2 for details

LSCC differentially expressed genesT Wang et al

1520

Oncogene

Page 3: Identification of genes differentially over-expressed in lung squamous cell carcinoma using combination of cDNA subtraction and microarray analysis

tively, we have chosen to balance the Cy3 and Cy5signals using the entire cDNA elements as references.Therefore, the balanced coe�ciency for each pair ofprobes is a ratio of average ¯uorescence intensitybetween Cy3 and Cy5 signals (see Figure 2 for anexample). Since multiple probe pairs were used in thisstudy and each hybridization is an independentexperiment with a result of competitive hybridizationof the two probes, the ¯uorescence signal cannot becompared between probe pairs or chips quantitatively.In other words, quanti®cation is only applied to cDNAelements within the chip using the same pair of probes.

The expression pro®les of all cDNA elements on thechip are illustrated using a lung squamous tumor probepaired with a normal kidney probe (Figure 1a, probepair 11) and a lung squamous tumor probe with amatched normal lung probe (Figure 1b, probe pair 20).A large number of clones (within the triangle areas)were found to be over-expressed in these lungsquamous tumors. LSCC speci®c genes were scoredbased upon their expression in over 50% of lungsquamous tumors, as compared with undetectable orlimited expression in normal tissues. Upon sequencinganalysis, we identi®ed 17 genes di�erentially over-

expressed in lung squamous cell carcinoma including13 known genes and four novel cDNA sequences. Alist of candidates and their corresponding genes (ifknown) as well as the frequency of genes recovered islisted in Table 2. The fact that multiple genes wererecovered multiple times independently through micro-array technology demonstrates that this methodologyis a reliable and reproducible tool for gene discovery.

Expression profile of LSCC specific genes

Shown in Figure 2 is the dual color expression analysisof 17 LSCC genes including four novel candidates. Theprobe pair number, 1 through 23, is listed on the left ofeach panel (also refer to Table 1). Expression levels inprobe wells are illustrated by both color images andthe fold of di�erential expression. To con®rm theexpression pro®le of these genes with other indepen-dent methodologies, we examined the expression pro®leof a number of genes using Northern and/orquantitative real-time RT ±PCR analysis. Representa-tive genes analysed by Northern analysis (L513S,L514S, L519S and L520S) and RT±PCR analysis(L513S, L522S, L524S and L528S) are shown in

Figure 1 The scatter plot of ¯uorescence intensity of 2002 cDNA clones hybridized with a lung squamous tumor probe (9688T)and a normal kidney probe (CT9) in (a) and with a pair of lung squamous tumor (9681T) and normal lung (9681N) probes, in (b).The cDNA clones within the triangle boxes represent the cDNAs that are di�erentially over-expressed in these lung squamoustumors

Table 2 Summary of cDNA clones that are over-expressed in lung squamous cell carcinoma*

Identity No. of hits Northern RT±PCR References

L529S Connexin 26 2 ND ND Duflot-Dancer et al., 1997Jamieson et al., 1998

L525S Plakophilin 1 2 ND + Moll et al., 1997L527S Cytokeratin 13 2 ND NDL513S Pemphigus Vulgaris Antigen (PVA), gp130 2 +++ +++ Amagai et al., 1991L520S SPRC (SPR1 Homologue) 1 ++ ND Robinson et al., 1994L521S SPR1 2 ND ND Hu et al., 1998L515S IGF-b2 1 ND ND Occleston and Walker, 1993L516S Aldose Reductase Homolog 1 ND NDL524S Parathyroid Hormone-like Protein 1 Failed +++ Davidson et al. 1996L526S Ataxia Telagiectasia Group D Associated Protein 1 ND ND Morgan and Kastan, 1997

Khanna et al., 1998L522S ADH7 5 Failed ++ Zgombic-Knight et al., 1995L523S KOC 1 +++ +++ Mueller-Pillasch et al, 1997L528S NMB (pMEL17 Homolog) 14 ND +++ Weterman et al., 1995L519S Novel 10 ++ ++L514S Novel 1 +++ +++L530S Novel 2 +++ ++L531S Novel 1 Failed ++

*Not listed in the table is the type II keratin genes such as keratin isoform 6 and 58 kD type II keratin that are highly expressed in lungsquamous cell carcinoma. ND: Denotes the type of analysis was not done; Failed denotes Northern analyses were attempted with no successprobably due to low level of messages. +: Represents the degree of data consistency between Northern or real-time analysis and microarrayanalysis; +: Suggests inconsistent data observed between microarray and RT±PCR analysis

Oncogene

LSCC differentially expressed genesT Wang et al

1521

Page 4: Identification of genes differentially over-expressed in lung squamous cell carcinoma using combination of cDNA subtraction and microarray analysis

Figures 3 and 4, respectively. To date, we have foundthat the expression pro®le of each gene revealed bycDNA microarray analysis correlates well with North-ern and/or real-time RT ±PCR analysis (Table 2).Furthermore, genes that are di�erentially expressed atlower level such as L522S, L524S and L531S werecon®rmed by sensitive real-time RT ±PCR assay thatare otherwise di�cult for Northern analysis (Table 2).For instance, L513S is over-expressed in three out offour lung squamous tumors by both Northern andRT ±PCR analysis (Figures 3a and 4a), consistent withwhat we observed from cDNA microarray data (Figure2a). In addition, L513S is also over-expressed in fourout of four head and neck squamous tumors, asrevealed by RT±PCR analysis (Figure 4a). Impor-tantly, expression of L513S was consistently notdetected in either lung adenocarcinoma or a panel ofnormal tissues including lung, brain, kidney, liver,

breast, colon, bronchiole epithelial cells, and pancreasby three independent approaches (Figures 2b, 3a and4a). Lower level expression of L513S in normalepithelial cell containing tissues such as skin (Figures2b and 4a), esophagus, trachea, and soft pallet (Figure4a), is also well demonstrated. Similarly, L519S isexpressed in normal breast and skin besides tumors asshown by cDNA microarray and Northern analysis(Figures 2c and 3c), as well as indicated by RT ±PCRanalysis (Table 2). It is noteworthy that inconsistentexpression pro®les were observed occasionally forgenes that we have analysed. However, we believe thatthese are largely contributed by sensitivity of the assayand/or quality of individual samples. For instance,expression of L513S and L520S detected in normalskin and/or salivary gland by cDNA microarrayanalysis was not re¯ected in the Northern blots. Thismay be due to the lower sensitivity nature of Northern

LSCC differentially expressed genesT Wang et al

1522

Oncogene

Page 5: Identification of genes differentially over-expressed in lung squamous cell carcinoma using combination of cDNA subtraction and microarray analysis

analysis and to the integrity of the RNA sample (noticethat the ratio of 28S rRNA and 18S rRNA for the skinsample is 51, an indication of partial degradation). Inconclusion, results of Northern analysis on L513S,L514S, L519S and L520S and real-time RT ±PCRanalysis on L513S, L522S, L524S and L528S showedconsistent correlation with the cDNA microarray data,thus further validating the high throughput approachfor identifying di�erentially expressed genes usingcombination of subtraction and cDNA microarraytechnologies.

Discussion

As discussed earlier in the case of melanoma antigens,we expected to identify genes that were either tumorspeci®c or squamous cell speci®c. Based upon theexpression pro®le of known genes and the previousstudies, they are grouped in three categories: genesthat are tissue speci®c (Figure 2a), genes that aretumor speci®c (Figure 2b), and genes that are novel(Figure 2c). The signi®cance of these results isdiscussed below and the summary of the discussionis listed in Table 3.

Tissue-specific genes

Tissue speci®c genes include L529S, L525S, L527S,L513S, L520S and L521S. L529S, L525S and L527Sare cytoskeletal components. L513S, L520S andL521S are squamous cell speci®c markers (Table3). L529S encodes connexin 26, a gap junctionprotein. It is highly expressed in lung squamoustumor 9688T, and moderately over-expressed in twoother tumors (9681T and 96A). Noticeably, however,lower level expression of connexin 26 is alsodetectable in normal skin, colon, liver, and stomach.Since connexin 26 belongs to the gap junctionintercellular communication (GJTC), thought to bea cellular mechanism for tumor suppression (Du¯ot-Dancer et al., 1997) up-regulation of connexin 26 inLSCC was unexpected. Though the over-expressionof connexin 26 in some breast tumors has beenreported (Jamieson et al., 1998), it remains to beseen if the L529S is a mutated or non-mutated formof connexin 26. L525S is plakophilin 1, a desmoso-mal protein found in plaque-bearing adheringjunctions of skin (Moll et al., 1997). mRNA forL525S is highly elevated in three out of four lungsquamous tumors tested. Expression of L525S in

Figure 2 Microarray analysis of potential lung squamous tumor antigens. (a) Shows genes that are squamous tissue speci®c asthese genes are also expressed in normal squamous cell containing tissues such as skin (probe pair 1). (b) Illustrates tumor speci®ccandidates which include lung squamous tumor speci®c genes and genes that are also found in other tumor targets. (c) denotes theexpression pro®le of four novel genes, L514S, L519S, L530S and L531S. In all cases, the data is shown by both color image ofhybridization intensities (white being the strongest and dark being the weakest) and the fold of balanced di�erential expressionbetween probes in Cy3 and Cy5 channel. The probe pairs are from 1 to 23 (also refer to Table 1) `+' and `7' indicate theexpression being stronger in Cy3 and Cy5 channel, respectively (see Materials and methods for detail). The arrows in both (a) and(b) denotes the probe pair 21 of which the balanced coe�cient is so distorted (see * in Table 1) that the balanced di�erentialexpression number is no longer correlated with the color image

Oncogene

LSCC differentially expressed genesT Wang et al

1523

Page 6: Identification of genes differentially over-expressed in lung squamous cell carcinoma using combination of cDNA subtraction and microarray analysis

normal skin may prevent it from being developed asa therapeutic vaccine due to potential autoimmuneresponses, although, it could prove useful as adiagnostic marker. To our knowledge, this is the®rst report of over-expression of plakophilin 1 inLSCC. We have also repeatedly identi®ed keratin 6isoform and type II 58KD keratin in our initialsubtracted library such as LST-S1 (Table 3). Inaddition, we have recovered cytokeratin 13, L527S.As discussed earlier, keratin and keratin-relatedgenes, i.e. CYFRA2.1, have been extensively docu-mented as potential markers for lung cancer (Pastoret al., 1997; Brechot et al., 1997). Here we showthat a variety of keratin related genes are up-regulated in LSCC and the overall expressionpro®les for these keratin genes are similar to thatof L527S.

L513S, L520S and L521S are not cytoskeletalcomponents, but appear to be speci®c to squamouscells as they are present in normal skin (Figure 2a).Both cDNA microarray and Northern analysis show

elevated expression of these genes in lung squamoustumors (Figures 2a and 3a,d), as also revealed byreal-time RT ±PCR analysis for L513S (Figure 4a).L513S encodes a protein that was ®rst isolated as apemphigus vulgaris antigen using patient auto-antibodies (Amagai et al., 1991). L520S (SPRC)(Robinson et al., 1994) and L521S (SPR1) belong toa family of small proline rich proteins, representingmarkers for fully di�erentiated squamous cells.L521S has recently been described as a speci®cmarker for lung squamous tumor (Hu et al., 1998).Although L520S and L521S have similar expressionpro®les in lung squamous tumors, they seem todi�er in their normal tissue expression. L520S is upregulated in normal salivary gland whereas expres-sion of L521S is dominant in normal skin (Figure3a). Interestingly, Northern analysis also showed thatL520S is highly expressed in esophagus and trachea.These results suggest that L520S and L521S havedistinct roles in cell di�erentiation and tumorigeni-city.

Table 3 Summary of genes di�erentially expressed in LSCC

Cytoskeletal components Diffrentiated squamous cell markers

Tissue-specific genes L529S (connexin 26) L513S (Pemphigu Vulgatis Antigen)L525S (Plakophilin 1) L520S (SPRC, SPR1 homolog)L527S (cytokeratin 13) L521S (SPR1)Keratin 6 isoform*Type II 58KD Keratin*

Cell signaling components Enzymes Shared antigensTumor-specific genes L515S (IGF-beta2) L516S (Aldose reductase homolog) L523S (KOC)

L524S (PTHrP) L522S (ADH7) L528S (NMB)L526S (ATM)

*denotes ketatine 6 isoform and type II 58 kd keratin that were repeatedly recovered from LST-S1

Figure 3 Northern analysis of L513S (a), L514S (b), L519S (c) and L520S (d). Each panel of RNA includes four lung squamoustumors (Squamous), four lung adenocarcinoma (Adenocarc.), three normal lung samples, and various other normal tissues

LSCC differentially expressed genesT Wang et al

1524

Oncogene

Page 7: Identification of genes differentially over-expressed in lung squamous cell carcinoma using combination of cDNA subtraction and microarray analysis

Tumor-specific genes

The tumor speci®c genes can be subdivided in threegroups: cell signaling components (L515S, L524S,L526S), enzymes (L516S, L522S), and shared tumorantigens (L523S, L528S), (see Table 3). L515S ismoderately expressed in lung squamous tumors andencodes IGF-beta2. Mechanism underlying the elevationof L515S messages are not clear except that secretion ofmultiple growth factors including IGF-beta2 by a non-small cell carcinoma cell line has been documented(Occleston and Walker, 1993). Parathyroid hormone-related peptide (PTHrP), L524S, is known to causehumoral hypercalcaemia associated with malignanttumors such as leukemia, prostate and breast cancer. Itis also believed that PTHrP is most commonly associatedwith squamous carcinoma of lung and rarely with lungadenocarcinoma (Davidson et al., 1996). It is thereforenot surprising to see elevated expression of L524S inboth lung and head and neck squamous tumors (Figure4c). Finally, L526S is a gene called ATM. Mutations inATM cause a genetic disorder in humans known asataxia telangiectasia characterized by immunode®ciency,

progressive cerebella ataxia, radiosensitivity, and cancerpredisposition (Morgan and Kastan, 1997). Recentstudies have shown that ATM encodes a 350 KD proteincontaining a PI-3 kinase domain and is required foroptimal function of p53 (Khanna et al., 1998).Speci®cally, ATM activates p53 mediated cell cyclecheckpoint through direct binding and phosphorylationof p53. As shown in Figure 2b, over-expression of ATMis observed in all LSCC tested. However, it is not knownwhether up regulation of ATM is a cause or a result ofLSCC. Because 40% of lung cancer is associated withp53 mutations (Hollstein et al., 1991), it is likely thatover-expression of ATM is a result of compensation forloss of p53 function. On the other hand, the expression ofATM seems to be squamous cell speci®c as indicated byits expression in normal skin (see probe pair 1 in Figure2b). Importantly, expression of ATM was also detectedin a metastatic but not other lung adenocarcinoma(compare probe pairs 3 and 9 with probe pairs 7, 10, 16,17 and 19). This result suggests that the ATM gene mayalso be involved in metastasis. Taking together, theexpression pro®le of ATM and its association with p53suggest that this molecule plays an important role in

Figure 4 Real-time RT±PCR analysis of L513S (a), L522S (b), L524S (c), and L528S (d). Each gene is analysed using an identicalpanel of 36 cDNA samples comprised of four lung squamous tumors (1 ± 4), one of each lung adenocarcinoma, mesothelioma, andlarge cell carcinoma (5 ± 7), four head and neck squamous tumors (8 ± 11), ®ve normal lung tissue samples (12 ± 16), and variousother normal tissues (17 ± 36). The expression of each gene for each cDNA sample is normalized against internal actin transcriptsand is plotted as copies per 1000 pg actin (see Materials and methods for details). Notice that L513S and L528S are much moreabundant compared with L522S and L524S

Oncogene

LSCC differentially expressed genesT Wang et al

1525

Page 8: Identification of genes differentially over-expressed in lung squamous cell carcinoma using combination of cDNA subtraction and microarray analysis

tumor biology and makes it an interesting target forstudying the mechanism of tumorigenicity of LSCC.

L516S is an aldose reductase homologue and L522Sbelongs to a class IV alcohol dehydrogenase, ADH7(Zgombic-Knight et al., 1995). Similar to L526S, L516Sis also up regulated in metastatic, but not in primarylung adenocarcinoma (see probe pairs 3 and 9 andcompare with probe pairs 7, 10, 16, 17 and 19), anindication of a potential role in metastasis and/or as amarker for prognosis. Both L516S and L522S aremoderately over-expressed in lung squamous tumors asshown in Figures 2 and 4b. Unlike other alcoholdehydrogenases, expression of ADH7 is not detected inliver and is believed to be most e�ective as a retinoldehydrogenase (Zgombic-Knight et al., 1995). It will beinteresting to see whether the up regulation of theseenzymes is associated with the hypoxic status of thetumor.

In the ®nal group, L523S encodes KOC proteinwhich contains RNA-binding motifs (KH domain). Itwas ®rst isolated as a gene di�erentially over-expressedin human pancreatic cancer cell lines and pancreaticcancer tissues (Mueller-Pillasch et al., 1997). Recently,another novel cytoplasmic protein containing KHdomains has been identi®ed as an autoantigen inhuman hepatocellular carcinoma (Zhang et al., 1999).Identi®cation of L523S in lung squamous tumorsuggests that this gene may be a shared antigen amongLSCC, pancreatic cancer, and hepatocellular carcino-ma. Similar to L522S, L523S is only moderately over-expressed in LSCC, yet its expression is low in allnormal tissue tested (Figure 2b and Table 3). There-fore, both L522S and L523S can be further evaluatedfor their diagnostic and therapeutic values. UnlikeL523S, L528S is highly expressed in two lungsquamous tumors with moderate expression in twoother squamous tumors, one lung adenocarcinoma(8009T), and some normal tissues including skin,lymph nodes, heart, stomach and lung as shown bycDNA microarray analysis (Figure 2b). Nevertheless, itis di�erentially expressed in lung squamous tumorscompared with other tumors and normal tissues(Figure 4d). NMB gene is similar to the precursor ofmelanocyte speci®c gene pMEL17 and is reported to bepreferentially expressed in melanoma cell lines withlow-metastatic potential (Weterman et al., 1995).Because the human homologue of pMEL17, gp-100has been validated as a melanoma antigen, isolation ofNMB in LSCC suggests that this potentially sharedantigen and its associated antibodies could be devel-oped as anti-cancer products.

Expression of LSCC specific genes in head and necksquamous tumors

Head and neck squamous tumors are very similar toLSCC by their microscopic characteristics. We havefound that subsets of LSCC speci®c genes such asL513S, L524S are elevated in head and neck squamoustumors whereas others are not (L522S and L528S) asshown in Figure 4. Although further studies remain tobe seen for other LSCC genes identi®ed here, ourpreliminary results suggest that distinct molecules maybe involved in tumorigenicity of these two types ofsquamous tumors which can potentially be di�eren-tiated on the molecular level.

Comparison with SAGE analysis

Studies have been initiated on NSCLC di�erentiallyexpressed genes using Serial Analysis of Gene Expression(SAGE) (Hibi et al., 1998). A total of 108 000 and118 000 tags were sequenced from two SAGE librariescomprised of lung squamous tumors and normalbronchial tracheal epithelial cells, respectively, and theresults of sequenced tags were compared between tumorand normal SAGE libraries. Over a dozen genes werefound to be overexpressed in NSCLC and three of thesegenes including PGP 9.5, B-myb, and human mutT werecon®rmed to be di�erentially expressed in lung cancers.Interestingly, these three genes are not overlapping withseventeen genes that we have identi®ed here and this maybe due to several factors. First, the uniqueness of thetumor samples between two studies may account for thedi�erence. Secondly, SAGE analysis is prone to identifymoderately to abundantly expressed genes whereas oursubtraction strategy was designed to selectively enrichfor less abundantly expressed genes and allow us toidentify di�erentially expressed genes that are expressedat a lower level such as L522S, L524S, and L531S.Finally, our criteria for selecting LSCC speci®c genesmay be more stringent than SAGE analysis. This isbecause our subtractions were performed using a numberof di�erent normal tissues and candidate genes wereselected based on their relatively low level expression in apanel of 23 normal tissues whereas SAGE analysis islimited to one normal epithelial cell type as reference.However, SAGE analysis is less expensive to performand requires a less sophisticated technology platform.We believe that it is bene®cial to have parallelapproaches since they may complement each other aswe have already seen here.

In conclusion, we reported using a combination ofcDNA library subtraction and microarray analysis foridenti®cation and initial characterization of 17 genespreferentially expressed in lung squamous cell carcino-ma. Expression pro®les of these genes were con®rmedby Northern and/or real-time RT±PCR analysis.Protein expression and localization of these genes willbe evaluated by immunohistochemistry. Genes that aretumor speci®c such as L522S, L523S, L524S, L526Sand L528S will also be tested for their antigenicity andability to generate T-cell responses. Finally, full-lengthcDNA isolation for novel candidates, 514S, 519S,L530S and L531S are currently in progress and theyshould provide us with additional tools for developinglung cancer diagnostic and therapeutic products.

Materials and methods

Tissue and RNA sources

Tumor and some normal tissues used in this study were fromCooperative Human Tissue Network (CHTN), NationalDisease Research Interchange (NDRI), and Roswell ParkCancer Center.

Construction of a cDNA library using LSCC

A human lung squamous cell carcinoma cDNA library wasconstructed from poly A+ RNA extracted from a pool of twopatient tissues using a Superscript Plasmid System for cDNASynthesis and Plasmid Cloning Kit (GIBCO BRL Life

LSCC differentially expressed genesT Wang et al

1526

Oncogene

Page 9: Identification of genes differentially over-expressed in lung squamous cell carcinoma using combination of cDNA subtraction and microarray analysis

Technologies, Gaithersburg, MD, USA) with modi®cations.Brie¯y, BstXI/EcoRI adaptors (Invitrogen, San Diego, CA,USA) were used and cDNA was cloned into pcDNA3.1+vector (Invitrogen) that was digested with BstXI and EcoRI.A total of 2.76106 independent colonies were obtained, with100% of clones having inserts and the average insert sizebeing 2100 base pairs.

Construction of cDNA libraries using normal lung, heart andliver tissues

Using the same procedure, a normal human lung cDNAlibrary was prepared with a pool of four lung tissuespecimens, and a normal human liver and heart cDNAlibrary was generated from equal amounts of total RNAisolated from heart and liver tissues. The normal lung librarycontained 1.46106 independent colonies, with 90% of cloneshaving inserts and the average insert size being 1800 basepairs. The normal heart and liver cDNA library contained1.76106 independent colonies, with 100% of clones havinginserts and the average insert size being 1600 base pairs.

Lung squamous cell carcinoma-specific subtracted cDNAlibraries

To enrich for genes preferentially expressed in LSCC, weperformed cDNA library subtractions using the above lungsquamous cell cDNA library as the tester and normal tissuecDNA libraries as driver, as previously described (Sargentand Dawid, 1983; Duguid and Dinauer, 1990), withmodi®cations. Normal lung, liver and heart cDNAs (40 mgof each) were digested with BamHI and XhoI, followed byphenol-chloroform extraction and ethanol precipitation. TheDNA was then labeled with photoprobe long-arm biotin(Vector Laboratories, Burlingame, CA, USA) and theresulting material was ethanol precipitated and dissolved inH2O at 2 mg/ml to prepare driver DNA. For tester DNA,10 mg of lung squamous cell carcinoma cDNA was digestedwith NotI and SpeI followed by phenol-chloroform extractionand size fractionation using Chroma spin-400 columns(Clontech, Palo Alto, CA, USA). Five mg tester DNA wasmixed with 25 mg driver DNA and proceeded for hybridiza-tion at 688C by adding equal volume of 26hybridizationbu�er (1.5 M NaCl/10 mM EDTA/50 mM HEPES pH 7.5/0.2% sodium dodecyl sulfate). Following hybridization,several rounds of streptavidin treatment and phenol/chloro-form extraction were performed to remove biotinlated DNA,both driver DNA and tester DNA hybridizing to driverDNA. The subtracted DNA enriched for tester-speci®c DNAwas then hybridized to additional driver DNA for a secondround of subtraction. After the second round of subtraction,DNA was precipitated and ligated into pBCSK+plasmidvector (Stratagene, La Jolla, CA, USA) to generate a LungSquamous Tumor-speci®c Subtracted cDNA library, referredto as LST-S1, LST-S2, LST-S3 etc.To analyse the subtracted libraries, 20 to 300 clones were

randomly picked and plasmid DNA was prepared forsequence analysis with a Perkin Elmer/Applied BiosystemsDivision Automated Sequencer Model 373A and/or Model377 (Foster City, CA, USA). These sequences were comparedto sequences in the GenBank and human EST databases. Theredundancy and the complexity of each subtracted cDNAlibrary was then estimated based on the frequency of eachunique cDNA recovered. Highly redundant cDNAs werethen used as probes to pre-screen the subtracted cDNAlibraries to eliminate redundant cDNA fragments from thoseto be analysed by microarray technology.

Analysis of cDNA expression using microarray technology

A total of 2002 cDNA fragments isolated in LST-S1, LST-S2 and LST-S3 were PCR ampli®ed from individual

colonies. Their mRNA expression pro®les in lung tumor,normal lung, and other normal and tumor tissues wereexamined using cDNA microarray technology (Incyte, PaloAlto, CA, USA) as described (Shena et al., 1995). In brief,these clones were arrayed onto glass slides as multiplereplicas, with each location corresponding to a uniquecDNA clone (as many as 10 000 clones can be arrayed on asingle slide, or chip). Each chip was hybridized with a pairof cDNA probes that were ¯uorescence-labeled with Cy3and Cy5, respectively. Typically, 200 ng of polyA+RNA wasused to generate each cDNA probe. After hybridization, thechips were scanned and the ¯uorescence intensity recordedfor both Cy3 and Cy5 channels. There were multiple built-inquality control steps. First, the probe quality was monitoredusing a panel of 18 ubiquitously expressed genes. Secondly,the control plate also had yeast DNA fragments of whichcomplementary RNA was spiked into the probe synthesisfor measuring the quality of the probe and the sensitivity ofthe analysis. Currently, the technology o�ers a sensitivity of1 in 100 000 copies of mRNA. Finally, the reproducibility ofthis technology was ensured by including duplicated controlcDNA elements at di�erent locations. Further validation ofthe process was indicated in that several di�erentiallyexpressed genes were identi®ed multiple times in the study,and the expression pro®les for these genes are verycomparable (not shown).

Northern analysis of LSCC specific genes

Northern blots were done using 20 mg of total RNA for eachsample according to standard procedures. Each blot wasstained with 0.02% methylene blue to reveal ribosomal RNAbefore hybridization. Pictures and autographs were scannedand processed through Photoshop 4.0.

Quantitative real-time RT ±PCR analysis of LSCC-specificgenes

The quantitative RT ±PCR analysis was performed using1:he TaqManTM chemistry provided by Perkin Elmer/ABI(PE Biosystems, Foster City, CA, USA). The TaqMan probecontaining a Reporter dye at the 5' end (FAM) and aQuencher dye at the 3' end (TAMRA) (Perkin Elmer/AppliedBiosystems Division, Foster City, CA, USA). Target-speci®cPCR ampli®cation results in cleavage and release of theReporter dye from the Quencher-containing probe by thenuclease activity of AmpliTaq GoldTM (PE Biosystems).Thus, ¯uorescence signal generated from released Reporterdye is proportional to the amount of PCR product. Tocompare the relative level of gene expression in multipletissue samples, a panel of cDNA was constructed using RNAfrom interested tissues, and real-time RT±PCR wasperformed using gene speci®c primers to quantify the copynumber in each cDNA sample. Each cDNA sample was donein duplicate and each reaction repeated in duplicated plates.The ®nal real-time RT±PCR result is reported as an averageof copy number of a gene of interest normalized against theaverage of actin copy number in each cDNA sample. AllRT ±PCR reactions were performed on an ABI PRISM 7700Detector (PE Biosystems).

AcknowledgmentsWe are very grateful to Dr Jill M Siegfried at University ofPittsburgh who also provided us with some tissues used inthis study. We thank Dr Jiangchun Xu for coordinatingmicroarray analysis with Incyte.

Oncogene

LSCC differentially expressed genesT Wang et al

1527

Page 10: Identification of genes differentially over-expressed in lung squamous cell carcinoma using combination of cDNA subtraction and microarray analysis

References

Amagai M, Klaus-Kovtun V and Stanley JR. (1991). Cell,67, 869 ± 877.

Brechot JM, Chevret S, Nataf J, Le Gall C, Fretault J,Rochemaure J and Chastang C. (1997). Eur. J. Cancer, 33,385 ± 391.

Davidson LA, Black M, Carey FA, Logue F and McNicolAM. (1996). J. Pathol., 178, 398 ± 401.

DeRisi J, Iyer VR and Brown PO. (1997). Science, 278, 680 ±686.

DeRisi J, Penland L, Brown PO, Bittner ML, Meltzer PS,Ray M, Chen Y, Su YA and Trent JM. (1996). Nat. Genet.,14, 457 ± 460.

Du¯ot-Dancer A, Mesnil M and Yamasaki H. (1997).Oncogene, 5, 2151 ± 2158.

Duguid JR and Dinauer MC. (1990). Nucleic Acids Res., 18,2789 ± 2792.

Fisk B, Blevins TL, Wharton J and Loannides C. (1995). J.Exp. Med., 181, 2109 ± 2117.

Hibi K, Liu Q, Beaudry GA, Madden SL, Westra WH,Wehage SL, Yang SC, Heitmiller RF, Bertelsen AH,Sidransky D and Jen J. (1998). Cancer Res., 58, 5690 ±5694.

Hibi K, Westra WH, Borges M, Goodman S, Sidranski Dand Jen J. (1999). Am. J. Pathol., 155, 711 ± 715.

Hollstein M, Sidransky D, Vogelstein B and Harris CC.(1991). Science, 253, 49 ± 53.

Hu R, Wu R, Deng J and Lau D. (1998). Lung Cancer, 20,25 ± 30.

Jamieson S, Going JJ, D'Arcy R and George WD. (1998). J.Pathol., 184, 37 ± 43.

Khanna KK, Keating KE, Kozlov S, Scott S, Gatei M,Hobson K, Taya Y, Gabrielli B, Chan D, Lees-Miller SPand Lavin MF. (1998). Nat. Genet., 20, 398 ± 400.

Lyer VR, Eisen MB, Ross DT, Schuler G, Moore T, LeeJCF, Trent JM, Staudt LM, Hudson Jr J, Boguski MS,Lashkari D, Shalon D, Botstein D and Brown PO. (1999).Science, 283, 83 ± 87.

Moll I, Kurzen H, Langbein L and Franke WW. (1997). J.Invest. Dermatol., 108, 139 ± 146.

Morgan SE and Kastan MB. (1997). Adv. Cancer Res., 71,1 ± 25.

Morita S, Sato A, Hayakawa H, Ihara H, Urano T, TakadaY and Takada A. (1998). Int. J. Cancer, 78, 286 ± 292.

Mueller-Pillasch F, Lacher U, Wallrapp C, Micha A,Zimmerhackl F, Hameister H, Varga G, Friess H, BuchlerM, Beger HC, Vila MR, Adler G and Gree TM. (1997).Oncogene, 14, 2729 ± 2733.

Occleston NL and Walker C. (1993). Cancer Lett., 71, 203 ±210.

Pastor A, Menendez R, Cremades MJ, Pastor V, Llopis Rand Aznar J. (1997). Eur. Respir. J., 10, 603 ± 609.

Peoples GE, Goedegebuure PS, Smith R, Linehan DC,Yoshino I and Eberlein TJ. (1995). Proc. Natl. Acad. Sci.USA, 92, 432 ± 436.

Robbins PF and Kawakami Y. (1996). Curr. Opin. Immunol.,8, 628 ± 636.

Robinson PA, Marley JJ, High AS and Hume WJ. (1994).Arch. Oral Biol., 39, 251 ± 259.

Sargent TL and Dawid IB. (1983). Science, 222, 135 ± 139.Shena M, Shalon D, Davis RW and Brown PO. (1995).

Science, 270, 467 ± 470.Weterman MAJ, Ajubi N, van Dinter IMR, Degen WGJ,

van Muijen GNP, Ruiter DJ and Bloemers HPJ. (1995).Int. J. Cancer, 60, 73 ± 81.

Zgombic-Knight M, Foglio MH and Duester G. (1995). J.Biol. Chem., 270, 4305 ± 4311.

Zhang JY, Chan EKL, Peng XX and Tan EM. (1999). J. Exp.Med., 189, 1101 ± 1110.

LSCC differentially expressed genesT Wang et al

1528

Oncogene