ORIGINAL ARTICLE High-density SNP association study and copy number variation analysis of the AUTS1 and AUTS5 loci implicate the IMMP2L–DOCK4 gene region in autism susceptibility E Maestrini 1,11 , AT Pagnamenta 2,11 , JA Lamb 2,3,11 , E Bacchelli 1 , NH Sykes 2 , I Sousa 2 , C Toma 1 , G Barnby 2 , H Butler 2 , L Winchester 2 , TS Scerri 2 , F Minopoli 1 , J Reichert 4 , G Cai 4 , JD Buxbaum 4 , O Korvatska 5 , GD Schellenberg 6 , G Dawson 7,8 , A de Bildt 9 , RB Minderaa 9 , EJ Mulder 9 , AP Morris 2 , AJ Bailey 10 and AP Monaco 2 , IMGSAC 12 1 Department of Biology, University of Bologna, Bologna, Italy; 2 The Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK; 3 Centre for Integrated Genomic Medical Research, University of Manchester, Manchester, UK; 4 Department of Psychiatry, Seaver Autism Research Center, Mount Sinai School of Medicine, New York, NY, USA; 5 Geriatric Research Education and Clinical Centre, Veterans Affairs Puget Sound Health Care System, Seattle Division, Seattle, WA, USA; 6 Department of Pathology and Laboratory Medicine, University of Pennsylvania School of Medicine, Philadelphia, PA, USA; 7 Autism Speaks, New York, NY, USA; 8 Department of Psychology, University of Washington, Seattle, WA, USA; 9 Department of Psychiatry, Child and Adolescent Psychiatry, University Medical Center Groningen, Groningen, The Netherlands and 10 University Department of Psychiatry, Warneford Hospital, Oxford, UK Autism spectrum disorders are a group of highly heritable neurodevelopmental disorders with a complex genetic etiology. The International Molecular Genetic Study of Autism Consortium previously identified linkage loci on chromosomes 7 and 2, termed AUTS1 and AUTS5, respectively. In this study, we performed a high-density association analysis in AUTS1 and AUTS5, testing more than 3000 single nucleotide polymorphisms (SNPs) in all known genes in each region, as well as SNPs in non-genic highly conserved sequences. SNP genotype data were also used to investigate copy number variation within these regions. The study sample consisted of 127 and 126 families, showing linkage to the AUTS1 and AUTS5 regions, respectively, and 188 gender-matched controls. Further investigation of the strongest association results was conducted in an independent European family sample containing 390 affected individuals. Association and copy number variant analysis highlighted several genes that warrant further investigation, including IMMP2L and DOCK4 on chromosome 7. Evidence for the involvement of DOCK4 in autism susceptibility was supported by independent replication of association at rs2217262 and the finding of a deletion segregating in a sib-pair family. Molecular Psychiatry (2010) 15, 954–968; doi:10.1038/mp.2009.34; published online 28 April 2009 Keywords: autistic disorder; disease susceptibility; single nucleotide polymorphisms; linkage disequilibrium; chromosome 7; chromosome 2 Introduction Autism (OMIM: %209850) is a complex neurodeve- lopmental disorder characterized by impairments in reciprocal social interaction, difficulties in verbal and nonverbal communication, stereotyped behaviors and interests, and an onset in the first 3 years of life. Autism belongs to the group of pervasive develop- mental disorders (PDD), also known as autism spectrum disorders (ASDs), which also include Asperger syndrome and pervasive developmental disorder—not otherwise specified (PDD-NOS). The estimated population prevalence of core autism is around 15–20 in 10 000, with a male/female sex ratio of approximately 4:1. When all ASD subtypes are combined the prevalence is several times higher, reaching 116 in 10 000. 1–3 Several lines of evidence indicate that genetic factors are important in susceptibility to idiopathic Received 20 October 2008; revised 19 February 2009; accepted 2 April 2009; published online 28 April 2009 Correspondence: Professor AP Monaco, Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN, UK. E-mail: [email protected] or Professor AJ Bailey, University Department of Psychiatry, Warneford Hospital, Head- ington, Oxford OX3 7JX, UK. E-mail: [email protected]11 These authors contributed equally to this work. 12 IMGSAC: see list of authors in Supplementary Information. Molecular Psychiatry (2010) 15, 954–968 & 2010 Macmillan Publishers Limited All rights reserved 1359-4184/10 www.nature.com/mp
15
Embed
High-density SNP association study and copy number variation analysis of the AUTS1 and AUTS5 loci implicate the IMMP2L–DOCK4 gene region in autism susceptibility
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
ORIGINAL ARTICLE
High-density SNP association study and copy numbervariation analysis of the AUTS1 and AUTS5 loci implicatethe IMMP2L–DOCK4 gene region in autism susceptibility
E Maestrini1,11, AT Pagnamenta2,11, JA Lamb2,3,11, E Bacchelli1, NH Sykes2, I Sousa2, C Toma1,
G Barnby2, H Butler2, L Winchester2, TS Scerri2, F Minopoli1, J Reichert4, G Cai4, JD Buxbaum4,
O Korvatska5, GD Schellenberg6, G Dawson7,8, A de Bildt9, RB Minderaa9, EJ Mulder9, AP Morris2,
AJ Bailey10 and AP Monaco2, IMGSAC12
1Department of Biology, University of Bologna, Bologna, Italy; 2The Wellcome Trust Centre for Human Genetics,University of Oxford, Oxford, UK; 3Centre for Integrated Genomic Medical Research, University of Manchester,Manchester, UK; 4Department of Psychiatry, Seaver Autism Research Center, Mount Sinai School of Medicine,New York, NY, USA; 5Geriatric Research Education and Clinical Centre, Veterans Affairs Puget Sound Health CareSystem, Seattle Division, Seattle, WA, USA; 6Department of Pathology and Laboratory Medicine, University ofPennsylvania School of Medicine, Philadelphia, PA, USA; 7Autism Speaks, New York, NY, USA; 8Department ofPsychology, University of Washington, Seattle, WA, USA; 9Department of Psychiatry, Child and Adolescent Psychiatry,University Medical Center Groningen, Groningen, The Netherlands and 10University Department of Psychiatry,Warneford Hospital, Oxford, UK
Autism spectrum disorders are a group of highly heritable neurodevelopmental disorders witha complex genetic etiology. The International Molecular Genetic Study of Autism Consortiumpreviously identified linkage loci on chromosomes 7 and 2, termed AUTS1 and AUTS5,respectively. In this study, we performed a high-density association analysis in AUTS1 andAUTS5, testing more than 3000 single nucleotide polymorphisms (SNPs) in all known genes ineach region, as well as SNPs in non-genic highly conserved sequences. SNP genotype datawere also used to investigate copy number variation within these regions. The study sampleconsisted of 127 and 126 families, showing linkage to the AUTS1 and AUTS5 regions,respectively, and 188 gender-matched controls. Further investigation of the strongestassociation results was conducted in an independent European family sample containing390 affected individuals. Association and copy number variant analysis highlighted severalgenes that warrant further investigation, including IMMP2L and DOCK4 on chromosome 7.Evidence for the involvement of DOCK4 in autism susceptibility was supported byindependent replication of association at rs2217262 and the finding of a deletion segregatingin a sib-pair family.Molecular Psychiatry (2010) 15, 954–968; doi:10.1038/mp.2009.34; published online 28 April 2009
Autism (OMIM: %209850) is a complex neurodeve-lopmental disorder characterized by impairments in
reciprocal social interaction, difficulties in verbal andnonverbal communication, stereotyped behaviors andinterests, and an onset in the first 3 years of life.Autism belongs to the group of pervasive develop-mental disorders (PDD), also known as autismspectrum disorders (ASDs), which also includeAsperger syndrome and pervasive developmentaldisorder—not otherwise specified (PDD-NOS). Theestimated population prevalence of core autism isaround 15–20 in 10 000, with a male/female sex ratioof approximately 4:1. When all ASD subtypes arecombined the prevalence is several times higher,reaching 116 in 10 000.1–3
Several lines of evidence indicate that geneticfactors are important in susceptibility to idiopathic
Received 20 October 2008; revised 19 February 2009; accepted 2April 2009; published online 28 April 2009
Correspondence: Professor AP Monaco, Wellcome Trust Centrefor Human Genetics, University of Oxford, Roosevelt Drive,Oxford OX3 7BN, UK.E-mail: [email protected] or Professor AJ Bailey,University Department of Psychiatry, Warneford Hospital, Head-ington, Oxford OX3 7JX, UK.E-mail: [email protected] authors contributed equally to this work.12IMGSAC: see list of authors in Supplementary Information.
Molecular Psychiatry (2010) 15, 954–968& 2010 Macmillan Publishers Limited All rights reserved 1359-4184/10
autism. Twin studies show a concordance of 60–92%for monozygotic (MZ) twins and 0–10% for dizygotic(DZ) twins, depending on phenotypic definitions, andthe sibling recurrence risk is 25–60 times higher thanthe population prevalence.4 Furthermore, relatives ofaffected probands show a higher incidence of mildercognitive or behavioral features, consistent with thehypothesis of a ‘spectrum’ of severity.5
Autism spectrum disorders exhibit wide clinicalvariability and a high degree of genetic heterogeneity.A variety of chromosomal abnormalities are found ina small proportion of affected individuals (6–7%),most frequently in syndromic cases with dysmorphicfeatures and cognitive impairment.6 The autismphenotype is also associated with known geneticconditions such as the Fragile X syndrome andtuberous sclerosis. Recently, rare ASD-causing muta-tions were reported in a number of genes, includingNLGN3, NLGN4,7 NRXN1,8 SHANK39 and NHE9.10
In recent years, the development of DNA micro-array technologies has revealed that submicroscopicdeletions and duplications of DNA, known as copynumber variants (CNVs), may be significant in autismsusceptibility.11–14 Recent surveys identified a higherrate of de novo CNVs in autism pedigrees compared tocontrols, with the increased rate becoming moreexaggerated in singleton than in multiplex fa-milies.10,12,13 Nevertheless, it remains difficult tointerpret the significance of the numerous CNVsidentified in ASDs, to distinguish those that influencesusceptibility from normal polymorphic variation andto understand how they might interact with othergenetic and non-genetic factors.
Although individually rare, highly penetrant ab-normalities, such as microdeletions/microduplica-tions or point mutations, may have a significantfunction in ASDs. It is also likely that geneticsusceptibility may also result from the combinedaction of several common genetic variants. Commonvariation in several candidate genes has been im-plicated in autism (MET, CNTNAP2, SLC6A4, RELN,GABRB3),15 but in most cases consistent replicationhas not been achieved.
Because the strong genetic component in ASDs wasclearly demonstrated over a decade ago, a largenumber of molecular genetic studies have searchedfor susceptibility genes, following the general ap-proach of a genome-wide linkage scan using affectedsibling/relative pair families. The International Mole-cular Genetic Study of Autism Consortium (IMGSAC)identified the first autism linkage locus on chromo-some 7q21–q32 (designated autism susceptibilitylocus 1, AUTS1) with a multipoint maximum LODscore (MLS) of 2.53 in 87 families.16 This result wasconfirmed in follow-up studies conducted by theIMGSAC using additional families and markers.17,18
Another linkage susceptibility locus (AUTS5) wasidentified by IMGSAC on chromosome 2q24–q33with an MLS of 3.74 in 152 affected sibling pairs.17
Replication of linkage signals in independentstudies has proven difficult for ASDs. To date, 13
whole-genome linkage scan for ASDs have beenpublished,15 and no single locus has been consis-tently confirmed in all studies. This finding is likelyto result from the small effect size attributable toindividual genes, as well as from the clinical andgenetic complexity of ASDs; differences in ascertain-ment and inclusion criteria may have been additionalfactors. However, AUTS1 is one of the few identifiedloci that has been supported by overlapping positiveresults in multiple multiplex collections,19,20 and inmeta-analyses.21,22 Similarly, the chromosome 2qlocus is supported by overlapping linkage findingsin another two independent genome scans,23,24 and byhomozygosity mapping in consanguineous families.10
The largest genome scan published to date, carriedout by the Autism Genome Project (AGP) usingAffymetrix 10K single nucleotide polymorphism(SNP) arrays and 1181 multiplex families, alsoprovided some support for both the chromosome 2qand 7q loci within the families of inferred Europeanancestry.8
Despite the support for linkage on chromosomes 2qand 7q, the candidate genomic intervals remainbroad, each spanning approximately 40 Mb andcontaining approximately 200 known genes. Systema-tic screening and association studies of severalpositional candidate genes on chromosomes 2q and7q have been conducted by the IMGSAC,25–29 butthese studies have not led to the identification ofconfirmed autism susceptibility variants. Owing tothe recent technological advances in high-densitySNP genotyping and bioinformatic resources, wefocused our efforts on performing a gene-based high-density SNP association study of the autism suscept-ibility loci on chromosomes 2q and 7q implicated byIMGSAC linkage studies. SNP genotype data werealso used to investigate copy number variation withinthese regions. The genetic architecture of ASDs islikely to be extremely complex, with disease riskdetermined by both common variants of modesteffect, as well as rare variants with a range of effectsizes. The strategy of focusing on linkage regions forfine-mapping studies by high-density associationscreens will prioritize genes containing penetrantrare variants, which would not be well identifiedthrough association analysis. However, we mightexpect that genes containing such variants alsocontain more common variants of lesser effect andthus are still natural candidates to follow-up throughassociation studies.
Genotyping was conducted in two stages, based onHapMap Phase I and Phase II data, respectively. Intotal, 3002 SNPs were genotyped in each region,directly testing 173 genes on chromosome 2 and 270genes on chromosome 7. The study sample consistedof 126 and 127 affected individuals and their parents,selected from 293 IMGSAC multiplex families basedon identity-by-descent (IBD) sharing on chromosomes2q and 7q, respectively, as well as 188 gender-matched controls. This study design, where the sameprobands are used for family-based and case–control
SNP association and CNV analysis of AUTS1 and AUTS5E Maestrini et al
955
Molecular Psychiatry
analysis, should be more robust against the respectiveweaknesses of the case–control and TDT approaches(such as population structure and segregation distor-tion, respectively), and extract the maximum informa-tion from our sample.30 Moreover, by selectingfamilies showing excess allele sharing in the regionof interest, we are likely to increase the frequencyof the disease-associated alleles in the casesample, thereby increasing the power of associationstudies.31 Power calculations were performed over arange of risk allele frequencies and odds ratios (OR),confirming that the strategy of selecting familiesfor increased IBD sharing outperformed a strategyin which families are selected at random, givenfixed genotyping resources (see SupplementaryInformation).
Our study thus represents a deep exploration ofSNP and copy number variation within genic regionsof the two autism linkage loci on chromosomes 2qand 7q and pinpoints several genes that need furtherinvestigation.
Materials and methods
Study populationsThe chromosome 2 primary sample included 126independent autism families, for 371 individuals (119parent–parent–child trios and 7 single parent–childpairs). The chromosome 7 primary sample included127 independent autism families (117 parent–parent–child trios and 10 single parent–child pairs). Allfamilies were Caucasian (Table 1). The assessmentmethods and diagnostic criteria used by the IMGSAChave been described in detail previously.17 Diagnosiswas based on the Autism Diagnostic Interview—Revised (ADI-R) and the Autism Diagnostic Observa-tion Schedule (ADOS) and clinical evaluation. Kar-yotypes were obtained on all affected individualswhen possible, and gross karyotypic abnormalitieswere excluded in at least one affected individual perfamily in 93% of families and in both affectedindividuals in 83% of families.
Trios for the primary sample were selected from the293 multiplex families in the IMGSAC multiplexcollection (using one affected sib per family) based onIBD sharing on chromosomes 2q and 7q, respectively.Calculation of IBD states was based on microsatellitemarker data available from our genome scan18 andfine-mapping studies (unpublished data). RankedZ-scores were calculated for each family usingMerlin32 at the linkage peak position (D2S2302-D2S2310 and D7S2430-D7S684 for chromosomes 2and 7, respectively).
Two main sample collections were used for replica-tion (Table 1): (1) ‘IMGSAC replication’ (IMGSAC-R)sample: 260 parent-affected child trios or pairs and 34single cases and (2) ‘Northern Dutch’ sample (ND): 96singleton families from the north of the Netherlands,including 82 parent–parent–child trios and 14 par-ent–child pairs. Both replication sample collectionsfulfilled diagnostic criteria for Case ‘Type 1’ or ‘Type Table
1D
esc
rip
tion
of
sam
ple
s
Au
tism
sam
ple
Con
trols
Tota
laff
ecte
dS
ex
(M/F
)F
am
ily
typ
eC
ou
ntr
yof
ori
gin
Tota
lS
ex
(M/F
)C
ou
ntr
y
IMG
SA
Cch
r.2
126
103:2
3P
PC
119,
PC
773
UK
,25
US
A,
16
Neth
erl
an
ds,
8G
erm
an
y,3
Fra
nce,
1G
reece
188
154:3
4U
K
IMG
SA
Cch
r.7
127
101:2
6P
PC
117,
PC
10
66
UK
,28
US
A,
13
Neth
erl
an
ds,
9F
ran
ce,
7G
erm
an
y,3
Den
mark
,1
Gre
ece
188
148:4
0U
K
IMG
SA
Cre
pli
cati
on
294
236:5
8P
PC
213,
PC
47,
C34
129
UK
,85
Italy
,32
Germ
an
y,31
Neth
erl
an
ds,
10
Den
mark
,7
Fra
nce
180
144:3
6133
UK
,47
Italy
ND
96
85:1
1P
PC
82,
PC
14
Nort
hof
the
Neth
erl
an
ds
ND
-all
204
175:2
9P
PC
165,
PC
39
Nort
hof
the
Neth
erl
an
ds
Abbre
via
tion
s:C
,si
ngle
case
;F,fe
male
;IM
GS
AC
,In
tern
ati
on
al
Mole
cu
lar
Gen
eti
cS
tud
yof
Au
tism
Con
sort
ium
;M
,m
ale
;N
D,N
ort
hern
Du
tch
;P
C,p
are
nt–
ch
ild
pair
s;P
PC
,p
are
nt–
pare
nt–
ch
ild
trio
s.
SNP association and CNV analysis of AUTS1 and AUTS5E Maestrini et al
956
Molecular Psychiatry
2’ as defined by IMGSAC17 (meet ADI-R criteria or onepoint below threshold on one behavioral domain,meet ADOS/ADOS-G criteria for autism or PDD,performance IQ > 35). An extended Northern Dutchsample (ND-all; Table 1) was available, including 108cases that did not meet stringent criteria for one of thefollowing reasons: (1) met ADI-R criteria but failed tomeet ADOS criteria or did not undergo ADOSevaluation, (2) met ADI-R and ADOS criteria buthad an IQ score < 35, (3) did not meet full criteria forASD on the ADI-R.
The most significant SNPs from the chromosome 2locus were also tested in a collection of 358 multiplexfamilies (‘Mount Sinai’ sample), which have beenpreviously described.23,33 Similarly, three SNPs fromtwo of the most strongly associated genes in the case–control and family-based analysis on chromosome 7were genotyped in 62 Caucasian families selected forIBD sharing from a sample of 222 families showinglinkage to the same region of chromosome 719
(‘University of Washington’ sample).Controls used in the primary experiment included
188 DNA samples from UK random blood donorsfrom the ECACC HRC panels,34 sex-matched with theautism case sample. The additional set of 180 controlsgenotyped in the replication phase included 92 DNAsfrom ECACC HRC panels, 41 random donors from theUK and 47 random donors from Italy.
The study was reviewed by the relevant local ethicscommittees.
GenotypingSingle nucleotide polymorphisms for the primaryanalysis were genotyped using the GoldenGate assay(Illumina, San Diego, CA, USA) on an IlluminaBeadStation according to the manufacturer’s instruc-tions. BeadArrays were scanned using the BeadArrayReader at 532 and 647 nm. BeadStudio genotypingmodule (version 3.2.23) was used to generate genotypes.
Genotyping was conducted in two parallel stagesfor both chromosomal loci. A total of 3072 SNPs weregenotyped in each stage using two custom 1536-plexIllumina arrays, one for each chromosome. Theregions of interest ranged from 94.246 to 136.661 Mbon chromosome 7 and from 152.305 to 191.605 Mb onchromosome 2 (NCBI Build 36). These intervals weredefined using the approximate 1-LOD drop of thelinkage peaks on the two chromosomes, based onIMGSAC microsatellite marker data.18
In the first stage of this study, we evaluated thepatterns of linkage disequilibrium (LD) and thedistribution of haplotype blocks in the CEU genotypedata from the HapMap project release 13 (HapMapPhase I data). Genic regions were defined by NCBIBuild 34, by merging all RefSeq and UCSC KnownGenes, including all exonic, intronic and 30 UTRsequences, as well as 5 kb upstream of the 50 end. Atotal of 1496 tag SNPs on both chromosome 2q and 7qwere identified using HaploView35 and the Gabrielalgorithm for block definition from LD blocks over-lapping all genic regions.
In the second stage of genotyping, we tookadvantage of the higher-density HapMap Phase IIdata to better represent genetic variation in regions oflower LD not previously captured by the HapMapPhase I data. We also used the latest genomeannotation (NCBI Build 36) to investigate novel genesand ensure comprehensive coverage of all intragenicand putative regulatory regions on both chromo-somes. We identified ‘non-genic’ evolutionary con-served regions from PhastCons elements.36 Wedownloaded SNP genotype data from the CEUpopulation from HapMap release 22, and selectedall SNPs in all genic regions and in the top 5% of non-genic PhastCons elements. We also selected allnonsynonymous SNPs with minor allele frequency(MAF) X0.05. We then used the Tagger program fromHaploView35 (version 4) to select a second set of 1516tag SNPs for each chromosomal region. Parametersused for Tagger were r2
X0.75 (chromosome 2) andr2X0.63 (chromosome 7), minimum MAF of 5%,
aggressive tagging and force including SNPs alreadygenotyped in stage 1. We estimated that our two setsof SNPs were able to tag 96 and 85% of intragenicHapMap SNP variation (MAF > 0.05) with r2 > 0.8 onchromosomes 2 and 7, respectively.
Genotypes for 212 SNPs (99 on chromosome 2 and113 on chromosome 7), previously generated by theAGP using the Affymetrix 10K version 2 SNP array,8
were available on the IMGSAC family sample andwere also included in the family-based associationanalysis.
A total of 50 genome-wide unlinked SNPs weregenotyped for detection of population stratification,37
and 10 chromosome X SNP were also included toestimate levels of mistyping. In addition, for regionsof high LD, where tagging SNPs captured the mostgenetic variation, extra SNPs were chosen in case ofgenotyping failure.
Replication SNP genotypingSingle nucleotide polymorphisms for replicationwere genotyped using a combination of the MassExtend iPLEX Gold (Sequenom, San Diego, CA, USA)and TaqMan platforms. A 100% genotyping concor-dance was observed for two replicate DNA samplesgenotyped in each experiment. Twenty-five genome-wide SNPs were also genotyped in the IMGSAC-Rsample to test for population stratification.
Statistical analysisAssociation analysis. We evaluated evidence ofassociation using both ‘frequentist’ and Bayesianstatistical approaches.
Primary association analysis of the 5880 SNPs(including the 212 SNPs available from the AGPlinkage study8) successfully genotyped in theIMGSAC data set at the two loci was carried outusing the PLINK package.38 To extend the amountof information captured by single-marker tests, anadditional set of two-marker haplotype tags was
SNP association and CNV analysis of AUTS1 and AUTS5E Maestrini et al
957
Molecular Psychiatry
devised using the ‘aggressive’ option of the Taggerprogram39 implemented in HaploView.35 In total, 3526tests (2959 single-marker tests and 567 haplotypetags) were performed for the chromosome 2 study, and3380 tests (2921 single-marker tests and 459 haplo-type tags) for the chromosome 7 study.
Standard TDT from PLINK was used for family-based analysis, and the Cochran–Armitage trend test(1 degree of freedom) for the case–control analysis.Haplotype-based tests were calculated using PLINK.
Bayesian logistic regression analysis was performedusing the GENEBPM algorithm,40,41 again using both acase–control and family-based approach (see Supple-mentary Methods). The logistic regression modelallowed for additive and dominance effects of un-observed causal variants, a main effect of gender aswell as for parent-of-origin effects in the family-basedanalysis. GENEBPM analyses were performed using asliding window of five SNPs across each chromosomalregion. For comparison with frequentist single-SNPanalyses, the GENEBPM algorithm was also applied toeach SNP in turn (that is, single-SNP ‘haplotypes’).
Replication analysisAssociation analysis of the IMGSAC-R and ND repli-cation data sets was carried out using the UNPHASEDpackage,42 given the presence of a higher proportionof families with missing parents (24%) (Table 1).UNPHASED implements maximum-likelihood-basedassociation analysis for nuclear families and unrela-ted subjects allowing for missing genotypes anduncertain haplotype phase. In the presence of missingdata it has only minor loss of robustness to populationstratification and is more powerful than standardTDT.42
Analysis of the combined primary and replicationcohorts was also carried out using UNPHASED, againusing both a case–control approach and a family-based approach. Only the IMGSAC and IMGSAC-Rdata sets were combined for the population-basedmeta-analysis, because appropriate controls were notavailable for the ND population.
Copy number variationWe used transmission patterns of SNP genotypeswithin parent–offspring families to detect Mendelianerrors consistent with the presence of a deletion. Inaddition, the clustering of all SNP genotypes wasvisually examined to identify abnormal clusteringpatterns or outlying samples that might point to CNVsassociated with the autism phenotype. Sequencingwas carried out to confirm the presence of microdele-tions, CNVs or secondary SNPs.
After exclusion of whole-genome amplified sam-ples, data from both GoldenGate arrays were com-bined for each region, no-calls were deleted, and runon QuantiSNP version 1.0.43 CNV validation andscreening was carried out by multiplex PCR andquantitative multiplex PCR of short fluorescent frag-ments (QMPSF).44 Positive results were confirmed ina second independent QMPSF assay.
The distal breakpoint of the deletion detected infamily 15-0084 was better defined by quantitativePCR (qPCR) of DOCK4 exons 52, 37, 31, 14 and 7, withGAPDH as a reference.
Additional information is available as Supplemen-tary Methods.
Results
GenotypingA total of 6004 SNPs—3002 in each chromosomeregion—were genotyped using the Illumina Gold-enGate technology. After quality control procedures,we excluded 336 markers for one or more of thefollowing reasons: MAF < 0.05, more than 1 Mende-lian error, genotyping rate < 90%, poor clustering anddeviations from Hardy–Weinberg equilibrium(P < 0.001) in the control population.
For the 5668 (94%) SNPs that passed qualitycontrol, the genotyping efficiency exceeded 99.7%,with an estimated error rate from duplicate SNPs andfrom heterozygote calls of X chromosome SNPs inmales in the order of 2–5� 10�4. In summary, 2860SNPs from the chromosome 2q23.3–q32.3 region weresuccessfully genotyped in 559 DNA samples includ-ing 126 affected individuals, 245 parents and 188gender-matched controls from the ECACC collection;2808 SNPs from the chromosome 7q21.3–q33 regionwere successfully genotyped in 559 DNA samplesincluding 127 affected individuals, 244 parents and188 ECACC gender-matched controls. In addition, ourfamily-based analysis included genotypes from 212SNPs (99 on chromosome 2 and 113 on chromosome7), which were generated by the AGP using theAffymetrix 10K version 2 SNP array.8
There was no significant difference in the pattern ofLD between our sample and the HapMap CEU sample,indicating that the LD structure in the HapMap CEUdata can be readily applied to our autism sample.SNPs were selected to capture efficiently the largemajority of the currently known variation in allintragenic regions and highly conserved non-genicelements (see Supplementary Methods).
Population stratificationThe presence of stratification in a population-basedassociation study that is not suitably accounted for incase–control analysis can lead to an increase in thefalse-positive error rate. Furthermore, haplotypeanalyses in family-based association studies are notrobust to population stratification if random mating isassumed among parents in the haplotype estimationstep.
We tested for population structure in our primaryIMGSAC sample using Structure45,46 software, andtesting 50 unlinked genome-wide SNPs. Comparingthe fit of the admixture model for K = 1, 2 and 3 strata,we found strongest support for a model of nostratification (K = 1) in both of the following groupsof individuals: (1) probands, controls and HapMapCEU founders; and (2) parents and HapMap CEU
SNP association and CNV analysis of AUTS1 and AUTS5E Maestrini et al
958
Molecular Psychiatry
founders. Similarly, no evidence of stratification wasdetected in the combined IMGSAC primary andIMGSAC-R sample, using 25 unlinked genome-wideSNP markers. These results reassure us that no strongpopulation stratification is present in our IMGSACprimary and IMGSAC-R sample.
Association analysisThe results of the case–control (Cochran–Armitagetrend test) and family-based analysis (TDT) are shownin Figure 1 and summarized in Table 2.
Chromosome 2 association resultsThree SNPs in the NOSTRIN gene provided thestrongest association in the case–control analysis(rs7583629, P = 3.2�10�5; rs829957, P = 9.0�10�5;rs482435, P = 1.4� 10�4), followed by rs1020626(P = 3.8�10�4) in the FAM130A2 gene.
For the TDT analysis the strongest results camefrom SNPs in the ZNF533 gene (rs11885327,P = 8.0�10�4; rs1964081, P = 1.4� 10�3), and an SNPin the UPP2 gene (rs6709528, P = 8.0� 10�4).
Single-marker logistic regression analysis provideda similar ranking of results. In the case–control analy-sis the most strongly associated SNP rs7583629 inNOSTRIN provided a log10 Bayes factor (logBF) of 2.9,whereas in the family-based analysis the top signalwas for rs1139 in the ZNF533 gene (logBF = 1.7).GENEBPM multimarker analysis using 5-SNP slidingwindows (Supplementary Figure S1) showed in-creased evidence in favor of association for the
NOSTRIN locus (logBF = 3.2) in the case–controlanalysis, but did not identify additional interestingsignals. Family-based multimarker analysis revealedan additional association signal with a haplotypespanning 75 kb in the METTL8 gene (logBF = 2.3).
Chromosome 7 association resultsThe strongest signal for the case–control (trend test)analysis was from IMMP2L (rs12537269, P = 1.2� 10�4;rs1528039, 6.3� 10�4) and just upstream of SMO(rs6962740, P = 3.4� 10�4). The TDT test implicatedPlexin A4 (PLXNA4; rs4731863, P = 1.0� 10�4) andcut-like homeobox 1 isoform b (CUX1; rs875659,P = 2.0� 10�4).
Single-marker logistic regression analysis provideda similar ranking of results in the case–controlanalysis with rs12537269 (logBF = 2.9) in IMMP2Lshowing the most significance. In the family-basedanalysis, the most significant result was seen forrs4730037 in LHFPL3 (logBF = 2.1), closely followedby rs4731863 in PLXNA4 (logBF = 2.0). Moreover,GENEBPM revealed a parent-of-origin effect in theIMMP2L locus, with increased risk for causal variantsinherited from the father compared to those inheritedfrom the mother. For this reason we investigated SNPsin IMMP2L by parent-specific TDT, which revealed aP-value of 0.01 for rs2030781, with a transmitted/untransmitted allele ratio of 31:14 for paternaltransmissions (Table 2).
GENEBPM multimarker analysis using 5-SNP slid-ing windows showed increased evidence of associa-
Figure 1 Graphical representation of chromosome 2 and 7 association results. �Log10 P-values are plotted against thechromosome position. (a) P-values obtained for single markers (Cochran–Armitage trend test) and 2-SNP haplotype case–control association (PLINK). (b) P-values for single-marker TDT and 2-SNP haplotype TDT.
SNP association and CNV analysis of AUTS1 and AUTS5E Maestrini et al
959
Molecular Psychiatry
Table
2S
um
mary
of
pri
mary
ass
ocia
tion
resu
lts
Fam
ily-b
ase
dC
ase
Con
trol
SN
P/h
ap
loty
pe
Ch
r.P
osi
tion
Gen
eR
isk
all
ele
P-v
alu
eLogB
FP
-valu
eLogB
F
rs1427395
2153
442
168
Ph
ast
Con
saT
0.0
016
0.7
80.0
133
0.6
6rs
3769357
2157
101
520
GP
D2
A0.0
031
1.0
4rs
6437129,
rs6709528_C
C2
158
669
000
UP
P2
CC
9.7
2E
-04
1.0
0
rs6437133
2158
672
533
UP
P2
C0.0
045
0.6
5rs
6709528
2158
678
671
UP
P2
T8.0
0E
-04
1.0
9rs
12620556
2158
905
017
LO
C130940
A0.0
041
0.4
50.0
153
0.5
5rs
764660
2165
921
543
SC
N2A
C0.0
047
0.6
8rs
1020626
2166
106
880
FA
M130A
2T
3.8
0E
-04
1.9
2rs
10930170
2166
107
713
FA
M130A
2G
0.0
020
1.4
0rs
829957
2169
367
080
NO
ST
RIN
T0.0
116
0.5
99.0
3E
-05
2.4
7rs
6433093
2169
367
190
NO
ST
RIN
A5.5
9E
-04
1.9
4rs
7583629
2169
381
125
NO
ST
RIN
A0.0
027
1.0
93.2
2E
-05
2.9
2rs
482435
2169
384
291
NO
ST
RIN
C0.0
084
0.7
81.3
9E
-04
2.5
4rs
2098802
2170
760
429
MY
O3B
G0.0
042
1.0
2rs
6738892
2170
768
975
MY
O3B
A0.0
035
1.1
3rs
13007575
2174
386
625
Ph
ast
Con
saA
0.0
077
0.8
10.0
026
1.1
9rs
6717587
2175
865
296
Ph
ast
Con
saA
0.0
015
1.4
0rs
1434087
2178
912
043
OS
BP
L6
T0.0
010
1.5
6rs
7590028
2180
257
688
ZN
F533
T0.0
010
1.8
0rs
11885327
2180
276
318
ZN
F533
C8.0
0E
-04
1.5
60.0
271
0.5
6rs
11885327,
rs1964081_T
G2
180
287
992
ZN
F533
TG
0.0
019
1.5
4
rs2008230,
rs1964081_G
G2
180
288
449
ZN
F533
GG
0.0
030
1.0
4
rs881737,
rs1964081_G
G2
180
294
093
ZN
F533
GG
0.0
016
1.7
5
rs1964081
2180
299
666
ZN
F533
A0.0
014
1.6
1rs
2126424,
rs1139_C
G2
180
312
034
ZN
F533
CG
0.0
073
1.1
80.0
038
1.1
1
rs1139
2180
318
326
ZN
F533
G0.0
067
1.7
30.0
049
1.2
6rs
415994
2183
266
932
50
of
DN
AJC
10
C0.0
027
1.5
4rs
3755248
2188
078
477
TF
PI
T0.0
036
0.9
0rs
7573488
2188
106
325
TF
PI
G0.0
046
0.9
1rs
3811608
2191
043
302
FLJ2
0160
T0.0
023
0.8
1rs
6757698
2191
071
741
FLJ2
0160
C0.0
012
0.3
1rs
12538145
795
636
377
SLC
25A
13
C0.0
039
0.5
20.0
168
0.5
5rs
2307355
799
531
488
MC
M7
A0.0
040
0.8
8rs
11768465
7100
036
322
FB
X024
C0.0
184
0.5
70.0
031
1.2
1rs
875659
7101
696
376
CU
X1
C2.0
0E
-04
1.8
2rs
3819479
7103
184
318
RE
LN
T0.0
028
1.0
2rs
6976167
7103
848
209
LH
FP
L3
T0.0
026
1.1
6rs
12666599
7103
905
157
LH
FP
L3
T0.0
038
0.8
9rs
4730037
7104
129
973
LH
FP
L3
C0.0
032
2.0
7rs
176481
7105
515
161
SY
PL1
T0.0
385
8.1
9E
-04
1.9
2rs
9690688
7107
507
398
LA
MB
4T
0.0
047
0.7
5rs
6951925
7108
588
860
NT
_007933.6
89
bG
0.0
029
1.2
0rs
1464895
7110
111
977
IMM
P2L
A0.0
049
0.9
5rs
2030781
7110
149
994
IMM
P2L
C0.0
11
c1.5
70.0
037
1.1
5rs
12537269
7110
184
783
IMM
P2L
A1.2
0E
-04
2.8
5rs
10500002
7110
229
091
IMM
P2L
T0.0
012
1.5
8rs
1528039
7110
230
008
IMM
P2L
C6.2
8E
-04
1.7
7rs
12531640
7110
266
771
IMM
P2L
T0.0
021
1.2
7rs
2217262
7111
583
613
DO
CK
4A
0.0
143
0.4
10.0
042
1.0
2
SNP association and CNV analysis of AUTS1 and AUTS5E Maestrini et al
960
Molecular Psychiatry
tion for the IMMP2L locus in the case–control analysis(logBF = 2.9) and for PLXNA4 (logBF = 2.9) in thefamily-based analysis, but did not identify additionalinteresting signals (Supplementary Figure S1).
Analysis of the LD landscape across the AUTS1region, using both HapMap (CEU) and data from the127 probands used in our primary sample, indicatedthat the six associated SNPs in IMMP2L (Table 2) areall within a single block of LD, and thus likely to beindexing the same effect. In contrast, the modestassociation seen in the first intron of the neighboringDOCK4 gene was in a separate block of LD.
ReplicationWe attempted replication of 56 SNPs (28 on eachchromosome) that attained the most significant associa-tion results in primary case–control and TDT analyses(Table 3; Supplementary Table S1). The replicationpopulation consisted of the IMGSAC-R and the NDcollections, including 390 affected individuals (seeTable 1; Materials and methods for a description ofsamples). Family-based analysis of the replicationsample showed significant overtransmission of thecommon allele of SNP rs2217262 in the DOCK4 gene(P = 9.2� 10�4, OR = 2.28, confidence interval 1.37–3.77)(Table 3; Supplementary Table S1). This result remainssignificant after Bonferroni correction for multipletesting (28 SNPs tested on chromosome 7, P = 0.026).The trend toward association of rs2217262 (P = 0.029)was also seen in the extended ND sample, whichincluded additional subjects fulfilling broader diag-nostic criteria (ND-all, 204 affected subjects; Table 1).
The remaining SNPs did not show significantreplication after correction for multiple testing, andno parent-of-origin effects were seen for rs2030781.
Finally, the 56 SNPs selected for replicationwere investigated in the combined primary andreplication data sets; only 7 SNPs attained uncor-rected significance of P < 0.001 (Table 4). The DOCK4SNP rs2217262 reached a nominal significance ofP = 5.23� 10�5 in the family-based analysis of allcohorts (IMGSAC primary, IMGSAC-R and ND). In thecase–control analysis of the combined IMGSACcollections (421 cases and 368 controls), rs12537269in IMMP2L achieved the most significant result(P = 7.3�10�5). Additional loci retaining associationevidence in case–control meta-analysis were ZNF533on chromosome 2, and TSPAN12, FEZF1 andSLC13A1 on chromosome 7.
Several SNPs in the most interesting genes from theprimary analysis were also tested in two additionalfamily collections, which had previously shownevidence of linkage to the chromosome 2q and 7qloci19,23,33 (Supplementary Table S1). Five SNPs inNOSTRIN, ZNF533 and OSBPL6 were tested in asample of 358 multiplex families (‘Mount Sinai’cohort),23,33 but no significant results were obtained.Of the 28, 3 AUTS1 replication SNPs in IMMP2L andCUX1 were genotyped in 62 Caucasian familiesselected for IBD sharing from 222 families showinglinkage to the same region of chromosome 7T
able
2C
on
tin
ued
Fam
ily-b
ase
dC
ase
Con
trol
SN
P/h
ap
loty
pe
Ch
r.P
osi
tion
Gen
eR
isk
all
ele
P-v
alu
eLogB
FP
-valu
eLogB
F
rs989613
7113
233
792
NT
_007933.6
32
bG
0.0
049
0.5
3rs
7807053
7120
137
743
KC
ND
2A
0.0
022
1.1
8rs
41620
7120
213
054
30
of
TS
PA
N12
A0.0
046
1.0
7rs
2525720
7120
392
266
ING
3A
0.0
049
1.2
5rs
538558
7121
724
673
30
of
FE
ZF
1A
0.0
065
0.9
8rs
11978485
7122
480
367
30
of
SLC
13A
1G
0.0
295
0.9
80.0
039
1.2
3rs
6962740
7128
614
047
50
of
SM
OG
3.3
9E
-04
2.1
4rs
4110091
7128
719
985
AH
CY
L2
T0.0
012
1.5
6rs
2030974
7129
693
119
50
of
CPA
2C
0.0
197
0.5
60.0
032
1.1
7rs
2171493
7129
693
383
50
of
CPA
2C
0.0
412
0.3
90.0
038
1.1
8rs
13226219
7129
806
727
50
of
CPA
1T
0.0
032
1.2
6rs
1863009
7130
649
715
AK
054623
T9.2
0E
-04
1.6
0rs
7787173
7131
107
683
NT
_007933.1
017
bA
6.0
0E
-04
1.1
2rs
4731863
7131
674
323
PLX
NA
4T
1.0
0E
-04
2.0
20.0
321
0.4
5
On
lyS
NP
ssh
ow
ing
P<
0.0
05
ineit
her
fam
ily-b
ase
dor
case
–con
trol
an
aly
sis
are
rep
ort
ed
.P
-valu
es
>0.0
5are
not
show
n.P
-valu
es
<0.0
01
are
inbold
.T
he
rep
ort
ed
risk
all
ele
iscon
sist
en
tin
the
two
ap
pro
ach
es.
aP
hast
Con
s,h
igh
lycon
serv
ed
regio
n.
bP
red
icte
dgen
es,
refe
ren
ce
sequ
en
ce
an
nota
tion
ch
an
ged
from
Bu
ild
34.
cP
are
nt-
specif
icT
DT
.
SNP association and CNV analysis of AUTS1 and AUTS5E Maestrini et al
961
Molecular Psychiatry
(‘University of Washington’ sample),19 again with noevidence for association.
Copy number variationA Mendelian error in one family for SNP rs7585982pinpointed a potentially interesting deletion in theUPP2 gene on chromosome 2. The deletion bound-aries were defined by sequence analysis of additionalSNPs flanking rs7585982. Using long-range PCRfollowed by sequencing, we refined the deletionto 5897 bp of the UPP2 gene (158 681 612–158 687 508 bp; UCSC Build 36), removing two codingexons (exons 6 and 7) and predicted to cause aframeshift leading to a premature termination codon(Supplementary Figure S2A). This deletion was notpresent in the Database of Genomic Variants (DGV,http://projects.tcag.ca/variation/), suggesting it couldbe an autism-specific CNV. We screened the samesample used for the SNP association experiment (126cases and 188 controls) for the presence of thisdeletion using multiplex PCR (SupplementaryFigure S2B). The frequency of the deletion was notsignificantly different between cases and controls(1.6 and 3.2%, respectively, P = 0.2). To investigate ifthe deletion segregates with the ASD phenotype,we also screened 265 sib-pair families from theIMGSAC collection, including relatives of the 126cases. Of these, we found 30 families with a parentcarrying the deleted allele, and in only 13families was it transmitted to affected children(in 5 families to both affected siblings and in 8families to a single affected individual). Theseresults suggest that the UPP2 deletion is not involvedin autism susceptibility. The coding sequence ofUPP2 was also sequenced in 47 unrelated subjects,including 12 probands carrying the deletion of exons6 and 7; no novel coding variants were identified,except one silent change in exon 4 in only oneindividual.
By combining data from both SNP arrays for eachcandidate region, a sufficient SNP density wasachieved to carry out copy number analysis on thesesamples using QuantiSNP.43 We detected 17 CNVs inseven regions of chromosome 7 and 6 CNVs in fiveregions of chromosome 2 (Supplementary Table S2).For the chromosome 7 analysis, an B800 kb duplica-tion was detected in family 13-3023 that wastransmitted from father to proband (SupplementaryFigure S3). This duplication includes two genes:IMMP2L and DOCK4. Another duplication overlap-ping EMID2 and RABL5 was detected in three familieswhere it was transmitted from mother to proband,whereas a smaller duplication containing only EMID2was detected in a father, but not transmitted, and inone control. A third CNV in EXOC4 was detected as anontransmitted loss in a father and as a gain (fourcopies) in a control.
On chromosome 2, five duplications and onedeletion were detected in parents and a singlecontrol, but never transmitted to an affected child(Supplementary Table S2).T
able
3F
am
ily-b
ase
dan
aly
sis
of
rep
licati
on
sam
ple
su
sin
gU
NP
HA
SE
D
IMG
SA
C-R
(294
aff
ecte
dsu
bje
cts
)N
D(9
6aff
ecte
dsu
bje
cts
)IM
GS
AC
-Rþ
ND
(390
aff
ecte
dsu
bje
cts
)
SN
PC
hr.
Gen
eA
llele
sR
isk
all
ele
P-v
alu
eC
a-F
req
Co-F
req
P-v
alu
eC
a-F
req
Co-F
req
P-v
alu
eC
a-F
req
Co-F
req
rs1427395
2P
hast
Con
saA
/TT
0.3
634
0.5
64
0.5
32
0.0
216
0.5
00
0.3
87
0.0
505
0.5
47
0.4
96
rs6437133
2U
PP
2C
/TC
0.1
630
0.5
43
0.5
00
0.0
395
0.4
82*
0.5
99*
0.7
106
0.5
29
0.5
19
rs12620556
2LO
C130940
A/G
A0.8
454
0.8
99
0.9
05
0.0
235
0.9
05*
0.9
65*
0.2
247
0.9
01
0.9
20
rs13007575
2P
hast
Con
saA
/GA
0.1
337
0.9
21
0.9
46
0.0
063
0.9
58
0.8
86
0.8
726
0.9
31
0.9
29
rs1434087
2O
SB
PL6
C/T
T0.0
399
0.9
28
0.8
90
0.7
597
0.9
16
0.9
25
0.0
988
0.9
25
0.8
98
rs11768465
7F
BX
024
C/T
C0.3
915
0.7
85
0.7
65
0.0
184
0.7
61*
0.8
63*
0.6
561
0.7
79
0.7
90
rs1464895
7IM
MP
2L
A/G
A0.4
854
0.1
61
0.1
45
0.0
042
0.1
20*
0.2
35*
0.3
043
0.1
50
0.1
70
rs12537269
7IM
MP
2L
A/G
A0.0
485
0.2
62
0.2
10
0.7
737
0.2
55
0.2
43
0.0
667
0.2
60
0.2
20
rs2217262
7D
OC
K4
A/C
A0.0
272
0.9
55
0.9
24
0.0
055
0.9
79
0.9
16
9.2
1E
-04
0.9
62
0.9
21
rs2171493
750
of
CPA
2A
/CC
0.0
230
0.2
42*
0.3
01*
0.8
506
0.2
16
0.2
24
0.0
459
0.2
35*
0.2
82*
rs4731863
7P
LX
NA
4A
/TT
0.1
591
0.9
07
0.9
31
0.0
987
0.8
91
0.9
38
0.0
391
0.9
03*
0.9
34*
Abbre
via
tion
s:C
a-F
req,
frequ
en
cy
inaff
ecte
doff
spri
ngs;
Co-F
req,
frequ
en
cy
inu
ntr
an
smit
ted
pare
nta
lall
ele
s;IM
GS
AC
-R,
Inte
rnati
on
al
Mole
cu
lar
Gen
eti
cS
tud
yof
Au
tism
Con
sort
ium
-rep
licati
on
;N
D,
Nort
hern
Du
tch
;S
NP,
sin
gle
nu
cle
oti
de
poly
morp
his
m.
On
lyn
om
inal
P-v
alu
es
<0.0
5are
show
n.A
llele
frequ
en
cie
sare
rep
ort
ed
for
the
risk
all
ele
dete
cte
din
the
pri
mary
ass
ocia
tion
an
aly
sis.
Fli
p-f
lop
of
ass
ocia
ted
all
ele
isfl
agged
by
an
ast
eri
sk.
aP
hast
Con
s,h
igh
lycon
serv
ed
regio
n.
SNP association and CNV analysis of AUTS1 and AUTS5E Maestrini et al
Most of the identified CNVs are well represented inthe DGV, suggesting that they do not have a majorfunction in autism susceptibility. However, theduplication involving IMMP2L and DOCK4 warrantedfurther analysis, as it involved two adjacent genesshowing possible SNP association with autism.Therefore we developed a QMPSF assay able tosimultaneously test CNVs in exons 2, 3 and 6 ofIMMP2L, exon 4 of LRRN3 and the last exon of DOCK4(number 52). We validated the duplication in family13-3023, identified by QuantiSNP, and verified that itis transmitted from the father to the affected son, butit was not transmitted to the other affected sib or to anunaffected sib. Screening of 475 UK controls and 285IMGSAC multiplex families with 487 affected indivi-duals was then carried out using the QMPSF assay,to check if CNVs in these genic regions segregatedwith the autism phenotype in families and/or have ahigher frequency in cases than controls. We identifiedsix additional deletions of different length, of whichsome were transmitted. One deletion disrupted exons2 and 3 of IMMP2L and the last exon of DOCK4, andwas transmitted from the mother to both affectedsons, as well as to a daughter, who did not have anASD. Both the carrier mother and daughter werereported to have dyslexia. qPCR indicated that thedeletion distal breakpoint is located between exons31 and 14 of DOCK4 (Supplementary Figure S4). Twosmaller deletions were transmitted from the parent toonly one of their affected children, one was found
only in the father but not transmitted and the othertwo were found in controls. The relative lengthand position of the CNVs identified are depicted inFigure 2.
Discussion
Several linkage studies have suggested that chromo-somes 2q and 7q may harbor one or more genescontributing to the risk for developing an ASD. Here,we have presented a comprehensive high-densitySNP genotyping, association and CNV study coveringthe 2q23.3–q32.3 and 7q21.3–q33 chromosome re-gions. We have tested more than 3000 SNPs in eachregion, covering all known genes, as well as in highlyconserved non-genic sequences.
The complementary case–control and family-basedapproach taken in our study allowed us to extract themaximum information from our sample, taking intoconsideration the advantages and disadvantages ofthe two different approaches. Case–control studiesare more powerful compared to family-based ap-proaches, but are sensitive to the presence of popula-tion stratification. Structure analysis using 50genome-wide SNPs did not reveal strong populationstratification, although we cannot exclude that un-detected low levels may be present. Family-basedapproaches are more robust to confounding bypopulation stratification and in addition they enabletesting for parent-of-origin effects.
Table 4 Combined analysis of primary and replication samples
All samples combineda
Family-based analysisIMGSAC samples combinedb
Case–control analysis
Chr SNP Gene location Risk allele P-value(Ca-Co Freq)
7 rs41620 30 of TSPAN12 A 0.08796 8.14E-04 1.48(0.77, 0.74) (0.78, 0.71) (1.18–1.86)
7 rs538558 30 of FEZF1 A 0.4688 5.77E-04 1.45(0.36, 0.34) (0.37, 0.28) (1.17–1.80)
7 rs11978485 30 of SLC13A1 G 0.04972 2.89E-04 1.59(0.82, 0.79) (0.84, 0.76) (1.24–2.04)
Abbreviations: Ca-Co freq, risk allele frequency in affected offspring and in untransmitted parental alleles (family-based) orin control (case–control); IMGSAC, International Molecular Genetic Study of Autism Consortium; OR (CI), odds ratio and95% confidence interval; SNP, single nucleotide polymorphism.Results generated by UNPHASED. Only SNPs with nominal P < 0.001 are shown. P-values < 0.001 are in bold.aIMGSAC primary sample, IMGSAC-R, ND (515–516 affected individuals).bIMGSAC primary sample, IMGSAC-R (420–421 cases, 368 controls).
SNP association and CNV analysis of AUTS1 and AUTS5E Maestrini et al
963
Molecular Psychiatry
Although the strongest signals identified by the twoapproaches did not coincide, comparison of theresults led us to pinpoint the most interesting locisupported by both methods, albeit with differentstrength. In addition, consistency of the resultsobtained by frequentist and Bayesian approachessuggested that our strongest signals are independentof the analysis method.
Primary association analysis of the chromosome 2region identified the most interesting results inNOSTRIN, UPP2 and ZNF533. NOSTRIN encodesthe nitric oxide synthase trafficker. Interestingly, thenitric oxide signaling pathway has been recentlyshown to be overrepresented in genes disrupted byCNVs in schizophrenia.47 However, the NOSTRINassociation was stronger in the case–control analysiswith only minor support from the TDT, and it wasnot confirmed in the replication sample or in thecombined meta-analysis, suggesting that it mightrepresent a false-positive result.
Similarly, the ZNF533 association was not repli-cated, however rs7590028 remained one of thestrongest signals in case–control combined analysisof IMGSAC samples. ZNF533 encodes a proteincontaining four matrin-type zinc fingers and is highlyconserved in evolution. Given its putative nuclearlocation, it is thought to act as a repressor oftranscription, although no specific targets are cur-rently known. ZNF533 is widely expressed in adult
tissues, including brain. Expression of all isoforms infetal brain was confirmed by reverse transcriptase–PCR (data not shown). Deletions including ZNF533have been described in several patients with aneurological phenotype including mental retarda-tion,48,49 and other zinc-finger genes have alsobeen implicated in mental retardation cases.50–52 Thezinc-finger gene ZNF804A was recently identified asthe strongest result in a genome-wide associationstudy of schizophrenia and bipolar disorder,53 sug-gesting that they may act as transcription regulators ina wide range of human cognitive processes.
On chromosome 7, the most significant associationresult from the primary cohort was in the IMMP2Lgene. Although SNPs in this gene failed to replicate inindependent samples, the IMMP2L intronic SNPrs12537269 achieved the strongest result in thecase–control meta-analysis of the IMGSAC sample(P = 7.3�10�5). This gene encodes an inner mitochon-drial membrane protease-like protein and is a plau-sible candidate for autism, because it was previouslyreported to be disrupted in an individual withTourette syndrome, a complex neuropsychiatric dis-order showing phenotypic overlap with ASDs.54
Moreover, IMMP2L contains a neuronal leucine-richrepeat gene (LRRN3) nested within its large thirdintron. The expression profile of LRRN3 also makes itan interesting candidate gene for autism, as it is mosthighly expressed in fetal brain. Studies in Drosophila
Figure 2 Summary of IMMP2L and DOCK4 copy number variants (CNVs). Fragments tested by QMSPF are shown as redbars at the top. CNVs from the Database of Genomic Variants (DGV) are shown as orange bars. Deletions and duplicationsidentified in affected individuals and in parents or controls are depicted at the bottom. Dashed and continuous lines indicatethe maximum and minimum length of the CNVs, respectively. The distal breakpoint of the deletion in pedigree 15-0084 wasdefined by qPCR. The distal breakpoint of the duplication in pedigree 13-3023 was not defined precisely.
SNP association and CNV analysis of AUTS1 and AUTS5E Maestrini et al
964
Molecular Psychiatry
demonstrate that many members of the LRR familyprovide an essential role in target recognition, axonalpathfinding and cell differentiation during neuraldevelopment,55 and murine studies suggest these LRRproteins could have similar functions in mammalianneural development.56
The only SNP that achieved significant replication,after Bonferroni correction for multiple testing, isrs2217262 in the neighboring gene DOCK4, also agood autism candidate. This gene encodes a proteinthat activates Rac GTPase and is often deleted duringtumor progression.57 A recent study in rats indicatesthat DOCK4 is predominantly expressed in thehippocampus as well as in the lung.58 This studyfurther demonstrated that in cultured hippocampalneurons, DOCK4 is upregulated at the same time asdendrites start growing, and that knockdown of thisgene by RNA interference results in impaired den-dritic morphogenesis.
The association result for rs2217262 indicates thatthe common allele in the population is associatedwith increased risk for autism, or the minor allele is a‘protective’ variant. It has been shown that inpresence of missing data, SNPs with a low MAFmay show a bias in TDT, resulting in artificialovertransmission of the common allele.59 This pro-blem is not likely to apply to rs2217262, as thisassociation was supported also by case–controlanalysis.
Although only the rs2217262 association wasconfirmed by replication analysis, suggesting thatthe other results may represent false positives, thispolymorphism (with MAF only about 5%) would notalone account for the linkage signal seen at AUTS1 inthe IMGSAC sample. It is thus possible that multipleloci might contribute to the overall linkage seen forthis region, and that the other significant SNPs fromprimary analysis may in reality be true signals butwith lower OR, which our replication study wasunderpowered to detect. We do recognize that severallimitations may have affected our replication sample.The primary sample was composed of trios selectedfrom multiplex families based on IBD sharing, therebymore likely to be enriched for susceptibility alleles.By contrast, the replication population was a moreheterogeneous sample, not preselected on linkage,and was mostly composed of singleton families.Power calculation suggested that our replicationsample (IMGSAC-R and ND) should give us sufficientpower to replicate the most significant primaryresults. However, the well-known ‘winners curse’theory also suggests that the effect sizes from theinitial study may have been overestimated, thusrequiring a much larger sample for replication. Wedid not detect presence of structure in the combinedIMGSAC primary and IMGSAC-R samples, but it ispossible that heterogenity may be present among thedifferent samples used in this study (ND, Mount Sinaiand University of Washington). This could havealso contributed to the lack of replication, as couldhave gene–environment interactions, when different
environmental exposures are present between popu-lation samples.
De novo and/or inherited CNVs are emerging asimportant causes of ASDs and other complex dis-orders.8,11–13 Hence we exploited our dense SNPgenotyping data to mine for structural variants. Themost interesting discovery is the occurrence ofdeletions and duplications in four independentfamilies in the IMMP2L/DOCK4 locus, given thecoincident SNP association also seen for these genes.A maternal deletion was transmitted to both affectedsons and the unaffected daughter in family 15-0084.In all other instances (two deletions and oneduplication) the second affected sib did not inheritthe CNV. Interestingly, the maternally segregatingdeletion extends to the 30 end of the DOCK4 gene,whereas the non-segregating deletions or those iden-tified in controls and in the DGV were limited toIMMP2L. Taken together, these data seem to suggestthat a copy number loss of DOCK4 may influencesusceptibility to ASDs, whereas duplications may notbe damaging. The effect of DOCK4 deletions might beless penetrant in women because the mother and theunaffected daughter also carried the deletion. Largerstudies will be needed to confirm this hypothesis.
The predominantly gene-based nature of our studyrepresents a possible limitation, as we may havemissed susceptibility alleles in intergenic regions.Recent findings from the ENCODE Consortium em-phasize the importance of looking at noncodingsequence, as several functional elements in thegenome seem to be in these regions.60 We attemptedto minimize this limitation by including several SNPsin non-genic evolutionary conserved elements.
Our study also suggests that no common variants oflarge effect size are present within genic regions atAUTS1 and AUTS5 and highlights the importance ofvery large sample sizes for identification of robustassociations and rare CNVs with sufficient power forstatistical significance. Evidence from recent genome-wide association studies for various disorders clearlyshows that effect sizes for loci contributing tocomplex traits are generally lower than those pre-dicted a few years ago.61 Several whole-genomeassociation and CNV studies for autism are currentlyin progress by large consortia, and it will be interest-ing to see if any of the genes highlighted by this studyare also identified by these extensive studies.
It is possible that rare variants, both point muta-tions and CNVs, may account for a larger fraction ofthe overall genetic risk in complex psychiatricdisorders than previously assumed. The presentstudy was not designed to assess the contribution ofrare sequence variants and our results do not precludethat the chromosome 2q and 7q linkage regions mayharbor rare variation showing allelic heterogeneityacross families, which may require resequencing touncover.
The inconclusive findings identified with thisstudy reflect the status of the field of autism geneticsand suggest that classical approaches such as linkage
SNP association and CNV analysis of AUTS1 and AUTS5E Maestrini et al
965
Molecular Psychiatry
and association analyses alone may not be sufficientto deal with the genetic and phenotypic heterogeneityseen in autism. One recent study of note usedhomozygosity mapping to uncover a number of largehomozygous deletions in consanguineous pedigrees,highlighting the utility of this approach for hetero-geneous disorders like autism.10 Another successfulstudy found linkage to 15q13.3–q14 in a subset offamilies with IQ X70, suggesting that the use ofinformative subphenotypes to define homogeneoussets of ASD families could be very important indetecting susceptibility loci involved in autism.62
Finally, another report indicated that level of somaticCNVs between MZ twins may be higher thanexpected.63 If confirmed, this finding could be apowerful tool for identification of autism suscept-ibility loci in MZ twins with a discordant phenotype.We believe a combination of these (and other) novelapproaches, together with traditional methods will berequired to uncover all the genes and biologicalpathways leading to autism.
In summary, the present high-density SNP associa-tion and CNV screen have provided evidencethat variants in the IMMP2L/DOCK4 locus on chro-mosome 7 and in ZNF533 on chromosome 2 mayincrease susceptibility to ASDs. Association of thecommon allele of SNP rs2217262 in DOCK4 wassupported by an independent replication, whereasthe associations in IMMP2L and ZNF533 are notsufficiently significant in the context of multipletesting and warrant further studies.
Conflict of interest
The authors declare no conflict of interest.
Acknowledgments
We thank all the families who have participated in thestudy and the professionals who made this studypossible. We also thank John Broxholme for bioinfor-matics support, Joseph Trakalo and Chris Allan at theWTCHG core genomics facility for Illumina andSequenom genotyping, respectively. We especiallythank Professor Giovanni Romeo at the MedicalGenetics Unit, S Orsola-Malpighi Hospital, Universityof Bologna for his generous provision of laboratoryspace and equipment to EM, EB and CT. The CPEA(Collaborative Program of Excellence in Autism)thank Jeffery Munson, and Raphael Bernier andAnnette Estes. This work was funded by the NancyLurie Marks Family Foundation; the Simons Founda-tion; the EC Sixth FP AUTISM MOLGEN, Telethon-Italy; the Korczak Foundation for Autism and RelatedDisorders; the Netherlands Organization for ScientificResearch (NWO). The IMGSAC was funded by UKMedical Research Council, Wellcome Trust, BIOMED2 (CT-97-2759), EC Fifth Framework (QLG2-CT-1999-0094), Deutsche Forschungsgemeinschaft, FondationFrance Telecom, Conseil Regional Midi-Pyrenees,Danish Medical Research Council, Sofiefonden, Bea-
trice Surovell Haskells Fund for Child Mental HealthResearch of Copenhagen, Danish Natural ScienceResearch Council (9802210) and National Institutesof Health (U19 HD35482, MO1 RR06022, K05MH01196, K02 MH01389). AJ Bailey is the Cheryland Reece Scott Professor of Psychiatry. AP Monacois a Wellcome Trust principal research fellow.
References
1 Chakrabarti S, Fombonne E. Pervasive developmental disorders inpreschool children: confirmation of high prevalence. Am JPsychiatry 2005; 162: 1133–1141.
2 Fombonne E. Epidemiology of autistic disorder and otherpervasive developmental disorders. J Clin Psychiatry 2005;66(Suppl 10): 3–8.
3 Baird G, Simonoff E, Pickles A, Chandler S, Loucas T, Meldrum Det al. Prevalence of disorders of the autism spectrum in apopulation cohort of children in South Thames: the Special Needsand Autism Project (SNAP). Lancet 2006; 368: 210–215.
4 Bailey A, Le Couteur A, Gottesman I, Bolton P, Simonoff E, YuzdaE et al. Autism as a strongly genetic disorder: evidence from aBritish twin study. Psychol Med 1995; 25: 63–77.
5 Bolton P, Macdonald H, Pickles A, Rios P, Goode S, Crowson Met al. A case–control family history study of autism. J ChildPsychol Psychiatr 1994; 35: 877–900.
6 Vorstman JA, Staal WG, van Daalen E, van Engeland H,Hochstenbach PF, Franke L. Identification of novel autismcandidate regions through analysis of reported cytogeneticabnormalities associated with autism. Mol Psychiatry 2006; 11:1, 18–28.
7 Jamain S, Quach H, Betancur C, Rastam M, Colineaux C, GillbergIC et al. Mutations of the X-linked genes encoding neuroliginsNLGN3 and NLGN4 are associated with autism. Nat Genet 2003;34: 27–29.
8 Szatmari P, Paterson AD, Zwaigenbaum L, Roberts W, Brian J, LiuXQ et al. Mapping autism risk loci using genetic linkage andchromosomal rearrangements. Nat Genet 2007; 39: 319–328.
9 Durand CM, Betancur C, Boeckers TM, Bockmann J, Chaste P,Fauchereau F et al. Mutations in the gene encoding the synapticscaffolding protein SHANK3 are associated with autism spectrumdisorders. Nat Genet 2007; 39: 25–27.
10 Morrow EM, Yoo SY, Flavell SW, Kim TK, Lin Y, Hill RS et al.Identifying autism loci and genes by tracing recent sharedancestry. Science 2008; 321: 218–223.
11 Christian SL, Brune CW, Sudi J, Kumar RA, Liu S, Karamohamed Set al. Novel submicroscopic chromosomal abnormalitiesdetected in autism spectrum disorder. Biol Psychiatry 2008; 63:1111–1117.
12 Marshall CR, Noor A, Vincent JB, Lionel AC, Feuk L, Skaug J et al.Structural variation of chromosomes in autism spectrum disorder.Am J Hum Genet 2008; 82: 477–488.
13 Sebat J, Lakshmi B, Malhotra D, Troge J, Lese-Martin C, Walsh Tet al. Strong association of de novo copy number mutations withautism. Science 2007; 316: 445–449.
14 Ullmann R, Turner G, Kirchhoff M, Chen W, Tonge B, Rosenberg Cet al. Array CGH identifies reciprocal 16p13.1 duplications anddeletions that predispose to autism and/or mental retardation.Hum Mutat 2007; 28: 674–682.
15 Abrahams BS, Geschwind DH. Advances in autism genetics:on the threshold of a new neurobiology. Nat Rev Genet 2008; 9:341–355.
16 IMGSAC. A full genome screen for autism with evidence forlinkage to a region on chromosome 7q. Hum Molec Genet 1998; 7:571–578.
17 IMGSAC. A genomewide screen for autism: strong evidence forlinkage to chromosomes 2q, 7q, and 16p. Am J Hum Genet 2001;69: 570–581.
18 Lamb JA, Barnby G, Bonora E, Sykes N, Bacchelli E, Blasi F et al.Analysis of IMGSAC autism susceptibility loci: evidence for sex
SNP association and CNV analysis of AUTS1 and AUTS5E Maestrini et al
966
Molecular Psychiatry
limited and parent of origin specific effects. J Med Genet 2005; 42:132–137.
19 Schellenberg GD, Dawson G, Sung YJ, Estes A, Munson J,Rosenthal E et al. Evidence for multiple loci from a genome scanof autism kindreds. Mol Psychiatry 2006; 11: 1049–1060, 979.
20 Pericak-Vance MA, Wolpert CM, Menold MM, Bass MP, HauserER, Donnelly SL et al. Chromosome 7 and autistic disorder (AD).Am J Hum Genet 1998; 63: A16.
21 Trikalinos TA, Karvouni A, Zintzaras E, Ylisaukko-oja T, PeltonenL, Jarvela I et al. A heterogeneity-based genome search meta-analysis for autism-spectrum disorders. Mol Psychiatry 2006; 11:29–36.
22 Badner JA, Gershon ES. Regional meta-analysis of published datasupports linkage of autism with markers on chromosome 7. MolPsychiatry 2002; 7: 56–66.
23 Buxbaum JD, Silverman JM, Smith CJ, Kilifarski M, Reichert J,Hollander E et al. Evidence for a susceptibility gene for autism onchromosome 2 and for genetic heterogeneity. Am J Hum Genet2001; 68: 1514–1520.
24 Shao Y, Raiford KL, Wolpert CM, Cope HA, Ravan SA, Ashley-Koch AA et al. Phenotypic homogeneity provides increasedsupport for linkage on chromosome 2 in autistic disorder. Am JHum Genet 2002; 70: 1058–1061.
25 Bonora E, Bacchelli E, Levy ER, Blasi F, Marlow A, Monaco APet al. Mutation screening and imprinting analysis of fourcandidate genes for autism in the 7q32 region. Mol Psychiatry2002; 7: 289–301.
26 Bonora E, Beyer KS, Lamb JA, Parr JR, Klauck SM, Benner A et al.Analysis of reelin as a candidate gene for autism. Mol Psychiatry2003; 8: 885–892.
27 Bonora E, Lamb JA, Barnby G, Sykes N, Moberly T, Beyer KS et al.Mutation screening and association analysis of six candidategenes for autism on chromosome 7q. Eur J Hum Genet 2005; 13:198–207.
28 Bacchelli E, Blasi F, Biondolillo M, Lamb JA, Bonora E, Barnby Get al. Screening of nine candidate genes for autism on chromosome2q reveals rare nonsynonymous variants in the cAMP-GEFII gene.Mol Psychiatry 2003; 8: 916–924.
29 Blasi F, Bacchelli E, Carone S, Toma C, Monaco AP, Bailey AJ et al.SLC25A12 and CMYA3 gene variants are not associated withautism in the IMGSAC multiplex family sample. Eur J Hum Genet2006; 14: 123–126.
30 Ackerman H, Usen S, Jallow M, Sisay-Joof F, Pinder M,Kwiatkowski DP. A comparison of case–control and family-basedassociation methods: the example of sickle-cell and malaria. AnnHum Genet 2005; 69: 559–565.
31 Fingerlin TE, Boehnke M, Abecasis GR. Increasing the power andefficiency of disease-marker case–control association studiesthrough use of allele-sharing information. Am J Hum Genet2004; 74: 432–443.
33 Ramoz N, Cai G, Reichert JG, Silverman JM, Buxbaum JD. Ananalysis of candidate autism loci on chromosome 2q24–q33:evidence for association to the STK39 gene. Am J Med Genet BNeuropsychiatr Genet 2008; 147B: 1152–1158.
34 http://www.hpacultures.org.uk/collections/ecacc.jsp.35 Barrett JC, Fry B, Maller J, Daly MJ. HaploView: analysis and
visualization of LD and haplotype maps. Bioinformatics 2005; 21:263–265.
36 Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M,Rosenbloom K et al. Evolutionarily conserved elements invertebrate, insect, worm, and yeast genomes. Genome Res 2005;15: 1034–1050.
37 Seldin MF, Shigeta R, Villoslada P, Selmi C, Tuomilehto J, Silva Get al. European population substructure: clustering of northernand southern populations. PLoS Genet 2006; 2: e143.
38 Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender Det al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007; 81: 559–575.
39 de Bakker PI, Yelensky R, Pe’er I, Gabriel SB, Daly MJ, Altshuler D.Efficiency and power in genetic association studies. Nat Genet2005; 37: 1217–1223.
40 Morris AP. Direct analysis of unphased SNP genotype data inpopulation-based association studies via Bayesian partitionmodelling of haplotypes. Genet Epidemiol 2005; 29: 91–107.
41 Morris AP. A flexible Bayesian framework for modeling haplotypeassociation with disease, allowing for dominance effects of theunderlying causative variants. Am J Hum Genet 2006; 79: 679–694.
42 Dudbridge F. Likelihood-based association analysis for nuclearfamilies and unrelated subjects with missing genotype data. HumHered 2008; 66: 87–98.
43 Colella S, Yau C, Taylor JM, Mirza G, Butler H, Clouston P et al.QuantiSNP: an Objective Bayes Hidden-Markov Model to detectand accurately map copy number variation using SNP genotypingdata. Nucleic Acids Res 2007; 35: 2013–2025.
44 Saugier-Veber P, Goldenberg A, Drouin-Garraud V, de La Rocheb-rochard C, Layet V, Drouot N et al. Simple detection of genomicmicrodeletions and microduplications using QMPSF in patientswith idiopathic mental retardation. Eur J Hum Genet 2006; 14:1009–1017.
45 Pritchard JK, Stephens M, Donnelly P. Inference of populationstructure using multilocus genotype data. Genetics 2000; 155:945–959.
46 Falush D, Stephens M, Pritchard JK. Inference of populationstructure using multilocus genotype data: linked loci andcorrelated allele frequencies. Genetics 2003; 164: 1567–1587.
47 Walsh T, McClellan JM, McCarthy SE, Addington AM, Pierce SB,Cooper GM et al. Rare structural variants disrupt multiple genes inneurodevelopmental pathways in schizophrenia. Science 2008;320: 539–543.
48 Mencarelli MA, Caselli R, Pescucci C, Hayek G, Zappella M,Renieri A et al. Clinical and molecular characterization of apatient with a 2q31.2–32.3 deletion identified by array-CGH. Am JMed Genet A 2007; 143A: 858–865.
49 Monfort S, Rosello M, Orellana C, Oltra S, Blesa D, Kok K et al.Detection of known and novel genomic rearrangements by arraybased comparative genomic hybridisation: deletion of ZNF533 andduplication of CHARGE syndrome genes. J Med Genet 2008; 45:432–437.
50 Shoichet SA, Hoffmann K, Menzel C, Trautmann U, Moser B,Hoeltzenbein M et al. Mutations in the ZNF41 gene are associatedwith cognitive deficits: identification of a new candidate for X-linked mental retardation. Am J Hum Genet 2003; 73: 1341–1354.
52 Lugtenberg D, Yntema HG, Banning MJ, Oudakker AR, Firth HV,Willatt L et al. ZNF674: a new Kruppel-associated box-containingzinc-finger gene involved in nonsyndromic X-linked mentalretardation. Am J Hum Genet 2006; 78: 265–278.
53 O’Donovan MC, Craddock N, Norton N, Williams H, Peirce T,Moskvina V et al. Identification of loci associated with schizo-phrenia by genome-wide association and follow-up. NatGenet 2008.
54 Petek E, Windpassinger C, Vincent JB, Cheung J, Boright AP,Scherer SW et al. Disruption of a novel gene (IMMP2L) by abreakpoint in 7q31 associated with Tourette syndrome. Am J HumGenet 2001; 68: 848–858.
55 Battye R, Stevens A, Perry RL, Jacobs JR. Repellent signalingby Slit requires the leucine-rich repeats. J Neurosci 2001; 21:4290–4298.
57 Yajnik V, Paulding C, Sordella R, McClatchey AI, Saito M, WahrerDC et al. DOCK4, a GTPase activator, is disrupted duringtumorigenesis. Cell 2003; 112: 673–684.
58 Ueda S, Fujimoto S, Hiramoto K, Negishi M, Katoh H. Dock4regulates dendritic development in hippocampal neurons.J Neurosci Res 2008; 86: 3052–3061.
59 Mitchell AA, Cutler DJ, Chakravarti A. Undetected genotypingerrors cause apparent overtransmission of common alleles inthe transmission/disequilibrium test. Am J Hum Genet 2003; 72:598–610.
SNP association and CNV analysis of AUTS1 and AUTS5E Maestrini et al
60 Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR,Margulies EH et al. Identification and analysis of functionalelements in 1% of the human genome by the ENCODE pilotproject. Nature 2007; 447: 799–816.
61 WTCCC. Genome-wide association study of 14 000 cases of sevencommon diseases and 3000 shared controls. Nature 2007; 447:661–678.
63 Bruder CE, Piotrowski A, Gijsbers AA, Andersson R, Erickson S,de Stahl TD et al. Phenotypically concordant and discordantmonozygotic twins display different DNA copy-number-variationprofiles. Am J Hum Genet 2008; 82: 763–771.
This work is licensed under the CreativeCommons Attribution-NonCommercial-
No Derivative Works 3.0 Unported License. To viewa copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/3.0/
Supplementary Information accompanies the paper on the Molecular Psychiatry website (http://www.nature.com/mp)
SNP association and CNV analysis of AUTS1 and AUTS5E Maestrini et al