EUROPEAN ORGANISATION FOR NUCLEAR RESEARCH
(CERN)CERN-PH-EP-2014-262Submitted to: Journal of High Energy
PhysicsEvidence for the Higgs-boson Yukawa coupling to tau
leptonswith the ATLAS detectorThe ATLAS
CollaborationAbstractResultsofasearchfor H decaysarepresented,
basedonthefull setofprotonprotoncollision data recorded by the
ATLAS experiment at the LHC during 2011 and 2012. The data
corre-spond to integrated luminosities of 4.5fb1and 20.3fb1at
centre-of-mass energies of s = 7 TeVand s=8TeVrespectively. All
combinationsof leptonic( ` with` =e, )andhadronic( hadrons ) tau
decays are considered. An excess of events over the expected
background fromother Standard Model processes is found with an
observed (expected) signicance of 4.5 (3.4) stan-dard deviations.
This excess provides evidence for the direct coupling of the
recently discovered Higgsboson to fermions. The measured signal
strength, normalised to the Standard Model expectation, of =
1.43+0.430.37 is consistent with the predicted Yukawa coupling
strength in the Standard Model.c2015 CERN for the benet of the
ATLAS Collaboration.Reproduction of this article or parts of it is
allowed as specied in the CC-BY-3.0 license.arXiv:1501.04943v2
[hep-ex] 26 Jan
2015PreparedforsubmissiontoJHEPEvidencefortheHiggs-bosonYukawacouplingtotauleptonswiththeATLASdetectorThe
ATLAS CollaborationAbstract: Results of a search forH decays are
presented, based on the full set ofprotonproton collision data
recorded by the ATLAS experiment at the LHC during 2011and2012.
Thedatacorrespondtointegratedluminositiesof
4.5fb1and20.3fb1atcentre-of-mass energies of s = 7 TeV and s = 8
TeV respectively. All combinations ofleptonic ( ` with` = e, ) and
hadronic ( hadrons) tau decays are
considered.AnexcessofeventsovertheexpectedbackgroundfromotherStandardModelprocessesisfoundwithanobserved(expected)signicanceof4.5(3.4)standarddeviations.
Thisexcess provides evidence for the direct coupling of the
recently discovered Higgs boson tofermions. The measured signal
strength, normalised to the Standard Model expectation,
of=1.43+0.430.37is consistent with the predicted Yukawa coupling
strength in the StandardModel.Contents1 Introduction 22 The ATLAS
detector and object reconstruction 33 Data and simulated samples 54
Event selection and categorisation 74.1 Event selection 74.2
Analysis categories 114.3 Higgs boson candidate mass reconstruction
135 Boosted decision trees 156 Background estimation 196.1
Background fromZ production 196.2 Background from misidentied
leptons or hadronically decaying taus 216.3 Z ee andZ background
226.4 W+jets background 236.5 Background from top-quark production
246.6 Diboson background 246.7 Contributions from other Higgs boson
decays 246.8 Validation of background estimates 247 Systematic
uncertainties 287.1 Experimental uncertainties 287.2 Background
modelling uncertainties 297.3 Theoretical uncertainties 308 Signal
extraction procedure 349 Results 4110 Cut-based analysis 4711
Conclusions 5412 Acknowledgements 54 1 1 IntroductionThe
investigation of the origin of electroweak symmetry breaking and,
related to this, theexperimental conrmation of the
BroutEnglertHiggs mechanism [16] is one of the primegoals of the
physics programme at the Large Hadron Collider (LHC) [7]. With the
discoveryof a Higgs boson with a mass of approximately 125 GeVby
the ATLAS [8] and CMS [9]collaborations,
animportantmilestonehasbeenreached. Moreprecisemeasurementsofthe
properties of the discovered particle [10, 11] as well as tests of
the spinparity quantumnumbers[1214]
continuetobeconsistentwiththepredictionsfortheStandardModel(SM)
Higgs boson.These measurements rely predominantly on studies of the
bosonic decay modes, H ,H ZZandH WW. To establish the mass
generation mechanism for fermions
asimplementedintheSM,itisofprimeimportancetodemonstratethedirectcouplingofthe
Higgs boson to fermions and the proportionality of its strength to
mass [15]. The
mostpromisingcandidatedecaymodesarethedecaysintotauleptons, H ,
andbottomquarks(b-quarks), H bb. Duetothehighbackground,
thesearchfordecaystobb isrestricted to Higgs bosons produced in
modes which have a more distinct signature but alowercross-section,
suchasHproductionwithanassociatedvectorboson. Thesmallerrate of
these processes in the presence of still large background makes
their detection chal-lenging. More favourable signal-to-background
conditions are expected for H decays.Recently, the CMS
Collaboration published evidence for H decays at a signicance
interms of standard deviations of3.2 [16], and an excess
corresponding to a signicance of2.1 in the search forH bb decays
[17]. The combination of channels provides evidencefor fermionic
couplings with a signicance of 3.8 [18]. The yield of events in the
search forH bb decays observed by the ATLAS Collaboration has a
signal signicance of 1.4 [19].The Tevatron experiments have
observed an excess corresponding to2.8in theH bbsearch [20].In this
paper,the results of a search forH decays are presented,based on
thefull protonproton dataset collected by the ATLAS experiment
during the 2011 and 2012data-takingperiods,
correspondingtointegratedluminositiesof
4.5fb1atacentre-of-massenergyof s=7TeVand20.3fb1at s=8TeV.
Theseresultssupersedetheearlierupperlimitsonthecrosssectiontimesthebranchingratioobtainedwiththe7TeV
data [21]. All combinations of leptonic ( ` with`=e, ) and hadronic
( hadrons )
taudecaysareconsidered.1Thecorrespondingthreeanalysischannelsaredenotedbyleplep,
lephad, andhadhadinthefollowing. Thesearchisdesignedtobesensitive
to the major production processes of a SM Higgs boson, i.e.
production via gluonfusion(ggF)[22], vector-bosonfusion(VBF)[23],
andassociatedproduction(V H)withV= Wor Z. These production
processes lead to dierent nal-state signatures, which areexploited
by dening an event categorisation. Two dedicated categories are
considered toachieve both a good signal-to-background ratio and
good resolution for the reconstructed invariant mass. The VBF
category, enriched in events produced via vector-boson
fusion,1Throughout this paper the inclusion of charge-conjugate
decay modes is implied. 2 isdenedbythepresenceof
twojetswithalargeseparationinpseudorapidity.2Theboosted category
contains events where the reconstructed Higgs boson candidate has a
largetransverse momentum. It is dominated by events produced via
gluon fusion with additionaljetsfromgluonradiation. Inviewof
thesignal-to-backgroundconditions,
andinordertoexploitcorrelationsbetweennal-stateobservables,
amultivariateanalysistechnique,basedonboosteddecisiontrees(BDTs)[2426],
isusedtoextractthenal results. Asa cross-check, a separate analysis
where cuts on kinematic variables are applied is carriedout.2 The
ATLAS detector and object reconstructionThe ATLAS detector [27] is
a multi-purpose detector with a cylindrical geometry. It com-prises
an inner detector (ID) surrounded by a thin superconducting
solenoid, a
calorimetersystemandanextensivemuonspectrometerinatoroidalmagneticeld.
TheIDtrack-ing system consists of a silicon pixel detector,a
silicon microstrip detector (SCT), and atransition radiation
tracker (TRT). It provides precise position and momentum
measure-ments for charged particles and allows ecient identication
of jets containingb-hadrons(b-jets) in the pseudorapidity range ||
< 2.5. The ID is immersed in a2T axial magneticeld and is
surrounded by high-granularity lead/liquid-argon (LAr) sampling
electromag-netic calorimeters which cover the pseudorapidity range
|| 30GeV. To reduce the contamination of jets by
additionalinteractions in the same or neighbouring bunch crossings
(pile-up), tracks originating fromthe primary vertex must
contribute a large fraction of thepT when summing the scalarpTof
all tracks in the jet. This jet vertex fraction (JVF) is required
to be at least 75% (50%)for jets with || < 2.4 in the7TeV (8TeV)
dataset. Moreover, for the 8 TeV dataset, theJVF selection is
applied only to jets withpT< 50GeV. Jets with no associated
tracks areretained.In the pseudorapidity range || < 2.5, b-jets
are selected using a tagging algorithm [38].Theb-jet tagging
algorithm has an eciency of 6070% forb-jets in simulatedtt
events.The corresponding light-quark jet misidentication
probability is 0.11%, depending on thejetspT and.Hadronically
decaying tau leptons are reconstructed starting from clusters of
energy inthe electromagnetic and hadronic calorimeters.
Thehad3reconstruction is seeded by theanti-ktjet
ndingalgorithmwitha radiusparameterR=0.4. Tracks
withpT>1GeVwithin a cone of radius 0.2 around the cluster
barycentre are matched to the had candidate,and the had charge is
determined from the sum of the charges of its associated tracks.
Therejection of jets is provided in a separate identication step
using discriminating variablesbasedontracks withpT>1GeVandthe
energydepositedincalorimetercells foundinthe core region (R <
0.2) and in the region0.2 < R < 0.4 around thehad
candidatesdirection. Such discriminating variables are combined in
a boosted decision tree and threeworking points, labelled tight,
medium and loose [39], are dened, corresponding to dierent3In the
following, thehadsymbol always refers to the visible decay products
of thehadronic decay. 4 had identication eciency values.In this
analysis,hadcandidates withpT> 20GeV and || < 2.47 are used.
Thehadcandidates are required to have charge 1, and must be1-
or3-track (prong) candidates.In addition, a sample without the
charge and track multiplicity requirements is retained
forbackground modelling in thehadhadchannel, as described in
section 6.2. The identica-tion eciency forhad candidates satisfying
the medium criteria is of the order of 5560%.Dedicated criteria
[39] to separatehad candidates from misidentied electrons are also
ap-plied, with a selection eciency for true had decays of 95%. The
probability to misidentifya jet withpT> 20GeV as ahad candidate
is typically 12%.Following their reconstruction, candidate leptons,
hadronically decaying taus and jetsmaypoint tothesameenergydeposits
inthecalorimeters (withinR 40 GeV,Z
eventscanbeselectedfromthedatawithhigheciencyandpurity.
Toreplacethemuonsinthe selected events, all tracks associated with
the muons are removed and calorimeter cellenergies associated with
the muons are corrected by subtracting the corresponding
energydepositions in a single simulatedZ event with the same
kinematics. Finally,boththe track information and the calorimeter
cell energies from a simulatedZ decay areaddedtothedataevent.
ThedecaysofthetauleptonsaresimulatedbyTauola[71].4These processes
are hereafter for simplicity denoted byZ andZ respectively, even
thoughthe whole continuum above and below theZpeak is considered. 6
The tau lepton kinematics are matched to the kinematics of the
muons they are replacing,including polarisation and spin
correlations [72], and the mass dierence between the muonsand the
tau leptons is accounted for. This hybrid sample is referred to as
embedded datain the
following.Otherbackgroundprocessesaresimulatedusingdierentgenerators,
eachinterfacedto Pythia [46, 73] or Herwig [74, 75] to provide the
parton shower, hadronisation and themodelling of the underlying
event, as indicated in table 1. For theHerwig samples, thedecays of
tau leptons are also simulated using Tauola [71]. Photon radiation
from chargedleptons for all samples is provided by Photos [76]. The
samples for W/Z+jets productionare generated with Alpgen [77],
employing the MLM matching scheme [78] between thehard process
(calculated with LO matrix elements for up to ve partons) and the
partonshower. ForWWproduction, the loop-inducedgg WWprocess is also
generated, usingthe gg2WW [79] program. In the AcerMC [80], Alpgen,
and Herwig event generators,theCteq6L1 parameterisation of the PDFs
is used, while theCT10 parameterisation
isusedforthegenerationofeventswithgg2WW.Thenormalisationofthesebackgroundcontributions
is either estimated from control regions using data, as described
in section 6,or the cross sections quoted in table 1 are used.For
all samples, a full simulation of the ATLAS detector response [81]
using the Geant4program[82] wasperformed. Inaddition,
eventsfromminimum-biasinteractionsweresimulatedusingtheAU2[83]
parametertuningof Pythia8. TheAU2tuneincludesthe set of optimized
parameters for the parton shower, hadronisation, and multiple
partoninteractions. They are overlaid on the simulated signal and
background events according tothe luminosity prole of the recorded
data.The contributions from these pile-up interactionsare simulated
both within the same bunch crossing as the hard-scattering process
and inneighbouring bunch crossings. Finally, the resulting
simulated events are processed throughthe same reconstruction
programs as the data.4 Event selection and categorisation4.1 Event
selectionSingle lepton, dilepton and di-had triggers were used to
select the events for the analysis. Asummary of the triggers used
by each channel at the two centre-of-mass energies is reportedin
table 2. Due to the increasing luminosity and the dierent pile-up
conditions, the onlinepT thresholds increased during data-taking in
2011 and again for 2012, and more stringentidentication
requirements were applied for the data-taking in 2012. ThepT
requirementson the objects in the analysis are usually 2 GeV higher
than the trigger requirements, toensure that the trigger is fully
ecient.In addition to applying criteria to ensure that the detector
was functioning properly,requirements to increase the purity and
quality of the data sample are applied by rejectingnon-collision
events such as cosmic rays and beam-halo events. At least one
reconstructedvertex is required with at least four associated
tracks withpT>400MeV and a positionconsistent with the beam
spot. 7 Signal (mH= 125 GeV) MC generator B [pb]s = 8 TeVggF,H
Powheg [4245] 1.22 NNLO+NNLL [4853, 84]+ Pythia8 [46]VBF,H
Powheg+Pythia8 0.100 (N)NLO [5759, 84]WH,H Pythia8 0.0445 NNLO [62,
84]ZH,H Pythia8 0.0262 NNLO [62, 84]Background MC generator B [pb]s
= 8 TeVW(`), (` = e, , ) Alpgen [77]+Pythia8 36800 NNLO [85,
86]Z/(``),Alpgen+Pythia8 3910 NNLO [85, 86]60 GeV< m``< 2
TeVZ/(``),Alpgen+Herwig [87] 13000 NNLO [85, 86]10 GeV< m``<
60 GeVVBFZ/(``) Sherpa [88] 1.1 LO [88]tt Powheg+Pythia8
253NNLO+NNLL [8994]Single top : Wt Powheg+Pythia8 22NNLO [95]Single
top : s-channel Powheg+Pythia8 5.6NNLO [96]Single top : t-channel
AcerMC [80]+Pythia6 [73] 87.8NNLO [97]q q WW Alpgen+Herwig 54NLO
[98]gg WW gg2WW [79]+Herwig 1.4NLO [79]WZ, ZZ Herwig 30NLO [98]H WW
same as forH signal 4.7Table1. MonteCarlogeneratorsusedtomodel
thesignal andthebackgroundprocessesats=8TeV.
Thecrosssectionstimesbranchingfractions(
B)usedforthenormalisationofsome processes (many of these are
subsequently normalised to data) are included in the last
columntogether with the perturbative order of the QCD calculation.
For the signal processes theH SM branching ratio is included, and
for the Wand Z/ background processes the branching
ratiosforleptonicdecays(`=e, , )ofthebosonsareincluded. Forall
otherbackgroundprocesses,inclusive cross sections are quoted
(marked with a
).Withrespecttotheobjectidenticationrequirementsdescribedinsection2,
tightercriteriaareappliedtoaddressthedierentbackgroundcontributionsandcompositionsinthedierentanalysischannels.
HigherpTthresholdsareappliedtoelectrons, muons,andhadcandidates
according to the trigger conditions satised by the event, as listed
intable2. Forthechannelsinvolvingleptonictaudecays,
leplepandlephad, additionalisolation criteria for electrons and
muons, based on tracking and calorimeter
information,areusedtosuppressthebackgroundfrommisidentiedjetsorfromsemileptonicdecaysof
charmandbottomhadrons. ThecalorimeterisolationvariableI(ET,
R)isdenedas the sum of the total transverse energy in the
calorimeter in a cone of sizeR
aroundtheelectronclusterorthemuontrack,
dividedbytheEToftheelectronclusterorthepTofthemuonrespectively.
Thetrack-basedisolationI(pT, R)isdenedasthesumof
thetransversemomentaof trackswithinaconeof
Raroundtheelectronormuontrack, divided by theET of the electron
cluster or the muonpT respectively. The isolation 8 s = 7
TeVTriggerTriggerAnalysis level thresholds
[GeV]levelthresholds,lepleplephadhadhadpT[GeV]Single electron 2022
e:peT> 22 24e:peT> 25pT> 10 pT> 20Single muon
18:p1T> 20: p2T> 10 pT> 22e:pT> 20 pT> 20peT>
15Di-electron 12/12 ee:pe1T> 15 pe2T> 15Di-had29/20 :p1T>
35p2T> 25s = 8 TeVTriggerTriggerAnalysis level thresholds
[GeV]levelthresholds,lepleplephadhadhadpT[GeV]Single electron
24e:peT> 26e: pT> 10 peT> 26ee:pe1T> 26 pT>
20pe2T> 15Single muon 24 :pT> 26pT> 20Di-electron 12/12
ee:pe1T> 15 pe2T> 15Di-muon 18/8 :p1T> 20 p2T>
10Electron+muon 12/8 e:peT> 15 pT> 10Di-had29/20 :p1T>
35p2T> 25Table2. Summary of the triggers used to select events
for the dierent analysis channels at thetwo centre-of-mass
energies. The transverse momentum thresholds applied at trigger
level and inthe analysis are listed. When more than one trigger is
used, a logical OR is taken and the triggereciencies are calculated
accordingly.requirements applied are slightly dierent for the two
centre-of-mass energies and are listedin table 3.In the hadhad
channel, isolated taus are selected by requiring that there are no
trackswithpT>0.5GeVinanisolationregionof 0.2 25 GeV.Within the
collinear approximation [99], i.e. assuming that the tau directions
are
givenbythedirectionsofthevisibletaudecayproductsandthatthemomentaoftheneutri-nos
constitute the missing transverse momentum, the tau momenta can be
reconstructed.For tau decays,the fractions of the tau momenta
carried by the visible decay products,6x,i= pvis,i/(pvis,i +
pmis,i), withi=1, 2, are expected to lie in the interval 0 70 GeV
are rejected. Contributions from tt events are reducedby rejecting
events with ab-jet withpT> 30 GeV.Thehadhadchannel: one isolated
mediumhad candidate and one isolated
tighthadcandidatewithOSchargesarerequired.
Eventswithelectronormuoncandidatesarerejected. Forall data,
themissingtransversemomentummustsatisfyEmissT>20GeVanditsdirectionmusteitherbebetweenthetwovisiblehadcandidatesinorwithin40GeV,to
eliminate low-massZ/events. Although this category isdominated by
VBF events, it also includes smaller contributions from ggF andV
Hproduction.The boosted category targets events with a boosted
Higgs boson produced via
ggF.Higgsbosoncandidatesarethereforerequiredtohavelargetransversemomentum,pHT>
100GeV.The pHTis reconstructed using the vector sum of the missing
transversemomentum and the transverse momentum of the visible tau
decay products. In theleplepchannel, at least one jet
withpT>40GeV is required. The jet requirement7mT=p2p`TEmissT (1
cos ), where is the azimuthal separation between the directions of
thelepton and the missing transverse momentum. 11 Channel
Preselection cutsleplepExactly two isolated opposite-sign
leptonsEvents withhad candidates are rejected30 GeV < mvis<
100(75)GeV for DF (SF) events``< 2.5EmissT> 20(40)GeV for DF
(SF) eventsEmiss,HPTOT> 40 GeV for SF eventsp`1T+ p`2T>
35GeVEvents with ab-tagged jet withpT> 25 GeV are rejected0.1
< x1, x2< 1mcoll> mZ 25GeVlephadExactly one isolated
lepton and one mediumhad candidate with opposite chargesmT<
70GeVEvents with ab-tagged jet withpT> 30 GeV are
rejectedhadhadOne isolated medium and one isolated tight
opposite-signhad-candidateEvents with leptons are vetoedEmissT>
20GeVEmissTpoints between the two visible taus in, or min[(,
EmissT)] < /40.8 < R(had1, had2) < 2.4(had1, had2) <
1.5Channel VBF category selection cutsleplepAt least two jets
withpj1T> 40 GeV andpj2T> 30 GeV(j1, j2) > 2.2lephadAt
least two jets withpj1T> 50 GeV andpj2T> 30 GeV(j1, j2) >
3.0mvis> 40GeVhadhadAt least two jets withpj1T> 50 GeV
andpj2T> 30 GeVpj2T> 35 GeV for jets with || > 2.4(j1, j2)
> 2.0Channel Boosted category selection cutsleplepAt least one
jet withpT> 40 GeVAllFailing the VBF selectionpHT>
100GeVTable 4. Summary of the event selection for the three
analysis channels. The requirements usedin both the preselection
and for the denition of the analysis categories are given. The
labels (1)and (2) refer to the leading (highest pT) and subleading
nal-state objects (leptons, had, jets). Thevariables are dened in
the text.selectsaregionof thephasespacewheretheEmissTof
same-avoureventsiswellmodelledbysimulation.
Inordertodeneanorthogonal category,
eventspassingtheVBFcategoryselectionarenotconsidered.
Thiscategoryalsoincludessmallcontributions from VBF and VH
production.While these categories are conceptually identical across
the three channels,
dierencesinthedominantbackgroundcontributionsrequiredierentselectioncriteria.
Forboth 12 categories, the requirement on jets is inclusive and
additional jets, apart from those passingthe category requirements,
are allowed.For the hadhad channel, the so-called rest category is
used as a control region. In thiscategory, events passing the
preselection requirements but not passing the VBF or
boostedcategoryselectionsareconsidered.
ThiscategoryisusedtoconstraintheZ andmultijet background
contributions. The signal contamination in this category is
negligible.4.3 Higgs boson candidate mass
reconstructionThedi-tauinvariant mass (mMMC) is
reconstructedusingthemissingmass calculator(MMC)[100].
Thisrequiressolvinganunderconstrainedsystemof
equationsforsixtoeightunknowns,
dependingonthenumberofneutrinosinthe nalstate.
Theseun-knownsincludethex-, y-,
andz-componentsofthemomentumcarriedbytheneutrinosfor each of the
two tau leptons in the event, and the invariant mass of the two
neutrinosfrom any leptonic tau decays. The calculation uses the
constraints from the measuredx-andy-componentsof
themissingtransversemomentum, andthevisiblemassesof
bothtaucandidates.
Ascanisperformedoverthetwocomponentsofthemissingtransversemomentum
vector and the yet undetermined variables. Each scan point is
weighted by itsprobability according to theEmissTresolution and the
tau decay topologies. The estimatorfor themass is dened as the most
probable value of the scan points.TheMMCalgorithmprovides
asolutionfor 99%of the H andZevents. This is a distinct advantage
compared to the mass calculation using the collinearapproximation
where the failure rate is higher due to the implicit collinearity
assumptions.The small loss rate of about 1% for signal events is
due to large uctuations of theEmissTmeasurement or other scan
variables.Figure1showsreconstructedmMMCmassdistributionsfor H
andZeventsinthelephadVBFandboostedcategories. Themassresolution,
denedastheratiobetweenthefullwidthathalfmaximum(FWHM)andthepeakvalueofthemassdistribution
(mpeak), is found to be 30% for all categories and channels. 13
[GeV] MMCm0 50 100 150 200Fraction of Events / 5
GeV00.020.040.060.080.10.120.140.160.180.20.22 Z (125) HATLAS
VBFhad e + had ) = 92.4 GeV Z (peakm) = 123.2 GeV H (peakm 30 %
peakm FWHM/(a) [GeV] MMCm0 50 100 150 200Fraction of Events / 5
GeV00.020.040.060.080.10.120.140.160.180.20.22 Z (125) HATLAS
Boostedhad e + had ) = 90.4 GeV Z (peakm) = 122.3 GeV H (peakm 30 %
peakm FWHM/(b)Figure1. Thereconstructedinvariant mass, mMMCfor H
(mH=125GeV)andZ eventsinMCsimulationandembeddingrespectively,
foreventspassing(a)theVBFcategory selection and (b) the boosted
category selection in thelephad channel. 14 5 Boosted decision
treesBoosted decision trees are used in each category to extract
the Higgs boson signal from thelarge number of background events.
Decision trees [24] recursively partition the parameterspace into
multiple regions where signal or background purities are enhanced.
Boosting isa method which improves the performance and stability of
decision trees and involves thecombination of many trees into a
single nal discriminant [25, 26]. After boosting, the nalscore
undergoes a transformation to map the scores on the interval 1
to+1. The mostsignal-like events have scores near 1 while the most
background-like events have scores
near1.SeparateBDTsaretrainedforeachanalysiscategoryandchannel
withsignal andbackgroundsamples, describedinsection6, at s=8TeV.
Theyarethenappliedtothe analysis of the data at both centre-of-mass
energies. The separate training naturallyexploits dierences in
event kinematics between dierent Higgs boson production modes.
Italso allows dierent discriminating variables to be used to
address the dierent backgroundcompositionsineachchannel.
ForthetrainingintheVBFcategory,
onlyaVBFHiggsproductionsignalsampleisused,
whiletrainingintheboostedcategoryusesggF,VBF,andV Hsignal samples.
TheHiggsbosonmassischosentobemH=125GeVforallsignalsamples.
TheBDTinputvariablesusedatbothcentre-of-massenergiesarelistedin
table 5. Most of these variables have straightforward denitions,
and the more complexones are dened in the following. R(1, 2): the
distanceR between the two leptons, between the lepton andhad,or
between the twohad candidates, depending on the decay mode.
pTotalT: magnitudeof thevectorsumof thetransversemomentaof
thevisibletaudecay products, the two leading jets, andEmissT.SumpT:
scalar sum of thepTof the visible components of the tau decay
productsand of the jets. EmissT centrality: a variable that
quanties the relative angular position of the missingtransverse
momentum with respect to the visible tau decay products in the
transverseplane. The transverse plane is transformed such that the
direction of the tau decayproducts are orthogonal, and that the
smaller angle between the tau decay
prod-uctsdenesthepositivequadrantofthetransformedplane.
TheEmissTcentralityisdenedasthesumof thex-andy-componentsof
theEmissTunitvectorinthistransformed plane.Sphericity: a variable
that describes the isotropy of the energy ow in the event [101].It
is based on the quadratic momentum tensorS=Pi pi piPi |~ pi2|.
(5.1)In this equation, and are the indices of the tensor. The
summation is performedover the momenta of the selected leptons and
jets in the event. The sphericity of the 15 event (S) is then dened
in terms of the two smallest eigenvalues of this tensor,
2and3,S=32(2 + 3). (5.2)min(`1`2,jets): the minimum between the
dilepton system and either of the twojets.Object centrality: a
variable that quanties the position of an object (an
isolatedlepton, ahad candidate or a jet) with respect to the two
leading jets in the event. Itis dened asC1,2() = exp"4(12)2 1 +
222#, (5.3)where,
1and2arethepseudorapiditiesoftheobjectandthetwoleadingjetsrespectively.
This variable has a value of1 when the object is halfway in
betweenthe two jets,1/e when the object is aligned with one of the
jets, and< 1/e when theobject is not between the jets in. In
theleplep channel the centrality of a thirdjet in the event,
C1,2(j3), and the product of thecentralities of the two
leptonsareusedasBDTinputvariables, whileinthelephadchannel
thecentralityofthe lepton,C1,2(`), is used, and in thehadhad
channel the centrality of each,C1,2(1) andC1,2(2), is used. Events
with only two jets are assigned a dummyvalue of 0.5
forC1,2(j3).AmongthesevariablesthemostdiscriminatingonesincludemMMC,
R(1, 2)and(j1, j2). Figure 2 shows the distributions of selected
BDT input variables. For the VBFcategory, the distributions of(j1,
j2) are shown for all three channels. For the boostedcategory,the
distributions of R(1, 2) are shown for
thelephadandhadhadchannelsand the distribution of thepTof the
leading jet is shown for theleplepchannel. For alldistributions,
the data are compared to the predicted SM backgrounds ats = 8 TeV.
Thecorresponding uncertainties are indicated by the shaded bands.
All input distributions arewell described, giving condence that the
background models (from simulation and data)describe well the
relevant input variables of the BDT. Similarly, good agreement is
foundfor the distributions at s = 7 TeV. 16 VariableVBF
BoostedlepleplephadhadhadlepleplephadhadhadmMMC R(1, 2) (j1, j2)
mj1,j2 j1 j2 pTotalT SumpT p1T /p2T EmissT centrality
m`,`,j1m`1,`2(`1, `2) Sphericity p`1Tpj1TEmissT/p`2TmT
min(`1`2,jets) C1,2(`1) C1,2(`2) C1,2(`) C1,2(j3) C1,2(1) C1,2(2)
Table 5. Discriminating variables used in the training of the BDT
for each channel and categoryat s=8TeV. The more complex variables
are described in the text. The lled circles indicatewhich variables
are used in each case. 17 )2, j1(j 2 3 4 5 6 7Events / 0.35
50100150200250300Data(125) H 50 x Z+single-top t tOthersFake
leptonUncert. VBF + e +ee ATLASPre-fit-1, 20.3 fb = 8 TeV s(a)
[GeV]1jTp0 100 200 300Events / 20 GeV 2004006008001000Data(125) H
50 x Z+single-top t tOthersFake leptonUncert. Boosted + e +ee
ATLASPre-fit-1, 20.3 fb = 8 TeV s(b))2, j1(j 3 4 5 6 7Events / 0.2
0100200300400500600Data(125) H 50 x Z+single-top t tOthers Fake
Uncert. VBFhad e + had ATLASPre-fit-1, 20.3 fb = 8 TeV s(c))2 , 1 (
R 1 2 3 4Events / 0.2 05001000150020002500Data(125) H 50 x
Z+single-top t tOthers Fake Uncert. Boostedhad e + had
ATLASPre-fit-1, 20.3 fb = 8 TeV s(d))2, j1(j 2 3 4 5 6 7Events /
0.5 050100150200250300350400Data(125) H 50 x ZOthers Fake Uncert.
VBFhadhad ATLASPre-fit-1, 20.3 fb = 8 TeV s(e))2 , 1 ( R 1 1.5
2Events / 0.2 0100200300400500600700800900Data(125) H 50 x ZOthers
Fake Uncert. Boostedhadhad ATLASPre-fit-1, 20.3 fb = 8 TeV
s(f)Figure2.
DistributionsofimportantBDTinputvariablesforthethreechannelsandthetwocategories(VBF,
left)and(boosted, right)fordatacollectedat s=8TeV.
Thedistributionsareshownfor(a)theseparationinpseudorapidityofthejets,
(j1, j2), and(b)thetransversemomentumof theleadingjet pj1Tinthe
leplepchannel, for(c) (j1, j2)and(d) R(1, 2),the distanceR between
the lepton andhad, in the lephadchannel and for (e)(j1, j2)
and(f)R(1, 2), thedistanceRbetweenthetwohadcandidates, inthe
hadhadchannel. Thecontributions from a Standard Model Higgs boson
with mH = 125 GeV are superimposed, multipliedby a factor of 50.
These gures use background predictions made without the global t
dened insection 8. The error band includes statistical and pre-t
systematic uncertainties. 18 6 Background estimationThe dierent
nal-state topologies of the three analysis channels have dierent
backgroundcompositions which necessitate dierent strategies for the
background estimation. In gen-eral, the number of expected
background events and the associated kinematic
distributionsarederivedfromamixtureof
data-drivenmethodsandsimulation. Thenormalisationofseveral
importantbackgroundcontributionsisperformedbycomparingthesimulatedsamplesofindividual
backgroundsourcestodatainregionswhichonlyhaveasmall ornegligible
contamination from signal or other background events. The control
regions usedin the analysis are summarised in table 6.Common to all
channels is the dominant Z background, for which the
kinematicdistributionsaretakenfromdatabyemployingtheembeddingtechnique,
asdescribedinsection3.
Backgroundcontributionsfromjetsthataremisidentiedashadronicallydecayingtaus(fakebackgrounds)areestimatedbyusingeitherafake-factormethodorsamples
of non-isolated had candidates. Likewise, samples of non-isolated
leptons are usedto estimate fake-lepton contributions from jets or
hadronically decaying taus and leptonsfrom other sources, such as
heavy-quark decays.8Contributions from various other physics
processes with leptons and/or had candidatesinthenal
stateareestimatedusingthesimulation, normalisedtothetheoretical
crosssections,as given in table 1. A more detailed discussion of
the estimation of the variousbackground components in the dierent
channels is given in the following.6.1 Background fromZ productionA
reliable modelling of the irreducibleZ background is an important
ingredient ofthe analysis. It has been shown in other ATLAS
analyses that existing Z+jets Monte Carlosimulation needs to be
reweighted to model data correctly [102104]. Additionally, it is
notpossible to select a suciently pure and signal-freeZ control
sample from data tomodel the background in the signal region.
Therefore this background is estimated usingembedded data, as
described in section 3. This procedure was extensively validated
usingboth data and simulation. To validate the subtraction
procedure of the muon cell energiesand tracks from data and the
subsequent embedding of the corresponding information
fromsimulation, the muons in Z events are replaced by simulated
muons. The calorimeterisolationenergyinaconeof
R=0.3aroundthemuonsfromdatabeforeandafterembedding is compared in
gure 3(a). Good agreement is found, which indicates that
nodeterioration (e.g. possible energy biases) in the muon
environment is introduced. Anotherimportant test validates the
embedding of more complex Z events, which can only beperformed in
the simulation. To achieve a meaningful validation, the same MC
generatorwith identical settings was used to simulate both theZ
andZ events. Thesample of embedded events is corrected for the bias
due to the trigger, reconstruction andacceptance of the original
muons. These corrections are determined from data as a functionofpT
and(), and allow the acceptance of the original selection to be
corrected. The taudecay products are treated like any other objects
obtained from the simulation, with one8For simplicity, leptons from
heavy-quark decays are considered as fake leptons in the following.
19 important dierence due to the absence of trigger simulation in
this sample. Trigger
eectsareparameterisedfromthesimulationasafunctionofthetaudecayproductpT.
Afterreplacing the muons with simulated taus, kinematic
distributions of the embedded
samplecanbedirectlycomparedtothefullysimulatedones. Asanexample,
thereconstructedinvariant mass, mMMC, is shown in gure 3(b), for
the lephad nal state. Good agreementis found and the observed
dierences are covered by the systematic uncertainties.
Similarly,good agreement is found for other variables, such as the
missing transverse momentum, thekinematic variables of the
hadronically decaying tau lepton or of the associated jets in
theevent. A direct comparison of the Z background in data and the
modelling using theembeddingtechniquealsoshowsgoodagreement.
Thiscanbeseeninseveralkinematicquantity distributions, which are
dominated byZ events, shown in gure 2.The normalisation of this
background process is taken from the nal t described insection 8.
The normalisation is independent for theleplep,lephad, andhadhad
analysischannels. [GeV]Tp , 0.3) TE ( IArbitrary
Units00.050.10.150.20.250.30.35DataEmbedded DataATLAS Z [GeV]Tp ,
0.3) TE ( I0 2 4 6 8 10Emb. / Data0.80.911.11.2(a) MMC [GeV]60 80
100 120 140 160 180Arbitrary
Units00.020.040.060.080.10.120.140.16MCMC Stat. ErrorEmbedded
MCEmb. UncertaintyATLAS Simulationhad e + had [GeV]MMC m60 80 100
120 140 160 180Emb. / MC0.80.911.11.2(b)Figure 3. (a) The
distribution of the calorimeter isolation energyI(ET, 0.3) pT
within a cone ofradiusR = 0.3 around the muons inZ events from
data, before and after the embeddingofsimulatedmuons.
(b)Thedistributionofthereconstructedinvariant mass, mMMC,
inthelephadnalstate, forsimulatedZ events,
comparedtotheoneobtainedfromsimulatedZ eventsaftertauembedding.
Theratiosof
thevaluesbeforeandaftertheembeddingandbetweentheembeddedZ andZ
eventsaregivenin(a)and(b)respectively.Theerrorsin(a)and(b)ontheratios(points)representthestatistical
uncertainties,
whilethesystematicuncertaintiesareindicatedbythehatchedbandsin(b).
Theshadedbandsrepresentthe statistical uncertainties from theZ data
events in (a) and from theZ simulationin (b). 20 6.2 Background
from misidentied leptons or hadronically decaying
tausFortheleplepchannel, all
backgroundsourcesresultingfrommisidentiedleptonsaretreatedtogether.
Inthisapproach, contributionsfrommultijetandW+jetsproduction,aswell
asthepartof the tt
backgroundresultingfromdecaystoleptonsandhadrons(tt `bqqb) are
included. A control sample is dened in data by inverting the
isolationrequirements for one of the two leptons, while applying
all other signal region requirements.The contributions from other
background channels (dileptonictt decays (tt `b`b),Zee, Z,
anddibosonproduction)areobtainedfromthesimulationandaresubtracted.
Fromthiscontrolsampleatemplateiscreated.
ThenormalisationfactorisobtainedbyttingthepTdistributionofthesubleadingleptonatanearlystageofthepreselection.Forthelephadchannel,
thefake-factormethodisusedtoderiveestimatesforthemultijet, W+jets,
Z+jets, andsemileptonicttbackgroundeventsthatpassthelephadselection
due to a misidentied had candidate. The fake factor is dened as the
ratio of thenumber of jets identied as medium had candidates to the
number satisfying the loose, butnot the medium, criteria. Since the
fake factor depends on the type of parton initiating thejet and on
thepT of the jet, it is determined as a function ofpT separately
for samples en-riched in quark- and gluon-initiated jets. In
addition, the fake factor is found to be dierentfor 1-track and
3-track candidates. Three dierent, quark-jet dominated samples are
usedseparately for the W+jets, tt and Z+jets background components.
They are dened by se-lecting the high-mT region (mT> 70GeV), by
inverting the b-jet veto and by requiring twoleptons with an
invariant mass consistent with mZ (80GeV < m``< 100GeV)
respectively.In addition, a multijet sample dominated by
gluon-initiated jets is selected by relaxing thelepton identication
and requiring it to satisfy the loose identication criteria. The
derivedfakefactorsarefoundtovaryfrom0.124(0.082)for
pT=20GeVto0.088(0.038)forpT=150GeVfor1-track(3-track)candidatesintheVBFcategory.
Thecorrespondingvaluesfortheboostedcategoryare0.146(0.084)forpT=20GeVand0.057(0.033)forpT=150GeV.Toobtainthefake-backgroundestimatefortheVBFandboostedsignalregions,
these factors are then applied, weighted by the expected
relativeW+jets,Z+jets,multijet, andtt fractions, to the events in
regions dened by applying the selections of thecorresponding signal
region, except that the had candidate is required to pass the loose
andto fail the mediumhadidentication. As an example, the good
agreement between dataand background estimates is shown in gure
4(a) for the reconstructedmass for eventsin the high-mT region,
which is dominated byW+jets production.Forthehadhadchannel,
themultijetbackgroundismodelledusingatemplateex-tractedfromdatathatpasstheVBForboostedcategoryselection,
where, however, thetaus fail the isolation and opposite-sign charge
requirements (the number-of-tracks require-ment is not enforced).
The normalisation of the multijet background is rst determined
byperforming a simultaneous t of the multijet (modelled by the data
sample just mentioned)and Z (modelled by embedding) templates after
the preselection cuts. The t is per-formed for the distribution of
the dierence in pseudorapidity between the two hadronic
taucandidates, (had, had). The signal contribution is expected to
be small in this category. 21
Theagreementbetweendataandthebackgroundestimateforthisdistributionisshownin
gure 4(b) for the rest category dened in section 4. The
preselection normalisation isused as a reference point and starting
value for the global t (see below) and is used forvalidationplots.
Thenalnormalisations ofthetwoimportant backgroundcomponents,from
multijet andZ events,are extracted from the nal global t,as
described insection 8, in which the (had, had) distribution for the
rest category is included. [GeV]MMC m0 100 200 300Events / 20 GeV
20406080100120140160180200220240Data(125) H 50 x Z+single-top t
tOthers Fake Uncert. VBFhad e + had ATLASControl Region+jets W-1,
20.3 fb = 8 TeV s(a)) , ( 0 0.5 1 1.5Events / 0.15
020040060080010001200Data(125) H 50 x ZOthers Fake Uncert.
Resthadhad ATLASPre-fit-1, 20.3 fb = 8 TeV s(b)Figure4.
(a)Thedistributionof thereconstructedinvariant mass, mMMC,
foreventsintheW+jetscontrolregion,forthelephadchannel.
(b)Theseparationinpseudorapidityofthehadcandidates, (had, had), for
thehadhadchannel in the rest control region. The expectedSM Higgs
boson signal contribution is superimposed, multiplied by a factor
50. These gures usebackground predictions made without the global t
dened in section 8. The error band includesstatistical and pre-t
systematic uncertainties.6.3 Z ee andZ backgroundThe DrellYanZ/ ee
andZ/ background channels are important contribu-tions to the nal
states with two same-avour leptons. They also contribute to the
otherchannels. Asdescribedbelow,
asimulationbasedonAlpgenisusedtoestimatethesebackground sources.
Correction factors are applied to account for dierences between
dataand simulation.In the leplep channel, the Alpgen simulation is
normalised to the data in the Z-masscontrol region, 80GeV <
m``< 100GeV, for each category, and separately for Z ee andZ
events. The normalisation factors are determined from the nal t
described insection 8. The distribution of the reconstructedmass
for events in this control region isshown in gure 5 (a). 22 In
thelephad channel, theZ ee andZ background estimates are also
basedon simulation. The corrections applied for a had candidate
depend on whether it originatesfrom a lepton from the Z boson decay
or from a jet. In the rst case, corrections from data,derived from
dedicated tag-and-probe studies, are applied to account for the
dierence inthe rate of misidentied had candidates resulting from
leptons [21, 105]. This is particularlyimportantforZ
eeeventswithamisidentiedhadcandidateoriginatingfromatrueelectron.
In the second case, the fake-factor method described in section 6.2
is applied.In thehadhad channel, the contribution of this
background is very small and is takenfrom simulation. [GeV]MMC m0
100 200 300 400Events / 20 GeV 100200300400500Data(125) H 50 x , ee
ZTop+diboson ZFake leptonUncert. VBF + e +ee ATLASControl Region ll
ZPre-fit-1, 20.3 fb = 8 TeV s(a))2, j1(j 3 4 5 6Events / 0.2
020406080100120Data(125) H 50 x Z+single-top t tOthers Fake Uncert.
VBFhad e + had ATLASControl RegionTopPre-fit-1, 20.3 fb = 8 TeV
s(b)Figure5. (a)Thedistributionof thereconstructedinvariant mass,
mMMC, foreventsinthe Z``control region, forthe leplepchannel.
(b)Thedistributionof theseparationinpseudorapidityof
thetwoleadingjets, (j1, j2), foreventsinthetopcontrol region,
forthelephadchannel.
Thisgureusesbackgroundpredictionsmadewithouttheglobaltdenedinsection
8. The error band includes statistical and pre-t systematic
uncertainties.6.4 W+jets backgroundEvents withWbosons and jets
constitute a background to all channels since
leptonicWdecayscanfeedintoall
signatureswhenthetrueleptonisaccompaniedbyajetwhichisfalselyidentiedasahadoraleptoncandidate.
Thisprocesscanalsocontributeviasemileptonic heavy quark decays that
provide identied leptons.Asstatedinsection6.2,
fortheleplepandlephadchannels, theW+jetscontribu-tions are
determined with data-driven methods. For thehadhad channel, theW
hadbackground is estimated from simulation. A correction is applied
to account for dierencesin thehad misidentication probability
between data and simulation. 23 6.5 Background from top-quark
productionBackground contributions from tt and single top-quark
production, where leptons or hadron-ically decaying taus appear in
decays of top quarks, are estimated from simulation in theleplep
andlephad channels. The normalisation is obtained from data control
regions de-ned by requiring ab-jet instead of ab-veto. In
thelephadchannel, a large value of thetransverse mass mT is also
required, to enhance the background from top-quark productionand to
suppress the signal contribution. This background is also found to
be small for thehadhadchannel anditisestimatedusingsimulation.
Thedistributionof (j1, j2)forevents in the top control region, for
thelephad channel, is shown in gure 5 (b).6.6 Diboson backgroundThe
production of pairs of vector bosons (W+W, ZZ and WZ), with
subsequent decaystoleptonsorjets,
contributesespeciallytothebackgroundintheleplepchannel. Forall
analysis channels, these contributions are estimated from
simulation, normalised to theNLO cross sections indicated in table
1.6.7 Contributions from other Higgs boson
decaysIntheleplepchannel, anon-negligiblecontributionfromH WW
``existsandthis process is considered as background. Its
contribution is estimated formH= 125GeVusingsimulation.
ThecorrespondingsignalcrosssectionisassumedtobetheSMvalueand is
indicated in table 1.6.8 Validation of background estimatesAs
described above, the normalisation for important background sources
that are modelledwith simulation are determined by tting to data in
control regions. These normalisationsare compared in table 7 to
predictions based on the theoretical cross sections for the 8
TeVanalysis. In most cases, the values obtained are compatible with
unity within the statisticaluncertainties shown. For the top
control region in the VBF category of the lephad
channel,thevalueisalsoinagreementwithunityiftheexperimental
andtheoretical systematicuncertainties are included. The
control-region normalisations are used for validation plots,and
they are used as starting values in the nal global t described in
section 8. The globalt does not change any of these normalisations
by more than 2%.It is important to verify that the BDT output
distributions in data control regions arewell described after the
various background determinations. Figure 6 shows
distributionsfromimportantcontrol regionsforthe s=8TeVdataset, i.e.
theZ-enrichedcontrolregions for theleplep andlephad channels, and
the reconstructedinvariant mass side-band control region (dened
asmMMC150GeV) for thehadhadchannel.
ThedistributionsareshownforboththeVBFandtheboostedcategories.
Alldistributions are found to be well described, within the
systematic uncertainties. 24
ProcesslepleplephadhadhadZ``-enriched80