bbTT ARCHNES HL-LHC by

advertisement
Higgs Pair Production in bbTT Final States at the
ARCHNES
HL-LHC
by
MASSACKLM;Ts fK3TUTE
OF r'CHNOLOLGY
by
Jay Mathew Lawhorn
AUG 10 2015
Submitted to the Department of Physics
in partial fulfillment of the requirements for the degree of
LIBRARIES
BACHELOR OF SCIENCE
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
February 2015
Jay Mathew Lawhorn, MMXV. All rights reserved.
The author hereby grants to MIT permission to reproduce and to
distribute publicly paper and electronic copies of this thesis document
in whole or in part in any medium now known or hereafter created.
Signature redacted
.
Author .
Z67/
Certified by..
C
Department of Physics
January 20, 2015
Signature redacted
Markus Klute
Associate Professor
Thesis Supervisor
.
Accepted by
Signature redacted.............
Professor Nergis Mavalvala
Senior Thesis Coordinator, Department of Physics
2
Higgs Pair Production in bbTT Final States at the HL-LHC
by
Jay Mathew Lawhorn
Submitted to the Department of Physics
on January 20, 2015, in partial fulfillment of the
requirements for the degree of
BACHELOR OF SCIENCE
Abstract
A measurement of standard model Higgs pair production in bbrr final states at the
High Luminosity LHC is investigated. Higgs pair production can be used to measure
the Higgs trilinear coupling constant, which uniquely determines the shape of the
Higgs potential. The doubly hadronic, hadron-muon, and hadron-electron di-r final
states are considered, with a shape analysis on either the stransverse mass (doubly
hadronic) or a BDT discriminant (hadron-muon, hadron-electron) distribution performed to extract expected significances. The expected 95% CL upper limit on the
cross section times branching ratio from a combination of all three channels is 2.2
times the SM value, with an expected +1o- uncertainty on the measured cross section
of 67%, indicating this measurement is feasible.
Thesis Supervisor: Markus Klute
Title: Associate Professor
3
4
Acknowledgments
I am deeply grateful to Professor Markus Klute, my research supervisor.
My ex-
periences at MIT and future as a scientist has been unquestionably shaped by the
amazing opportunities and challenges he presented me these past two years.
A special thank you to Aram Apyan, with whom I worked closely with on all our
upgrade studies. Also thank you to my previous research supervisors and mentors:
Jim Annis, Jeff Kubo, Donna Kubik, James Battat, Professor Peter Fisher, and
Shawn Henderson. The diverse skill set I gained in their groups was invaluable.
Thanks to all the current and past MIT-CMS members who made the group
an easy and fun place to work, especially Kevin Sung, Leonardo di Matteo, Professor Christoph Paus, Max Goncharov, Valentina Dutta, Stephanie Brandt, Catherine
Medlock, and Allison Christian. Also thanks to my academic advisor Professor Jesse
Thaler, Miri Skolnik, and Stephen Benyas, as well as Allison Mann, Chelsea Levy,
Katharine Berry P.F., Dan Abercrombie, Ian Chen, Brandon Allen, Sid Narayanan,
and all my other friends.
My research was funded by the MIT Undergraduate Research Opportunities Program and the MIT International Science and Technology Initiatives program.
5
6
Contents
1
Introduction
15
2
Higgs Physics
17
3
Signal Process
21
4.1
. . . . . . . . . . . . . . . . . . . . . . . . .
Object Reconstruction
22
25
The Compact Muon Solenoid
.
4
The bb-rr Final State . . . . . . . . . . . . . . . . . . . . . . . . . .
.
3.1
26
5
Background Processes
29
6
Monte Carlo Samples
31
7
Event Selection
33
8
Signal Extraction
39
. . . . . . . . . . . . . . . . . . . .
39
8.2
Semi-Leptonic Channels
. . . . . . . . . . . . . . . . . . . .
40
8.3
Statistical Interpretation
. . . . . . . . . . . . . . . . . . . .
45
8.4
Uncertainties
. . . . . . . . . . . . . . . . . . . .
45
.
.
.
.
. . . . .
.
Fully Hadronic Channel
Results
47
Cross Check Using 8 TeV Data Sets . . . . . . . . . . . . . . . . . .
9.2
14 TeV Results
.
9.1
. . . .
47
52
.
9
8.1
7
10 Conclusions
53
A Semi-Leptonic BDT
55
8
List of Figures
2-1
Feynman diagrams contributing to gluon fusion Higgs pair production.
7-1
Predicted m,, (top) and mbb (bottom) distributions in the
ThTh
18
chan-
nel. The background yields are expected contributions from SM processes, while the signal yield is the expected contribution from SM
Higgs pair production scaled by a factor of five hundred.
7-2
. . . . . . .
36
Predicted m,, (top) and mbb (bottom) distributions in the rtrh channel. The background yields are expected contributions from SM processes, while the signal yield is the expected contribution from SM
Higgs pair production scaled by a factor of one thousand.
7-3
Predicted m,, (top) and mbb (bottom) in the
TeTh
. . . . . .
37
channel. The back-
ground yields are expected contributions from SM processes, while the
signal yield is the expected contribution from SM Higgs pair production
scaled by a factor of one thousand.
8-1
. . . . . . . . . . . . . . . . . . .
Predicted distribution of the p-r(bb) (top) and
in the
Thrh
mT2
38
(bottom) variables
channel after mass window cuts. The background yields
are expected contributions from SM processes, while the signal yield is
the expected contribution from SM Higgs pair production scaled by a
factor of one hundred.
. . . . . . . . . . . . . . . . . . . . . . . . . .
9
40
8-2
Predicted distributions for the pT(bb) (top) and
mT2
(bottom) variables
in the r,rh channel after mass window cuts. The background yields are
expected contributions from SM processes, while the signal yield is the
expected contribution from SM Higgs pair production scaled by a factor
of one thousand. .......
8-3
.............................
Predicted distributions for the pT(bb) (top) and
mT2
42
(bottom) variables
in the TeTh channel after mass window cuts. The background yields are
expected contributions from SM processes, while the signal yield is
the expected contribution from SM Higgs pair production scaled by a
factor of five thousand. . . . . . . . . . . . . . . . . . . . . . . . . . .
8-4
43
Predicted distribution of the BDT discriminant in the rTh (top) and
rerh (bottom) channels for the signal region.
The background yields
are expected contributions from SM processes, while the signal yield is
the expected contribution from SM Higgs pair production scaled by a
factor of five hundred.
9-1
. . . . . . . . . . . . . . . . . . . . . . . . . .
Cross check with 8 TeV data for the
tions for the
mT2
ThTh
44
channel. Predicted distribu-
variable before (top) and after (bottom) mass window
cuts. The background yields are expected contributions from SM processes, while the signal yield is the expected contribution from SM
Higgs pair production scaled by one thousand (top) or one hundred
(bottom). In the bottom figure, the data is blinded for
9-2
Cross check with 8 TeV data for the
tions for the
mT2
TTh
mT2
> 100 GeV.
50
channel. Predicted distribu-
variable before (top) and after (bottom) mass window
cuts. The background yields are expected contributions from SM processes, while the signal yield is the expected contribution from SM
Higgs pair production scaled by ten thousand (top) or one thousand
(bottom ).
A-1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Input variable distributions for
TTh
channel BDT. The signal is shown
in blue, while the background is red.
10
51
. . . . . . . . . . . . . . . . . .
56
A-2
Input variable distributions for
TeTh
channel BDT. The signal is shown
in blue, while the background is red.
A-3
. . . . . . . . . . . . . . . . . .
57
Correlation matrices for the (top) signal and (bottom) background
samples in
T,rh
channel.
. . . . . . . . . . . . . . . . . . . . . . . . .
58
A-4 Overtraining check for BDT classifier in the r,1 h channel.The signal is
. . . . . . . . . . . . . .
59
A-5
ROC curve for BDT classifier in the rtTh channel. . . . . . . . . . . .
59
A-6
Correlation matrices for the (top) signal and (bottom) background
shown in blue, while the background is red.
samples in rer channel.
. . . . . . . . . . . . . . . . . . . . . . . . .
60
A-7 Overtraining check for BDT classifier in the Terh channel.The signal is
A-8
shown in blue, while the background is red . . . . . . . . . . . . . . .
61
ROC curve for BDT classifier in the erTh channel.
61
11
. . . . . . . . . . .
12
List of Tables
3.1
Expected number of SM HH -+ bbTT events separated by
at
6.1
= 14 TeV in 3000 fb-
TT
final state
. . . . . . . . . . . . . . . . .
of data.
22
List of SM background categories generated for CMS upgrade studies,
including the main processes for each category, generator-level final
. . . . . . . . . . . . . . . . . . .
states, and order in each coupling.
7.1
32
Summary of object-level selection criteria for each di-T channel. The
absolute isolation variable I is properly defined in the 8 TeV H -+
Tr
analysis but not used here. The relative isolation variable R, as
previously defined, is used in its place.
7.2
Expected yields in each channel for 3000 fbafter baseline selection requirements.
7.3
. . . . . . . . . . . . . . . . .
8.1
Expected signal yields in each channel for 3000 fb-
1
35
of integrated lu-
. . . . . . . . . . . . . . .
35
Expected signal yields in each channel for 3000 fb-' of integrated luminosity after requiring that the
9.1
of integrated luminosity
. . . . . . . . . . . . . . . . . .
minosity after mass window requirements.
35
mT2
variable is greater than 100 GeV.
Cross check with 8 TeV full simulation at two stages of the
41
rhTh cut-
based selection. The 8 TeV columns are from full simulation MC and
are scaled to 8 TeV with 21 fb-
1
integrated luminosity, except (*) the
signal and tfyields which are scaled to 14 TeV and 3000 fb-1 integrated
luminosity. The 14 TeV columns are from Delphes and are scaled to
. . . . . . . . . . . . . . . . . . . .
3000 fb-1 integrated luminosity.
13
48
9.2
Cross check with 8 TeV full simulation at two stages of the
Trah
cut-
based selection. The 8 TeV columns are from full simulation MC and
are scaled to 8 TeV with 21 fb- 1 integrated luminosity, except (*) the
signal and tf yields which are scaled to 14 TeV and 3000 fb-
1
integrated
luminosity. The 14 TeV columns are from Delphes and are scaled to
3000 fb9.3
1
integrated luminosity. . . . . . . . . . . . . . . . . . . . . .
49
Statistical results of the analysis, showing the asymptotic 95% CL upper limit on the expected cross section and the expected
tainties on the cross section measurement.
14
1a- uncer-
. . . . . . . . . . . . . . .
52
Chapter 1
Introduction
The standard model (SM) of particle physics was developed in the 1960s and 1970s to
explain observations at sub-atomic scale and correspondingly high energy. The theory
has been highly successful, predicting the existence of W and Z bosons, gluons, and
three heavy quarks before their respective discoveries. The recent discovery [1-31 of
the Higgs boson at the Large Hadron Collider (LHC) was another major confirmation
of the SM. All current measurements of Higgs boson properties are consistent with
SM predictions [4-61, but some phenomena cannot be measured in the data acquired.
Additionally, significant experimental evidence suggests the SM is not yet complete,
including the gravitational force, dark matter, and neutrino masses.
The LHC is scheduled to continue data taking until 2022 when the proposed High
Luminosity LHC (HL-LHC) project will make major upgrades to the accelerator
complex, increasing the number of collisions per second by a factor of 2.5. The HLLHC is expected to take 3000 fb- 1 of data over ten years and will allow observations
of currently unobservable Higgs phenomena as well as precision measurements of
previously known SM parameters.
The Higgs trilinear coupling constant is one SM parameter that could be measured at the HL-LHC. This constant governs the rate of interactions involving three
Higgs bosons, including Higgs pair production events where one (highly off-shell)
Higgs boson decays to two on-shell Higgs bosons. The trilinear coupling constant is
particularly sensitive to non-SM phenomena in the Higgs sector because it uniquely
15
determines the width of the SM Higgs potential.
The SM cross section for Higgs
pair production is almost minimal due to interference between two possible Feynman
diagrams. Large deviations from the SM trilinear coupling constant increase the production cross section, making Higgs pair production searches uniquely sensitive to
non-SM processes.
This thesis investigates a measurement of the Higgs pair production cross section
in final states containing two b-quarks and two
T
leptons at the Compact Muon
Solenoid (CMS) experiment during the HL-LHC run. The HL-LHC configuration of
the CMS detector is currently being designed, but will including a number of upgrades
to combat both detector aging and the harsh HL-LHC run conditions. This analysis,
as well as Higgs pair production analyses in the bb'yy and bbWW final states, will be
included in upcoming the technical proposal for that upgrade.
16
Chapter 2
Higgs Physics
The Higgs mechanism [7-12] was proposed in 1964 to preserve local gauge invariance in Lagrangian density while allowing mass terms for the fermions and gauge
bosons [13-151. In terms of the physical Higgs field H, the SM Lagrangian for Higgs
interactions with vector and Higgs bosons, and fermions is given by
=a
+61.rV11VA (2m2 VH +
ffH
H+rn (2v f
Y
2VH2
22.1)
2
+ +2 H3+ 8vHH4
where nf is the fermion mass,
6w = 1,
6
f
is the fermion field, V is either a W, or Z and
z = 1/2 [16]. The CMS Higgs coupling measurements from a combination
all analyzed final states in the full 8 TeV data set find all observed coupling constants
to be consistent with the SM [6]. However, the Higgs decay modes to lighter quarks
and leptons as well as the final two terms in Equation 2.1 cannot be probed in the
current data due to their vastly smaller cross sections.
At a proton-proton collider like the LHC, gluon-gluon fusion via a top quark loop
is the dominant mode for both single Higgs boson and Higgs pair production. Vector
boson fusion (VBF) and associated production with either a vector boson (VH) or tt
pair (ttH) processes also contribute with a smaller cross section but with additional
tagging particles that can be exploited to increase the signal-to-background ratio. All
four production modes were exploited for the Higgs discovery at V
17
= 7 and 8 TeV.
There are two main Feynman diagrams for Higgs pair production, shown in Figure 2-1. The right diagram shows a highly off-shell Higgs produced via gluon-gluon
fusion that decays to a pair of less massive Higgs bosons. However, the dependence
of the overall production cross section on
AHHH
is diluted by the left diagram, which
also produces Higgs pairs without an HHH vertex.
The overall production cross
section is reduced because the two diagram destructively interfere.
h
t
t1
'00000-
-
g
Figure 2-1: Feynman diagrams contributing to gluon fusion Higgs pair production.
At leading order in QCD, the SM partonic cross section for gluon-gluon fusion
Higgs pair production is
f+
-Lo(99 -+ HH)
{
a
256(27)
3
AHH
AHHH
s-MI + iMHF H
FA + Fo
2(2.2)
+ |Ga| 2
where PH is the Higgs decay width, FA, FE, and Gr are form factors, and the limits
of integration are
2
sA
1- 4M
)
G2 ( R)
where 9 and t are the partonic Mandlestam variables [171. In the limit of an infinitely
massive top quark, the form factors reduce to FA = 2/3, FE = -2/3,
and Go = 0.
Evaluated at /s = 14 TeV, the inclusive Higgs pair production cron
17.8+i
fb at leading order and 40.2t3
section is
fb at NNLO [181. Because SM Higgs pair
production is nearly minimal, many beyond the SM models for the Higgs sector predict
an increased cross section, including the Minimally Supersymmetric Standard Model
(MSSM) and Higgs portal scenarios [19, 201. For example, a Higgs portal scenario
18
motivated by electroweak baryogenesis predicts yields of up to twenty times the SM
expectation [211.
19
20
Chapter 3
Signal Process
In the 3000 fh-' of data collected at the HL-LHC, approximately one hundred and
twenty thousand gluon-gluon fusion Higgs pair events are expected. However, these
events are distributed among a large number of final states because of the large
number of SM Higgs boson decay modes. Like the single Higgs searches, Higgs pair
searches rely on minimizing all reducible backgrounds and precisely reconstructing
the two Higgs mass peaks.
Much of the signal strength needed for the initial Higgs discovery lay in the H
-+
-y-y and H -+ ZZ -+ 41 channels, which have quite small branching ratios of 0.23%
and 0.00125% respectively. Because of the excellent lepton and photon resolutions
achieved by the CMS and ATLAS detectors, the lack of neutrinos in the final states,
the Higgs mass, and careful analysis efforts, these two channels outperformed the
more likely channels bb,
TT,
and WW. The single Higgs search in the bb final state,
with a branching ratio of 57%, was complicated by overwhelming backgrounds and
relatively poor jet resolution. The
Tr
and WW searches had the additional challenge
of reconstructing neutrinos in the final state, which escape the detector.
While the Higgs mass resolution is still an important concern for Higgs pair
searches, the much smaller inclusive cross section means large branching ratios are
more important than in the single Higgs search. The large branching ratio for the
H -+ bb process makes it a very attractive channel, especially when paired with a
channel with more discriminating power. The branching ratio for the H -+ ZZ -+ 41
21
process is so small it is generally not a good candidate for these searches. The bbbb,
bbyy, bbTr, and bbWW final states have been considered in theoretical studies with
mixed results [17,22-241. From these studies, the bby7 and bbrT channels seem the
most promising.
3.1
The bbTT Final State
This analysis focuses on the HH -- bbrT channel because its relatively large branching
ratio of 7.29% compared to the cleaner bb-y- channel and markedly lower backgrounds
than the dominant bbbb channel. It is also an excellent standard candle for overall
detector performance, as discussed further in the next chapter. The most promising
theoretical work on this channel reported a possible measurement of the trilinear
coupling constant AHHH with 30% uncertainty at the HL-LHC [231. While promising,
that study and similar ones neglect detector effects that must be taken into account
and will likely reduce the significance of any result.
In 3000 fb- 1 of data, the expected yield in the bbrT final state is 8792 events.
However, because
T
leptons themselves decay before detection, we must consider each
di-T final state separately. The majority of
T
leptons (64.8%) decay into a v, with
some combination of neutral and charged hadrons, classified by the number of charged
hadrons into one-, three-, and five-prong decay modes. The remaining
T
leptons decay
almost equally into evev, (17.8%) and pJvv, (17.4%) modes [25].
Table 3.1: Expected number of SM HH -+ bbrT events separated by TT final state
at Vs = 14 TeV in 3000 fb-I of data.
di-r Final State
Notation Yield
di-T Final State
Notation Yield
Electron-electron
TeTe
279
Electron-muon
Terp
545
Muon-muon
TrTr
266
Muon-hadronic
TpTh
1983
Hadronic-hadronic
Thh
3692 Electron-hadronic
TeTh
2028
Table 3.1 lists the number of expected HH -+ bbrT events for each
The doubly hadronic final state,
ThTh,
dominates, while the
TeTe
TT
final state.
and -r,r, final states
are least likely. Because the Tere and TrTl, final states are completely overwhelmed
by the SM Z -+ ee and Z -+ pp processes respectively, they are not considered.
22
While the TeT, channel was initially considered, the Higgs mass reconstruction suffered
from the four neutrinos in the final state. This, combined with overwhelming SM
backgrounds, led to work on this channel being abandoned as well.
This analysis considers three separate di-r final states: rhTh, ,IrTh, and Terh.
23
24
Chapter 4
The Compact Muon Solenoid
The CMS experiment 126] is one of two general-purpose physics detectors at the LHC.
The detector itself is composed of tracker detectors and calorimeters in a 3.8 T magnetic field produced by a superconducting solenoid, surrounded by muon detectors.
The CMS detector coordinates are given in terms of (rj, #), where
7= - In [tan (0/2)]
is the pseudo-rapidity, 0 is the polar angle measured from the anticlockwise beam
direction, and
#
is the azimuthal angle.
I
= 14
TeV starting in 2025 with an instantaneous luminosity of 5 x 1034 cM- 2 s-1.
This
The proposed HL-LHC program will provide 3000 fb-1 of collisions at
corresponds to an average pileup (<PU>) of 140 interactions per bunch crossing,
compared to the upcoming Run II with an expected <PU>= 25. Between the dramatically increased <PU> and projected aging effects, a non-upgraded CMS detector
would struggle to produce physics results in this harsher environment.
In order to preserve or improve the detector performance achieved at
F = 7
and 8 TeV, the proposed Phase II CMS detector includes a new silicon tracker with
coverage to
IT1I
=
4.0 and new electromagnetic and hadronic calorimeters in the
forward region, 1.6 < mJq
< 3.0. The forward regions and regions closest to the beam
25
line are most affected by both aging effects and increased pileup because the total
energy deposition per unit and particle number density are highest in those regions.
4.1
Object Reconstruction
Events in the CMS detector are analyzed using a particle flow algorithm [27-29] which
considers information from all sub-detectors to identify and reconstruct individual
particles in the event. It combines tracks from the inner tracker detectors, energy
depositions in the various calorimeters, and muon segments in the muon detectors to
create particle candidates based on track and muon segment extrapolation and the
locations of energy depositions.
The Delphes fast simulation [30] is used to model the Phase II detector at <PU>=
140. The parameterized efficiencies used as input for Delphes are derived using a
GEANT-based [311 full simulation of the proposed detector geometry.
Muons are identified by matching muon segments with tracks. The average muon
identification efficiency is 98% for PT > 30 GeV. The Delphes simulation does not
include a muon fake rate.
Electrons are reconstructed from energy depositions in the electromagnetic calorimeter and compatible tracks in the tracker detector. The parameterized electron identification efficiency as a function of pT and r7 is greater than 90% for electrons with
pT > 30 GeV. The electron fake rate from photon conversions in the tracker or other
sources is not included in Delphes.
Electrons and muons from H -+
T decays are not expected to be near large
numbers of hadrons, unlike leptons originating from jets. The lepton relative isolation
variable R is a measure of how much unrelated activity is near the lepton candidate.
It is calculated in Delphes as
PT(lep.)
E
2
pT + max (ZPT - 7rpAR ,
26
)
(4.1)
where pT(lep) is the transverse momentum of the lepton, both sums are over objects
within a cone of radius AR around the lepton, the first charged hadrons, the second
neutral hadrons, and p is the average energy density per unit area for the event. The
7rpAR2 term is subtracted to correct for the expected energy deposition from pileup
within the cone around the lepton, allowing for better discrimination between leptons
inside and outside jets.
Charged hadrons are reconstructed from energy depositions in the hadronic calorimeter and compatible tracks in the tracker. Neutral hadrons and photons are identified
Jets are reconstructed using the anti-kt algorithm [32] with cone radius D
-
as energy depositions in the relevant calorimeters without matching tracks.
0.4. The FastJet technique 133] is used to correct for pileup effects. Pileup jets are
rejected using cuts on track-related and jet shape variables following previous pileup
jet identification work within CMS [34] corresponding to 95% non-pileup jet efficiency
and 20% pileup acceptance.
The CMS Combined Secondary Vertex algorithm (CSV) [35-371 is used to identify
b-jets based on a likelihood discriminant which considers track impact parameters and
the identification of displaced vertices from the relatively long-lived b-hadron. Jets
originating from b-quarks are identified using the CSV medium working point with
on average 68% efficiency, a 10% fake rate from c-quarks, and a 1% fake rate from
light quarks.
In the CMS particle flow algorithm, hadronic r decays are reconstructed from
jets with an identified 7r0 decay and charged hadrons matching a hadronic T decay
mode [38]. However, the Delphes process for hadronic
T
identification is greatly
simplified and does not take into account the particular hadronic T decay mode. Jets
originating from hadronic T decays are tagged with 65% efficiency and a fake rate
of 1%. Both the efficiency and fake rate are flat in
PT
and q, which is likely also a
simplification.
In data and the CMS full simulation, particular types of hadronic activity are much
more likely to be incorrectly identified as hadronic T decays than others. Because the
Delphes fake rate is applied to all jets without consideration of the underlying physics,
27
its predictions could differ significantly from reality. In order to improve upon the
Delphes hadronic T-tagger, this analysis requires a reconstructed hadronic T candidate
contains at least one Delphes isolated track, mimicking a full simulation restriction
to one-prong
T
decays.
The missing transverse energy, KT, is calculated as the negative vector sum of all
particle flow candidate objects,
tT
ZPTi.
Neutrinos are only reconstructed by theET they create, but
ET
is also created via mis-
reconstructing the momenta of objects in the event. Thus
OT
resolution is degraded
by increase pileup as more objects give more opportunities for mis-reconstruction.
After rejecting jets identified as pileup jets as described above, the
,T
resolution is
on average 20 GeV for <PU>= 140.
The di-T mass m,, is reconstructed using the SVFIT algorithm [391, which optimizes the mass resolution of the di-r final state by performing a maximum likelihood
fit method to take into account both the visible
T
decay products and the 4+, which
includes contributions from 2-4 neutrinos. At 8 TeV, the SVFIT mass resolution was
estimated to range between 10 and 20% depending on final state and category.
28
Chapter 5
Background Processes
There are a number of other SM processes with much larger cross sections than the
signal process that also have two T leptons and two b-jets in their final state. These
include tt events where the W boson decays leptonically, ZH and ZZ events, among
others. Additionally, because there are non-negligible fake rates for both hadronic
T
decays and b-jets, the analysis must consider processes that could fake this final state.
QCD
multijet processes could be a major source of fake hadronic
T
decays or b-jets
because of the overwhelmingly high production cross section, as well as anything with
multiple jets involved.
One way to discriminate against backgrounds is to consider the di-T mass m, and
di-b mass Mbb distributions. For the HH signal, both of these distributions should
peak near MH= 125 GeV. For ZH background, would expect one peak near
MH
and one peak near Mz. For ZZ background, both peaks near Mz. For tt and QCD
multijet events, no peak is expected because the pairs of same flavor objects don't
come from a resonant decay.
We use one additional discriminant against the tt background, the stransverse
mass mT2. It was proposed for this purpose in [23], but originally designed for SUSY
searches where a pair of equal-mass particles decay into one invisible daughter and at
least one visible daughter particle with unknown parent momenta. For the purposes
29
of this analysis, it is defined as
mT2
(mB, mB,bT, U, pr, mc, mC)
min
{max(mT, mT)}
(5.1)
CT+C'r=PT
where bT and b' are the b-jet transverse momenta, mB and m' are the b-jet masses,
CT
and c'T are the visible T lepton candidate transverse momenta, mC and m' are
the visible T lepton candidate masses, and
(5.2)
+ pT(T) + pTs(T') = pT(W) + pT (W')
PT -
is the vector sum of the missing transverse momentum, presumably from neutrinos,
and the transverse momenta of the visible
mT2
T
decay products.
For tt events, the
variable is bounded above by the top mass. In contrast, the di-Higgs signal
distribution is bounded only by \//2.
In the following chapters, the background processes shown in plots are divided
into five major categories: tt, SM H
-4 TT,
Z -+
TT,
electroweak, and QCD. The
tf background is tt events with no restriction on the final state, but in the signal
region it is overwhelmingly composed of real T decays and b-jets. The SM H -+
TT
background includes gluon-gluon fusion, VBF, VH, and ttH production modes to the
r-.r
final
c'tat.
In the sin
L.L.LL
"i %kL..CbL
V.5 i.JI.I
kiV .1V
h IQ IILL L
region,
LJ J L.% VViI - k'01 .1
ct Admiated
kbr
eiT-T
T
.1h..nn.
IL~J. L. JL 11..-J. ' 10J CbdLJDw
some contribution from VBF jets and vector boson jets faking b-jets. The Z -4
TT
background includes events with any number of Z bosons and jets, but no W or H
bosons. The electroweak background includes di-boson, tri-boson, single top, and
W+jets processes. The QCD background is not shown in most plots, but is discussed
separately in Section 9.1.
30
Chapter 6
Monte Carlo Samples
Monte Carlo (MC) samples for all signal and background processes were generated
using the MC generation strategy developed for the Snowmass 2013 conference [40,41],
with the underlying physics processes simulated in Madgraph 5 [42], parton showering
143], and the simulation of T lepton decays
and fragmentation performed in PYTHIA 6
done using TAUOLA [44]. Detector simulation was performed using the Delphes fast
simulator.
One million signal events were generated with the signal yield normalized to the
NNLO Higgs pair production cross section at 14 TeV of 40.2 fb. The same number
of events were also generated with the Higgs trilinear coupling constant
5x, 0 x, -1x,
AHHH
set to
and -5 x Asm. Single Higgs samples constrained to the TT final state
for all four major production modes were also produced to provide increased statistics
using the same generator level work flow.
The background processes were generated centrally for the CMS Phase II upgrade
studies and are organized into five object categories at the generator level:
J 0=
,U
, dj,
, c,=s,
e, b,
I},
L = {e+, e-pg+ /1- T+,IT I~Ve,IVA, V-r},
B= {W+, W-, Zo,-}, T = {t,
31
, H = {h0}.
Final States
Order
vector boson + jets
divector + jets
top pair + jets
B + mJ
BB + nJ
TT + nJ
O(ca8
O(c8c4)
O(ain + 2))
top pair, off-shell
T* -+ Wj + jets
TB + nJ
0(a(n + 1)aw)
single top (s- and
t-channel) + jets
offshell B* -+ LL + jets
T + nJ
LL + nJ [ml, > 20 GeV]
top pair + boson
TTB + nJ, TTH + nJ
off-shell divector
BLL + nJ [m, > 20 GeV]
)
Main Processes
)
O(aC
)
1)
O(an-
)
2 a
O(asM
O(anal)
B* -* LL + jets
BBB
+ nJ, VH + nJ
_
O(asc)
_______
H + nJ
B + nJ, H + nJ [n > 2]
O(asa
O(a8
W)
)
tri-vector + jets,
Higgs associated + jets
gluon fusion + jets
vector boson fusion + jets
Table 6.1: List of SM background categories generated for CMS upgrade studies,
including the main processes for each category, generator-level final states, and order
in each coupling.
The included background samples are summarized in Table 6.1.
Each sample
was produced in orthogonal bins of the variable S , the scalar sum of the transverse
momentum of all generator level particles, with the process cross section computed
separately for each bin. The cross section of each event is computed at LO, with
the branching ratio of each final state reweighted to enrich rare decay modes. NLO
K-factors calculated using MCFM [45] and branching ratio scale factors are applied
at the event level to produce the final event weight.
Because of the overwhelmingly large QCD multijet cross section and the relatively
low probability for four QCD jets in a single event to fake exactly two
T
lepton candi-
dates and two b-jets, no QCD samples were produced. Instead, an 8 TeV data-driven
estimate from same-sign rr events, described in Section 9.1, was used to confirm that
the QCD background is likely much smaller than the expected background contribution from MC-simulated processes.
32
Chapter 7
Event Selection
The baseline selection criteria for the pT,
ij,
lished based on previous work on the H -+
TT
and isolation of each object were estabchannel at CMS, the projected HL-LHC
environment and detector capabilities, and physics properties of the HH -+ bb-rT signal. A summary of the object level selection criteria for each channel is summarized
in Table 7.1, along with the 8 TeV H
-+ TT
The pT thresholds for the r,-rh and
TeTh
analysis requirements [391.
channels were raised because compared to
the CMS Run I data, the HL-LHC data will have many more low energy objects due
to increased pileup and it is unlikely the available bandwidth for the trigger menu
will increase proportionally. The
FhTh
channel pr threshold was not raised because
the 8 TeV thresholds already limit the selection acceptance significantly.
This analysis is restricted to essentially the same q regions as the Run I detector
because the generator level q distributions for signal events were predicted to be
mostly central, leaving little motivation to consider the relatively poorly understood
performance of the new forward detectors. The relative isolation criteria for leptons
and hadronic
T
candidates were relaxed because the isolated track requirement itself
already greatly reduces the number of jets faking hadronic
T
decays in the signal
region.
The signal and background yields after applying the baseline selection criteria are
shown in Table 7.2. The harsh requirements on hadronic
in a baseline signal yield for the
ThTh
T
decay kinematics result
channel of about 0.5% the expected production
33
cross section. The
erTh
and
predicted cross sections.
T,Th
channels fare better, retaining around 1-2% of their
As expected, the tt background completely overwhelms
the signal in all channels, but the Z
-+
significant contributions to the Ter and
and electroweak backgrounds also make
TT
Tphr
channels.
As previously discussed, the most powerful discriminators against non-resonant
backgrounds and resonant Z -4 -rrevents are ?rn, and
distributions are shown in Figures 7-1 (for
mbb
TeTh).
ThTh),
The baseline m,, and
mbb.
7-2 ( for
TTh),
and 7-3 (for
In all figures, the signal distribution is scaled by a factor of five hundred to
one thousand for visibility.
Even at this baseline selection level, it is apparent the
background MC statistics are lacking in the rhTh channel.
There is a clear peak in the signal distributions for all three channels. For the
distributions in all channels and the m,
distributions for the
Tj1 Th
and
TeTh
mbb
channels,
this peak is near 120 GeV as expected for Higgs decays. The M,, distribution for the
TT
channel peaks much lower, near 90 GeV, suggesting that further optimization of
the SVFIT algorithm for <PU>= 140 is possible. However, the aim of this analysis
is to demonstrate feasibility, not complete optimization.
The m, and
mbb
window requirements were established separately for each chan-
nel to strike a balance between signal acceptance and background rejection. For the
rhTh
final state, the requirements are 90 <
For the Ferh and
T,Th
mbb <
140 GeV and 90 < m, < 120 GeV.
channels, the requirements are 90 <
mbb
< 130 GeV and
100 <m,, <150 GeV.
Table 7.3 shows the signal and background yields after applying mass window
requirements. The background contributions in the
Th-rh
channel are greatly reduced,
but again it is apparent that the MC statistics limit the predictive power for that
channel. For the ,Trhchannel, the tt yield is reduced by a factor of ten after the mass
window cuts, but still dwarfs the signal yield.
34
Table 7.1: Summary of object-level selection criteria for each di-T channel. The
absolute isolation variable I is properly defined in the 8 TeV H -+ TT analysis but
not used here. The relative isolation variable R, as previously defined, is used in its
place.
di--r Final State Object
8 TeV Req.
14 TeV Req.
pT > 45 GeV
pT > 45 GeV
IT/I
Th
Thrh
IT/I
< 2.1
I < 1.0 GeV
pT > 30 GeV
pT > 17 - 20 GeV
R < 0.4
pT > 30 GeV
171< 2.1
R < 0.4
pT > 30 GeV
1r71 < 2.1
1r/1 < 2.5
R < 0.1
pT > 30 GeV
R < 0.4
pT > 30 GeV
Th
I
Trh
< 2.1
A
r7 < 2.4
< 1.5 GeV
1771< 2-1
J I < 2.4
Th
I < 1.5 GeV
R < 0.4
pr > 20 - 24 GeV pT > 30 GeV
TeTh
All
e
1r71 < 2.1
R < 0.1
Jq I < 2.5
R < 0.4
b
-
pr > 30 GeV
All < 2.5
Table 7.2: Expected yields in each channel for 3000 fb-' of integrated luminosity
after baseline selection requirements.
Process
ThTh
TTh
TeTh
HH
ti
Z -+ r
EWK
Single H
23.6 0.5
(1.4 0.1) x 10 4
2300 600
1500 100
240 20
34.0 0.6
(5.1 0.1) x 105
(1.7 0.1) x 104
(3.9 0.8) x 104
960 + 40
30.6 0.5
(4.8 0.1) x 105
(1.0 0.5) x 104
(2.7 0.0) x 10 4
1000 40
Table 7.3: Expected signal yields in each channel for 3000 fb-l of integrated luminosity after mass window requirements.
Process
Thrh
TITh
TeTh
HH
f
_
Z-
Tr
EWK
Single H
7.7
700
0.6
60
30
0.3
300
0.6
30
10
14.4 0.4
(3.6 0.2) x 104
83 83
1600 100
80 10
35
18.6
(1.3
0.4
0.0) x 105
400
7300
350
300
200
30
E
5000
500xhh-m bb
O
SM H-M-
Electroweak
bkMg.
-
_
uncertinty-
2000
1000
0
0
50
150
100
250
200
300
500xhh-+T bb
SM W-M
G)
4000
Eecroweak
ti
.' bkg.
uncertinty
3000
2000
1000
0
0
50
150
100
200
250
300
mbb
Figure 7-1: Predicted m,, (top) and mb, (bottom) distributions in the rhrh channel.
The background yields are expected contributions from SM processes, while the signal
yield is the expected contribution from SM Higgs pair production scaled by a factor
of five hundred.
36
0y
80000
1000xhh-*
W
bb
50000
z-V
40000
bkg. unewtinty
Ekwctrweak
30000
20000
10000
0
0
50
150
100
250
200
300
rT
---1000xhh-m bb
4)
H-n
50000
SM
40000
Electrvweak
t
bk. uncertinty
30000
20000
10000
0
0
50
150
100
200
250
300
Mbb
Figure 7-2: Predicted m,, (top) and mbb (bottom) distributions in the rrh channel.
The background yields are expected contributions from SM processes, while the signal
yield is the expected contribution from SM Higgs pair production scaled by a factor
of one thousand.
37
S1000xhh-+
>
bb
50000
EBectroweak
40OW
3{5
bkg.
unewaInty
30000
20000
10000
0
0
50
150
100
40000
>
300
-10-aa
10O0xhh-+= bb
a>
35000
250
200
SM H--Ym
Electowoak
t
30000
Mkg. uncwrtointy
25000
20000
15000
10000
5000
0F
0
50
150
100
200
250
300
Figure 7-3: Predicted m,, (top) and mbb (bottom) in the rer channel. The background yields are expected contributions from SM processes, while the signal yield is
the expected contribution from SM Higgs pair production scaled by a factor of one
thousand.
38
Chapter 8
Signal Extraction
The goal of this analysis is to estimate the expected significance of a Higgs pair
production cross section measurement at the Phase II CMS detector.
Because the
signal yield is expected to be quite small compared to the backgrounds, a one-variable
shape analysis for each channel is considered to exploit the differences between signal
and background distributions in key variables. The CMS combine tool [46] was used
to perform all statistical analysis.
For all three channels, a number of event-level variables were considered for signal
extraction, though only the most promising are discussed below.
mass
mT2
The stransverse
variable was defined in a previous chapter. The transverse momenta of
the di-b system pr(bb), the visible di-T system pS(TT), and the overall di-H system
pfr(HH) were also considered, as well as opening angles between the various selected
objects. Many of these variables are correlated, because, for example, a boosted di-b
system with high pT(bb) will have a small opening angle.
8.1
Fully Hadronic Channel
The fully hadronic rhrh channel is already limited by the tt MC statistics after the
mass window cuts described in the previous chapter.
Figure 8-1 shows the pT(bb)
and mT2 distributions for this channel. The signal is scaled by a factor of one hun-
39
dred for visibility. Both show good separation between the signal and background
mT2
variable is chosen for shape analysis.
C
a,
LU
----
100xhh-
bb
-
distributions, but the
1000Euectoweakn-
bkg.
uncertinty
-
ti-
800
600
400
200
-~i
0
100
300
200
600
500
400
pT(bb)
--
a
-
LU-
800
100xhh-+%w bbSM Ha
Electroweak
-
-
-
1000
ff
bkq. uncertinty
600
400
200
'-----I
-
0
100
I
300
200
77-;m-j-
400
500
600
MT2
Figure 8-1: Predicted distribution of the pr(bb) (top) and mT2 (bottom) variables
in the rhrh channel after mass window cuts. The background yields are expected
contributions from SM processes, while the signal yield is the expected contribution
from SM Higgs pair production scaled by a factor of one hundred.
8.2
Semi-Leptonic Channels
Unlike the fully hadronic channel, the two semi-leptonic channels rTh and
erTh
have
ample MC statistics after the mass window cuts described in the previous chapter.
40
There are still overwhelming SM backgrounds, so further cuts on event-level variables
are considered. The mT2 and pT(bb) distributions for the T11 Th and Ter channels are
shown in Figures 8-2 and 8-3 respectively. In the T,Th channel, the signal distributions
are scaled by a factor of one thousand for visibility, and the TeTh channel a factor of
five thousand.
Based on these distributions, a further requirement that the mT2 be greater than
100 GeV is applied to the TTh and rer channels. Table 8.1 gives the expected signal
and background yields after the mT2 cut. While it reduces the background yields by
half in the rTh channel and by ten in the Terh channel without a significant reduction
in signal yield, the signal-to-background ratio is still quite poor in both channels.
Table 8.1: Expected signal yields in each channel for 3000 fb- 1 of integrated luminosity after requiring that the mT2 variable is greater than 100 GeV.
Process
TtiTh
Terh
HH
tf
Z -
TT
EWK
Single H
12.7 0.3
(1.2 0.1)x10 4
83 83
540 50
34 4
12.5 0.3
(1.0 0.1)x10 4
0.6 0.6
570 +60
40 10
To improve the discrimination between signal and background samples, a boosted
decision tree (BDT) is trained to separate the Higgs pair signal and tf backgrounds
after baseline selection. While in principle all descriptive variables can be used together to create excellent separation in the BDT training sample, in practice too
many variables leads to overtraining and that success in the training sample will not
carry over into other samples. The two BDT for this analysis are described in detail
in Appendix A. Both are trained on the mT2 variable, and the masses, transverse momenta and opening angles of the di-r, di-b, and di-H systems after baseline selection
cuts.
The BDT discriminants for each channel after all selection cuts are applied are
shown in Figure 8-4 and are used for signal extraction. Both channels show good
41
separation between signal and background, though the signal distributions are scaled
------------------>M
1OOOxhh-*m bb
-
by a factor of five hundred for visibility.
8000
7000
-
6000
5000
3000
t
2000
0
-
1000
0
100
200
300
400
500
600
pT(bb)
K2SM H-+T
m
1000
I
-
8000
-* bb
Electrowek
-
C 000
10000
~ti
bkg. uncertainty
6000 y4000
2000
0
100
300
200
400
500
600
1
r2
Figure 8-2: Predicted distributions for the pT(bb) (top) and mT2 (bottom) variables
in the TrTh channel after mass window cuts. The background yields are expected
contributions from SM processes, while the signal yield is the expected contribution
from SM Higgs pair production scaled by a factor of one thousand.
42
()
--
30000
5000xhh--T bb
SM H-+w
-
C-
Electroweak-
25000
--
bkg. uncertainty
20000
15000
10000
5000
0
0
100
200
300
400
600
500
pT(bb)
35000
5000xhh-+= bb
SM H-vT
30000
Eketroweak
25000
bkg. uncertainty
-
0)
20000
15000
10000
5000
0
0
100
200
300
400
600
500
r2
Figure 8-3: Predicted distributions for the pT(bb) (top) and mT2 (bottom) variables
in the TeTh channel after mass window cuts. The background yields are expected
contributions from SM processes, while the signal yield is the expected contribution
from SM Higgs pair production scaled by a factor of five thousand.
43
-4
.-
50xh-+
-
4500
bb
SM H-+
4000
-7
m-
3500
bkgj.
uncertainty
3000
2500
2000
C
1500
1000
500
0
-0.5
-0.4
-0.3 -0.2
-0.1
0
0.1
0.2
0.3
0.5
0.4
--
3500
SM H-+T
bb
-Electrowoak
-ti
+
w
50"xh-+=
-
BDT
3000 Z-
bkg. uncertainty
2500
2000
1500
16-1000
500
0
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
BDT
Figure 8-4: Predicted distribution of the BDT discriminant in the T,7Th (top) and
TeTh (bottom) channels for the signal region. The background yields are expected
contributions from SM processes, while the signal yield is the expected contribution
from SM Higgs pair production scaled by a factor of five hundred.
44
8.3
Statistical Interpretation
Two methods are used to extract the statistical significance of the potential measurement. The first is the asymptotic CLs method [47], which reports an upper limit on
the cross section that would still be consistent with the background-only hypothesis to
a specified confidence level. The second is a maximum likelihood fit using the Asimov
dataset, which estimates the expected precision on a cross section measurement.
8.4
Uncertainties
Both statistical uncertainties on the overall MC scale factor from the limited number
of MC events and systematic uncertainties are taken into account by the maximum
likelihood fit, though bin-by-bin statistical uncertainties are not. The MC statistical
uncertainties dominate for all three channels. The leading systematic uncertainties
are a 20% uncertainty from the QCD scale at NLO for the signal process, and a 9%
PDF uncertainty. The systematic uncertainty on integrated luminosity is taken to be
2.6%, the same as in the 8 TeV data. Uncertainties on the jet, lepton, and missing
energy scales are also included.
45
46
Chapter 9
Results
Before discussing the results of this analysis, a comparison of the 14 TeV Delphes
and 8 TeV full simulation event yields and distributions is presented to motivate the
reasonableness of those results.
9.1
Cross Check Using 8 TeV Data Sets
Because there were a number of simplifying assumptions made about detector performance to create the Delphes fast simulation, we compare to the very well understood
8 TeV detector using H
QCD
-+ TT
full simulation samples. The major concerns are the
multijet backgrounds and hadronic
T
performance.
The background yields from the Delphes samples at 14 TeV for the Phase II
detector in the two most sensitive channels, rhmh and r,ITh, were compared to full
simulation samples at 8 TeV produced for the CMS H
-
TT
analysis [391. A signal
sample at 8 TeV was also produced in full simulation using the same production
framework. The 8 TeV QCD contribution is estimated from data in same-sign
di-T
events that otherwise pass the selection requirements. All 8 TeV samples have the
relevant di-T triggers applied.
Tables 9.1 and 9.2 show a comparison between the expected signal yields for
21 fb-' of integrated luminosity at v/2
=
8 TeV and 3000 fb-
1
of integrated luminosity
at fi = 14 TeV at two stages of the cut-based selection in the
47
ThTh
and
T,rh
channels
respectively. The Higgs pair and ti yields from 8 TeV full simulation are scaled to
3000 fb-' and A/s = 14 TeV. For the signal process, this is done using the 8 TeV
NNLO Higgs pair production cross section of 9.8 fb [181. For the ti process, scale
factors derived from MCFM are used.
Overall, the full simulation and Delphes yields are consistent.
There is some
discrepancy between the signal and ti yields after the mass window requirements are
applied, but this is not surprising given the expected differences in
and jet PT
'KT
resolution and response between <PU>= 21 and <PU>= 140.
Additionally, this comparison demonstrates that the QCD contribution to the
background, at least for 8 TeV and <PU>= 21, is negligible or zero when compared
to the dominant tt background. After the mT2 > 100 GeV selection cut in the rr
channel, the QCD contribution is 7 + 4 events, only slightly more than 10% of the
61 + 2 tt events.
Table 9.1: Cross check with 8 TeV full simulation at two stages of the rhrh cut-based
selection. The 8 TeV columns are from full simulation MC and are scaled to 8 TeV
with 21 fb- 1 integrated luminosity, except (*) the signal and ti yields which are scaled
to 14 TeV and 3000 fb- integrated luminosity. The 14 TeV columns are from Delphes
and are scaled to 3000 fb-1 integrated luminosity.
Mass window
Baseline selection
Process
Pr
___
_
ti
8 TeV
23.4 1 2(*)
(1.8 0.1) x 104 (*)
14 TeV
23.6 - 0.5
(1.4 0.1) x 104
Z -+ r
5.2
0.8
2300
600
EWK
Single H
3.4
0.19
1.1
0.04
1500
240
100
20
QCD
The
mT2
17.3
4.8
8 TeV
1 14 TeV
14.6 1.2 (*) 1.17 _ .3
530 130 (*) 700 300
0
0.03
0.04
-
0.03
0.02
0.6
0.6
60
30
30
10
0
distributions for 8 TeV data and MC are also considered to demon-
strate agreement between the two and to further consider the possible QCD multijet
contributions to the background distributions at 14 TeV. Figure 9-1 shows the
mT2
distributions at the baseline and mass window levels for the rhTh channel with the
signal distribution scaled by a factor of one thousand (baseline) or one hundred (mass
window). At the mass window level, the data points above
48
mT2 =
100 GeV are not
Table 9.2: Cross check with 8 TeV full simulation at two stages of the T,!rh cut-based
selection. The 8 TeV columns are from full simulation MC and are scaled to 8 TeV
with 21 fb- 1 integrated luminosity, except (*) the signal and tif yields which are scaled
to 14 TeV and 3000 fb- 1 integrated luminosity. The 14 TeV columns are from Delphes
and are scaled to 3000 fb-1 integrated luminosity.
Mass window
Baseline selection
8 TeV
30.0 1.6 (*)
tf
(6.6
Z -+ TT
EWK
Single H
QCD
12.8
44.7
1.1
0.0)
(*)
1.3
4.4
0.1
29.9
11.5
x10 5
14 TeV
34.0 0.6
(5.1
0.1) x 10 5
(1.7
(3.9
0.1) x 104
0.8) x 10 4
960 40
8 TeV
14.6 1.1 (*)
(3.4 0.1)
x10 4 (*)
0.43 0.22
3.4 1.3
0.2 0.1
6.9
-
3.6
14 TeV
19.2 0.4
(4.4 0.2) x 10 4
(2.3 0.1) x 104
1893 113
1308 45
-
Process
HH
shown in order to blind the analysis. There are no QCD multijet events remaining
after the mass window cuts.
Figure 9-2 show the same
mT2
distributions for the
TITh
channel with the signal
distribution scale by a factor of ten thousand (baseline) or one thousand (mass window). In both the rprh and
FhTh
channels, there is excellent agreement between data
and MC at the baseline selection level.
49
1
~n 601
T
TI I II I 1 1
1000xhh-+
Observed
--
>)-
W
0
bb
SM H-+nr
Z-
40
Electroweak
30
QCD Multi Jet
bkg. uncertainty
20
10
0
0
100
200
400
300
600
500
rT
2
10xhh-mT bb
3
Observed
SM H-=
Z-w
-.-
W
2.5 -
-ti
-
2
QCD Multi Jet
bkg. uncertainty
-
1.5
Electroweak
-
0.5
0
0
100
300
200
400
500
600
MT 2
Figure 9-1: Cross check with 8 TeV data for the ThTh channel. Predicted distributions for the mn2 variable before (top) and after (bottom) mass window cuts. The
background yields are expected contributions from SM processes, while the signal
yield is the expected contribution from SM Higgs pair production scaled by one thousand (top) or one hundred (bottom). In the bottom figure, the data is blinded for
mT2 > 100 GeV.
50
0
700
>
W
--
10000xhh-+t bb
-+-
Observed
SM H-+-
:
600
Electroweak
500 -ti
QCD Multi Jet
bkg. uncertainty
400
300
200
100
7:
0
100
400
300
200
600
500
M2
0)
45
0>
---
1000xhh-mn
-
Observed
SM H-+'rt
WU
35
bb
Z+
30
-ti
25
20
Electroweak
QCD MutI Jet
bkg. uncertainty
15
10
5
0
0
100
300
200
400
500
600
mr
2
Figure 9-2: Cross check with 8 TeV data for the r,-rh channel. Predicted distributions
for the mT2 variable before (top) and after (bottom) mass window cuts. The background yields are expected contributions from SM processes, while the signal yield
is the expected contribution from SM Higgs pair production scaled by ten thousand
(top) or one thousand (bottom).
51
9.2
14 TeV Results
The expected 95% CL upper limit on the branching ratio times cross section measurement and the expected la- uncertainty on the branching ratio times cross section
measurement are computed for each channel separately, then combined. Both metrics
are listed in Table 9.3 for the separate and combined scenario.
The main result is that the expected upper 1l- uncertainty on the branching ratio
times cross section measurement for SM Higgs pair production is 67%. This is less
precision than the result achieved by [231, as expected due to the detector effects taken
into account with our simulation. However, this result is still comparable, which is
encouraging overall for the bbrr final state. The rr
best, followed by
Thhr
channel appears to perform the
and finally TeTh. Note that the
TeTh
uncertainty on the cross
section is consistent with zero. However, the doubly hadronic channel performance is
much less definitive than the other two because it may be the result of some artifact
in the distributions created by the limited statistics in that channel.
Table 9.3: Statistical results of the analysis, showing the asymptotic 95% CL upper
limit on the expected cross section and the expected lo uncertainties on the cross
section measurement.
Channel
ThTh
1Tp Ih
TeTh
Combined
95% CL upper limit
3.35
T
onCA
L M.U
5.66
2.19
52
+1U
+119%
-10-89%
000-/ rPTAOI
-00 /0
-IV/O
+231%
+67%
-100%
-57%
Chapter 10
Conclusions
This analysis indicates that a measurement of Higgs pair production in bbrT final
states is feasible at the CMS detector during the HL-LHC run. The doubly hadronic
and semi-leptonic di-T channels are considered separately to establish selection cuts,
before a shape-based signal extraction is performed. The doubly hadronic channel
uses the mT2 distribution, while the two semi-leptonic channels are analyzed using a
BDT discriminant trained to separate the tf background from the signal. Additionally,
the results from 14 TeV HL-LHC fast simulation samples were found to be consistent
with 8 TeV full simulation samples, validating these projections. The expected 95%
CL upper limit on the cross section times branching ratio from a combination of all
three channels is 2.2 times the SM value, with an expected +lc- uncertainty on the
measured cross section of 67%. While much of the analysis is based on predictions
that will be updated as the detector is built and commissioned, these results indicate
that the bbrT final state could yield powerful constraints on the SM and non-SM
Higgs sectors.
53
54
Appendix A
Semi-Leptonic BDT
The TMVA [481 package was used to train two BDT for signal versus ff discrimination
in the
T,Th
and
TeTh
channels. Half of the MC events for each channel that pass the
baseline selection requirements are used for training, while the other half comprise
the testing sample. The variables used in these BDT are:
" visible di-r mass mv,
" visible di-T transverse momentum pys(TT),
" di-r opening angle AR(rr) = V/( ,"
di-b mass
"
di-b transverse momentum pT(bb),
7r2)2
+ (#i -
2
(#b1 -
#,2)2,
mbb,
" di-b opening angle AR(bb)
= V/(b1 -
"
di-H mass mHH,
"
di-H transverse momentum pr (HH),
" di-H opening angle AR(HH)
=
/(ma
" stransverse mass mT2
55
?b2)
-
+
r1)
+ (#bb
#52)2,
-
#rT)2, and
L4
Figure A-1: Input variable distributions for rrh channel BDT. The signal is shown
in blue, while the background is red.
56
A BSwkgound7
a.a.
OL
IT
aim
Ikit
W1
GAMto
*0/o
ao
Figure A-2:~~ Inputae v ieari abl e ditrbtin
frTj
e
Fiur A2:Iptarbedstbtisfr
i blue, twhie.bc
in bluewhile te backg
ruof
ise
srd
57
cane
DT
hanlBTTesgalsshw
hesinl sshw
Figure A-3: Correlation matrices for the (top) signal and (bottom) background samples in T,hr channel.
58
-. 4
M0.017 (0.04"
agnd (backound) proabi
-O.t
-6.2
.4.3
.
-. 5
Background (trinhig smple)7
I
Background
Kolmogomv-&Srnovbst
Figure A-4: Overtraining check for BDT classifier in the T,rh channel.The signal is
shown in blue, while the background is red.
[Background rejectioni versus
C
Signal efficiencyI
.
........
..
..
..
...
..
..
..
...
...
..
.....
.........
...
..
....
....
...
...
.
....
...
0.6
0.7
..
..
.-.
...
..
.---...
...
...
....-.
..
..-..
..
..-..
....
...
- -..
---....
..
..
...
...
....
.....
.....
.....
.........
..
.-..
-..
...
..
..
....
.
.5
0.8
-
0
0.5
.... .... d...
YA
-....
.. ...... ......
......... ...
.M.........
-..
........
-..
.....
........ ..........
.
0.6
-..
0.3
n2
0 0.1
0.2
0.3
0.4
0.5
0.6
0.7
.8
0.
1
Signa efficiency
Figure A-5: ROC curve for BDT classifier in the r,rh channel.
59
Figure A-6: Correlation matrices for the (top) signal and (bottom) background samples in r6 rh channel.
60
TMVA
overtraining
check for- classifier:
BDT
Background (traIning ample)
z 7 fkround hM awnpie)
KolmogarovSmnov lt siignal I
probs~illy =
0(0.022)
5-
34
.0.5
4A
/
1-F
..
4.2
-. A
4.1
a
BDT response
Figure A-7: Overtraining check for BDT classifier in the reh channel.The signal is
shown in blue, while the background is red.
Background
rejection versus
Signal efficiency
AAo
CBDT
.5
MVA Method:
-BDT
.
0 .2 .. . . . .. . . .
.. . . . . . .
0.4
0.5
0.6
0.7
0.8
0
8.1
0.2 0.3
0.9
1
Signal efficiency
Figure A-8: ROC curve for BDT classifier in the eTh channel.
61
62
Bibliography
[1] S. Chatrchyan et al. Observation of a new boson at a mass of 125 GeV with the
CMS experiment at the LHC. Phys.Lett., B716:30-61, 2012.
[2] S. Chatrchyan et al. Observation of a new boson with mass near 125 GeV in pp
collisions at sqrt(s) = 7 and 8 TeV. JHEP, 06:081, 2013.
[31 G. Aad et al. Observation of a new particle in the search for the Standard Model
Higgs boson with the ATLAS detector at the LHC. Phys.Lett., B716:1-29, 2012.
[4] S. Chatrchyan et al. Study of the mass and spin-parity of the higgs boson
candidate via its decays to z boson pairs. Phys. Rev. Lett., 110:081803, Feb
2013.
[51 G. Aad et al. Evidence for the spin-0 nature of the Higgs boson using ATLAS
data. Phys.Lett., B726:120-144, 2013.
[61 V. Khachatryan et al. Precise determination of the mass of the higgs boson and
tests of compatibility of its couplings with the standard model predictions using
proton collisions at 7 and 8 tev. Technical Report arXiv:1412.8662. CERN-PH-
EP-2014-288. CMS-HIG-14-009, CERN, Geneva, Dec 2014. Comments: Submitted to Eur. Phys. J. C.
171 F. Englert and R. Brout. Broken Symmetry and the Mass of Gauge Vector
Mesons. Phys.Rev.Lett., 13:321-323, 1964.
[81 P.W. Higgs. Broken symmetries, massless particles and gauge fields. Phys.Lett.,
12:132-133, 1964.
[9] P.W.
Higgs.
Broken
Symmetries
and
the
Masses
of Gauge
Bosons.
Phys.Rev.Lett., 13:508-509, 1964.
[10] G.S. Guralnik, C.R. Hagen, and T.W.B. Kibble. Global Conservation Laws and
Massless Particles. Phys.Rev.Lett., 13:585-187, 1964.
[11] P.W. Higgs.
Spontaneous Symmetry Breakdown without Massless Bosons.
Phys.Rev., 145:1156-1163, 1966.
[121 T.W.B Kibble. Symmetry Breaking in Non-Abelian Gauge Theories. Phys.Rev.,
155:1554-1561, 1967.
63
[13] S.L. Glashow. Partial-symmetries of weak interactions. Nucl.Phys., 22:579-588,
1961.
[14] S. Weinberg. A Model of Leptons. Phys.Rev.Lett., 19:1264-1266, 1967.
[15] A. Salam. Elementary Particle Physics: Relativistic Groups and Analyticity.
page 367, 1968. Proceedings of the eighth Nobel symposium.
[16] M. Carena, C. Grojean, M. Kado, and V. Sharma. Status of Higgs Boson Physics.
Chin.Phys., C38, 2014.
[17] J. Baglio, A. Djouadi, R. Grober, M.M. Muhlleitner, J. Quevillon, et al. The
measurement of the Higgs self-coupling at the LHC: theoretical status. JHEP,
1304:151, 2013.
[18] D. de Florian and J. Mazzitelli. Higgs boson pair production at next-to-next-toleading order in qcd. Phys. Rev. Lett., 111:201801, Nov 2013.
[19] T. Plehn, M. Spira, and P.M. Zerwas. Pair production of neutral Higgs particles
in gluon-gluon collisions. Nucl.Phys., B479:46-64, 1996.
[20] M.J. Dolan, C. Englert, and M. Spannowsky. New physics in lhc higgs boson
pair production. Phys. Rev. D, 87:055002, Mar 2013.
[21] J.M. No and M. Ramsey-Musolf. Probing the higgs portal at the lhc through
resonant di-higgs production. Phys. Rev. D, 89:095031, May 2014.
[22] M. J. Dolan, C. Englert, and M. Spannowsky. Higgs self-coupling measurements
at the LHC. JHEP, 1210:112, 2012.
[23] A.J. Barr, M.J. Dolan, C. Englert, and M. Spannowsky. Di-higgs final states
augint2ed: Selecting hh events at the high luminosity lhc. Phys.Lett., B728:308313, 2014.
[24] D.E.F. de Lima, A. Papaefstathiou, and M. Spannowsky. Standard model higgs
boson pair production in the (bbbb) final state. Journal of High Energy Physics,
2014(8), 2014.
[25] K. Olive et al. Tau Branching Fractions. Chin.Phys., C38, 2014.
[261 S. Chatrchyan et al. The CMS experiment at the CERN LHC.
Instrumentation, 3(08):S08004, 2008.
[ 2u aice-F iuw Eveta1 MeUCOnstruction in CV1 anu re1ormiuancue or Jets,
Journal of
Taus,
and MET. Technical Report CMS-PAS-PFT-09-001, CERN, 2009. Geneva, Apr
2009.
[28] Commissioning of the Particle-Flow reconstruction in Minimum-Bias and Jet
Events from pp Collisions at 7 TeV. Technical Report CMS-PAS-PFT-10-002,
CERN, Geneva, 2010.
64
[29] Particle-flow commissioning with muons and electrons from J/Psi and W events
at 7 TeV. Technical Report CMS-PAS-PFT-10-003, CERN, 2010. Geneva, 2010.
[301 J. de Favereau, C. Delaere, P. Demin, A. Giammanco, V. Lemaitre, et al.
DELPHES 3, A modular framework for fast simulation of a generic collider
experiment. 2013.
[311 S. Agostinelli et al.
GEANT4:
a simulation toolkit.
Nucl. Instrum. Meth.,
A506:250, 2003.
[321 M. Cacciari, G.P. Salam, and G. Soyez. The anti-kt jet clustering algorithm.
Journal of High Energy Physics, 2008(04):063, 2008.
[33] M. Cacciari, G.P. Salam, and G. Soyez.
FastJet User Manual.
Eur.Phys.J.,
C72:1896, 2012.
[341 Pileup jet identification. Technical Report CMS PAS JME-13-005, CERN, 2013.
[35] S. Chatrchyan et al. Identification of b-quark jets with the CMS experiment.
JINST, 8:PO4013, 2013.
[36] Performance of b tagging at sqrt(s)=8 tev in multijet, ttbar and boosted topology
events. Technical Report CMS-PAS-BTV-13-001, CERN, Geneva, 2013.
[37] Results on b-tagging identification in 8 tev pp collisions. Technical Report CMS-
DP-2013-005, CERN, 2013.
[381 S. Chatrchyan et al. Performance of -lepton reconstruction and identification in
CMS. J. Instrum., 7(arXiv:1109.6034. CMS-TAU-11-001. CERN-PH-EP-2011137):PO1001. 33 p, Sep 2011.
[39] S. Chatrchyan et al. Evidence for the 125 gev higgs boson decaying to a pair of
tau leptons. Journal of High Energy Physics, 2014(5), 2014.
[40] A. Avetisyan, J.M. Campbell, T. Cohen, N. Dhingra, J. Hirschauer, et al. Methods and Results for Standard Model Event Generation at fIs = 14 TeV, 33 TeV
and 100 TeV Proton Colliders (A Snowmass Whitepaper). Technical report,
2013.
[41] J. Anderson, A. Avetisyan, R. Brock, S. Chekanov, T. Cohen, et al. Snowmass
Energy Frontier Simulations. 2013.
[42] J. Alwall, M. Herquet, F. Maltoni, 0. Mattelaer, and T. Stelzer. MadGraph 5:
Going Beyond. JHEP, 1106:128, 2011.
[43] T. Sjdstrand and S. Mrenna and P. Z. Skands. PYTHIA 6.4 Physics and Manual.
JHEP, 05:026, 2006.
65
[44] Stanislaw Jadach, Johann H. Kuhn, and Zbigniew Wa .
Tauola - a library of monte carlo programs to simulate decays of polarized tau leptons.
Comp.Phys. Com., 64(2):275 - 299, 1991.
[45] J.M Campbell and R.K. Ellis. Mcfm for the tevatron and the lhc. Nuc.Phys. B
(Proc. Suppl.), 205-206:10 - 15, 2010. Loops and Legs in Quantum Field Theory
Proceedings of the 10th DESY Workshop on Elementary Particle Theory.
[46] Procedure for the LHC Higgs boson search combination in Summer 2011. Tech-
nical Report CMS-NOTE-2011-005. ATL-PHYS-PUB-2011-11, CERN, Geneva,
Aug 2011.
[471 G. Cowan, K. Cranmer, E. Gross, and 0. Vitells. Asymptotic formulae for
likelihood-based tests of new physics. Eur.Phys.J., C71:1554, 2011.
[48] A. Hoecker, P. Speckmayer, J. Stelzer, J. Therhaag, E. von Toerne, and H. Voss.
TMVA: Toolkit for Multivariate Data Analysis. PoS, ACAT:040, 2007.
66
Download