• Help
    Discussion forum
    Search tips
  • About
    CERN Open Data
    ALICE
    ATLAS
    CMS
    DELPHI
    JADE
    LHCb
    OPERA
    TOTEM
    Glossary

Important notice: opendata-dev.cern.ch is a development server. Please use it for testing purposes only. The content may be erased at any time. Please use opendata.cern.ch for production.

CMS Simulated Dataset Names

Documentation Guide


What is the “simulated dataset” name?

It’s the part of the dataset name before the second 'slash', i.e. for the dataset

/GluGluToHToZZTo4L_M-550_7TeV-minloHJJ-pythia6-tauola/Summer11LegDR-PU_S13_START53_LV6-v1/AODSIM

the dataset name is

GluGluToHToZZTo4L_M-550_7TeV-minloHJJ-pythia6-tauola.

Convention

CMS uses the following convention for this name:

PROCESS_RANGETYPE-RANGELOWtoRANGEHIGH_FILTER_TUNE_COMMENT_COMENERGY-GENERATOR,

where

  • PROCESS is the physics process in the sample, e.g. QCD, TT, W, DY
  • RANGETYPE is the variable the sample is binned in, e.g. PT, M
  • RANGELOW is the lower bound of that range in GeV, e.g. 0, 10, 200
  • RANGEHIGH is the upper bound
  • FILTER denotes information on additional filters applied
  • TUNE is the underlying-event tune (e.g. TuneZ2Star)
  • COMMENT for additional comments
  • COMENERGY is the centre-of-mass collision energy, e.g. 7TeV, 10TeV
  • GENERATOR is the generator used, e.g. pythia6, herwigpp, sherpa

Some details on all these parts are below.

PROCESS

To unify this part CMS uses the following conventions:

  • all ‘particles’ start with capital letters, followed by minor letters, e.g. W, Z, Mu, Tau, E, Nu, Wplus, H, Jets, Tbar, B, Bbar
  • if a specific decay is simulated, this is specified using the keyword To, e.g. WToENu, HToWWTo2L2Nu
  • initial-state particles are only specified if needed to distinguish between other processes, e.g. GluGluToWW with respect to WW
  • charge for a particle is only specified if relevant, i.e. Wplus is used if only W+ is in the sample, and Tbar if an anti-top is present; WplusWminus is not used for W-pair production and a top-antitop pair is represented with TT
  • if there is (a) more then one particle of the same kind and (b) more then two particles in total, say, 2E2Nu is used rather then EENuNu

The following conventions are applied for the most common particles:

Particle Keyword modifications if needed
electron/positron `E` `Eminus`, `Eplus`
muon `Mu` `Muplus`, `Muminus`
tau `Tau` `Tauplus`, `Tauminus`
charged lepton `L`
neutrino `Nu` `Nue`, `Numu`, `Nutau`, `Nuebar`,…
W `W` `Wplus`, `Wminus`, `Wprime`
Z (no photon) `Z` `Zprime`
Z/photon (Drell-Yan) `DY`
photon `G`
gluon `Glu`
top `T` `Tbar`, `Tprime`
bottom `B` `Bbar`, `Bprime`
quark `Q` `Qbar`
jet `Jet` `Jets` for more than one (inclusive); `1Jet`, `2Jets`, … for exclusive
jet `J` for decay products, if both quarks and gluons are produced

RANGETYPE, RANGELOW, RANGEHIGH

This part denotes the variable (if at all) in which the sample is binned. Typical possibilities are Pt (pthat), M (inv. mass), as well as the range for this variable. This part of the name may also be used togther with M to specify the mass of a particle that is used as a parameter (e.g. Higgs mass, Z′ mass, …). Examples are

  • Pt-100To200 for binning in pthat from 100 GeV to 200 GeV
  • Pt-50 if there is only a lower cut on pthat
  • M-30 as a lower inv. mass cut (e.g. in Drell-Yan)
  • M-160 for Higgs mass of 160 GeV

If needed, the variable may be accompanied by specification of the particle in the process on which the cut is applied:

  • PtW-300 if there is a lower cut on the W pthat

This part of the dataset name may be dropped if there is no binning, inv. mass cuts etc.

FILTER

This part specifies a GEN-level filter for the production, if present. The part is however not standardised; here a few examples:

  • EMEnriched: some filter — enriching the sample with electrons/photons — is applied
  • MuEnriched: GEN-level filter on Muons
  • MuEnrichedPt5: cut on GEN-level muons of 5 GeV

COMMENT

Used only if necessary information could not be put in any of the other fields. e.g. only QCD or only EWK production, special selections etc.

COMENERGY and GENERATOR

Units are added to integers in COMENERGY, i.e. 7TeV, 10TeV, 900GeV, 2360GeV and so on.

The generators used are:

Generator Keyword
Pythia6 `pythia6`
Pythia8 `pythia8`
Herwig6 `herwig6`
Herwig++ `herwigpp`
Sherpa `sherpa`
MadGraph `madgraph`
MadGraph **not showered with Pythia6** `madgraph-herwigpp`, …
Alpgen `alpgen`
Alpgen **not showered with Pythia6** `alpgen-herwig6`, …
MC@NLO `mcatnlo`
MC@NLO **not showered with Herwig6** `mcatnlo-pythia6`, …``
POWHEG `powheg`
POWHEG **not showered with Pythia6** `powheg-pythia8`, …
HARDCOL `hardcol`
BCVEGPY 2 `bcvegpy2`
… …

If a specialised decay tool was used, it was appended to the name, e.g. if EvtGen was used after Pythia6, the keyword would be …-pythia6-evtgen.

ALICE experiment
ATLAS experiment
CMS experiment
DELPHI experiment
JADE experiment
LHCb experiment
OPERA experiment
PHENIX experiment
TOTEM experiment
© CERN, 2014–2025 ·
Terms of Use ·
Privacy Policy ·
Help ·
GitHub ·
Twitter ·
Email
Powered by Invenio
Open Data Portal v0.3.0
CERN