CMS Public tutorial example

CMS is one of the Large Hadron Collider (LHC) experiments at CERN. It is a general-purpose detector, and has been used to discover the Higgs boson, and to search for new physics beyond the Standard Model.

The CMS Public Tutorial can be performed in many ways, depending on the physics you are interested in. For simplicy, we will focus on the Z boson in this example. We are aiming to reproduce the outputs showin in FAST-HEP example repository.

To perform this analysis, we will need to:

  1. Create variables not present in the data: Muon transverse momentum, muon isolation, and number of isolated muons.

  2. Select isolated muons pairs with opposite charge and calculate the invariant mass of the pair.

  3. Create histograms of the number of muons and number of isolated muons per event.

  4. Select events based on number of isolated muons, a trigger decision, and the transverse momentum of the muons.

  5. Create histograms of the invariant mass of the muon pair.

  6. Present the results in a publication-ready plot.

In the following sections, we will go through each of these steps, and show how to define them in a fasthep-flow workflow.

Setup

Preparing the data

The data for this tutorial are available on CERNBOX:

{
  "data.root": "https://cernbox.cern.ch/index.php/s/9QU53dsR2AQPxz8/download",
  "dy.root": "https://cernbox.cern.ch/index.php/s/x4cRGGXNvQ2ZnDy/download",
  "qcd.root": "https://cernbox.cern.ch/index.php/s/IYznddwu1oX9zpc/download",
  "single_top.root": "https://cernbox.cern.ch/index.php/s/8cwxZYSAz1QV83w/download",
  "ttbar.root": "https://cernbox.cern.ch/index.php/s/3s6Haj3SLGqPuAK/download",
  "wjets.root": "https://cernbox.cern.ch/index.php/s/tGNjygyJFvSs2Dc/download",
  "ww.root": "https://cernbox.cern.ch/index.php/s/dFaiOi8JJVzCN8L/download",
  "wz.root": "https://cernbox.cern.ch/index.php/s/W7hNNy47F7D8X80/download",
  "zz.root": "https://cernbox.cern.ch/index.php/s/CRlo8JP0Htvg4Dm/download"
}

You can download manually or using the fasthep-cli:

fasthep download --json /path/to/json --destination /path/to/data

Note

While you can automate the data download and curator steps, we will do them manually in this example. Both could be added as stages of the type fasthep_flow.operators.bash.BashOperator.

Putting together the workflow

Input data

The first step is to define the input data. In this case, we will use the output from the fasthep-curator step and pass it to the first stage of the workflow.

stages:
  - name: Input data
    type: fasthep_carpenter.operators.InputDataOperator
    kwargs:
      curator_config: "/path/to/curator.yaml"
      split_strategy: "file"
      split_kwargs:
        n: 1
      method: uproot5

We typically would only need the name, type, and curator_config here as the other values are defaults. However, we have included them here for completeness.

Creating variables

Adding onto the stages:

- name: Create variables
  type: fasthep_carpenter.operators.CreateVariablesOperator
  kwargs:
    variables:
      - name: Muon_Pt
        type: float32
        expr: "sqrt(Muon_Px ** 2 + Muon_Py ** 2)"
      - name: IsoMuon_Idx
        type: float32
        expr: "(Muon_Iso / Muon_Pt) < 0.10"
      - name: NIsoMuon
        type: int32
        expr: "count(IsoMuon_Idx)"

OK, there is a lot going on here. Let’s break it down.

First, we use fasthep_carpenter.operators.CreateVariablesOperator for the implementation and give it a few variables to create. Each variable is defined with a name, type, and expr. The name is the name of the variable, the type is the type of the variable, and the expr is the expression used to calculate the variable. The expr is a string that is evaluated using numexpr for simple expressions and fasthep_expr for more complex expressions. The expr can use any of the variables in the input data, and can use any of the functions in fasthep-carpenter.

Selecting muon pairs

Next, we want to select muon pairs and calculate their invariant mass. We will use the fasthep_carpenter.operators.DiObjectMass for this:

- name: Muon Invariant Mass
  type: fasthep_carpenter.operators.DiObjectMass
  kwargs:
    four_momenta: ["Muon_Px", "Muon_Py", "Muon_Pz", "Muon_E"]
    output: "DiMuonMass"
    when:
      all:
        - "NIsoMuon >= 2"
        - "Muon_Charge[0] == -Muon_Charge[1]"

Note

There is also a more general fasthep_carpenter.operators.InvariantMassOperator that can be used to calculate the invariant mass of more than two objects.

Creating histograms

There are two places in this analysis example where we want to create histograms: before the selection and after. Let’s start with the first histogram: the number of muons and number of isolated muons per event. We already have the definitions of these variables, so we can use them directly:

- name: Histograms before selection
  type: fasthep_carpenter.operators.HistogramOperator
  kwargs:
    histograms:
      - name: NMuon
        input: "NMuon"
        edges: [0, 1, 2, 3, 4, 5]
      - name: NIsoMuon
        input: "NIsoMuon"
        edges: [0, 1, 2, 3, 4, 5]
    weights: ["EventWeight"]

The fasthep_carpenter.operators.HistogramOperator takes a list of histograms to create. Each histogram is defined with a name, input, and edges or bins. The name is the name of the histogram, the input is the variable to histogram, and the edges are the bin edges. The input can be any variable defined in the workflow, and the edges can be any list of numbers. The weights are optional, and can be any list of variables defined in the workflow.

Note

All histograms will be created in a folder named after the name of the stage with spaces replaced with _. All histograms will be prepended by hist_. This behaviour can be changed by setting the histogram['folder_rule'] and histogram['prefix'] in the global section of the fasthep-flow configuration file.

Selecting events

For this example the selection is very simple: make sure a High Level Trigger (HLT) path was fired, that there are at least two isolated muons in the event, and the first muon in the event has at least a transverse momentum of 25 GeV. We can use the fasthep_carpenter.operators.SelectorOperator for this:

- name: Select events
  type: fasthep_carpenter.operators.SelectorOperator
  kwargs:
    when:
      all:
        - "triggerIsoMu24 == 1"
        - "NIsoMuon >= 2"
        - "first(Muon_Pt) > 25"

The fasthep_carpenter.operators.SelectorOperator takes a list of conditions to select events. Each condition is defined with a when key, and a list of conditions. The when key can be all or any, and the conditions can be any variable defined in the workflow. The when key is optional, and defaults to all. Selection stages are special, since they also keep track of the number of events before and after the selection.

Creating histograms after selection

It is now time to create histograms of the invariant mass of the muons after the selection. We can use the fasthep_carpenter.operators.HistogramOperator again:

- name: Histograms after selection
  type: fasthep_carpenter.operators.HistogramOperator
  kwargs:
    histograms:
      - name: DiMuonMass
        input: "DiMuonMass"
        bins: { low: 60, high: 120, nbins: 60 }
    weights: ["EventWeight"]

Output data

Finally, we want to save the output of the workflow. We can use the fasthep_carpenter.operators.OutputDataOperator for this:

- name: Output data
  type: fasthep_carpenter.operators.OutputDataOperator
  kwargs:
    path: "/path/to/output"
    method: uproot4

The fasthep_carpenter.operators.OutputDataOperator takes a path and a method. The path is the path to the output file(s), and the method is the method to use to write the output file. The method can be any method supported by fasthep-carpenter.

Making paper-ready plots

The final step is to make a paper-ready plot. We will use the fasthep_flow.operators.bash.BashOperator for this:

- name: Make paper-ready plot
  type: fasthep_flow.operators.bash.BashOperator
  kwargs:
    bash_command: |
      fasthep plotter \
        --input /path/to/output \
        --output /path/to/output/plots/

Putting it all together