France-Berkeley PHYSTAT Conference on Unfolding

Europe/Paris
Anja Butter (Centre National de la Recherche Scientifique (FR)), Ben Nachman (Lawrence Berkeley National Lab. (US)), Lydia Brenner (Nikhef National institute for subatomic physics (NL))
Description

A central task in differential cross section measurements in particle-, nuclear-, and astrophysics is unfolding: the removal of detector distortions, also called deblurring or deconvolution.  Unfolding is a challenging inverse, simulation-based inference task.  

The goal of this conference is to bring together method developers and practioners to discuss the state-of-the-art in unfolding.  One key aspect of the conference will be machine learning-based unfolding methods, which have enabled new possibilities (e.g. unbinned and high-dimensional measurements).

The conference will be held at the LPNHE in Paris from June 10 - 13, 2024. There will be a zoom connection for remote participation as well.

Organizing Committee:

Olaf Behnke
Lydia Brenner
Anja Butter
Louis Lyons
Bogdan Malaescu
Ben Nachman
 

Acknowledgements: We are grateful to the France-Berkeley Fund for sponsorship and to PHYSTAT for logitistcal support.

   

Registration
Participation on-line (no fee)
Registration for in-person participation (€100 fee)
Participants
  • Alessandro Tarabini
  • Andrea Giammanco
  • Andres Daniel Perez
  • Andrew Fowlie
  • Anja Butter
  • Baptiste Ravina
  • Ben Nachman
  • Bertrand Laforge
  • Biao Wang
  • Bogdan Malaescu
  • Caio Cesar Daumann
  • Carlos Mana
  • Carsten Burgard
  • Dimitri Bourilkov
  • Elzbieta Richter-Was
  • Fernando Torales Acosta
  • Gianluca Bianco
  • Humberto Reyes-Gonzalez
  • Jad Mathieu Sardain
  • Javier Mariño Villadamigo
  • Judita Mamuzic
  • Krish Desai
  • Kyle Cormier
  • Laura Brittany Havener
  • Laurent Lellouch
  • Louis Lyons
  • Lucas Kang
  • Lydia Brenner
  • Maja Mackowiak-Pawlowska
  • Marcelo Gameiro Munhoz
  • Michael Dolce
  • Molly Park
  • Nan Lu
  • Nathan Hütsch
  • Olaf Behnke
  • Olaf Behnke
  • Oleksandr Zenaiev
  • Rafal Maselek
  • Rahul Balasubramanian
  • Ricardo Barrué
  • Shilpi Jain
  • Simone Gasperini
  • Sneh Shuchi
  • Stefan Katsarov
  • Syed Anwar Ul Hasan
  • Tim Adye
  • Tom Cavaliere
  • Yingjie Wei
  • +19
    • 11:30 13:30
      Registration 2h
    • 13:30 13:45
      Welcome and Logistics 15m
      Speakers: Anja Butter (Centre National de la Recherche Scientifique (FR)), Ben Nachman (Lawrence Berkeley National Lab. (US)), Lydia Brenner (Nikhef National institute for subatomic physics (NL))
    • 13:45 14:30
      HEP overview (30'+15') 45m
    • 14:30 15:15
      Statistics overview (30'+15') 45m
      Speaker: Mikael Kuusela (Carnegie Mellon University (US))
    • 15:15 16:00
      ML overview (30'+15') 45m
      Speaker: Tilman Plehn (Heidelberg University)
    • 16:00 17:30
      Welcome reception 1h 30m
    • 09:30 10:10
      Binned ML methods overview (20'+20') 40m
    • 10:10 10:50
      Unbinned Discriminative ML methods overview (20'+20') 40m
    • 10:50 11:20
      Coffee 30m
    • 11:20 12:00
      Unbinned Generative ML methods overview (20'+20') 40m
      Speakers: Javier Marino, Nathan Hutsch
    • 12:00 13:30
      Lunch 1h 30m
    • 13:30 14:10
      Performance / benchmarking (20'+20') 40m
      Speakers: Dr Carsten Burgard (Technische Universitaet Dortmund (DE)), Lydia Brenner (Nikhef National institute for subatomic physics (NL))
    • 14:10 14:50
      How to pick the regularization (20'+20') 40m
      Speaker: Lydia Brenner (Nikhef National institute for subatomic physics (NL))
    • 14:50 15:20
      Coffee 30m
    • 15:20 16:50
      Contributed talks 1h 30m
    • 09:30 10:50
      Challenges
      • 09:30
        Dealing with Uncertainties 20m
        Speaker: Kyle Cormier (University of Zurich (CH))
      • 10:10
        Response Matrix Estimation in Unfolding Differential Cross Sections 20m

        In unfolding problem, the response matrix is the forward operator which models the detector response. In practice, the response matrix is not known analytically. Instead, it needs to be estimated using Monte Carlo simulation, which introduces statistical uncertainty into the unfolding procedure. This raises the question of how to estimate the response matrix in a sensible way. In most analyses at the LHC, this is done by binning the events and counting the corresponding numbers of events from bins to bins. However, this approach can suffer from undersmoothing, especially with a small sample size. To address this issue, we propose a two-step approach to response matrix estimation. First, we estimate the response kernel on the unbinned space. Second, we propagate the estimated response kernel into an integral equation to obtain an estimate for the response matrix.

        Speaker: Richard Zhu (Carnegie Mellon University)
    • 10:50 11:20
      Coffee 30m
    • 11:20 12:00
      Challenges
      • 11:20
        Open discussion on challenges and possible solutions in unfolding 40m
        Speaker: Bogdan Malaescu (LPNHE-Paris CNRS/IN2P3 (FR))
    • 12:00 13:30
      Lunch 1h 30m
    • 13:30 17:30
      Tutorials
      Conveners: Dr Carsten Burgard (Technische Universitaet Dortmund (DE)), Javier Marino, Lydia Brenner (Nikhef National institute for subatomic physics (NL)), Nathan Hutsch, Vincent Alexander Croft (Nikhef National institute for subatomic physics (NL))
    • 18:30 20:30
      Conference dinner 2h
    • 09:30 10:50
      New/related methods
      • 09:30
        Simplified Template Cross Sections (STXS) 20m
        Speaker: Rahul Balasubramanian (Centre National de la Recherche Scientifique (FR))
      • 10:10
        QUnfold: Quantum Annealing for Distributions Unfolding in High-Energy Physics 20m

        In High-Energy Physics (HEP) experiments, each measurement apparatus exhibit a unique signature in terms of detection efficiency, resolution, and geometric acceptance. The overall effect is that the distribution of each observable measured in a given physical process could be smeared and biased. Unfolding is the statistical technique employed to correct for this distortion and restore the original distribution. This process is essential to make effective comparisons between the outcomes obtained from different experiments and the theoretical predictions.
        The emerging technology of Quantum Computing represents an enticing opportunity to enhance the unfolding performance and potentially yield more accurate results.
        This work introduces QUnfold, a simple Python module designed to address the unfolding challenge by harnessing the capabilities of quantum annealing. In particular, the regularized log-likelihood minimization formulation of the unfolding problem is translated to a Quantum Unconstrained Binary Optimization (QUBO) problem, solvable by using quantum annealing systems. The algorithm is validated on a simulated sample of particles collisions data generated combining the Madgraph Monte Carlo event generator and the Delphes simulation software to model the detector response. A variety of fundamental kinematic distributions are unfolded and the results are compared with conventional unfolding algorithms commonly adopted in precision measurements at the Large Hadron Collider (LHC) at CERN.
        The implementation of the quantum unfolding model relies on the D-Wave Ocean software and the algorithm is run by heuristic classical solvers as well as the physical D-Wave Advantage quantum annealer boasting 5000+ qubits.

        Speakers: Dr Gianluca Bianco (Universita e INFN, Bologna (IT)), Simone Gasperini (Universita e INFN, Bologna (IT))
    • 10:50 11:20
      Coffee 30m
    • 11:20 12:00
      New/related methods
      • 11:20
        Full Event Particle-Level Unfolding with Variable Length Latent Variational Diffusion 20m

        Collisions at the Large Hadron Collider (LHC) provide information about the values of parameters in theories of fundamental physics. Extracting measurements of these parameters requires accounting for effects introduced by the particle detector used to observe the collisions. The typical approach is to use a high-fidelity simulation of the detector to generate synthetic datasets that can then be compared directly with experimental data. However, these simulations are often proprietary and computationally expensive. An alternative approach, unfolding, statistically adjusts the experimental data for detector effects. Traditional unfolding algorithms require binning data in a small set of pre-selected dimensions. Recent methods using generative machine learning models have shown promise for performing un-binned unfolding in high dimensions, allowing later computation of many observables. However, all current generative approaches are limited to unfolding a fixed set of observables, making them unable to perform full-event unfolding in the variable dimensional environment of collider data. A novel modification to the variational latent diffusion model (VLD) approach to generative unfolding is presented, which allows for unfolding of high- and variable-dimensional feature spaces. The performance of this method is evaluated in the context of semi-leptonic $t\bar{t}$ production at the LHC. Additionally, the dependence of the unfolding on the training data prior is assessed by evaluating the model on datasets with alternative priors.

        Speakers: Alexander Shmakov (University of California Irvine (US)), Kevin Thomas Greif (University of California Irvine (US))
    • 12:00 13:30
      Lunch 1h 30m
    • 13:30 14:00
      HEP summary 30m
    • 14:00 14:30
      Statistics summary 30m
    • 14:30 15:00
      ML summary 30m
    • 15:00 15:30
      Closing of the meeting 30m