This C++ version of BAT is still being maintained, but addition of new features is unlikely. Check out our new incarnation, BAT.jl, the Bayesian analysis toolkit in Julia. In addition to Metropolis-Hastings sampling, BAT.jl supports Hamiltonian Monte Carlo (HMC) with automatic differentiation, automatic prior-based parameter space transformations, and much more. See the BAT.jl documentation.

Signal search in the presence of background - tutorial for BAT 0.9

Physics motivation

The search for a weak signal in the presence of background is a common situation in particle physics. In particular, searches for neutrino-less double-beta-decay are limited by the background level reached by experiments. In this case, the signal process - if present - would cause a sharp line with known shape and position in the energy spectrum measured in the experiments. The expected number of signal and background events can reach a level of the order of 10 in the region of interest resulting in count-rates of less than one event per bin. Approximations valid for large statistics do not apply in this case.

Tutorial

This tutorial shows how to judge the validity of models and how to compare those models. Different goodness-of-fit tests are compared. The base line for the analysis presented is the following: the data is assumed to be composed of events from different contributions, e.g., various background sources and a signal process. The shapes of the resulting distributions are assumed to be known and stacked on top of each other. A binned Likelihood fit, assuming independent Poissonian fluctuations in each bin, is performed by scaling the distributions, or templates. The scale factors of the contributions are the parameters of the model and named N1, N2, etc.

The tutorial is split into six steps:


Step 1 - Getting started

Create the HypothesisTest project. Compile and run the code. You can clean up the runHypothesisTest.cxx file from the automatically generated code that are not needed. Modify the file runHypothesisTest.cxx such that the histogram containing the data, hist_data, is read in from the file histograms.root and plot it. The ROOT file can be found here.

Solution


Step 2 - Fitting a background-only model

The first model we test considers only background. The shape of the distribution is assumed to be flat in the region of interest.

  1. Create a new instance of BCTemplateFitter.

  2. Read in the background template (histogram) hist_bkg1 from the ROOT file.

  3. Add the histogram to the BCTemplateFitter using AddTemplate (TH1D hist, const char *name, double Nmin=0, double Nmax=0) where Nmin and Nmax are the minimum and maximum values of the normalization. The normalization is the integral of all contributions over the whole range.

  4. Run the Markov chains and estimate the global mode.

  5. Print the stack plot using the method BCTemplateFitter::PrintStack(const char * filename).

Solution


Step 3 - Introduction to the goodness-of-fit tests

Does the model describe the data? Perform the following three tests for the goodness of fit and compare the results with an inspection by eye:

  1. χ2-test using the methods double BCTemplateFitter::CalculateChi2(std::vector<double> parameters), int BCTemplateFitter::GetNDF(), and double BCTemplateFitter::GetChi2Prob(std::vector<double> parameters);

  2. Kolmogorov-Smirnov test using the method double BCTemplateFitter::CalculateKSProb();

  3. p-value test using the method double BCTemplateFitter::CalculatePValue().

Solution


Step 4 - Goodness-of-fit tests for signal and background models

After the goodness-fit-test for the background-only model, a second model is introduced which contains a signal contribution. The shape of the signal is a Gaussian centered around 2,040 keV with a width of σ = 2.5 keV.

  1. Create a second BCTemplateFitter and read in the signal template, hist_sgn, from the ROOT file.

  2. Estimate the signal contribution (95% limit) and print out the marginalized distribution.

  3. Perform the goodness-of-fit tests introduced in the previous step for the signal model. Does the signal model describe the data?

Solution


Step 5 - Comparing signal and background models

Both models describe the data. Compare the models directly by using the BCModelManager.

  1. Create an instance of the BCModelManager.

  2. Add the two models.

  3. Calculate the Bayes Factor for the two models using the method BCModelManager::BayesFactor(modelindex1,modelindex2). The method can also perform a calculation of other useful quantities like p-values, posterior probabilities.

Solution


Step 6 - Testing more complex models

A better model for the background could be a composition of two distributions: a flat one and an expotentially decaying one.

  1. Create two new BCTemplateFitter and read in the second background template, hist_bkg2, from the ROOT file.
  2. Perform the same tests as in the previous step and compare the results. Which of the four models describes the data best? Which model is preferred over which?
  3. Compare the two signal models (with one and two sources of background, respectively) and calculate the limit on the signal contribution. How does the background model change the limits?

Solution



Further material and documentation

A sensitivity study for the GERDA experiment can be found here:




Kevin Kröninger
Last modified: Fri Feb 25 14:17:23 CET 2011