Analysis and CP Tools Attila Krasznahorkay Overview • Is supposed to be a description of the tools that you’ll be using in your data analysis outside of Athena • You can find a tutorial on how to do realistic data analysis in Athena here • You will be able to try out all of them in the hands on part in the afternoon • As yesterday, this is mostly to convey the basic design ideas 2 RootCore • Is a lightweight build system used for simple analysis packages • • Used to be merely a smart Makefile at first These days it’s a bunch of Python/shell code that parses the description file in your package and creates a Makefile based on that • Full documentation is available here • You’ll see much more in the hands on part, but simple operation is usually like the following: # # # # rc rc rc rc checkout_pkg atlas-krasznaa/AODUpgrade/xAODPerformance/trunk find_packages compile version 3 RootCore • Is a lightweight build system used for simple analysis packages • • Used to be merely a smart Makefile at first These days it’s a bunch of Python/shell code that parses the description file in your package and creates a Makefile based on that • Full documentation is available here • You’ll see much more in the hands on part, but simple operation is usually like the following: Check out this package from SVN # # # # rc rc rc rc checkout_pkg atlas-krasznaa/AODUpgrade/xAODPerformance/trunk find_packages compile version 3 RootCore • Is a lightweight build system used for simple analysis packages • • Used to be merely a smart Makefile at first These days it’s a bunch of Python/shell code that parses the description file in your package and creates a Makefile based on that • Full documentation is available here • You’ll see much more in the hands on part, but simple operation is usually like the following: Check out this package from SVN # # # # rc rc rc rc checkout_pkg atlas-krasznaa/AODUpgrade/xAODPerformance/trunk find_packages Find all packages in the work area compile version 3 RootCore • Is a lightweight build system used for simple analysis packages • • Used to be merely a smart Makefile at first These days it’s a bunch of Python/shell code that parses the description file in your package and creates a Makefile based on that • Full documentation is available here • You’ll see much more in the hands on part, but simple operation is usually like the following: Check out this package from SVN # # # # rc rc rc rc checkout_pkg atlas-krasznaa/AODUpgrade/xAODPerformance/trunk find_packages Find all packages in the work area compile Compile all local packages version 3 RootCore • Is a lightweight build system used for simple analysis packages • • Used to be merely a smart Makefile at first These days it’s a bunch of Python/shell code that parses the description file in your package and creates a Makefile based on that • Full documentation is available here • You’ll see much more in the hands on part, but simple operation is usually like the following: Check out this package from SVN # # # # rc rc rc rc checkout_pkg atlas-krasznaa/AODUpgrade/xAODPerformance/trunk find_packages Find all packages in the work area compile Compile all local packages version Show the version of all used packages 3 rcSetup • Just as for Athena we have helper code for setting up the environment (AtlasSetup), we have helper code for setting up analysis releases as well • The purpose is the same. Set up the runtime environment such that RootCore would find both the local and the centrally provided packages, and be able to compile them against each other. • Full documentation is available here • But we will just use it very simply like the following today: # setupATLAS # rcSetup Base,2.0.22 4 xAODRootAccess • The package holding all the code for accessing xAOD files outside of Athena • Available under Control/xAODRootAccess • • Athena (StoreGateSvc) D3PDReader • Combines design ideas from: • Few main interfaces/functions in the package, and a bunch of helper classes • • • • xAOD::Init(): Function setting up the “environment” for xAOD access xAOD::TEvent: The main class for interacting with xAOD files xAOD::TStore: A generic transient object store xAOD::MakeTransientTree(): Function(s) creating a TTree for interactive use 5 xAOD::TEvent - A Lightweight Event Store • A combination of the offline software’s StoreGateSvc, and of the D3PDReader code • The interface is that of StoreGateSvc, while the underlying implementation comes mostly from the D3PDReader code • Can be connected to one/no input/output output file • In any combination 6 xAOD::TEvent - A Lightweight Event Store 7 xAOD::TStore - A Lightweight Transient Store • Provides the same interface as TEvent, but for storing transient information during the analysis • Can be a convenient way of passing information around between parts of the analysis code • Used by dual use tools to record information meant for the user • Who can then decide to put the objects into TEvent (and the output file) or not 8 xAOD::TStore - A Lightweight Transient Store 9 Analysis Tools in Run 1 • Written to work using the D3PD variables • Using them from Athena was in many cases near impossible (many D3PD variables were calculated in non-trivial ways from the AOD information) • Simple C++ classes, mostly written absolutely from scratch • • • • Receiving configuration in their own way Printing messages in their own way Receiving information about objects in their own way Receiving setup for systematics application in their own way • Usually had to get familiar with the implementation of each tool in order to use it correctly 10 Dual-Use Tools • Allows us to write a single source • Documented on: https://cds.cern.ch/record/1639568 • Doesn’t mean a binary compatibility • between Athena and ROOT! Only means a source compatibility, as long as you follow the recommendations… The interface to the tool developers is very similar to the “regular” Athena tools • For things like accessing event- and metadata, printing messages, etc. 11 Not reviewed, for internal circulation only code that can work in Athena and ROOT Draft version 0.051 ATLAS NOTE January 17, 2014 Dual-use tools in ATLAS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 D. Adamsa , P.A. Delsartb , M. Elsingc , K. Köneked , E. Lançone , W. Lavrijsenf , S. Strandbergg , W. Verkerkeh , I. Vivarellii , M. Woudstraj a Brookhaven National Laboratory, USA Laboratoire de Physique Subatomique et de Cosmologie, Université Joseph Fourier and CNRS/IN2P3 and Institut National Polytechnique de Grenoble, Grenoble, France c CERN, Geneva, Switzerland d Fakultät für Mathematik und Physik, Albert-Ludwigs-Universität, Freiburg, Germany e DSM/IRFU, CEA Saclay, France f Physics Division, Lawrence Berkeley National Laboratory and University of California, Berkeley CA, United States of America g Department of Physics, Stockholm University h Nikhef National Institute for Subatomic Physics and University of Amsterdam, Amsterdam, Netherlands i Department of Physics and Astronomy, University of Sussex, Brighton, United Kingdom j University of Manchester, UK b Abstract This note provides recommendations and guidelines for the implementation and interfaces of dual-use tools. With the event of xAODs, analysis and combined performance software packages should be available through both Athena and ROOT frameworks. Standard interfaces for these dual-tool tools have to be defined. Current use cases are listed and requirements for the tools are presented in this note together with recommendations for the implementations. Naming and coding conventions are also proposed. Examples and class diagrams are provided in the appendix. c Copyright 2014 CERN for the benefit of the ATLAS Collaboration. Reproduction of this article or parts of it is allowed as specified in the CC-BY-3.0 license. Tool Design Guidelines • In order to provide the same, or very similar Not reviewed, for internal circulation only AMSG Task Force 3 Draft 0.9, June 30, 2014 Design Guidelines for Combined Performance tools intended to be used in Physics analysis interfaces to all the tools, a set of guidelines were written up Introduction This documents describes a set of design guidelines for analysis tools provided by ATLAS (combined) performance groups that are intended for physics analysis. The goal of this document is to arrive at a homogenous collection of CP analysis tools where tools that perform similar tasks have a similar look-and-feel, and that routine physics analysis tasks, such as the evaluation of systematic uncertainties, are not straightforward to implement by physics users. This document describes the interface for the implementation of the core functionality analysis tools. The basic interfaces for ROOT/Athena dual-use tools to access configuration data, event store data etc are described in [1]. • https://cds.cern.ch/ Version History • v0.5 - LaTeX version of slides discussed in the TF3 meeting[2] on Feb 17, and ASG plenary session in the S&C week[3] on Feb 25. record/1667206 • v0.8 - Incorporates feedback from ASG plenary session, following TF3 discussion[4] and CDS comments up to March 21. • v0.9 - Adds the recommendation of storing a pointer to the original object when making a copy in the tools. Finalises recommendations about systematic uncertainties. 1 • These were extensively Naming and types of tools Almost all tools provided by CP groups to physics users that operate on single objects can be classified in four types discussed in tutorials to the tool developers • Object Calibration Tools - These tools adjust the kinematics of objects in a deterministic way. • Object Smearing Tools - These tools adjust the kinematics of objects in a stochastic way, i.e. there is a randomized process that applies a di↵erent correction to each object with the intent to smear kinematic resolutions. • Object Efficiency Correction Tools - These tools do not modify the kinematics of an object, but adjust for selection efficiencies that occur in the triggering, reconstruction and/or identication of the object. The output is an object weight that should be propagated into subsequent analysis steps (and ultimately into the statistical analysis). • Check here if you need to work on such a tool 1 12 Design Guidelines 1. Naming of tools • CP tools falling under one of these categories need to carry the name Calibration, Smearing, Efficiency or Selection in their name 2. Calibration and smearing tool interfaces • Tools need to provide these sort of functions for applying their calibration/smearing 13 Design Guidelines 3. Efficiency tool interfaces • Tools need to functions like: 4. A number of additional rules that are not worth going into detail about • Please read the CDS document for details 14 CP Tools and Systematics • What we did with tools in the past: • • • Every developer implemented a slightly different way to tell the tool what sort of systematic variation to apply at any given moment Mostly done through extra function parameter values Required the user to write quite a bit of extra/smart code around the tools in order to iterate over all the systematics • The dual-use tools help us with this as well! • All tools that implement systematic variations, need to implement the CP::ISystematicsTool interface 15 CP::ISystematicsTool 16 CP::SystematicRegistry • Your (singleton) interface to configure the behaviour of all the • instantiated CP tools Currently it provides you with the systematic variations that you are advised to take into account with the tools that you’re currently using • Later on I also want to teach it to talk with the CP tools themselves • So that analysis code would only ever need to talk to this one interface to configure its tools for which systematic variation they should apply at any given moment 17 Systematics Coding Guidelines • A full example and a helper framework is in development (see here for more details) • But the proper usage of the tools will not depend on using this new framework (QuickAna) • The exact way to incorporate systematics into a given analysis is hard to give a recipe on • • The best method may differ wildly based on the base code But a few better developed examples will definitely be provided • All in all, we didn’t want to re-invent the wheel here, just to avoid forcing everybody to write the exact same technical code over and over again 18 EventLoop • Is a lightweight implementation of the most important part of Athena: the event loop • • It just provides the ability to schedule algorithms in a sequence, and provide ntuple/xAOD files as input to these algorithms No specific support for generic services and tools is given • In order to keep the code simple • The tutorial is mostly about introducing you to writing code for EventLoop • Writing an EventLoop algorithm class, and running it • The full documentation is available here 19 SampleHandler • In a full physics analysis you need to treat possibly hundreds of datasets • And bookkeep how you produced output files / histograms from each one of them • SampleHandler is code meant to help with this • • It simplifies handling locally downloaded datasets and DQ2/ Rucio datasets It keeps metadata on your output files for future reference • In the tutorial only relatively simple use cases are • shown, but that should be possible to extrapolate to realistic setups The full documentation is available here 20 Summary • Once again, this was just a run through the more important aspects of the analysis tools • The hands on tutorial will show practically all of these tools in action… • In general, if you need some tool for your analysis that seems like it should exist already, it probably does • But if it doesn’t, and there’s agreement that it should be made, there are two options: • • You make a request, and one of the analysis tool developers implements it You implement it yourself, and get (OTP) credit for the development and maintenance • Finally, the place to discuss all analysis related issues is atlas-sw-analysis-forum@cern.ch 21 For the Adventurous… • After installing an analysis release, it is possible to start writing your own code using smart IDEs • • • Mostly tested with MacOS X / Xcode, but KDevelop should work as well Project still in very early stage If you’re interested, I can show you how to set it up 22 For the Adventurous… • After installing an analysis release, it is possible to start writing your own code using smart IDEs • • • Mostly tested with MacOS X / Xcode, but KDevelop should work as well Project still in very early stage If you’re interested, I can show you how to set it up 22 For the Adventurous… • After installing an analysis release, it is possible to start writing your own code using smart IDEs • • • Mostly tested with MacOS X / Xcode, but KDevelop should work as well Project still in very early stage If you’re interested, I can show you how to set it up 22 Backup 23 Writing Code From Scratch • You can always just write a simple executable “from scratch” when you want to test simple things • You still need to use RootCore for the compilation unless you want to cause yourself a lot of issues • Things you need to do to access a file: • • • • • • Call xAOD::Init() before doing anything else basically Create an xAOD::TEvent object Either create a TChain with the files you will be processing, or open the files one by one in a loop Call xAOD::TEvent::readFrom(…) on either the opened TFile or configured TChain pointer Call xAOD::TEvent::getEntry(…) to select which event to look at Start retrieving objects/containers using xAOD::TEvent::retrieve<…>(…) 24 Writing Code in a TSelector • While the recommendation is to use EventLoop if you want to write new analysis code, if you want to adapt your old TSelector based analysis to xAOD input, that’s possible to do • Note that PROOF-Lite is working fine, but full-blown PROOF support is largely untested at the moment • Short instructions: • • • • Forward declare xAOD::TEvent in the TSelector class’s header, not to let rootcint see the xAOD code. Create the object in the constructor with new. Remember the TTree pointer in Init(TTree*) Use this pointer in Notify() to call xAOD::TEvent::readFrom(TFile*) with (see next slide) In Process(Long64_t) call xAOD::TEvent::getEntry(…) with the provided event number 25 Writing Code in a TSelector 26 Writing Code in a TSelector 26 Writing Code in PyROOT • Since using xAOD::TEvent from interactive ROOT or PyROOT would be impractical (dictionaries for template functions…), the way of accessing xAOD files from such environments is using “transient trees” • • Concept taken directly from AthenaROOTAccess xAOD::MakeTransientTree creates a TTree object that is owned by the function, and provides a functional view of the xAOD interface containers of the file • The returned object is actually of type xAOD::TTree, inheriting from ROOT’s TTree • To be used only from interactive ROOT / PyROOT • The transient tree code is not particularly efficient. It’s not terrible, but in compiled code we can do better by using xAOD::TEvent directly. 27 xAOD Access with PyROOT • Implemented in a similar way to how AthenaRootAccess (ARA) works • You create a “transient tree” in memory that represents the data in the file, and interact with this tree instead of the one that’s in the file itself 28 Finding Examples • Many standalone examples (or test codes) exist already. More advanced users should feel free to browse them for inspiration. • • • • https://svnweb.cern.ch/trac/atlasgroups/browser/PAT/ AODUpgrade/CPToolTests/trunk https://svnweb.cern.ch/trac/atlasgroups/browser/PAT/ AODUpgrade/xAODSelector/trunk https://svnweb.cern.ch/cern/wsvn/atlas-krasznaa/ AODUpgrade/xAODTest/trunk/ https://svnweb.cern.ch/cern/wsvn/atlas-krasznaa/ AODUpgrade/xAODCopyTest/trunk/ 29