Related Publications

Marco Foco, Max Rietmann, Vassil Vassilev, Michael Wong, et. al., P2072R0: Differentiable programming for C++ (2020).

  • Derivatives are vital a wide variety of computing applications, including numerical optimization, solution of nonlinear equations, sensitivity analysis, and nonlinear inverse problems. Virtually every process could be described with a mathematical function. A mathematical function can be thought of as an association between elements from different sets. Derivatives can track how a varying quantity depends on another quantity, for example how the position of a planet as the time varies. Derivatives and gradients allows us to explore the properties of a function and thus the described process as a whole. Also, gradients are an essential component in gradient-based optimization methods, that have become more and more important in recent years, in particular with its application training of (Deep) Neural Networks. Derivatives can be computed numerically, but unfortunately the finite differences methods are problematic due to the approximation operated and the finite precision of floating point values used, and so the implementation of the method faces precision and round off problems which can affect the overall precision of the computation. This problem becomes worse with higher order derivatives.

Javier López-Gómez, Javier Fernández, David del Rio Astorga, Vassil Vassilev, et. al., Relaxing the one definition rule in interpreted C++ (2020).

  • Most implementations of the C++ programming language generate binary executable code. However, interpreted execution of C++ sources has its own use cases as the Cling interpreter from CERN’s ROOT project has shown. Some limitations are derived from the ODR (One Definition Rule) that rules out multiple definitions of entities within a single translation unit (TU). ODR is there to ensure uniform view of a given C++ entity across translation units. Ensuring uniform view of C++ entities helps when producing ABI compatible binaries. Interpreting C++ presumes a single ever-growing translation unit that define away some of the ODR use-cases. Therefore, it may well be desirable to relax the ODR and, consequently, to support the ability of developers to override any existing definition for a given declaration. This approach is especially well-suited for iterative prototyping. In this paper, we extend Cling, a Clang/LLVM …

Martin Vassilev, Vassil Vassilev, Alexander Penev, IDD – A Platform Enabling Differential Debugging Cybernetics and Information Technologies 20 53-67 (2020).

  • Debugging is a very time consuming task which is not well supported by existing tools. The existing methods do not provide tools enabling optimal developers’ productivity when debugging regressions in complex systems. In this paper we describe a possible solution aiding differential debugging. The differential debugging technique performs analysis of the regressed system and identifying the cause of the unexpected behavior by comparing to a previous version of the same system. The prototype, idd, inspects two versions of the executable – a baseline and a regressed version. The interactive debugging session runs side by side both executables and allows to examine and to compare various internal states. The architecture can work with multiple information sources comparing data from different tools. We also show how idd can detect performance regressions using information from third-party performance facilities. We illustrate how in practice we can quickly discover regressions in large systems such as the clang compiler.

Yuka Takahashi, Oksana Shadura, Vassil Vassilev, Migrating large codebases to C++ Modules Journal of Physics: Conference Series 1525 012051 (2020).

  • ROOT has several features which interact with libraries and require implicit header inclusion. This can be triggered by reading or writing data on disk, or user actions at the prompt. Often, the headers are immutable, and reparsing is redundant. C++ Modules are designed to minimize the reparsing of the same header content by providing an efficient on-disk representation of C++ Code. ROOT has released a C++ Modules-aware technology preview, which intends to become the default for the ROOT 6.20 release.

Vassil Vassilev, David Lange, Malik Shahzad Muzzafar, Mircho Rodozov, et. al., C++ Modules in ROOT and Beyond arXiv preprint arXiv:2004.06507 (2020).

  • C++ Modules come in C++ 20 to fix the long-standing build scalability problems in the language. They provide an io-efficient, on-disk representation capable to reduce build times and peak memory usage. ROOT employs the C++ modules technology further in the ROOT dictionary system to improve its performance and reduce the memory footprint.ROOT with C++ Modules was released as a technology preview in fall 2018, after intensive development during the last few years. The current state is ready for production, however, there is still room for performance optimizations. In this talk, we show the roadmap for making the technology default in ROOT. We demonstrate a global module indexing optimization which allows reducing the memory footprint dramatically for many workflows. We will report user feedback on the migration to ROOT with C++ Modules.

Kim Albertsson Brann, Sitong An, Vassil Vassilev, Enric Tejedor Saavedra, et. al., arXiv: Software Challenges For HL-LHC Data Analysis (2020).

  • The high energy physics community is discussing where investment is needed to prepare software for the HL-LHC and its unprecedented challenges. The ROOT project is one of the central software players in high energy physics since decades. From its experience and expectations, the ROOT team has distilled a comprehensive set of areas that should see research and development in the context of data analysis software, for making best use of HL-LHC’s physics potential. This work shows what these areas could be, why the ROOT team believes investing in them is needed, which gains are expected, and where related work is ongoing. It can serve as an indication for future research proposals and cooperations.

Vassil Vassilev, Aleksandr Efremov, Oksana Shadura, Automatic Differentiation in ROOT arXiv preprint arXiv:2004.04435 (2020).

  • In mathematics and computer algebra, automatic differentiation (AD) is a set of techniques to evaluate the derivative of a function specified by a computer program. AD exploits the fact that every computer program, no matter how complicated, executes a sequence of elementary arithmetic operations (addition, subtraction, multiplication, division, etc.), elementary functions (exp, log, sin, cos, etc.) and control flow statements. AD takes source code of a function as input and produces source code of the derived function. By applying the chain rule repeatedly to these operations, derivatives of arbitrary order can be computed automatically, accurately to working precision, and using at most a small constant factor more arithmetic operations than the original program.This paper presents AD techniques available in ROOT, supported by Cling, to produce derivatives of arbitrary C/C++ functions through implementing source code transformation and employing the chain rule of differential calculus in both forward mode and reverse mode. We explain its current integration for gradient computation in TFormula. We demonstrate the correctness and performance improvements in ROOT’s fitting algorithms.

Oksana Shadura, Brian Paul Bockelman, Vassil Vassilev, Extending ROOT through Modules EPJ Web of Conferences 214 05011 (2019).

  • The ROOT software framework is foundational for the HEP ecosystem, providing multiple capabilities such as I/O, a C++ interpreter, GUI, and math libraries. It uses object-oriented concepts and build-time components to layer between them. We believe that a new layering formalism will benefit the ROOT user community. We present the modularization strategy for ROOT which aims to build upon the existing source components, making available the dependencies and other metadata outside of the build system, and allow post-install additions on top of existing installation as well as in the ROOT runtime environment. Components can be grouped into packages and made available from repositories in order to provide a post-install step of missing packages. This feature implements a mechanism for the more comprehensive software ecosystem and makes it available even from a minimal ROOT installation. As part of …

Oksana Shadura, Vassil Vassilev, Brian Paul Bockelman, Continuous Performance Benchmarking Framework for ROOT EPJ Web of Conferences 214 05003 (2019).

  • Foundational software libraries such as ROOT are under intense pressure to avoid software regression, including performance regressions. Continuous performance benchmarking, as a part of continuous integration and other code quality testing, is an industry best-practice to understand how the performance of a software product evolves. We present a framework, built from industry best practices and tools, to help to understand ROOT code performance and monitor the efficiency of the code for several processor architectures. It additionally allows historical performance measurements for ROOT I/O, vectorization and parallelization sub-systems.

Y Takahashi, V Vassilev, O Shadura, R Isemann, Optimizing Frameworks’ Performance Using C++ Modules Aware ROOT EPJ Web of Conferences 214 02011 (2019).

  • ROOT is a data analysis framework broadly used in and outside of High Energy Physics (HEP). Since HEP software frameworks always strive for performance improvements, ROOT was extended with experimental support of runtime C++ Modules. C++ Modules are designed to improve the performance of C++ code parsing. C++ Modules offers a promising way to improve ROOT’s runtime performance by saving the C++ header parsing time which happens during ROOT runtime. This paper presents the results and challenges of integrating C++ Modules into ROOT.

V Vassilev, Optimizing ROOT’s Performance Using C++ Modules Journal of Physics: Conference Series 898 1742-6596 (2016).

  • ROOT comes with a C++ compliant interpreter cling. Cling needs to understand the content of the libraries in order to interact with them. Exposing the full shared library descriptors to the interpreter at runtime translates into increased memory footprint. ROOT’s exploratory programming concepts allow implicit and explicit runtime shared library loading. It requires the interpreter to load the library descriptor. Re-parsing of descriptors’ content has a noticeable effect on the runtime performance. Present state-of-art lazy parsing technique brings the runtime performance to reasonable levels but proves to be fragile and can introduce correctness issues. An elegant solution is to load information from the descriptor lazily and in a non-recursive way.The LLVM community advances its C++ Modules technology providing an io-efficient, on-disk representation capable to reduce build times and peak memory usage. The feature …

V Vassilev, Native Language Integrated Queries with CppLINQ in C++ Journal of Physics: Conference Series 608 012030 (2015).

  • Programming language evolution brought to us the domain-specific languages (DSL). They proved to be very useful for expressing specific concepts, turning into a vital ingredient even for general-purpose frameworks. Supporting declarative DSLs (such as SQL) in imperative languages (such as C++) can happen in the manner of language integrated query (LINQ).

V Vassilev, M Vassilev, A Penev, L Moneta, et. al., Clad—automatic differentiation using Clang and LLVM Journal of Physics: Conference Series 608 012055 (2015).

  • Differentiation is ubiquitous in high energy physics, for instance in minimization algorithms and statistical analysis, in detector alignment and calibration, and in theory. Automatic differentiation (AD) avoids well-known limitations in round-offs and speed, which symbolic and numerical differentiation suffer from, by transforming the source code of functions. We will present how AD can be used to compute the gradient of multi-variate functions and functor objects. We will explain approaches to implement an AD tool. We will show how LLVM, Clang and Cling (ROOT’s C++ 11 interpreter) simplifies creation of such a tool. We describe how the tool could be integrated within any framework. We will demonstrate a simple proof-of-concept prototype, called Clad, which is able to generate n-th order derivatives of C++ functions and other language constructs. We also demonstrate how Clad can offload laborious computations …

Violeta Ilieva, Vassil Vassilev, clad - Automatic Differentiation with Clang (2013).

  • Differentiation is the process of finding a derivative, which measures how a function changes as its input changes. Derivatives can be evaluated with machine precision accuracy through a method called automatic differentiation. Unlike other methods for differentiation, including numerical and symbolic, automatic differentiation yields exact derivatives even of complicated functions at relatively low processing and storage costs.

V Vassilev, Ph Canal, A Naumann, P Russo, Cling–the new interactive interpreter for ROOT 6 (2012).

  • Cling is an interactive C++ interpreter, built on top of Clang and LLVM compiler infrastructure. Like its predecessor Cint, Cling realizes the read-print-evaluate-loop concept, in order to leverage rapid application development. Implemented as a small extension to LLVM and Clang, the interpreter reuses their strengths such as the praised concise and expressive compiler diagnostics. We show how to match the interpreter concept to the compiler library and generalize common set of requirements for building up an interactive interpreter. We reason the design and implementation decisions as solution to the challenge of implementing interpreter behaviour as an extension of the compiler library. We present the new features, eg how C++11 will come to Cling and how CINT-specific extensions are being adopted. We clarify the state of integration in the ROOT framework and the induced change set. We explain how ROOT …