The project meetings serve as a place to gain insights into the ongoing activities conducted by our group, exploring the intersection of compiler research and data science. The project meetings are virtual and typically take place the first Thursday of the month. While our current emphasis is on automatic differentiation, interpretative C/C++/CUDA, and C++ language interoperability with Python, we are equally dedicated to promoting the research and applications of programming languages infrastructure in various domains. Additionally, these meetings often feature invited speakers, contributing diverse perspectives and enriching discussions, fostering a collaborative environment for the exchange of knowledge and ideas.
If you are interested in our work you can join our compiler-research-announce google groups forum or follow us on LinkedIn.
CaaS Monthly Meeting – 12 December 2024 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
2024 was a busy year for our compiler research team. We have made substantial progress in automatic differentiation with Clad. We enabled large physics workflows and observed substantial performance gains in minimization of likelihoods. We also progressed in advancing foundational tools such as ClangRepl, BioDynaMo, Jupyter, LLVM and ROOT.
This talk provides a brief overview of what our team has done in 2024.
CaaS Monthly Meeting – 07 November 2024 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
The Cling C++ interpreter has transformed language bindings by enabling incremental compilation at runtime. This allows Python to interact with C++ on demand and lazily construct bindings between the two. The emergence of Clang-REPL as a potential alternative to Cling within the LLVM compiler framework highlights the need for a unified framework for interactive C++ technologies. This talk presents CppInterOp, a C++ Interoperability library, which leverages Cling and LLVM’s Clang-REPL, to provide a minimalist and backward-compatible API facilitating seamless language interoperability. This provides downstream interactive C++ tools with the compiler as a service by embedding Clang and LLVM as libraries in their codebases. By enabling dynamic Python interactions with static C++ codebases, CppInterOp enhances computational efficiency and rapid development in high-energy physics. The library offers primitives enabling cppyy(PyROOT), an automatic, run-time, Python-C++ bindings generator. We showcase how CppInterOp optimizes cross-language execution and computational tasks in high-energy physics, making it a valuable tool for researchers and developers.
CaaS Monthly Meeting – 05 September 2024 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
CaaS Monthly Meeting – 11 July 2024 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
RooFit is a widely used data analysis library in High-energy physics experiments at CERN, which requires minimizing likelihoods with thousands of free parameters spread over hundreds of likelihood components. To iteratively find the minimum of a likelihood, one generally has to know the gradient of the likelihood with respect to all free parameters. Previously, these were calculated using numerical differentiation, which requires varying one parameter at a time and reevaluating the full likelihood. Although complete reevaluation can be prevented by a caching mechanism, it incurs a significant overhead in bookkeeping. This talk discusses the recent integration of automatic differentiation (AD) in RooFit for these gradient computations using Clad, a compile-time AD library for C++ codebases. We also present the efficiency improvements with the recently released Atlas and CMS Higgs models.
CaaS Monthly Meeting – 06 June 2024 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
CaaS Monthly Meeting – 04 April 2024 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
CaaS Monthly Meeting – 07 March 2024 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
Debugging is a very time consuming task which is not well supported by existing tools. The existing methods do not provide tools enabling optimal developers’ productivity when debugging regressions in complex systems. It is described in this presentation a possible solution aiding differential debugging. The differential debugging technique performs analysis of the regressed system and identifying the cause of the unexpected behavior by comparing to a previous version of the same system. The prototype, idd, inspects two versions of the executable – a baseline and a regressed version. The interactive debugging session runs side by side both executables and allows to examine and to compare various internal states. The architecture can work with multiple information sources comparing data from different tools. Also it will be shown how idd can detect performance regressions using information from third-party performance facilities.
CaaS Monthly Meeting – 01 February 2024 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
Single-core performance has long been saturated, and it is critical to exploit the peak performance of heterogeneous accelerators and specialized instructions in many applications. Because compilers are difficult to extend to support diverse and rapidly evolving hardware targets, and automatic optimization is often insufficient to guarantee state-of-the-art performance, high-performance libraries are commonly still coded and optimized by hand, at great expense, in low-level C and assembly. User schedulable languages, or USLs in short, have been proposed to address the challenge by decoupling algorithms and scheduling. In this talk, I will focus on one such USL, Exo, based on the principle of exocompilation: externalizing target-specific code generation support and optimization policies to user-level code. Exo allows custom hardware instructions, specialized memories, and accelerator configuration states to be defined in user libraries. I will also talk about other projects that borrow the idea from USLs and lessons we learned from the industry adoption of Exo.
CaaS Monthly Meeting – 11 January 2024 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
This talk will cover notable OSS achievements, especially those that are in contrast to big tech, striving to protect the consumers and their data.
We’ll start with GrapheneOS, a hardened fork of Android AOSP. Notably, GrapheneOS makes upstream contributions to the Android Open Source Project (AOSP) and other upstream projects, and many of its features were implemented in AOSP, Linux, LLVM, and other projects that GrapheneOS is based on.
GrapheneOS substantially expands the standard mitigations for memory corruption vulnerabilities, and some of these features are designed to directly catch the memory corruption bugs either via an explicit check or memory protection and abort the program to prevent them from being exploited.
Slides
CaaS Monthly Meeting – 07 December 2023 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
We will describe some of the past, ongoing and future activities of the compiler-research group.
SlidesMonte-Carlo simulators like Geant4 are heavily used in High Energy Physics to simulate the interaction between particles and the detector matter. Integrating algorithmic differentiation (AD) into such codes, e.g. in order to obtain gradients for design optimization, comes with a couple of technical and mathematical challenges.
On the technical side, most AD tools are not fully automatic for the entire language standard, but require certain manual interventions or exclude certain language constructs. The size and complexity of Geant4 with about one million lines of C++ code can turn seemingly little restrictions into a massive amount of effort required before being able to obtain algorithmic derivatives. In the first part of this talk, we will give a brief description of our novel AD tool Derivgrind, which brings the development efforts close to the possible minimum because it operates on the machine code of the compiled primal program. We will then demonstrate how it allows to compute correct algorithmic derivatives in a very artificial setup using Geant4 code.
On the mathematical side, stochasticity and the frequent use of non-differentiable operations like comparisons constitute severe challenges. While AD allows us to obtain floating-point-accurate derivatives of the precise sequence of real-arithmetic operations conducted by the primal program, they may come with very large variances, and their expected values may disagree with the derivative of the function represented by the Monte-Carlo simulation in the limit where the number of samples goes to infinity. These challenges need to be assessed and attacked based on a mathematical understanding of the computations performed by the Monte-Carlo simulation. In the second part of this talk, we will report on recent progress in this area.
CaaS Monthly Meeting – 20 September 2023 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
The C++ programming language is used for many numerically intensive scientific applications. A combination of performance and solid backward compatibility has led to its use for many research software codes over the past 20 years. Despite its power, C++ is often seen as difficult to learn and inconsistent with rapid application development. Exploration and prototyping is slowed down by the long edit-compile-run cycles during development.
In this talk we show how to leverage our experience in interactive C++, just-in-time compilation technology (JIT), dynamic optimizations, and large scale software development to greatly reduce the impedance mismatch between C++ and Python. We show how clang-repl generalizes Cling in LLVM upstream to offer a robust, sustainable and omnidisciplinary solution for C++ language interoperability. The demonstrate how we have:
The presentation includes interactive session where we demonstrate some of the capabilities of our system via the Jupyter interactive environment.
Slides Video
CaaS Monthly Meeting – 06 July 2023 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
CaaS Monthly Meeting – 01 June 2023 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
CaaS Monthly Meeting – 04 May 2023 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
CaaS Monthly Meeting – 06 April 2023 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
The Numba project engineers are undertaking research into some new technology in 2023, in this talk two of the larger areas of this research will be presented. One part of this research aims to alter the typical view of the compilation space as neither being JIT nor AOT, but as a user defined continuum of options. A new AOT output format is being developed, PIXIE, which helps to provide higher performance runtime execution and also allows for cross library/whole program optimisation, along with AOT re-specialisation. This talk will present the Numba 2023 target MVP, it will be able to combine AOT compiled C/C++ libraries from Clang based toolchains with JIT compiled Python code to present a continuum of compilation options to users based on their desired compilation and runtime performance characteristics.
Another part of the research being undertaken is into Regionalise Value State Dependence Graphs (RVSDGs). Its initial use will be to isolate the Numba compiler frontend from the ever changing semantics of Python byte code (Numba uses this as its input source, it’s a CPython implementation detail). It will do this by transforming the CFGs derived from the byte code into a structured form, that of the RVSDG. This talk will note some of the numerous research opportunities anticipated as a result of using this form, particularly with respect to IR design and the expected ease with which new compiler passes can be written.
Slides
CaaS Monthly Meeting – 02 March 2023 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
CaaS Monthly Meeting – 02 February 2023 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
Efficient multiline input support in clang-repl has been worked on over the last months. We formulated a RFC to extend the clang Lexer to receive incremental in LLVM community and implemented it with received feedbacks considered. Clang Lexer is now capable of lexing buffer growing line by line while handling all complicated preprocessor expressions correctly. Based on this groundwork, an efficient incremental input support was added to clang-repl.
I will demonstrate these new additions to clang-repl with demoes, talk about how do they work under the good, and talk about the future direction. I will also briefly show demos of enhanced Windows support in clang-repl featuring native static library and SEH exception handling.
Video
CaaS Monthly Meeting – 12 January 2023 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
Abstract: RISC-V is a new computer architecture. It currently receives a lot of attention from open source developers because the ISA itself is royalty free and vendors can create hardware without paying licensing fees. As with any new architecture, the software ecosystem has to catch up once actual hardware is available. In this talk, I will present the current status of Just-In-Time-Compilation support for RISC-V in LLVM, how I made clang-repl work on RISC-V to interpret C++ code and eventually bring up the whole of Cling and ROOT to run a first physics analysis on this new architecture.
Slides Video
CaaS Monthly Meeting – 03 November 2022 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
JITLink is a new JIT linker in LLVM developed to eliminate limitations in LLVM’s JIT implementation. With JITLink, it is not required to use special compilation flags or workarounds to load code into the JIT, since most of the object file features including small code model and thread local storage are fully implemented. This tutorial will explain how to use JITLink by working on a windows JIT application that just-in-time links to third-party static libraries. The tutorial will also dig into internals of JITLink by working on a JITLink plugin managing SEH exception tables.
Slides Video
CaaS Monthly Meeting – 06 October 2022 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
Following years of Lisp development, Clojure has entered the scene as a practical Lisp for the JVM. It takes a functional-first approach to designs and has extensive support for purely transforming persistent, immutable data structures. The interactive programming experience Clojure brings to the JVM is unmatched, built on both a JIT and technology such as nREPL. Clojure developers use this interactive programming to build their software from empty source files to working servers, GUI applications, and more without restarting the process.
In the spirit of Clojure, I would like to present jank, a research programming language which is a Clojure dialect built on native C++, rather than the JVM, using Cling as its JIT compiler. jank aims to be strongly compatible with Clojure while offering the same interactive programming experience. Like Clojure, jank aims to provide seamless interop with its host (meaning C++), allowing for easy consumption of third-party libraries. On top of a native runtime, jank also aims to provide gradual, structural typing, which will not be covered in this presentation.
Slides Video
CaaS Monthly Meeting – 01 September 2022 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
CaaS Monthly Meeting – 04 August 2022 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
CaaS Monthly Meeting – 07 July 2022 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
CaaS Monthly Meeting – 09 June 2022 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
The artistic live coding community has been growing steadily since around the year 2000. The Temporary Organisation for the Permanence of Live Art Programming (TOPLAP) has been around since 2004, Algorave (algorithmic rave parties) recently celebrated its tenth birthday, and six editions of the International Conference on Live Coding (ICLC) have been held. A great many live coding systems have been developed during this time, many of them exhibiting exotic and culturally specific features that professional software developers are mostly unaware of. This talk will introduce the artistic live coding context and community, and then describe attempts by this community to appropriate the Cling C++ interpreter for artistic practice. In one such example, Cling has been used as the basis for a C++ based live coding synthesiser [1]. In another example, Cling has been installed on a BeagleBoard to bring live coding to the Bela interactive audio platform [2]. Following these examples, I will offer some reflections on the potential mutual benefits for increased engagement between the Cling community and the artistic live coding community.
[1] tiny spectral synthesizer with livecoding support
[2] Using the Cling C++ Interpreter on the Bela Platform
CaaS Monthly Meeting – 05 May 2022 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
CaaS Monthly Meeting – 07 April 2022 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
CaaS Monthly Meeting – 10 March 2022 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
In 2021 a new remote-JIT layer “RemoteEPC” landed in LLVM’s OrcJIT library [0]. It separates serialization from transport, streamlines error behavior and integrates well with ExecutorProcessControl. Combined with the clean design and extensibility of ORCv2 and JITLink, it lowers the bar for building exotic out-of-process LLVM bitcode JITs.
The ez-clang project [1] makes use of all these to build a pure out-of-process REPL specifically designed for very low-resource embedded devices. The executor endpoint on the device is very simple and fits into a few Kilobytes of memory. All heavy lifting happens in the JIT process on the host.
I want to present my proof-of-concept implementation, which is based on
a hacked-up version of cling [2] (bringing in Clang-integration, a
command line and the concept of transaction-based incremental
compilation). Right now, it supports a small selection of Cortex-M
development boards. The smallest is the TeensyLC [3] with an ARMv6-M
instruction set, 62kb Flash memory and 8kb RAM.
[0] https://github.com/llvm/llvm-project/commit/bb27e4564355
[1] https://echtzeit.dev/ez-clang
[2] https://github.com/root-project/cling
[3] https://www.pjrc.com/teensy/teensyLC.html
CaaS Monthly Meeting – 03 February 2022 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
CaaS Monthly Meeting – 13 January 2022 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
CaaS Monthly Meeting – 09 December 2021 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
CaaS Monthly Meeting – 11 November 2021 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
CaaS Monthly Meeting – 07 October 2021 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
The C++ programming language is used for many numerically intensive scientific applications. A combination of performance and solid backward compatibility has led to its use for many research software codes over the past 20 years. Despite its power, C++ is often seen as difficult to learn and inconsistent with rapid application development. Exploration and prototyping is slowed down by the long edit-compile-run cycles during development. Exploratory programming is an effective way to gain a deeper understanding of a project’s requirements; reduce the complexity of a problem; and provide an early validation of the system’s design and implementation. This is amongst the strengths of Python and a major design goal of new languages such as Julia, D and Swift.
Two of the most widely used languages by researchers are C++ and Python. Python has grown steadily as a language of choice for data science and application control. The interactive nature of Python and its many available libraries make it an excellent choice for scripting tasks and code prototyping. However, native computational performance of Python is mediocre. Python includes functionality for replacing the most critical components of a processing kernel with implementations in C. This functionality is insufficient to fully cover many scientific use cases because crossing the language boundary is expensive due to limitations in current tools.
This talk describes key aspects of language interoperability for C++ using an automated binding approach. The primary initial focus is to support automatic binding to and from Python. Furthermore, the approach is generic enough to fit other languages such as D and Julia.
Slides
CaaS Monthly Meeting – 02 September 2021 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
cppyy provides automatic Python bindings to C++ code, at runtime, through Cling, the C++ interpreter. Python is itself a dynamic language executed by an interpreter, thus the interaction with C++ code becomes more natural when intermediated by Cling. Examples include runtime template instantiations, callbacks, cross-language inheritance, automatic downcasting, and exception mapping. Many advanced C++ features such as placement new, multiple virtual inheritance, variadic templates, etc., are also naturally handled.
cppyy achieves high performance through an all-lazy approach and specialization of common cases through runtime reflection. As such, it has a much lower call overhead than other binders, notably in its implementation for PyPy, a fully compatible Python interpreter sporting a tracing JIT. Furthermore, cppyy makes maintaining a large software stack simpler: except for cppyy’s own python-interpreter binding, it does not have any compiled code that is Python-dependent. I.e., cppyy-based extension modules require no recompilation when switching Python interpreters.
In this presentation I’ll show the benefits of runtime Python-C++ bindings and give a bird’s eye overview of the implementation underpinning cppyy.
Slides Video
CaaS Monthly Meeting – 05 August 2021 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
Cxx.jl is one of the oldest Julia packages and provides extremely tight integration between Julia and C++. With advanced features including a C++, REPL environment, the ability to perform cross-language template instantiation as well as cross-language object implementation, it is a powerful tool for developers seeking to integrate the two languages. However, despite these capabilities, the package never fully became mainstream due to a number of technical limitations. In this talk, I will explore the design and featureset of Cxx.jl, and what limitations in Clang and Julia prevented its full adoption in the Julia world.
Slides VideoJulia can directly call C++ via the Cxx.jl package. This enables is to use basically all of ROOT’s capabilities directly from Julia, including reading/writing TFiles and user ROOT’s GUI features. I’ll present some past experiences and practical examples of Julia/ROOT interaction.
Slides
CaaS Monthly Meeting – 01 July 2021 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
LLVM’s On Request Compilation (ORC) APIs provide a foundation for building in-memory and just-in-time compilers by re-using existing static compilers. This re-use is enabled by appending a new linking step to the standard compiler pipeline to patch the compiler output into the executing process. The ORC APIs support concurrent compilation, lazy compilation, and linking of code across process boundaries and even across architectures. The ORC APIs are already used in several open source projects including PostgreSQL, Cling, Julia, and the Swift interpreter; and they remain under active development with new features added frequently. This talk will provide an overview of the ORC APIs and recent developments, as well as demos and pointers to example code.
Slides
CaaS Monthly Meeting – 03 June 2021 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
CaaS Monthly Meeting – 06 May 2021 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
CaaS Monthly Meeting – 25 March 2021 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
Sylvain Corlay from QuantStack talks about C++ in Jupyter Notebooks using the Xeus-Cling. Xeus-Cling is a Cling-based notebook kernel which delivers interactive C++. Sylvain makes a deep dive in topic outlining some of the specific challenges and requirements.
Slides Video
CaaS Monthly Meeting – 04 March 2021 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
Simeon Ehrig from Helmholtz-Zentrum Dresden-Rossendorf (HZDR) shared his work with us recently. In his talk he gives insights about interactive CUDA using the C++ interpreter Cling. He shows several exciting examples in the area of dynamic execution without loss of state where we can “checkpoint” the execution state, add specific data analysis and reuse the previous computations.
Slides Video
CaaS Monthly Meeting – 04 February 2021 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
Alexandru Militaru shared his work with us recently. In his talk he gives insights about C++/D interoperability on the fly using the interactive C++ interpreter Cling and cppyy.
Slides Video
CaaS Monthly Meeting – 14 January 2021 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
CaaS Monthly Meeting – 03 December 2020 at 17:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
CaaS Monthly Meeting – 22 October 2020 at 18:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda:
CaaS Monthly Meeting – 17 September 2020 at 18:00 Geneva (CH) Time
Connection information: Link to zoom
Agenda: