Open Projects

Agent-Based Simulation of CAR-T Cell Therapy Using BioDynaMo

Chimeric Antigen Receptor T-cell (CAR-T) therapy has revolutionized cancer treatment by harnessing the immune system to target and destroy tumor cells. While CAR-T has demonstrated success in blood cancers, its effectiveness in solid tumors remains limited due to challenges such as poor tumor infiltration, immune suppression, and T-cell exhaustion. To improve therapy outcomes, computational modeling is essential for optimizing treatment parameters, predicting failures, and testing novel interventions. However, existing models of CAR-T behavior are often overly simplistic or computationally expensive, making them impractical for large-scale simulations.

This project aims to develop a scalable agent-based simulation of CAR-T therapy using BioDynaMo, an open-source high-performance biological simulation platform. By modeling T-cell migration, tumor engagement, and microenvironmental factors, we will investigate key treatment variables such as dosage, administration timing, and combination therapies. The simulation will allow researchers to explore how tumor microenvironment suppression (e.g., regulatory T-cells, hypoxia, immunosuppressive cytokines) affects CAR-T efficacy and what strategies such as checkpoint inhibitors or cytokine support can improve outcomes.

The final deliverable will be a fully documented, reproducible BioDynaMo simulation, along with analysis tools for visualizing treatment dynamics. The model will provide insights into the optimal CAR-T cell dosing, tumor penetration efficiency, and factors influencing therapy resistance. This project will serve as a foundation for in silico testing of immunotherapies, reducing the need for costly and time-consuming laboratory experiments while accelerating the development of more effective cancer treatments.

Task ideas and expected results

  • Expected plan of work:

  • Phase 1: Initial Setup & Simple T-cell Dynamics
  • Phase 2: Advanced CAR-T Cell Behavior & Tumor Interaction
  • Phase 3: Integration of Immunosuppressive Factors & Data Visualization

  • Expected deliverables

  • A fully documented BioDynaMo simulation of CAR-T therapy.
  • Analysis scripts for visualizing tumor reduction and CAR-T efficacy.
  • Performance benchmarks comparing different treatment strategies.
  • A research-style report summarizing findings.

Enable GPU support and Python Interoperability via a Plugin System

Xeus-Cpp integrates Clang-Repl with the Xeus protocol via CppInterOp, providing a powerful platform for C++ development within Jupyter Notebooks.

This project aims to introduce a plugin system for magic commands (cell, line, etc.), enabling a more modular and maintainable approach to extend Xeus-Cpp. Traditionally, magic commands introduce additional code and dependencies directly into the Xeus-Cpp kernel, increasing its complexity and maintenance burden. By offloading this functionality to a dedicated plugin library, we can keep the core kernel minimal while ensuring extensibility. This approach allows new magic commands to be developed, packaged, and deployed independently—eliminating the need to rebuild and release Xeus-Cpp for each new addition.

Initial groundwork has already been laid with the Xplugin library, and this project will build upon that foundation. The goal is to clearly define magic command compatibility across different platforms while ensuring seamless integration. A key objective is to reimplement existing features, such as the LLM cell magic and the in-development Python magic, as plugins. This will not only improve modularity within Xeus-Cpp but also enable these features to be used in other Jupyter kernels.

As an extended goal, we aim to develop a new plugin for GPU execution, leveraging CUDA or OpenMP to support high-performance computing workflows within Jupyter.

Task ideas and expected results

  • Move the currently implemented magics and reframe using xplugin
  • Complete the on-going work on the Python interoperability magic
  • Implement a test suite for the plugins
  • Extended: To be able to execute on GPU using CUDA or OpenMP
  • Optional: Extend the magics for the wasm use case (xeus-cpp-lite)
  • Present the work at the relevant meetings and conferences

Enhancing LLM Training with Clad for efficient differentiation

This project aims to leverage Clad, an automatic differentiation (AD) plugin for Clang, to optimize large language model (LLM) training primarily in C++. Automatic differentiation is a crucial component of deep learning training, enabling efficient computation of gradients for optimization algorithms such as stochastic gradient descent (SGD). While most modern LLM frameworks rely on Python-based ecosystems, their heavy reliance on interpreted code and dynamic computation graphs can introduce performance bottlenecks. By integrating Clad into C++-based deep learning pipelines, we can enable high-performance differentiation at the compiler level, reducing computational overhead and improving memory efficiency. This will allow developers to build more optimized training workflows without sacrificing flexibility or precision.

Beyond performance improvements, integrating Clad with LLM training in C++ opens new possibilities for deploying AI models in resource-constrained environments, such as embedded systems and HPC clusters, where minimizing memory footprint and maximizing computational efficiency are critical. Additionally, this work will bridge the gap between modern deep learning research and traditional scientific computing by providing a more robust and scalable AD solution for physics-informed machine learning models. By optimizing the differentiation process at the compiler level, this project has the potential to enhance both research and production-level AI applications, aligning with compiler-research.org’s broader goal of advancing computational techniques for scientific discovery.

Task ideas and expected results

  • Develop a simplified LLM setup in C++
  • Apply Clad to compute gradients for selected layers and loss functions
  • Enhance clad to support it if necessary, and prepare performance benchmarks
  • Enhance the LLM complexity to cover larger projects such as llama
  • Repeat bugfixing and benchmarks
  • Develop tests to ensure correctness, numerical stability, and efficiency
  • Document the approach, implementation details, and performance gains
  • Present progress and findings at relevant meetings and conferences

Integrate Clad in PyTorch and compare the gradient execution times

PyTorch is a popular machine learning framework that includes its own automatic differentiation engine, while Clad is a Clang plugin for automatic differentiation that performs source-to-source transformation to generate functions capable of computing derivatives at compile time.

This project aims to integrate Clad-generated functions into PyTorch using its C++ API and expose them to a Python workflow. The goal is to compare the execution times of gradients computed by Clad with those computed by PyTorch’s native autograd system. Special attention will be given to CUDA-enabled gradient computations, as PyTorch also offers GPU acceleration capabilities.

Task ideas and expected results

  • Incorporate Clad’s API components (such as clad::array and clad::tape) into PyTorch using its C++ API
  • Pass Clad-generated derivative functions to PyTorch and expose them to Python
  • Perform benchmarks comparing the execution times and performance of Clad-derived gradients versus PyTorch’s autograd
  • Automate the integration process
  • Document thoroughly the integration process and the benchmark results and identify potential bottlenecks in Clad’s execution
  • Present the work at the relevant meetings and conferences.

Support usage of Thrust API in Clad

The rise of ML has shed light into the power of GPUs and researchers are looking for ways to incorporate them in their projects as a lightweight parallelization method. Consequently, General Purpose GPU programming is becoming a very popular way to speed up execution time.

Clad is a clang plugin for automatic differentiation that performs source-to-source transformation and produces a function capable of computing the derivatives of a given function at compile time. This project aims to enhance Clad by adding support for Thrust, a parallel algorithms library designed for GPUs and other accelerators. By supporting Thrust, Clad will be able to differentiate algorithms that rely on Thrust’s parallel computing primitives, unlocking new possibilities for GPU-based machine learning, scientific computing, and numerical optimization.

Task ideas and expected results

  • Research and decide on the most valuable Thrust functions to support in Clad
  • Create pushforward and pullback functions for these Thrust functions
  • Write tests that cover the additions
  • Include demos of using Clad on open source code examples that call Thrust functions
  • Write documentation on which Thrust functions are supported in Clad
  • Present the work at the relevant meetings and conferences.

Enable automatic differentiation of C++ STL concurrency primitives in Clad

Clad is an automatic differentiation (AD) clang plugin for C++. Given a C++ source code of a mathematical function, it can automatically generate C++ code for computing derivatives of the function. This project focuses on enabling automatic differentiation of codes that utilise C++ concurrency features such as std::thread, std::mutex, atomic operations and more. This will allow users to fully utilize their CPU resources.

Task ideas and expected results

  • Explore C++ concurrency primitives and prepare a report detailing the associated challenges involved and the features that can be feasibly supported within the given timeframe.
  • Add concurrency primitives support in Clad’s forward-mode automatic differentiation.
  • Add concurrency primitives support in Clad’s reverse-mode automatic differentiation.
  • Add proper tests and documentation.
  • Present the work at the relevant meetings and conferences.

Implementing Debugging Support in Xeus-Cpp

xeus-cpp is an interactive execution environment for C++ in Jupyter notebooks, built on the Clang-Repl C++ interpreter, provided by CppInterOp. While xeus-cpp enables a seamless workflow for running C++ code interactively, the lack of an integrated debugging experience remains a gap, especially when dealing with code that is dynamically compiled and executed through LLVM’s JIT(Just-In-Time) infrastructure.

Jupyter’s debugging system follows the Debug Adapter Protocol (DAP), enabling seamless integration of debuggers into interactive kernels. Existing Jupyter kernels, such as the IPython & the xeus-python kernel, have successfully implemented debugging workflows that support breakpoints, variable inspection, and execution control, even in dynamically executed environments. These implementations address challenges such as symbol resolution and source mapping for dynamically generated code, ensuring that debugging within Jupyter remains intuitive and user-friendly.

However, debugging C++ inside an interactive environment presents unique challenges, particularly due to Clang-Repl’s use of LLVM’s ORC JIT to compile and execute code dynamically. To integrate debugging into xeus-cpp, the project will explore existing solutions for DAP implementations like lldb_dap and debuggers like lldb that can interface with Jupyter while effectively supporting the execution model of Clang-Repl.

Task ideas and expected results

  • Seamless debugging integration, establishing reliable interactions between xeus-cpp, a Debug Adapter Protocol (DAP) implementation, and a debugger.
  • Implement a testing framework through xeus-zmq to thoroughly test the debugger. This can be inspired by an existing implementation in xeus-python.
  • Present the work at the relevant meetings and conferences.

Interactive Differential Debugging - Intelligent Auto-Stepping and Tab-Completion

Differential debugging is a time-consuming task that is not well supported by existing tools. Existing state-of-the-art tools do not consider a baseline(working) version while debugging regressions in complex systems, often leading to manual efforts by developers to achieve an automatable task.

The differential debugging technique analyzes a regressed system and identifies the cause of unexpected behaviors by comparing it to a previous version of the same system. The idd tool inspects two versions of the executable - a baseline and a regressed version. The interactive debugging session runs both executables side-by-side, allowing the users to inspect and compare various internal states.

This project aims to implement intelligent stepping (debugging) and tab completions of commands. IDD should be able to execute until a stack frame or variable diverges between the two versions of the system, then drop to the debugger. This may be achieved by introducing new IDD-specific commands. IDD should be able to tab complete the underlying GDB/LLDB commands. The contributor is also expected to set up the necessary CI infrastructure to automate the testing process of IDD.

Task ideas and expected results

  • Enable stream capture
  • Enable IDD-specific commands to execute until diverging stack or variable value.
  • Enable tab completion of commands.
  • Set up CI infrastructure to automate testing IDD.
  • Present the work at the relevant meetings and conferences.

Using ROOT in the field of genome sequencing

ROOT is a framework for data processing, born at CERN, at the heart of the research on high-energy physics. Every day, thousands of physicists use ROOT applications to analyze their data or to perform simulations. The ROOT software framework is foundational for the HEP ecosystem, providing capabilities such as IO, a C++ interpreter, GUI, and math libraries. It uses object-oriented concepts and build-time modules to layer between components. We believe additional layering formalisms will benefit ROOT and its users.

ROOT has broader scientific uses than the field of high energy physics. Several studies have shown promising applications of the ROOT I/O system in the field of genome sequencing. This project is about extending the developed capability in GeneROOT and understanding better the requirements of the field.

Task ideas and expected results

  • Reproduce the results based on previous comparisons against ROOT master
  • Investigate and compare the latest compression strategies used by Samtools for conversions to BAM, with RAM(ROOT Alignment Maps).
  • Explore ROOT’s RNTuple format to efficiently store RAM maps, in place of the previously used TTree.
  • Investigate different ROOT file splitting techniques
  • Produce a comparison report

Implement CppInterOp API exposing memory, ownership and thread safety information

Incremental compilation pipelines process code chunk-by-chunk by building an ever-growing translation unit. Code is then lowered into the LLVM IR and subsequently run by the LLVM JIT. Such a pipeline allows creation of efficient interpreters. The interpreter enables interactive exploration and makes the C++ language more user friendly. The incremental compilation mode is used by the interactive C++ interpreter, Cling, initially developed to enable interactive high-energy physics analysis in a C++ environment.

Clang and LLVM provide access to C++ from other programming languages, but currently only exposes the declared public interfaces of such C++ code even when it has parsed implementation details directly. Both the high-level and the low-level program representation has enough information to capture and expose more of such details to improve language interoperability. Examples include details of memory management, ownership transfer, thread safety, externalized side-effects, etc. For example, if memory is allocated and returned, the caller needs to take ownership; if a function is pure, it can be elided; if a call provides access to a data member, it can be reduced to an address lookup.

The goal of this project is to develop API for CppInterOp which are capable of extracting and exposing such information AST or from JIT-ed code and use it in cppyy (Python-C++ language bindings) as an exemplar. If time permits, extend the work to persistify this information across translation units and use it on code compiled with Clang.

Task ideas and expected results

  • Collect and categorize possible exposed interop information kinds
  • Write one or more facilities to extract necessary implementation details
  • Design a language-independent interface to expose this information
  • Integrate the work in clang-repl and Cling
  • Implement and demonstrate its use in cppyy as an exemplar
  • Present the work at the relevant meetings and conferences.

Implement and improve an efficient, layered tape with prefetching capabilities

In mathematics and computer algebra, automatic differentiation (AD) is a set of techniques to numerically evaluate the derivative of a function specified by a computer program. Automatic differentiation is an alternative technique to Symbolic differentiation and Numerical differentiation (the method of finite differences). Clad is based on Clang which provides the necessary facilities for code transformation. The AD library can differentiate non-trivial functions, to find a partial derivative for trivial cases and has good unit test coverage.

The most heavily used entity in AD is a stack-like data structure called a tape. For example, the first-in last-out access pattern, which naturally occurs in the storage of intermediate values for reverse mode AD, lends itself towards asynchronous storage. Asynchronous prefetching of values during the reverse pass allows checkpoints deeper in the stack to be stored furthest away in the memory hierarchy. Checkpointing provides a mechanism to parallelize segments of a function that can be executed on independent cores. Inserting checkpoints in these segments using separate tapes enables keeping the memory local and not sharing memory between cores. We will research techniques for local parallelization of the gradient reverse pass, and extend it to achieve better scalability and/or lower constant overheads on CPUs and potentially accelerators. We will evaluate techniques for efficient memory use, such as multi-level checkpointing support. Combining already developed techniques will allow executing gradient segments across different cores or in heterogeneous computing systems. These techniques must be robust and user-friendly, and minimize required application code and build system changes.

This project aims to improve the efficiency of the clad tape and generalize it into a tool-agnostic facility that could be used outside of clad as well.

Task ideas and expected results

  • Optimize the current tape by avoiding re-allocating on resize in favor of using connected slabs of array
  • Enhance existing benchmarks demonstrating the efficiency of the new tape
  • Add the tape thread safety
  • Implement multilayer tape being stored in memory and on disk
  • [Stretch goal] Support cpu-gpu transfer of the tape
  • [Stretch goal] Add infrastructure to enable checkpointing offload to the new tape
  • [Stretch goal] Performance benchmarks

Enabling CUDA compilation on Cppyy-Numba generated IR

Cppyy is an automatic, run-time, Python-C++ bindings generator, for calling C++ from Python and Python from C++. Initial support has been added that allows Cppyy to hook into the high-performance Python compiler, Numba which compiles looped code containing C++ objects/methods/functions defined via Cppyy into fast machine code. Since Numba compiles the code in loops into machine code it crosses the language barrier just once and avoids large slowdowns accumulating from repeated calls between the two languages. Numba uses its own lightweight version of the LLVM compiler toolkit (llvmlite) that generates an intermediate code representation (LLVM IR) which is also supported by the Clang compiler capable of compiling CUDA C++ code.

The project aims to demonstrate Cppyy’s capability to provide CUDA paradigms to Python users without any compromise in performance. Upon successful completion a possible proof-of-concept can be expected in the below code snippet -

import cppyy
import cppyy.numba_ext

cppyy.cppdef('''
__global__ void MatrixMul(float* A, float* B, float* out) {
    // kernel logic for matrix multiplication
}
''')

@numba.njit
def run_cuda_mul(A, B, out):
    # Allocate memory for input and output arrays on GPU
    # Define grid and block dimensions
    # Launch the kernel
    MatrixMul[griddim, blockdim](d_A, d_B, d_out)	

Task ideas and expected results

  • Add support for declaration and parsing of Cppyy-defined CUDA code on the Numba extension.
  • Design and develop a CUDA compilation and execution mechanism.
  • Prepare proper tests and documentation.

Cppyy STL/Eigen - Automatic conversion and plugins for Python based ML-backends

Cppyy is an automatic, run-time, Python-C++ bindings generator, for calling C++ from Python and Python from C++. Cppyy uses pythonized wrappers of useful classes from libraries like STL and Eigen that allow the user to utilize them on the Python side. Current support follows container types in STL like std::vector, std::map, and std::tuple and the Matrix-based classes in Eigen/Dense. These cppyy objects can be plugged into idiomatic expressions that expect Python builtin-types. This behaviour is achieved by growing pythonistic methods like __len__ while also retaining its C++ methods like size.

Efficient and automatic conversion between C++ and Python is essential towards high-performance cross-language support. This approach eliminates overheads arising from iterative initialization such as comma insertion in Eigen. This opens up new avenues for the utilization of Cppyy’s bindings in tools that perform numerical operations for transformations, or optimization.

The on-demand C++ infrastructure wrapped by idiomatic Python enables new techniques in ML tools like JAX/CUTLASS. This project allows the C++ infrastructure to be plugged into at service to the users seeking high-performance library primitives that are unavailable in Python.

Task ideas and expected results

  • Extend STL support for std::vectors of arbitrary dimensions
  • Improve the initialization approach for Eigen classes
  • Develop a streamlined interconversion mechanism between Python builtin-types, numpy.ndarray, and STL/Eigen data structures
  • Implement experimental plugins that perform basic computational
    operations in frameworks like JAX
  • Work on integrating these plugins with toolkits like CUTLASS that utilise the bindings to provide a Python API

On Demand Parsing in Clang

Clang, like any C++ compiler, parses a sequence of characters as they appear, linearly. The linear character sequence is then turned into tokens and AST before lowering to machine code. In many cases the end-user code uses a small portion of the C++ entities from the entire translation unit but the user still pays the price for compiling all of the redundancies.

This project proposes to process the heavy compiling C++ entities upon using them rather than eagerly. This approach is already adopted in Clang’s CodeGen where it allows Clang to produce code only for what is being used. On demand compilation is expected to significantly reduce the compilation peak memory and improve the compile time for translation units which sparsely use their contents. In addition, that would have a significant impact on interactive C++ where header inclusion essentially becomes a no-op and entities will be only parsed on demand.

The Cling interpreter implements a very naive but efficient cross-translation unit lazy compilation optimization which scales across hundreds of libraries in the field of high-energy physics.

  // A.h
  #include <string>
  #include <vector>
  template <class T, class U = int> struct AStruct {
    void doIt() { /*...*/ }
    const char* data;
    // ...
  };

  template<class T, class U = AStruct<T>>
  inline void freeFunction() { /* ... */ }
  inline void doit(unsigned N = 1) { /* ... */ }

  // Main.cpp
  #include "A.h"
  int main() {
    doit();
    return 0;
  }

This pathological example expands to 37253 lines of code to process. Cling builds an index (it calls it an autoloading map) where it contains only forward declarations of these C++ entities. Their size is 3000 lines of code.

The index looks like:

  // A.h.index
  namespace std{inline namespace __1{template <class _Tp, class _Allocator> class __attribute__((annotate("$clingAutoload$vector")))  __attribute__((annotate("$clingAutoload$A.h")))  __vector_base;
    }}
  ...
  template <class T, class U = int> struct __attribute__((annotate("$clingAutoload$A.h"))) AStruct;

Upon requiring the complete type of an entity, Cling includes the relevant header file to get it. There are several trivial workarounds to deal with default arguments and default template arguments as they now appear on the forward declaration and then the definition. You can read more here.

Although the implementation could not be called a reference implementation, it shows that the Parser and the Preprocessor of Clang are relatively stateless and can be used to process character sequences which are not linear in their nature. In particular namespace-scope definitions are relatively easy to handle and it is not very difficult to return to namespace-scope when we lazily parse something. For other contexts such as local classes we will have lost some essential information such as name lookup tables for local entities. However, these cases are probably not very interesting as the lazy parsing granularity is probably worth doing only for top-level entities.

Such implementation can help with already existing issues in the standard such as CWG2335, under which the delayed portions of classes get parsed immediately when they’re first needed, if that first usage precedes the end of the class. That should give good motivation to upstream all the operations needed to return to an enclosing scope and parse something.

Implementation approach:

Upon seeing a tag definition during parsing we could create a forward declaration, record the token sequence and mark it as a lazy definition. Later upon complete type request, we could re-position the parser to parse the definition body. We already skip some of the template specializations in a similar way [commit, commit].

Another approach is every lazy parsed entity to record its token stream and change the Toks stored on LateParsedDeclarations to optionally refer to a subsequence of the externally-stored token sequence instead of storing its own sequence (or maybe change CachedTokens so it can do that transparently). One of the challenges would be that we currently modify the cached tokens list to append an “eof” token, but it should be possible to handle that in a different way.

In some cases, a class definition can affect its surrounding context in a few ways you’ll need to be careful about here:

1) struct X appearing inside the class can introduce the name X into the enclosing context.

2) static inline declarations can introduce global variables with non-constant initializers that may have arbitrary side-effects.

For point (2), there’s a more general problem: parsing any expression can trigger a template instantiation of a class template that has a static data member with an initializer that has side-effects. Unlike the above two cases, I don’t think there’s any way we can correctly detect and handle such cases by some simple analysis of the token stream; actual semantic analysis is required to detect such cases. But perhaps if they happen only in code that is itself unused, it wouldn’t be terrible for Clang to have a language mode that doesn’t guarantee that such instantiations actually happen.

Alternative and more efficient implementation could be to make the lookup tables range based but we do not have even a prototype proving this could be a feasible approach.

Task ideas and expected results

  • Design and implementation of on-demand compilation for non-templated functions
  • Support non-templated structs and classes
  • Run performance benchmarks on relevant codebases and prepare report
  • Prepare a community RFC document
  • [Stretch goal] Support templates

The successful candidate should commit to regular participation in weekly meetings, deliver presentations, and contribute blog posts as requested. Additionally, they should demonstrate the ability to navigate the community process with patience and understanding.

Improve automatic differentiation of object-oriented paradigms using Clad

Clad is an automatic differentiation (AD) clang plugin for C++. Given a C++ source code of a mathematical function, it can automatically generate C++ code for computing derivatives of the function. Clad has found uses in statistical analysis and uncertainty assessment applications.

Object oriented paradigms (OOP) provide a structured approach for complex use cases, allowing for modular components that can be reused & extended. OOP also allows for abstraction which makes code easier to reason about & maintain. Gaining full OOP support is an open research area for automatic differentiation codes.

This project focuses on improving support for differentiating object-oriented constructs in Clad. This will allow users to seamlessly compute derivatives to the algorithms in their projects which use an object-oriented model. C++ object-oriented constructs include but are not limited to: classes, inheritance, polymorphism, and related features such as operator overloading.

Task ideas and expected results

There are several foreseen tasks:

  • Study the current object-oriented differentiable programming support in Clad. Prepare a report of missing constructs that should be added to support the automatic differentiation of object-oriented paradigms in both the forward mode AD and the reverse mode AD. Some of the missing constructs are: differentiation of constructors, limited support for differentiation of operator overloads, reference class members, and no way of specifying custom derivatives for constructors.
  • Add support for the missing constructs.
  • Add proper tests and documentation.

Broaden the Scope for the Floating-Point Error Estimation Framework in Clad

In mathematics and computer algebra, automatic differentiation (AD) is a set of techniques to numerically evaluate the derivative of a function specified by a computer program. Automatic differentiation is an alternative technique to Symbolic differentiation and Numerical differentiation (the method of finite differences). Clad is based on Clang which provides the necessary facilities for code transformation. The AD library can differentiate non-trivial functions, to find a partial derivative for trivial cases and has good unit test coverage.

Clad also possesses the capabilities of annotating given source code with floating-point error estimation code. This allows Clad to compute any floating-point related errors in the given function on the fly. This allows Clad to reason about the numerical stability of the given function and also analyze the sensitivity of the variables involved.

The idea behind this project is to develop benchmarks and improve the floating-point error estimation framework as necessary. Moreover, find compelling real-world use-cases of the tool and investigate the possibility of performing lossy compression with it.

On successful completion of the project, the framework should have a sufficiently large set of benchmarks and example usages. Moreover, the framework should be able to run the following code as expected:

#include <iostream>
#include "clad/Differentiator/Differentiator.h"

// Some complicated function made up of doubles.
double someFunc(double F1[], double F2[], double V3[], double COUP1, double COUP2)
{
  double cI = 1;
  double TMP3;
  double TMP4;
  TMP3 = (F1[2] * (F2[4] * (V3[2] + V3[5]) + F2[5] * (V3[3] + cI * (V3[4]))) +
  F1[3] * (F2[4] * (V3[3] - cI * (V3[4])) + F2[5] * (V3[2] - V3[5])));
  TMP4 = (F1[4] * (F2[2] * (V3[2] - V3[5]) - F2[3] * (V3[3] + cI * (V3[4]))) +
  F1[5] * (F2[2] * (-V3[3] + cI * (V3[4])) + F2[3] * (V3[2] + V3[5])));
  return (-1.) * (COUP2 * (+cI * (TMP3) + 2. * cI * (TMP4)) + cI * (TMP3 *
  COUP1));
}

int main() {
  auto df = clad::estimate_error(someFunc);
  // This call should generate a report to decide
  // which variables can be downcast to a float.
  df.execute(args...);
}

Task ideas and expected results

The project consists of the following tasks:

  • Add at least 5 benchmarks and compare the framework’s correctness and performance against them.
  • Compile at least 3 real-world examples that are complex enough to demonstrate the capabilities of the framework.
  • Solve any general-purpose issues that come up with Clad during the process.
  • Prepare demos and carry out development needed for lossy compression.

Improve robustness of dictionary to module lookups in ROOT

The LHC smashes groups of protons together at close to the speed of light: 40 million times per second and with seven times the energy of the most powerful accelerators built up to now. Many of these will just be glancing blows but some will be head on collisions and very energetic. When this happens some of the energy of the collision is turned into mass and previously unobserved, short-lived particles – which could give clues about how Nature behaves at a fundamental level - fly out and into the detector. Our work includes the experimental discovery of the Higgs boson, which leads to the award of a Nobel prize for the underlying theory that predicted the Higgs boson as an important piece of the standard model theory of particle physics.

CMS is a particle detector that is designed to see a wide range of particles and phenomena produced in high-energy collisions in the LHC. Like a cylindrical onion, different layers of detectors measure the different particles, and use this key data to build up a picture of events at the heart of the collision. The CMSSW is a collection of software for the CMS experiment. It is responsible for the collection and processing of information about the particle collisions at the detector. CMSSW uses the ROOT framework to provide support for data storage and processing. ROOT relies on Cling, Clang, LLVM for building automatically efficient I/O representation of the necessary C++ objects. The I/O properties of each object is described in a compileable C++ file called a /dictionary/. ROOT’s I/O dictionary system relies on C++ modules to improve the overall memory footprint when being used.

The few run time failures in the modules integration builds of CMSSW are due to dictionaries that can not be found in the modules system. These dictionaries are present as the mainstream system is able to find them using a broader search. The modules setup in ROOT needs to be extended to include a dictionary extension to track dictionary<->module mappings for C++ entities that introduce synonyms rather than declarations (using std::vector<A<B>> = MyVector where the dictionaries of A, B are elsewhere)

Task ideas and expected results

The project consists of the following tasks:

  • If an alias declaration of kind using std::vector<A<B>> = MyVector, we should store the ODRHash of it in the respective dictionary file as a number attached to a special variable which can be retrieved at symbol scanning time.
  • Track down the test failures of CMSSW and check if the proposed implementation works.
  • Develop tutorials and documentation.
  • Present the work at the relevant meetings and conferences.

Enhance the incremental compilation error recovery in clang and clang-repl

The Clang compiler is part of the LLVM compiler infrastructure and supports various languages such as C, C++, ObjC and ObjC++. The design of LLVM and Clang enables them to be used as libraries, and has led to the creation of an entire compiler-assisted ecosystem of tools. The relatively friendly codebase of Clang and advancements in the JIT infrastructure in LLVM further enable research into different methods for processing C++ by blurring the boundary between compile time and runtime. Challenges include incremental compilation and fitting compile/link time optimizations into a more dynamic environment.

Incremental compilation pipelines process code chunk-by-chunk by building an ever-growing translation unit. Code is then lowered into the LLVM IR and subsequently run by the LLVM JIT. Such a pipeline allows creation of efficient interpreters. The interpreter enables interactive exploration and makes the C++ language more user friendly. The incremental compilation mode is used by the interactive C++ interpreter, Cling, initially developed to enable interactive high-energy physics analysis in a C++ environment.

Our group puts efforts to incorporate and possibly redesign parts of Cling in Clang mainline through a new tool, clang-repl. The project aims at enhancing the error recovery when users type C++ at the prompt of clang-repl.

Task ideas and expected results

There are several tasks to improve the current rudimentary state of the error recovery:

  • Extend the test coverage for error recovery
  • Find and fix cases where there are bugs
  • Implement template instantiation error recovery support
  • Implement argument-dependent lookup (ADL) recovery support