Google Summer of Code 2025 Contributor
email: adityapand3y666@gmail.com
Education: Bachelor of Technology, Birla Institute of Technology, Mesra, India
Ongoing project:
Using ROOT in the field of genome sequencing
This project aims to advance genomic data management by implementing ROOT’s
next-generation RNTuple format for sequence alignment storage. Beginning
with validation of previous GeneROOT benchmarks showing 4x performance gains
with TTree
, we will then extend these capabilities with RNTuple technology.
Genomic sequencing data volumes are growing exponentially, creating performance
bottlenecks in traditional formats.RNTuple’s improved memory mapping, type
safety through templated interfaces, and parallelization capabilities position
it as an ideal solution.
We will systematically compare compression algorithms, implement file splitting
strategies, and benchmark against established formats.Additionally, I will
analyze the latest compression strategies implemented by samtools and incorporate
these techniques into the project to ensure comprehensive coverage of state-of-the-art
genomic data compression approaches.
The project will deliver optimized tools for handling rapidly growing genomic datasets,
potentially establishing a new standard for high-performance genomic data analysis.
Project Proposal: URL
Mentors: Martin Vassilev, Jonas Rembser, Fons Rademakers, Vassil Vassilev