BioHackathon Europe 2025, Berlin, Germany, 2025

BioHackEU25

2025-11-03 - 2025-11-07
https://biohackathon-europe.org/
https://biohackathon-europe.org/about/biohackrxiv/

BioHackathon Europe is an annual event that brings together bioinformaticians and computational biologists from around the world. It’s organised by ELIXIR Europe, and offers an intense week of hacking, with participants working on diverse and exciting projects. BioHackathon is a community-driven event, which provides an opportunity for members of the life sciences community to meet and work together on topics of common interest. The goal is to create code that addresses challenges in bioinformatics research.

Source

Previous BioHackathon Europe preprints

YAML instructions

biohackathon_name: "BioHackathon Europe 2025"
biohackathon_url: "https://biohackathon-europe.org/"
biohackathon_location: "Berlin, Germany, 2025"

Preprints

Jan 22, 2026
https://doi.org/10.37044/osf.io/2jgk4_v1

METRICS - Monitoring of Key Performance Indicators for ELIXIR Services

Key Performance Indicators (KPIs) are increasingly requested by a diverse range of stakeholders across the research ecosystem. Funders want to measure the impact of projects and related services they fund, or research organisations want to track the service use for informed decision making. Service providers themselves are also interested in monitoring their services to gather feedback and improve service quality. KPIs are a simple, but powerful tool for these purposes.As part of the BioHackathon Europe 2025, we report on the activities of the METRICS project, which addresses the need for consistent and transparent evaluation of services across ELIXIR and related initiatives using KPIs. The project brings together experts from multiple ELIXIR Nodes and scientific domains to identify, harmonise, and semantically model KPIs that reflect service quality, usage, sustainability, and impact. By exploring existing evaluation frameworks, and processes, the team aims to design a flexible yet coherent foundation for KPI monitoring of ELIXIR services. This report summarises the project’s motivation, current landscape analysis, and initial steps toward developing an ontology-driven framework for KPI representation, fostering interoperability and supporting evidence-based management of life science infrastructures. 1 minute read

Dec 31, 2025
https://doi.org/10.37044/osf.io/jfpsx_v1

BioHackEU25 Report Project 16: MiCoReCa (Microbiome Community Resource Catalogue) - Towards Centralized Curation And Integration Of Microbiome Bioinformatics Resources

The rapid growth of microbiome research has led to the development of numerous bioinformatics tools and databases, but information about them remains fragmented across disparate, often outdated cataloging efforts, hindering resource discovery and utilization. To address this critical gap, the ELIXIR Microbiome Community proposes the development of MiCoReCa (Microbiome Community Resource Catalogue), a comprehensive, dynamic, open-access catalogue of microbiome-related bioinformatics resources (tools, workflows, training, standards, and databases). Leveraging our community’s expertise, this initiative will utilize standardized ontologies like EDAM and cross-reference established platforms like bio.tools and WorkflowHub to create a centralized, findable inventory. A key feature is the community-driven process for identifying and curating missing ontological terms and metadata, ensuring MiCoReCa’s accuracy and relevance in collaboration with partner platforms. Furthermore, the catalogue will integrate links to training materials from TeSS to support appropriate tool usage, and connect with OpenEBench for benchmarking capabilities. This project will not only provide a vital resource for the microbiome field, enhancing research efficiency and reproducibility, but will also establish a sustainable, adaptable infrastructure potentially applicable to other ELIXIR Communities. This effort represents a significant contribution by the ELIXIR Microbiome Community to streamline microbiome bioinformatics. 1 minute read

Dec 17, 2025
https://doi.org/10.37044/osf.io/xhkc3_v1

Decoding Complex Genotype-Phenotype Interactions by Discretizing the Genome

Background: Despite the ease and affordability of genome sequencing in biomedical research, the genetic causes of many diseases or their subtypes remain unknown due to diverse biological mechanisms that complicate genotype-phenotype relationships. Most previous studies have focused on single variants or sets of variants presumed to be directly causal for the disease. However, incomplete penetrance, in which some individuals carry disease-associated variants yet exhibit no phenotype, suggests that these variants, the genomic background and other secondary factors combine to shape the susceptibility to the disease. 1 minute read

Dec 16, 2025
https://doi.org/10.37044/osf.io/zah28_v1

BioHackEU25 report: Towards a Robust Validation Service for Data and Metadata in ARC RO-Crates

Robust validation of both research data and its accompanying metadata is essential for ensuring adherence to FAIR principles. Current approaches often handle these aspects separately, hindering a holistic quality assessment. Building upon previous BioHackathon work establishing ARCs (Annotated Research Context) as RO-Crates (ARC RO-Crate), we aim to develop and demonstrate an integrated validation strategy for FAIR digital objects. It distinguishes between validating the metadata descriptor and the payload data files.For the metadata descriptor, validation will ensure structural and semantic compliance to the base RO-Crate specification and the ARC-ISA family of RO-Crate profiles, using and extending the RO-Crate validator tool.For the payload data files, validation targets the actual content, since data files often require domain-specific structural and value constraints, which requires explicit schema definitions. For this, we will integrate Frictionless for checking data content against community standards (e.g. MIAPPE, as demonstrated in the HORIZON project AGENT). Crucially, this project will also explore mechanisms for specifying expected data structures’ requirements within the ARC RO-Crate itself. This aims to provide a more self-contained description of data, investigating how such internal requirements can be linked to data validation frameworks, complementing the crate’s metadata validation.The overall goal is to provide a powerful, holistic validation mechanism for ARC RO-Crates, enhancing their reliability, trustworthiness, and FAIRness. A MIAPPE-compliant plant phenomics dataset will serve as a use case. This integrated validation approach aims to streamline quality control for researchers and will be packaged as a deployable microservice, offering broad applicability across diverse research workflows. 1 minute read

Nov 29, 2025
https://doi.org/10.37044/osf.io/gv2ac_v1

Mining the potential of knowledge graphs for metadata on training

Training metadata in the life‑science community is increasingly standardized through Bioschemas, yet remains fragmented and under‑utilized. In this work we harvested training records from ELIXR’s TeSS platform and the Galaxy Training Network, converting them into a unified knowledge graph. A dedicated pipeline parses RDF/Turtle dumps, deduplicates entries, and builds rich indexes (keyword, provider, location, date, topic) that power a Model Context Protocol (MCP) server. The MCP offers live and offline search tools—including keyword, provider, location, date, topic, and SPARQL queries—enabling natural‑language access to training resources via LLM‑driven clients. User‑story driven evaluations demonstrate the system’s ability to generate custom learning paths, assemble trainer profiles, and link training data to external repositories. Findings highlight gaps in persistent identifiers (ORCID, ROR) and location granularity, informing recommendations for metadata providers. The project showcases how knowledge‑graph‑backed metadata can enhance discoverability, interoperability, and AI‑assisted exploration of scientific training materials. less than 1 minute read

Nov 26, 2025
https://doi.org/10.37044/osf.io/xvrud_v1

BioHackEU25 report: Scop3PTM Next - Interactive visualization of PTM data across sequence, structure and interactions

Scop3PTM Next was developed during BioHackathon Europe 2025 to address the need for integrated visualization of protein-centric data across sequence (and modification), interaction and structural contexts. The project delivers an open-source library of modular JavaScript components, built with Vue.js and documented with Storybook, enabling reusable and interoperable visualizations. The framework provides 1D sequence tracks, contact-map networks and interactive 3D structural renders, using MolSpecView for rendering 3D structures linked to Nightingale 1D tracks. Together, these components offer a unified interface for exploring PTM features across multiple representational layers. This work establishes the basis for a community-oriented visualization library to support proteomics analysis. less than 1 minute read

BioHackathon Europe 2025, Berlin, Germany, 2025

Previous BioHackathon Europe preprints

YAML instructions

Preprints

METRICS - Monitoring of Key Performance Indicators for ELIXIR Services

BioHackEU25 Report Project 16: MiCoReCa (Microbiome Community Resource Catalogue) - Towards Centralized Curation And Integration Of Microbiome Bioinformatics Resources

Decoding Complex Genotype-Phenotype Interactions by Discretizing the Genome

BioHackEU25 report: Towards a Robust Validation Service for Data and Metadata in ARC RO-Crates

Mining the potential of knowledge graphs for metadata on training

BioHackEU25 report: Scop3PTM Next - Interactive visualization of PTM data across sequence, structure and interactions