Meetings
BioHackSWAT4HCLS 2025
BioHackathon Europe 2024
3rd BioHackathon Germany
DBCLS BioHackathon 2024
ELIXIR INTOXICOM
Recent preprints
-
Secure Processing Environments as a Service in the de.NBI Cloud
Sensitive human data is crucial for biomedical research, enabling faster drug development and better understanding of diseases. The Biohackathon Germany project utilized ELIXIR Europe’s services and external tools to create Secure Processing Environments, ensuring high protection of sensitive data while facilitating research across Germany and Europe. -
Report: Workshop on connecting Knowledge Graphs with BioChatter
The workshop on connecting Knowledge Graphs (KGs) with BioChatter convened experts from biology, computer science, and bioinformatics to tackle challenges in integrating and accessing dispersed datasets in plant sciences. The goal was to create user-friendly interfaces for querying these datasets using natural language, bypassing the need for expertise in semantic technologies or query languages like SPARQL or Cypher.Key use cases included the BrAPI project, which aimed to simplify data retrieval from plant research datasets. While BioChatter effectively generated simple API queries, complex multi-step queries posed challenges, suggesting that programmatic approaches are better suited for such tasks. Integrating BrAPI with BioCypher enabled successful querying of KGs for questions like identifying studies involving specific plant varieties.The RDF adapter use case focused on enhancing the Plant Phenotyping Experiment Ontology (PPEO) by converting it into a BioCypher-compatible KG, thereby improving data interoperability and enabling LLMs to generate context-aware responses. The Mobile Element Knowledge Graph use case explored the relationship between transposable elements and gene regulation networks, utilizing BioChatter to assist users unfamiliar with Cypher.The Stress Knowledge Map (SKM) use case integrated a highly curated model of plant stress signaling with BioCypher and BioChatter, allowing natural language queries and improving access to complex biological data. The Chem and Plant KG use case aimed to integrate diverse scientific resources into a unified KG, enhancing data interoperability and accessibility.Challenges included the need for human-readable concepts within KGs to improve LLM interaction and aligning LLMs with user demands. Future work will focus on refining KG schemas, improving LLM integration, and expanding documentation to support broader adoption and utility in scientific research. The workshop highlighted the potential of combining KGs with LLMs to enhance data accessibility and drive new insights in biological and agricultural sciences. -
Simplifying and Standardizing the Creation of Data Use Agreements for Life Sciences and Beyond - BH Germany2024
The primary goal of this project is to develop a web application for creating data usage agreements (DUA) in a way that allows automated evaluation of access permissions. Specifically, we want to adhere to the Open Digital Rights Language (ODRL) standard [1] and model permissions and prohibitions for the use of digital objects. ODRL is a policy expression language developed and adopted by the W3C. It provides a flexible and interoperable data model and vocabulary to enable fine-grained statements about the use of digital content and services. Recently, the Data Governance Act (DGA) was published as an implementation of the EU Data Act and defined roles for data intermediaries such as data trustees with certain prohibitions and obligations. The high expectations placed on the data trustee require that they have technical measures in place to facilitate the negotiation and enforcement of data use agreements. The “Ethical, Legal & Social Aspects” section of the NFDI (ELSA), has also issued a statement to the DGA [2], demonstrating the importance of this issue.Usually, DUAs are negotiated individually between parties and are not stored in a machine-readable format, which prevents automated modeling and verification of access rights for digital objects. Our web application will allow to create a DUA step by step via a configurable graphical user interface using ODRL data model in the background. This enables legal laymen to create data use agreements without much effort. The use of ODRL allows to programmatically query the data use agreements and to answer access requests automatically. To this end, the resulting DUAs can be persisted and queried through an API e.g. according to the GAIA-X specifications of Eclipse Dataspace Components (EDC) [3, 4] to exchange data compliant to rules and policies. Additionally, we want to address the integration of ODRLs in FDOs, such as ARC-RO-Crate of the DataPlant consortia and discuss extensions of the RO-Crate profiles. At last, for legal review and formal signing, negotiated DUAs can be rendered as PDFs.In summary, we will simplify the process of creating DUAs by adhering to international standards and will contribute to efforts to harmonize technical solutions, as the EOSC describes ODRL as a core metadata schema for legal interoperability [5]. DUAs can serve as a platform to gain the trust of data owners with protected, sensitive data and thus enable access to such resources. The project is in line with ELIXIR-DE/de.NBI’s objective to improve the accessibility of resources and to ensure efficient, interoperable and secure resource sharing. It also aligns with the goals of the NFDI by handling sensitive data and enabling data protection. Especially when dealing with data from the health sector, but also handling agronomic data like land survey data or data from breeding programs.The project is a joint activity of the Leibniz IPK in Gatersleben (ELIXIR-DE/de.NBI Service Center GCBN), and the Justus-Liebig-University Giessen (ELIXIR-DE/de.NBI Service Center BiGi) and contributes to NFDI4Biodiversity, FAIRAgro, NFDI4Microbiota, DataPlant and FAIR-DS as well as to European initiatives such as EOSC and Gaia-X. -
BioHackEU24 report: ORCID and ROR identifiers in BioHackrXiv reports
The first BioHackrXiv preprint was published in 2020, using a platform based on the idea of using Markdown, and just weeks ago, BioHackrXiv published their 100th preprint. Machine-readable etadata added to the Markdown that is added includes the title, keywords, the author names, their affiliations, and details about the Biohackathon event the preprint is related to. The metadata in 2000 already supported listing the ORCID identifier of the authors, but this was not added to the author list in the generated PDF. This report describes two improvements of the platform: visualization of the ORCID identifiers in the preprint PDF and support for Research Organization Registry (ROR) identifiers of the affiliations. -
BioHackGermany24 report: User-friendly ISA-based metadata annotation of life science experiments with ISA Wizard
As a contribution to BioHackathon Germany 2024, the following report details the work conducted during that event for project 8: User-friendly ISA-based metadata annotation of life science experiments with ISA Wizard. -
BioHackJP24 report: Running a WikiBlitz
During BioHackathon 24 in Fukushima, we organized a WikiBlitz, a collaborative effort to integrate biodiversity observations from iNaturalist into the Wikimedia ecosystem. A WikiBlitz is inspired by the concept of a BioBlitz, where participants document as many species as possible within a limited time frame, while also contributing structured data to Wikidata, Wikimedia Commons, and Wikipedia. In this report, we describe the methodology and outcomes of this event, including the collection of 109 biodiversity observations and their subsequent verification and integration into Wikimedia platforms. We highlight the challenges and best practices for running a WikiBlitz, particularly around licensing and data quality, and demonstrate how tools such as iNaturalist2Commons and Wikidata queries can enhance the reuse of citizen science data. Finally, we provide a step-by-step tutorial to support future WikiBlitz events, ensuring broader participation and sustainable knowledge-sharing across platforms. -
BioHackEU24 report: Bioschemas for Mortals
We report here on the progress of project #10: “Bioschemas for Mortals” from BioHackathon Europe 2024. The goal of this project is to reimagine, reframe and supplement the existing Bioschemas guidance available. We identified patterns of use, commonly undertaken tasks and user personas and roles. This information will be used to identify what is needed by less technical users, ultimately providing specific code examples that can be copy/pasted, documented examples for different web setups, customised guidance for different personas, and to address usability and content accessibility. We will also use the learnings from the Bioschemas hackathons to progress more quickly on the domain-agnostic schemas.sci site.