Meetings

Recent preprints

  • How to increase the findability, visibility, and impact of Galaxy tools for your scientific community

    The scale and diversity of available software options in the Galaxy ecosystem can make domain or community specific discovery of software challenging. Here, we present a semi-automated and reusable pipeline for creating tailored interactive tables that list the identity and metadata (e.g. bio.tools, EDAM) available for Galaxy tools in a specific community (e.g. microGalaxy, imaging). In addition, we also describe an annotation framework to improve the quality of the table contents, and training material to support the reuse of both the pipeline and table by additional communities. The sum of these contributions is expected to make it easier for Galaxy users to discover and understand the software within their research area, improve the annotation of these software resources, and allow other domains to enable equivalent discovery processes for their community.This work is the outcome of a BioHackathon Europe 2023 project.
  • BioHackEU23: FAIR Workflow Execution with WfExS and Workflow Run Crate

    FAIR Computational Workflows argues that workflows should be FAIR scholarly community research objects in their own right as a kind of FAIR Research Software. In this project we go one step further, and argue that workflow executions should also be published with sufficient traces and structured metadata. Workflow Run RO-Crate is a set of profiles of RO-Crate that capture workflow provenance in a lightweight FAIR data package based on existing standards, in order to support traceability, reproducibility and interoperable description of diverse computational analysis. This use of RO-Crate allows the contextualization of a computational workflow and its execution, e.g. relating to people, organisations, projects, funding, data sources and wider research questions and studies.We have implemented the profile in multiple workflow systems, including Galaxy, COMPSs, StreamFlow, WfExS, Sapporo and Autosubmit. The command line tool runcrate can convert from the precursor CWLProv and display or validate crates according to the profiles. The crates are compatible with ELIXIR’s WorkflowHub and support increasing levels of details, including documenting ad-hoc scripts without a workflow engine.WfExS is a workflow orchestrator designed for reproducible and secure workflow executions in isolated environments (like HPC). Every input, workflow and container being used in an execution must have either a public or permanent identifier, or at least a resolvable URI, so the execution scenario can be materialised. The execution scenario before and/or after the execution can be saved to RO-Crate.Here we bring together FAIR Computational Execution practitioners to mature and generalise this approach using Workflow Run Crate.
  • Building Towards a Machine-Actionable Software Management Plan: A BioHackathon Europe 2023 Report

    This report provides an overview of our activities and accomplishments concerning machine-actionable Software Management Plans (SMPs) and the Software Management Wizard (SMW) during the ELIXIR BioHackathon Europe 2023. ELIXIR acknowledges the critical role of effective software management in facilitating sustainable and reproducible research outcomes. The Software Best Practices group is actively committed to establishing a robust framework for SMP creation. In this project, our primary focus is on streamlining the SMP creation process for research software within ELIXIR. To achieve this, we are working on developing essential integrators and identifying and reviewing the relevant metadata schema. This effort is closely aligned with various related initiatives such as OpenEBench, FAIR4RS, RDA, maSMPs, among others. The outcomes of the BioHackathon project are now available for immediate use and can be further refined in the future based on community feedback and advancements in research software best practices.
  • Metadata handling for BioHackathon publications through BioHackrXiv

    This paper presents the work executed on BioHackrXiv during the international ELIXIR BioHackathon Europe in Paris, France, 2022. BioHackrXiv is a scholarly publication service for BioHackathons and codefests that target biology and the biomedical sciences in the spirit of pre-publishing platforms.
  • Genome Annotation and Other Post-Assembly Workflows for the Tree of Life

    Rapid advances in genome sequencing technologies have resulted in an explosion of referencequality genome assemblies across the tree of life. While these resources will be invaluable towards goals of species and biodiversity conservation, their application is limited when they lack accurate annotations of their functional elements. The European Reference Genome Atlas (ERGA) is the European node of the Earth Biogenome Project (EBP) and aims to share resources and knowledge to create fully-annotated reference genomes. ERGA strives to do this in a distributed manner, bringing together researchers from across the world, with common goals and understandings.In the BioHackathon Europe 2023, we came together to construct and test tools, pipelines and workflows for annotating protein-coding regions in assembled genomes. We specifically aimed to evaluate (a) the performance in a wide variety of non-model organisms and (b) the “usability” of pipelines for newcomers to annotation. This work required installing and implementing tools in a number of computational environments and infrastructures, sharing of both genomic resources and expertise between researchers from a range of institutes, and evaluation of annotation workflows performance and what input data is required in order to achieve a high quality genome annotation. Here we present the results of over 20 researchers in 8 time-zones working towards a robust implementation of genome annotation workflows in eukaryotic organisms.
  • Improving Bioschemas creation and community adoption through process improvements, tool development, and advancing compliance to FAIR standards

    Nowadays scientists massively produce diverse datasets in many communities. They need to combine them to answer scientific or novel questions. To do so, these diverse computational resources need first to be found by search engines. Bioschemas provides a simple and lightweight mechanism to annotate online resources in a standardized way and expose key metadata. To improve the accessibility and value of Bioschemas to existing and emerging communities, we aim to develop an automated system to assess the adoption of Bioschemas, work with identified groups that have specific needs addressable by Bioschemas, address usability issues in the Bioschemas profile and type development process, and extend the reach of Bioschemas by making it available in a domain-agnostic manner.
  • Bioschemas Resource Index for Chem and Plants

    As part of the BioHackathon Europe 2023, we here report on the progress of the hacking team preparing a resource index and knowledge graph based on the JSON-LD Bioschemas markup from several resources in the life- and natural sciences, predominantly from the fields of plant- and (bio)chemistry research. This preliminary analysis will allow us to better understand how Bioschemas markup is currently used in these two communities, so we can take actions to improve guidelines and validation on the Bioschemas markup and the data providers side. The lessons learnt will be useful for other communities as well. The ultimate goal is facilitating and improving interoperability across resources.