Meetings

Recent preprints

Jul 4, 2023

RDF Data integration using Shape Expressions
The paper contains a report of the activities that have been done during the Biohackathon 2023 in Shodoshima, Japan in a project about RDF data integration using Shape Expressions. The paper describes several approaches that have been discussed to create RDF data subsets and some preliminary results applying some of those technologies. It also describes the work that has been done comparing RDF data modeling approaches like ShEx, LinkML and YAML files from rdfconfig. less than 1 minute read
Jul 1, 2023

Evaluating Oxigraph Server as a triple store for small and medium-sized datasets
With the escalating complexity and volume of bioinformatics data, there is an escalating demand for efficient and multifaceted triplestore technologies. Contemporary programming languages, such as Rust, provide solutions to the constraints identified in traditional languages, placing emphasis on safety, performance, and enhanced developer experience. A paradigm of this modern approach is Oxigraph, a Rust-based graph database demonstrating proficient graph data management, predominantly targeting single-node use case applications. Despite its genesis as a hobby project, Oxigraph yields competitive performance in administering straightforward Online Transaction Processing (OLTP) workloads, exhibiting a considerable potential for future refinement. This study is focused on a comprehensive appraisal of the Oxigraph server’s efficacy in distinct use cases, transcending beyond the typical SPARQL performance. The evaluation thoroughly examines various operational aspects, including data loading, backup procedures, deployment strategies, maintenance protocols, and overall server usability. The authors used a subset of PDB/RDF and complete chem_comp/RDF archives; totals around 0.5 B triples have been used to conduct this evaluation. less than 1 minute read
Jun 15, 2023

BioHackathon Europe 2022 Paper for Project 3: Bioinforming
Optimal formats to inform and engage young students in novel biology-related fields are short courses. Training schools, e.g. those lasting for five days, can provide enough content to introduce students to an extensive overview of bioinformatics and scientific career opportunities. In this work, we define a five-day training school format tailored to three target groups of young students: high school students, undergraduate students in biology-related fields and undergraduate students in computational fields. We structure the content and sessions around learning areas consisting of learning topics, detailing the dependencies between them.For each learning topic, we define learning outcomes and learning activities. Moreover, we conceptualize a teaching platform to manage FAIRyfied (Findable, Accessible, Interoperable, Reusable) training materials that anyone will be able to use to design a new training school in bioinformatics. less than 1 minute read
Jun 15, 2023

BioHackEU22 Report for Project 31: The What &amp; How in data management: Improving connectivity between RDMkit and FAIR Cookbook
This report describes the work completed during the ELIXIR Biohackathon 2022 for project 31: The What & How in data management: Improving connectivity between RDMkit and FAIR Cookbook. The project covered 3 subjects: the technical connectivity between the two primary resources, an editorial alignment and gap analysis of their content, and the creation of user journeys incorporating the wider ELIXIR Research Data Management (RDM) ecosystem. less than 1 minute read
Apr 6, 2023

Operator dashboard for controlling the NeIC Sensitive Data Archive
Human genome and phenome data is classified as special categories data under the EU GDPR legislation (Art. 9 GDPR). This requires special care to be taken when processing and reusing this data for research. To enable this in a compliant way, a federated approach was applied to the existing European Genome-phenome Archive ([EGA(https://ega-archive.org/)]) (Freeberg et al., 2022), creating the Federated EGA ([FEGA(https://ega-archive.github.io/ FEGA-onboarding/#what-is-federated-ega)]) (EGA Consortium, n.d.) in 2022. The Nordic countries, Norway, Finland and Sweden, together with Spain and Germany, represent the first federated partners.In the Nordics we have collaborated around our own implementation for our federated EGA nodes. We have done this under the umbrella of the Nordic e-Infrastructure Collaboration (NeIC)[https://neic.no/] (NeIC, n.d.), where we have had three projects over the last 7 years: Tryggve1 (NeIC, 2014-2017), Tryggve2 (NeIC, 2017-2020) and now Heilsa (NeIC, 2021-2024).As we in the nordics now move into production there is a need for both system administrators and helpdesk staff to be able to control and inspect the system. We need to answer questions related to operations, identify errors in order to better manage the services and infrastructure. To standardize this workflow and make the system easier to use, we decided to build a Minimal Viable Product (MVP) for such an “Operator Dashboard” during the ELIXIR Biohackathon 2022. 1 minute read
Apr 6, 2023

Improving Metadata Collection and Aggregation in Plant Phenotyping Experiments with MIAPPE Wizard and DataPLANT
As part of the BioHackathon Germany 2022, we hereby report on the success of the two projects “MIAPPE Wizard: Enabling easy creation of MIAPPE-compliant ISA metadata for Plant Phenotyping Experiments” and “DataPLANT - Facilitating Research Data Management to combat the reproducibility crisis”. Shortly before the actual hackathon, it became apparent to the participants that close coordination between the projects would be very beneficial. Both projects aimed to improve the process of collecting and aggregating metadata on plant experiments, but with different approaches. less than 1 minute read
Apr 6, 2023

Onboarding suite for Federated EGA nodes
The European Genome-phenome Archive (EGA) (Freeberg et al., 2022) (also known as CentralEGA - cEGA) is a service for permanent archiving and sharing personally identifiable geneticand phenotypic data resulting from biomedical research projects. The Federated EGA (EGAConsortium, n.d.), consisting of the Central and Federated EGA nodes, will be a distributednetwork of repositories for sharing human -omics data and phenotypes. Each node of thefederation is responsible for its own infrastructure and the connection to the Central EGA.Currently, the adoption and deployment of a new federated node is challenging due to thecomplexity of the project and the diversity of technological solutions used, in order to ensurethe secure archiving of the data and the transfer of the information between the nodes.The goal of this project was to develop an onboarding suite consisting of simple scripts,supplemented by documentation, that would help newcomers to the EGA federation in orderunderstand in depth the main concepts, while enabling them to get involved in the developmentof the technology as quickly as possible.At the same time we aimed to identify existing technologies and standards across FEGA nodesthat can be used as a reference to upcoming nodes. 1 minute read

Meetings

Recent preprints

RDF Data integration using Shape Expressions

Evaluating Oxigraph Server as a triple store for small and medium-sized datasets

BioHackathon Europe 2022 Paper for Project 3: Bioinforming

BioHackEU22 Report for Project 31: The What &amp;amp; How in data management: Improving connectivity between RDMkit and FAIR Cookbook

Operator dashboard for controlling the NeIC Sensitive Data Archive

Improving Metadata Collection and Aggregation in Plant Phenotyping Experiments with MIAPPE Wizard and DataPLANT

Onboarding suite for Federated EGA nodes

BioHackEU22 Report for Project 31: The What & How in data management: Improving connectivity between RDMkit and FAIR Cookbook