Meetings

Recent preprints

  • BioHackEU23 report: Extending interoperability of experimental data using modular queries across biomedical resources

    This report provides an overview of the significant accomplishments achieved during the ELIXIR Biohackathon 2023 under Project 17: “Extending interoperability of experimental data using modular queries across biomedical resources”. The project diligently addressed four key aspects: the expansion of data resources, the creation of knowledge graphs, advancements in data visualization, and the development of a use-case-driven pipeline. The collective efforts during the Biohackathon aimed to enhance the integration and accessibility of experimental data across diverse biomedical resources by developing a tool named BioDataFuse.
  • The fourth annual Carnegie Mellon Libraries hackathon for biomedical data management, knowledge graphs, and deep learning

    In October 2023, a group of 44 scientists hailing from several U.S. states, Canada, Poland, and Switzerland came together for a hybrid in-person and virtual hackathon. The event was jointly hosted by Carnegie Mellon University Libraries and DNAnexus, a California-based cloud computing and bioinformatics company. This collaborative effort revolved around the theme of “Data Management and Graph Extraction for Large Transformer Models in the Biomedical Space.” In the spirit of fostering collaboration, participants organized themselves into five teams, which ultimately resulted in the successful completion of four hackathon projects. These projects encompassed a wide range of topics, from detecting features contributing to virus susceptibility to validating models using knowledge graphs. Repositories for the hackathon projects are available at https://github.com/collaborativebioinformatics. We hope that the insights and experiences shared by these teams, as detailed in the following manuscript, will prove valuable to the broader scientific community.
  • Rendering co-author graphs using linked-open-data from Wikidata

    Wikidata is the linked-open-data graph of the Wikimedia foundation with its most known sibling Wikipedia (Vrandečić, 2012). What Wikipedia is to text, Wikidata is to data. Like in Wikipedia linked-data can be added for everyone, by everyone. This makes Wikidata a very rich source of data. A substantial part of the data on Wikidata is about scientific publications and the authors of these publications (Taraborelli et al., 2016). Scholia is a tool that uses this data to create a profile page for authors and publications (Nielsen et al., 2017). This report describes a workflow to create co-author graphs using the data from Scholia.
  • Exploring the landscape of the genomic wastewater surveillance ecosystem: a roadmap towards standardization

    The landscape of genomic wastewater surveillance in the context of infectious disease monitoring is rapidly evolving, and this came into sharp focus during the COVID-19 pandemic. Here we highlight the significance of wastewater surveillance as a passive monitoring system complementary to clinical genomic surveillance activities. Emphasizing the need for coordination, standardization, and the development of a unified catalog of software tools and services, we aim to streamline the implementation of end-to-end genomic wastewater surveillance pipelines.Key considerations such as defining variants, understanding antimicrobial resistance, and assessing viral fitness within the framework of wastewater surveillance are explored, linking to examples of respective tools and existing pipelines. The challenges of wastewater data analysis, the need for specialized tools and bioinformatics workflows, and the significance of integrated pipelines are also discussed in detail. The article presents case studies, including the V-pipe integrated bioinformatics workflow and the integration of tools into the Galaxy platform, underscoring their role in enhancing data analysis efficiency and standardization within the field.Overall, the review highlights the critical importance of continued research efforts to advance understanding and implementation of bioinformatic approaches in wastewater surveillance for the effective monitoring and management of infectious diseases.
  • Efforts to analyze pathways in non-model organisms

    In addition to functional annotation of genes, annotating genes to pathways is important in current molecular biology.But, pathway diagrams are required to annotate genes to nodes of those.Therefore, it is important to draw pathway diagrams with assignment to genes and metabolites.Existing metabolic pathway databases focus on generic pathways, while secondary metabolism is emphasized in organisms producing useful substances.Moreover they cannot accept third party annotation of those data.A practical system for pathway analyses is therefore really needed.Following on from the previous BioHackathon (BH23), we first discussed how to create a database of pathway information in non-model species in a domestic version of the BioHackathon called BH23.9 held in Shirahama, Wakayama, Japan (25-29 September 2023).We then gave a tutorial on how to write a pathway diagram using PathVisio, which is a free open-source pathway analysis and drawing software which allows drawing, editing, and analyzing biological pathways. Finally we tried to establish the conversion system from text data to Graphical Pathway Markup Language (GPML), which is called txt2gpml.txt2gpml will drastically reduce the time and effort required to create pathway diagrams.After a stimulus discussion in BH23 and BH23.9, we could clarify the current issues in the pathway analysis for non-model organisms.
  • BioHackJP 2023 Report R3: Plant data integration for findability across multiple databases

    Plant research generate vast amount of heterogeneous data available in dispersed repositories. Therefore, accessing, integrating, and analyzing these datasets is a challenge caused by their low findability as well as format and standards variability. Several solutions including data standards (MIAPPE, BrAPI) and portals (FAIDARE) are recommended by the ELIXIR plant community through the RDM Kit plant pages. The BioHackathon Japan 2023 was an ideal event to outreach those solutions toward the Japanese researchers and bioinformaticians in order to increase visibility of Japanese databases in the plant research data discovery portal FAIDARE and explore the use of the Breeding API for knowledge graph.
  • BioHackEU22 Report: Enhancing Research Data Management in Galaxy and Data Stewardship Wizard by utilising RO-Crates

    This report describes the integration of RO-Crates into Data Stewardship Wizard and Galaxy during the BioHackathon Europe 2023, aiming to improve data management and sharing in scientific research. By utilizing RO-Crates, researchers can easily create machine-readable metadata for their datasets, ensuring long-term discoverability, accessibility, and reusability. The seamless integration of RO-Crates in these platforms enhances collaboration between researchers and institutions, facilitating data sharing and reuse across projects and domains. Future efforts may focus on enhancing RO-Crate’s interoperability with other standards and platforms, as well as promoting wider adoption through outreach and education initiatives to meet the evolving needs of researchers and institutions in data stewardship.