Page cover image
HMS Office of Research Operations logo
Laboratory of Systems Pharmacology logo

From Data to Discovery: Managing Data and Knowledge in Biomedicine

Join the conversation on data & knowledge management in biomedicine and discover new ideas and practical tools for analyzing, sharing, and publishing your research.

2024 April 10 | Training for Rigor: Designing Online Educational Resources for the NIH CENTER/METER Initiative

Taralyn Tan, Ph.D., Assistant Dean for Educational Innovation and Scholarship & Ella Batty, Ph.D., Assistant Dean for Educational Programs at Kempner Institute

0:00 Introduction | 2:44 Main Presentation | 49:00 Q&A with Audience

2024 March 13 | Data Management for CryoEM

Shaun Rawson, Ph.D., CryoEM Computational Specialist | Cyro-EM@Harvard Medical School

0:00 Introduction | 1:15 Main Presentation | 41:10 Q&A with audience

Advances in cryoEM technologies have resulted in an increasing rate of data acquisition. With new detectors generating multiple TB per instrument in a single evening this deluge has outstripped the field's ability to organise and handle the data. As a field we are grappling with ways to handle this data at the user and institutional level, and with how to organise and share information more widely. This presentation will cover data handling from the instrument from the facility perspective before exploring the wider topic of downstream management of cryoEM data. We will discuss integration with HMS-IT storage solutions, along with metadata handling and the unsolved challenges we still face.

2024 February 14 | Managing Data by Managing Metadata

Stuart Levine, Ph.D., Director of BioMicro Center | Massachusetts Institute of Technology

0:00 Introduction | 0:52 Main Presentation | 38:25 Q&A with audience

Data management is a critical challenge required to improve the rigor and reproducibility of large projects. Adhering to Findable, Accessible, Interoperable, and Reusable (FAIR) standards provides a baseline for meeting these requirements. Although many existing repositories handle data in a FAIR-compliant manner, connecting these datasets in a coherent manner is a growing challenging in an increasingly multi-omic and multi-institutional environment. We have developed NExtSEEK as a data management platform that allows for creating highly structured and warehoused metadata that is compatible with public deposition of these metadata in the public repository fairdomhub.org. This metadata management platform is currently used by the IMPAcTB program, the MIT superfund research program, and the MIT Metastasis Network program.

2023 November 15 | Analysis & Management Pipelines for Large-Scale Neuroscience Imaging & Electrophysiology Data

Christopher D Harvey, Ph.D., Professor of Neurobiology | Harvard Medical School

Cindy Yuan, graduate student | Harvard Medical School

0:00 Introduction | 1:10 Main Presentation | 43:26 Q&A with audience

2023 September 20 | Managing the image-data life cycle for the real world

Caterina Strambio De Castillia, Ph.D. | CZI Imaging Scientist, Assistant Professor of Molecular Medicine, UMass Chan Medical School

0:00 Introduction | 2:00 Main Presentation | 47:12 Q&A with audience

Rigorous and quantitative cell science crucially depends on the generation of high-quality datasets in which all relevant information (i.e., metadata) about a microscopy experiment is reported using FAIR (Findable Accessible Interoperable Reusable) principles. Significant advances in spatiotemporal resolution have led to ever-expanding microscopy datasets which, without agreed-upon community guidelines, are challenging to quantitatively analyze (including AI-assisted strategies), reproduce, and re-use. To overcome this hurdle, it is essential to integrate community-specified image documentation and quality-control guidelines within easy-to-use Research Data Management (RDM) software tools and pipelines to support the streamlined execution, tracking, and documentation of the full life-cycle of image data from sample preparation, image acquisition and analysis to publication and sharing (i.e., data provenance).

2023 June 14 | Collaborative tools and approaches for FAIR data sharing and team science

Adam Taylor, Ph.D. | Senior Research Scientist, Sage Bionetworks

0:00 Introduction | 2:30 Main presentation | 37:45 Q&A with audience

As biological research has grown increasingly data-intensive, collaboration among researchers with diverse expertise and resources has become essential. At Sage Bionetworks, we work with funders and researchers to coordinate data distribution under FAIR principles and to help “teams of teams” balance incentives and achieve research goals. Our interdisciplinary team of data curators, scientists, engineers, designers, and governance experts builds tools and systems to enable this, including our NIH-recognized data repository Synapse. Our flexible approach ensures secure and adaptive stewardship, curation, and sharing of data and metadata, meeting the unique needs of each research community. We aim to make biomedical data widely available and usable, directly engaging research communities and leveraging team science-based strategies to support collaborative science. In this seminar, we will share our approach to accelerating collaborative research; our work with large consortia such as the Human Tumor Atlas Network; how you can use Synapse today; and ways of working with us to implement and enhance your data management and sharing plans.

2023 May 17 | Promoting rigor & reproducibility in fluorescence microscopy through accessible reporting tools & resources

Paula Montero Llopis, Ph.D. | Director of MicRoN Core, Harvard Medical School

0:00 Introduction | 3:02 Main presentation | 47:40 Q&A with audience

Over the past decade, biomedical research has become more quantitative and interdisciplinary. The development and advancement of new tools in light microscopy and data analysis, especially open-source methods, have played a significant role in this shift, enabling breakthroughs in biomedicine. This means, researchers can tackle more challenging questions and obtain a deeper understanding of complex biological systems than ever before. However, the rapid development presents new challenges for researchers, as an in-depth knowledge of each technology is needed to appreciate its impacts on bias and reproducibility. In this seminar, we discuss what impacts microscopy data and conclusions and provide tools and resources for designing rigorous and reproducible microscopy experiments and how to appropriately report microscopy methods.

2023 April 12 | Accelerating discovery in biomedicine with machine-assisted data annotation and knowledge assembly

Benjamin M. Gyori, Ph.D. | Assistant Professor, jointly appointed in Khoury College of Computer Sciences & Bioengineering

0:00 Introduction | 6:38 Main presentation | 50:41 Q&A with audience

Making novel scientific discoveries requires integrating biomedical data and knowledge from diverse sources. However, merging disparate sets of information is time consuming and error-prone due to challenges like inconsistent naming conventions and the use of incompatible identifier resources. To address this, we introduce the Biopragmatics project, a new set of community standards and software tools to annotate data sets and make them easier to integrate. Then, we discuss the INDRA software system, which automatically assembles data and knowledge from large-scale automated processing of literature and pathway databases. We demonstrate how the knowledge assembled by INDRA can be used to generate mechanistic models and networks of biological systems, and inform novel hypotheses that can advance the field of biomedicine.

Last updated