A FAIR-Based Approach to Enhancing the Discovery and Re-Use of Transcriptomic Data Assets for Nuclear Receptor Signaling Pathways

Authors

  • Scott A. Ochsner Nuclear Receptor Signaling Atlas (NURSA) Informatics, One Baylor Plaza, Houston, Texas, 77030, US; Department of Molecular and Cellular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, Texas, 77030, US
  • Yolanda F. Darlington Nuclear Receptor Signaling Atlas (NURSA) Informatics, One Baylor Plaza, Houston, Texas, 77030, US; Dan L. Duncan Cancer Center Biomedical Informatics Group, Baylor College of Medicine, One Baylor Plaza, Houston, Texas, 77030, US
  • Apollo McOwiti Nuclear Receptor Signaling Atlas (NURSA) Informatics, One Baylor Plaza, Houston, Texas, 77030, US; Dan L. Duncan Cancer Center Biomedical Informatics Group, Baylor College of Medicine, One Baylor Plaza, Houston, Texas, 77030, US
  • Wasula H. Kankanamge Nuclear Receptor Signaling Atlas (NURSA) Informatics, One Baylor Plaza, Houston, Texas, 77030, US; Dan L. Duncan Cancer Center Biomedical Informatics Group, Baylor College of Medicine, One Baylor Plaza, Houston, Texas, 77030, US
  • Alexey Naumov Nuclear Receptor Signaling Atlas (NURSA) Informatics, One Baylor Plaza, Houston, Texas, 77030, US; Dan L. Duncan Cancer Center Biomedical Informatics Group, Baylor College of Medicine, One Baylor Plaza, Houston, Texas, 77030, US
  • Lauren B. Becnel Nuclear Receptor Signaling Atlas (NURSA) Informatics, One Baylor Plaza, Houston, Texas, 77030, US; Dan L. Duncan Cancer Center Biomedical Informatics Group, Baylor College of Medicine, One Baylor Plaza, Houston, Texas, 77030, US
  • Neil J. McKenna Nuclear Receptor Signaling Atlas (NURSA) Informatics, One Baylor Plaza, Houston, Texas, 77030, US; Department of Molecular and Cellular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, Texas, 77030, US https://orcid.org/0000-0001-6689-0104

DOI:

https://doi.org/10.5334/dsj-2017-011

Keywords:

Transcriptomics, datasets, findability, accessibility, interoperability, re-use

Abstract

Public transcriptomic assets in the nuclear receptor (NR) signaling field hold considerable collective potential for exposing underappreciated aspects of NR regulation of gene expression. This potential is undermined however by a series of enduring informatic pain points that retard the routine re-use of these datasets. Here we describe a coordinated biocuration and web development approach to redress this situation that is closely aligned with ideals articulated in the FAIR (findable, accessible, interoperable, re-usable) principles on data stewardship. To improve findability, biocurators engage authors of studies in collaborating journals to secure datasets for deposition in public archives. Annotated derivatives of the archived datasets are assigned digital object identifiers and regulatory molecule identifiers that support persistent linkages between datasets and their associated research articles, integration in relevant records in gene and small molecule knowledgebases, and indexing by dataset search engines. To enhance their accessibility and interoperability, datasets are visualizable in responsively designed web pages, retrievable in machine-readable spreadsheets, or through an application programming interface. Re-use of the datasets is supported by their interrogation as a universe of data points through the Transcriptomine search engine, highlighting transcriptional intersections between NR signaling pathways, physiological processes and disease states. We illustrate the value of our approach in connecting disparate research communities using a use case of persistent interoperability between the Nuclear Receptor Signaling Atlas and the Pharmacogenomics Knowledgebase. Our FAIR-aligned model demonstrates the enduring value of discovery-scale datasets that accrues from their systematic compilation, biocuration and distribution across the digital biomedical research enterprise.

Downloads

Published

2017-03-23

Issue

Section

Practice Papers

Categories