A FAIR-Based Approach to Enhancing the Discovery and Re-Use of Transcriptomic Data Assets for Nuclear Receptor Signaling Pathways
DOI:
https://doi.org/10.5334/dsj-2017-011Keywords:
Transcriptomics, datasets, findability, accessibility, interoperability, re-useAbstract
Public transcriptomic assets in the nuclear receptor (NR) signaling field hold considerable collective potential for exposing underappreciated aspects of NR regulation of gene expression. This potential is undermined however by a series of enduring informatic pain points that retard the routine re-use of these datasets. Here we describe a coordinated biocuration and web development approach to redress this situation that is closely aligned with ideals articulated in the FAIR (findable, accessible, interoperable, re-usable) principles on data stewardship. To improve findability, biocurators engage authors of studies in collaborating journals to secure datasets for deposition in public archives. Annotated derivatives of the archived datasets are assigned digital object identifiers and regulatory molecule identifiers that support persistent linkages between datasets and their associated research articles, integration in relevant records in gene and small molecule knowledgebases, and indexing by dataset search engines. To enhance their accessibility and interoperability, datasets are visualizable in responsively designed web pages, retrievable in machine-readable spreadsheets, or through an application programming interface. Re-use of the datasets is supported by their interrogation as a universe of data points through the Transcriptomine search engine, highlighting transcriptional intersections between NR signaling pathways, physiological processes and disease states. We illustrate the value of our approach in connecting disparate research communities using a use case of persistent interoperability between the Nuclear Receptor Signaling Atlas and the Pharmacogenomics Knowledgebase. Our FAIR-aligned model demonstrates the enduring value of discovery-scale datasets that accrues from their systematic compilation, biocuration and distribution across the digital biomedical research enterprise.Published
License
Copyright (c) 2017 The Author(s)
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms. If a submission is rejected or withdrawn prior to publication, all rights return to the author(s):
-
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
-
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
-
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
Submitting to the journal implicitly confirms that all named authors and rights holders have agreed to the above terms of publication. It is the submitting author's responsibility to ensure all authors and relevant institutional bodies have given their agreement at the point of submission.
Note: some institutions require authors to seek written approval in relation to the terms of publication. Should this be required, authors can request a separate licence agreement document from the editorial team (e.g. authors who are Crown employees).