Curating Scientific Information in Knowledge Infrastructures

Authors

Markus Stocker TIB Leibniz Information Centre for Science and Technology, Welfengarten 1 B, 30167 Hannover, MARUM Center for Marine Environmental Sciences, PANGAEA Data Publisher for Earth & Environmental Science, Leobener Strasse 8, 28359 Bremen https://orcid.org/0000-0001-5492-3212
Pauli Paasonen Institute for Atmospheric and Earth System Research/Physics, 00014 University of Helsinki https://orcid.org/0000-0002-4625-9590
Markus Fiebig NILU – Norsk Institutt for Luftforskning, Dept. Atmospheric and Climate Research, Instituttveien 18, 2007 Kjeller https://orcid.org/0000-0002-3380-3470
Martha A. Zaidan Institute for Atmospheric and Earth System Research/Physics, 00014 University of Helsinki https://orcid.org/0000-0002-6348-1230
Alex Hardisty School of Computer Science and Informatics, Cardiff University, Queens Buildings, 5 The Parade, Cardiff CF24 3AA https://orcid.org/0000-0002-0767-4310

DOI:

https://doi.org/10.5334/dsj-2018-021

Keywords:

Data Use, Data Interpretation, Linked Data, Semantic Information, Environmental Research Infrastructures, Environmental Knowledge Infrastructures, Informatics, Data Science

Abstract

Interpreting observational data is a fundamental task in the sciences, specifically in earth and environmental science where observational data are increasingly acquired, curated, and published systematically by environmental research infrastructures. Typically subject to substantial processing, observational data are used by research communities, their research groups and individual scientists, who interpret such primary data for their meaning in the context of research investigations. The result of interpretation is information—meaningful secondary or derived data—about the observed environment. Research infrastructures and research communities are thus essential to evolving uninterpreted observational data to information. In digital form, the classical bearer of information are the commonly known “(elaborated) data products,” for instance maps. In such form, meaning is generally implicit e.g., in map colour coding, and thus largely inaccessible to machines. The systematic acquisition, curation, possible publishing and further processing of information gained in observational data interpretation—as machine readable data and their machine readable meaning—is not common practice among environmental research infrastructures. For a use case in aerosol science, we elucidate these problems and present a Jupyter based prototype infrastructure that exploits a machine learning approach to interpretation and could support a research community in interpreting observational data and, more importantly, in curating and further using resulting information about a studied natural phenomenon.

Downloads

Published

2018-09-20

Issue

Vol. 17 (2018)

Section

Research Papers

Collections

Special Collection on the Work of CODATA

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors who publish with this journal agree to the following terms. If a submission is rejected or withdrawn prior to publication, all rights return to the author(s):

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.

Submitting to the journal implicitly confirms that all named authors and rights holders have agreed to the above terms of publication. It is the submitting author's responsibility to ensure all authors and relevant institutional bodies have given their agreement at the point of submission.

Note: some institutions require authors to seek written approval in relation to the terms of publication. Should this be required, authors can request a separate licence agreement document from the editorial team (e.g. authors who are Crown employees).