<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20120330//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<!--<?xml-stylesheet type="text/xsl" href="article.xsl"?>-->
<article article-type="research-article" dtd-version="1.1" xml:lang="en" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id journal-id-type="issn">1683-1470</journal-id>
<journal-title-group>
<journal-title>Data Science Journal</journal-title>
</journal-title-group>
<issn pub-type="epub">1683-1470</issn>
<publisher>
<publisher-name>Ubiquity Press</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5334/dsj-2020-050</article-id>
<article-categories>
<subj-group>
<subject>Practice paper</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Incorporating RDA Outputs in the Design of a European Research Infrastructure for Natural Science Collections</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Islam</surname>
<given-names>Sharif</given-names>
</name>
<email>sharif.islam@naturalis.nl</email>
<xref ref-type="aff" rid="aff-1">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Hardisty</surname>
<given-names>Alex</given-names>
</name>
<xref ref-type="aff" rid="aff-2">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Addink</surname>
<given-names>Wouter</given-names>
</name>
<xref ref-type="aff" rid="aff-1">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Weiland</surname>
<given-names>Claus</given-names>
</name>
<xref ref-type="aff" rid="aff-3">3</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Gl&#246;ckler</surname>
<given-names>Falko</given-names>
</name>
<xref ref-type="aff" rid="aff-4">4</xref>
</contrib>
</contrib-group>
<aff id="aff-1"><label>1</label>Naturalis Biodiversity Center, Leiden, NL</aff>
<aff id="aff-2"><label>2</label>Cardiff University, Cardiff, UK</aff>
<aff id="aff-3"><label>3</label>Senckenberg Biodiversity and Climate Research Centre, Frankfurt, DE</aff>
<aff id="aff-4"><label>4</label>Museum of Natural History, Berlin, DE</aff>
<pub-date publication-format="electronic" date-type="pub" iso-8601-date="2020-12-14">
<day>14</day>
<month>12</month>
<year>2020</year>
</pub-date>
<pub-date pub-type="collection">
<year>2020</year>
</pub-date>
<volume>19</volume>
<elocation-id>50</elocation-id>
<history>
<date date-type="received" iso-8601-date="2020-07-17">
<day>17</day>
<month>07</month>
<year>2020</year>
</date>
<date date-type="accepted" iso-8601-date="2020-11-17">
<day>17</day>
<month>11</month>
<year>2020</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright: &#x00A9; 2020 The Author(s)</copyright-statement>
<copyright-year>2020</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See <uri xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</uri>.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://datascience.codata.org/articles/10.5334/dsj-2020-050/"/>
<abstract>
<p>To support future research based on natural sciences collection data, DiSSCo (Distributed System of Scientific Collections) &#8211; the European Research Infrastructure for Natural Science Collections &#8211; adopts Digital Object Architecture as the basis for its planned data infrastructure. Using the outputs of one Research Data Alliance (RDA) interest group (IG) and five working groups (WGs) we show how RDA recommendations and supporting documents have been applied to the various stages of the DiSSCo data lifecycle.</p>
</abstract>
<kwd-group>
<kwd>Distributed System of Scientific Collections</kwd>
<kwd>DiSSCo</kwd>
<kwd>Research Data Alliance</kwd>
<kwd>RDA</kwd>
<kwd>Digital Object Architecture</kwd>
<kwd>Digital Specimen</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec>
<title>Introduction</title>
<p>In this paper, we describe how the outputs of one Research Data Alliance (RDA) interest group (IG) and five working groups (WG) have shaped the core concepts of DiSSCo (Distributed System of Scientific Collections)<xref ref-type="fn" rid="n1">1</xref> &#8211; the European research infrastructure for Natural Science Collections. Designing, building and operating a research infrastructure like DiSSCo, which has a high dependence on information and communication technologies (ICT) and data management best practices brings together expertise from multiple domains (museum curators, taxonomists and other scientists, biodiversity informaticians and data managers, computing and software engineers, administrative management). The complex design decisions involve interrelated technical components spanning five data lifecycle phases, from data acquisition through data curation, data publishing and data processing to data use (<xref ref-type="bibr" rid="B32">Martin et al., 2017</xref>; <xref ref-type="bibr" rid="B36">Nieva de la Hidalga et al., 2020: 66&#8211;67</xref>). The collective expertise from RDA and the published recommendations provides the DiSSCo community with useful guidance for creating and supporting a sustainable, long-lived research infrastructure that can enhance the overall capacity of the user to find, retrieve, and use relevant information. How this community has used RDA recommendations to shape the DiSSCo approach is generic enough to be of interest to readers from other fields.</p>
<p>The paper is organized as follows. We begin with the background on Natural Science Collections (NSCs) in the context of recent advances in digitization, data sharing and how the new challenges in the future can be addressed by a research infrastructure such as DiSSCo. The background also introduces the Digital Specimen concept&#8211; a particular type of FAIR Digital Object and the DiSSCo data lifecycle. Foregrounding the DiSSCo data lifecycle then we describe how selected outputs of RDA are applied in the design of DiSSCo data infrastructure. We conclude the paper with an overview of the future core DiSSCo services that these design decisions will enable.</p>
<p>The RDA outputs cover the following aspects:</p>
<list list-type="order">
<list-item><p>RDA output dealing with the adoption of Digital Object Architecture, based on the work of the Data Fabric IG and the Data Foundation and Terminology WG (<xref ref-type="bibr" rid="B40">RDA DF&amp;T 2015</xref>);</p></list-item>
<list-item><p>RDA output dealing with the usage of persistent identifiers and kernel information in the context of machine actionable services and programmatic decisions for digital objects from the PID Kernel WG (<xref ref-type="bibr" rid="B43">RDA PID KI 2019</xref>);</p></list-item>
<list-item><p>RDA output dealing with the aggregation of digital objects in the context of meaningful entities and serving the data from the Research Data Collections WG (<xref ref-type="bibr" rid="B45">RDA Research Data Collections 2017</xref>);</p></list-item>
<list-item><p>RDA output dealing with curation and maintenance of digital objects from the RDA/TDWG Metadata attribution WG (<xref ref-type="bibr" rid="B46">RDA/TDWG Attribution Metadata 2018</xref>); and</p></list-item>
<list-item><p>RDA output covering guidelines and specifications to assess the DiSSCo FAIR implementation plan (<xref ref-type="bibr" rid="B42">RDA FAIR Data Maturity Model 2020</xref>).</p></list-item>
</list>
</sec>
<sec>
<title>Background</title>
<p>Natural Science Collections (NSCs) hosted in natural history museums, botanic gardens, universities and other research centres around the world contain data that are critical for many scientific endeavours (<xref ref-type="bibr" rid="B21">Hedrick et al. 2020</xref>). Over the years various large scale digitization projects (<xref ref-type="bibr" rid="B5">Blagoderov et al. 2012</xref>), mobilization of biodiversity data (<xref ref-type="bibr" rid="B35">Nelson and Ellis 2019</xref>) and use of museum specimens to study genetic diversity (<xref ref-type="bibr" rid="B34">Nachman 2013</xref>) provided novel ways of doing science (<xref ref-type="bibr" rid="B47">Schindel and Cook 2018</xref>). Within the context of COVID-19 pathogen discovery research, Cook et al. (<xref ref-type="bibr" rid="B9">2020 p. 2</xref>) highlight the crucial role of the information system related to collections that hold specimens:</p>
<disp-quote>
<p><italic>&#8220;In the past few decades, museums have become hubs of biodiversity informatics, serving as the critical nexus between biological samples and sample-derived data (e.g., genomics, geographic information, isotope chemistry, CT scans). The current pandemic reminds us that natural history specimens are important but underappreciated reservoirs for studying the hosts and distributions of animal and human pathogens (see Harmon et al. 2019) and that the data connected to these specimens increase our understanding not only of the host organism but of the pathogens as well. Enhanced support of both physical and cyberinfrastructure for biodiversity collections would yield an information system to enable prediction and mitigation of future outbreaks and pandemics.&#8221;</italic></p>
</disp-quote>
<p>To support data infrastructures for collections-based research in the future (and this includes their initial design and implementation), we need to understand the challenges and urgency ushered by the new types of data collection, curation, and sharing (e.g., <xref ref-type="bibr" rid="B24">Kays et al. 2020</xref>; <xref ref-type="bibr" rid="B25">Kays, McShea and Wikelski 2020</xref>) along with maintaining and providing access to historical data (e.g., <xref ref-type="bibr" rid="B4">Besnard et al. 2016</xref>; <xref ref-type="bibr" rid="B30">Lister et al. 2011</xref>). The physical materials (samples and specimens stored in natural history museums, seed banks, cryo banks, etc.) are crucial elements for scientific inquiry. However, accessing these physically comes with its challenges of reuse as materials can deplete and the distribution of traits and phenotypes in species populations in living collections varies over time (<xref ref-type="bibr" rid="B12">Diaz et al. 2016</xref>). Therefore, access to digitized data acts as an essential reference point to the relationship between the digital and physical world. This anchoring of the different kinds of data derived from physical specimens has been explored and described as the notion of the Extended Specimen (<xref ref-type="bibr" rid="B55">Webster 2017</xref>). It represents the integrative and interdisciplinary next generation of NSCs (<xref ref-type="bibr" rid="B47">Schindel and Cook 2018</xref>). We use the term &#8216;Digital Specimen&#8217; (explained below) in an analogous manner.<xref ref-type="fn" rid="n2">2</xref></p>
<p>Existing systems for exploiting material stored in NSCs are inefficient and not cost-effective (<xref ref-type="bibr" rid="B50">Smith et al. 2019</xref>). Despite significant work by global data infrastructures such as the Global Biodiversity Information Facility (GBIF),<xref ref-type="fn" rid="n3">3</xref> Biodiversity Heritage Library,<xref ref-type="fn" rid="n4">4</xref> and Plazi TreatmentBank,<xref ref-type="fn" rid="n5">5</xref> there remain systematic gaps in linking specimen data to other data classes such as DNA sequences, literature, functional traits, habitat and conservation data and ecological models (<xref ref-type="bibr" rid="B37">Page 2016</xref>; <xref ref-type="bibr" rid="B48">Senderov et al. 2018</xref>). We are noticing increased use of digitized data from NSCs (<xref ref-type="bibr" rid="B5">Blagoderov et al. 2012</xref>). However, at the same time, for many projects, these data are organized and managed in a manner that makes data linking, sharing, and future reuse problematic (<xref ref-type="bibr" rid="B29">Lewis et al. 2018</xref>).</p>
<p>Over the past several years, the community around NSCs recognized the gaps in our understanding of bio- and geo-diversity due to loosely coordinated data infrastructures (<xref ref-type="bibr" rid="B20">Hardisty and Roberts 2013</xref>). This has led to increased efforts towards creating shared global roadmaps for biodiversity informatics (<xref ref-type="bibr" rid="B22">Hobern et al. 2019</xref>), developing standards for improved data quality (<xref ref-type="bibr" rid="B7">Chapman et al. 2020</xref>), adopting FAIR principles (<xref ref-type="bibr" rid="B2">Agosti et al. 2019</xref>; <xref ref-type="bibr" rid="B15">Grobe et al. 2019</xref>) and creating building blocks for a data landscape in which component systems can exchange and understand the information in a standard form using open protocols, and metadata (<xref ref-type="bibr" rid="B26">Lannom et al. 2020</xref>). The Distributed System of Scientific Collections (DiSSCo), along with several global partners, is working towards such a data landscape by building a pan-European Research Infrastructure (RI) that aims to mobilize and unify bio- and geo-diversity information connected to the specimens held in natural science collections. As of February 2020, DiSSCo entered the preparation phase where key design decisions and best practices are influenced by five selected outputs from the Research Data Alliance (RDA) (summarised in Table <xref ref-type="table" rid="T1">1</xref>) along with the FAIR data principles (Findability, Accessibility, Interoperability, and Reusability) (<xref ref-type="bibr" rid="B57">Wilkinson et al. 2016</xref>; <xref ref-type="bibr" rid="B33">Mons et al. 2017</xref>) and the concept of FAIR Digital Objects (FAIR-DO) (<xref ref-type="bibr" rid="B11">De Smedt et al. 2020</xref>; <xref ref-type="bibr" rid="B59">Wittenburg and Strawn 2019</xref>; <xref ref-type="bibr" rid="B14">European Commission 2018</xref>).</p>
<table-wrap id="T1">
<label>Table 1</label>
<caption>
<p>RDA outputs applied to the management of the DiSSCo data lifecycle.</p>
</caption>
<table>
<tr>
<th align="left" valign="top">RDA output</th>
<th align="left" valign="top">RDA IG/WG</th>
<th align="left" valign="top">DiSSCo Element</th>
<th align="left" valign="top">Purpose</th>
<th align="left" valign="top">Workflow/Data phase</th>
</tr>
<tr>
<td colspan="5"><hr/></td>
</tr>
<tr>
<td align="left" valign="top">1. Adoption of Digital Object Architecture</td>
<td align="left" valign="top">Data Foundation and Terminology WG Data Fabric and Terminology IG</td>
<td align="left" valign="top">DiSSCo Digital Specimen Architecture</td>
<td align="left" valign="top">Define the FAIR Digital Object Architecture of DiSSCo, including the Digital Specimen Object Model</td>
<td align="left" valign="top">Creation and management of digital objects/All phases of the data life cycle</td>
</tr>
<tr>
<td align="left" valign="top">2. Persistent Identifiers and Kernel Information</td>
<td align="left" valign="top">PID Kernel WG</td>
<td align="left" valign="top">Meta-information about a digital object and DiSSCo (data) type registry</td>
<td align="left" valign="top">Allowing smart programmatic decisions and inspection of the object&#8217;s PID record</td>
<td align="left" valign="top">Data acquisition, curation, publishing, use</td>
</tr>
<tr>
<td align="left" valign="top">3. Aggregation of digital objects</td>
<td align="left" valign="top">Research Data Collection WG</td>
<td align="left" valign="top">DiSSCo data repository/portal/API</td>
<td align="left" valign="top">Provide meaningful entities and serving the data</td>
<td align="left" valign="top">Data publishing and use (share, download)</td>
</tr>
<tr>
<td align="left" valign="top">4. Metadata attribution and use of PROV entities</td>
<td align="left" valign="top">RDA/TDWG Metadata attribution working group</td>
<td align="left" valign="top">Digital Specimen and collection objects</td>
<td align="left" valign="top">Correctly attribute sources of data and work carried out</td>
<td align="left" valign="top">Digitization, curation and maintenance of digital object (for example collection objects or specimens)</td>
</tr>
<tr>
<td align="left" valign="top">5. FAIR data maturity model</td>
<td align="left" valign="top">RDA FAIR data maturity model working group</td>
<td align="left" valign="top">DiSSCo Digital Specimen Architecture</td>
<td align="left" valign="top">Develop guidelines and specifications to assess FAIR implementation plan.</td>
<td align="left" valign="top">DiSSCo data lifecycle</td>
</tr>
</table>
</table-wrap>
<p>DiSSCo&#8217;s vision is to transform a landscape of disconnected individual natural science collection providers into a coherent research infrastructure with a variety of e-services to enable this: 1) the European Loans and Visits System (ELViS),<xref ref-type="fn" rid="n6">6</xref> a one-stop shop for access to the collections, providing both physical access and virtual access by digitization on demand; 2) European Curation and Annotation System (ECAS) for community curation of the digitized specimen data; 3) Specimen Data Refinery (SDR) providing digitization services to extract, enhance and annotate data from specimens digital images; 4) Collections Monitoring Dashboards (CMD) showing the digitization status and usage of the collections and 5) a knowledge base providing protocols, digitization resources, manuals and other documents as FAIR-DO for direct integration with the other e-services. The RDA outputs mentioned here are providing essential building blocks for envisioning these services.</p>
<p>One of the critical elements in DiSSCo is the &#8216;Digital Specimen&#8217;, a FAIR-DO acting as a digital twin on the Internet for a specific physical specimen in a museum collection. The digital information derived from the specimens will enable FAIR data and services where various data classes can be linked to provide seamless unified access to information. These ideas were explored in the EU-funded ICEDIG<xref ref-type="fn" rid="n7">7</xref> project (2018&#8211;2020) and one of the core architecture outcomes was the decision to adopt FAIR Digital Objects (FAIR-DO) (<xref ref-type="bibr" rid="B19">Hardisty et al. 2020</xref>). In particular, this choice enables the creation of machine-actionable digital twins which by design ensures FAIRness<xref ref-type="fn" rid="n8">8</xref> of the data and various other features such as unambiguous identification, data typing enforcement, attribution and provenance tracking.</p>
<p>The following sections explain how five selected RDA outputs, summarised in Table <xref ref-type="table" rid="T1">1</xref>, are applied to the management of the data lifecycle in DiSSCo, illustrated in Figures <xref ref-type="fig" rid="F1">1</xref> and <xref ref-type="fig" rid="F2">2</xref>.</p>
<fig id="F1">
<label>Figure 1</label>
<caption>
<p>Lifecycle of Digital Specimen research data in the DiSSCo data infrastructure, from acquisition through curation, publishing, processing and use, which can create new data that can be iteratively acquired, curated, etc.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-19-1250-g1.png"/>
</fig>
<fig id="F2">
<label>Figure 2</label>
<caption>
<p>Contributions of RDA outputs to the design of data management in the DiSSCo Digital Specimen data infrastructure. The FAIR Digital Object Framework and the Recommendation on PID Kernel Information contribute to the architecture as a whole while the Recommendation on Research Data Collections and Attribution Metadata contribute more explicitly into specific phases of the data lifecycle.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-19-1250-g2.png"/>
</fig>
<p>This lifecycle begins with the digitization and acquisition of data from physical specimens &#8211; the creation of the Digital Specimens (DS) and Digital Collections that are specific object types with persistent identifiers and attributes. This is the data acquisition phase. These objects then are registered and curated within a repository platform (curation phase). Curated data is published to DiSSCo users and parties external to the infrastructure, as well as directly to other services. DiSSCo will provide services for further processing of data (data processing phase) that can produce new data to be stored within the infrastructure. Finally, the broader research community can use DiSSCo data and can design experiments and analyses acting on the published Digital Specimen and Collection data that produce results (derived data), which in turn can be passed back into DiSSCo for curation, publishing and processing; thus, restarting the lifecycle (<xref ref-type="bibr" rid="B13">DiSSCo DMP, 2019</xref>).</p>
</sec>
<sec>
<title>Adoption of Digital Object Architecture</title>
<p><bold>RDA output from the Data Fabric IG on virtual layer recommendations (<xref ref-type="bibr" rid="B41">RDA DFIG 2018</xref>) and the Data Foundation and Terminology WG on the basic vocabulary to apply a standard core data model (<xref ref-type="bibr" rid="B40">RDA DF&amp;T 2015</xref>) provide the structure for DiSSCo&#8217;s data organization model.</bold></p>
<p>Even though the history behind digital objects goes back to the early days of the Internet (<xref ref-type="bibr" rid="B23">Kahn and Wilensky 2006</xref>), the recent rendition has its origin in the RDA&#8217;s Data Foundation and Terminology (DF&amp;T) WG. From there the discussion has been taken up by the members of the Data Fabric IG (DFIG) together with the C2CAMP<xref ref-type="fn" rid="n9">9</xref> initiative, the RDA-Europe Group of European Data Experts (GEDE)<xref ref-type="fn" rid="n10">10</xref> and the GOFAIR<xref ref-type="fn" rid="n11">11</xref> initiative. These discussions have shaped the current principles of Digital Object Architecture (<xref ref-type="bibr" rid="B49">Sharp 2016</xref>) and most recently FAIR Digital Objects (FAIR-DO) (Figure <xref ref-type="fig" rid="F3a">3a</xref>) and the FAIR-DO Framework (<xref ref-type="bibr" rid="B11">De Smedt 2020</xref>; <xref ref-type="bibr" rid="B59">Wittenburg and Strawn 2019</xref>; <xref ref-type="bibr" rid="B14">European Commission 2018</xref>). DiSSCo adopts the Digital Object Architecture and the FAIR-DO framework to achieve FAIRness and meet the requirements of the FAIR Guiding Principles for scientific data management (<xref ref-type="bibr" rid="B57">Wilkinson 2016</xref>; <xref ref-type="bibr" rid="B26">Lannom et al. 2020</xref>; <xref ref-type="bibr" rid="B19">Hardisty et al. 2020</xref>).</p>
<fig id="F3a">
<label>Figure 3a</label>
<caption>
<p>Main components of a digital object &#8211; The core of the DO is a bit sequence that is encoding content (data, metadata, software, etc.). This is described by metadata to enable access and for correct interpretation. A persistent identifier uniquely identifies the DO and operations permit the content and metadata to be manipulated. Reproduced with permission (<xref ref-type="bibr" rid="B58">Wittenburg et al. 2019</xref>).</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-19-1250-g3.png"/>
</fig>
<fig id="F3b">
<label>Figure 3b</label>
<caption>
<p>Basic structure of a Digital Specimen (DS). A DS acts as a container for pointers, metadata and embedded content, i.e., information about and derived from the corresponding physical specimen including but not limited to, for example, necessary information about the specimen, image(s), molecular data, genetic sequence data, and morphological measurements.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-19-1250-g4.png"/>
</fig>
<p>The main impetus behind adopting FAIR-DOs for Digital Specimens (Figure <xref ref-type="fig" rid="F3b">3b</xref>) is to treat the digital representations of physical specimens as atomic items that need individual identification to avoid ambiguity<xref ref-type="fn" rid="n12">12</xref> and to collect and anchor core information about the specimens in one place. The Digital Specimens act as the mutable space for the curation of all data derived from and relating to the corresponding physical specimens. Unambiguous persistent identification allows tracking of Digital Specimens in the face of changing location, as well as organization into collections for specific purposes. The data derived from and linked to physical specimens must be easily findable and accessible. They must adhere to open standards with rich machine-comprehensible semantics, as well as conveying context (<xref ref-type="bibr" rid="B52">Stocker 2018</xref>) so they are interoperable and widely reusable by both humans and machines. Just being machine-readable (i.e., by linking to ontologies and encoding as RDF or JSON-LD) is insufficient to achieve reusability and, especially for reproducibility of science, provenance, data quality, credit and attribution (<xref ref-type="bibr" rid="B3">Bechhofer et al. 2013</xref>).</p>
<p>DiSSCo envisions that persistently and unambiguously identifying these Digital Specimens creates a digital doorway that allows researchers to do more than just find specimens and provide means for the institutions to widen access to the data stored within the NSCs (<xref ref-type="bibr" rid="B55">Webster 2017</xref>, <xref ref-type="bibr" rid="B27">Lendemer et al. 2020</xref>; <xref ref-type="bibr" rid="B47">Schindel and Cook 2018</xref>). DiSSCo expects that adopting the Digital Object Architecture recommended by RDA, and treating Digital Specimens as first-class citizens in that architecture can lead to transformations in working practices of collections-based science and the value chains founded in natural science collections.</p>
</sec>
<sec>
<title>Persistent Identifiers and Kernel Information</title>
<p><bold>RDA output from the PID Kernel WG on PID Kernel Information (<xref ref-type="bibr" rid="B43">RDA PID KI 2019</xref>) provides the capability to elevate a small number of essential attributes of Digital Specimens to the PID record level to enable new machine-actionable services without requiring access to or retrieval of the Digital Specimen objects themselves</bold>.</p>
<p>Identifiers are used in NSCs to identify physical specimens (<xref ref-type="bibr" rid="B17">G&#252;ntsch et al. 2017</xref>) and organizations such as GBIF are in the forefront of using Digital Object Identifiers (DOI) for datasets, queries and download records (<xref ref-type="bibr" rid="B10">Copas et al. 2019</xref>). At the moment though, it is not possible to unambiguously and persistently refer to digital equivalents of a physical specimen. The Digital Specimen and persistent identifier (PID) scheme combination proposed by DiSSCo fills this gap. DiSSCo, with extensive consultation and support from several international stakeholders such as GBIF, Corporation for National Research Initiative (CNRI),<xref ref-type="fn" rid="n13">13</xref> International DOI Foundation (IDF),<xref ref-type="fn" rid="n14">14</xref> is working towards adopting a Handle-based system (<xref ref-type="bibr" rid="B53">Sun et al. 2003</xref>) tuned to the needs of the natural science collections community. Scalability in the tens to hundreds of billions of PIDs (i.e., supporting a huge address space), trust (i.e., accurately maintained by a dedicated and reliable team), persistence over a very long term (i.e.,100-year target) and community governance (i.e., transparent and sustainable business model) are essential requirements to be accommodated. Besides specimens, other things have to be persistently identified. GRID<xref ref-type="fn" rid="n15">15</xref> and ROR<xref ref-type="fn" rid="n16">16</xref> are used as unique identifiers for institutions and ORCID<xref ref-type="fn" rid="n17">17</xref> is used for people. These allow to unambiguously link specimens respectively to the collection holding institutions and to the researchers and curators.</p>
<p>Assigning identifiers is the first step towards FAIR data services and ensuring machine actionability of FAIR-DOs. The definition and description of the metadata attributes of the specific digital object and persistent link to all these are the next steps. DiSSCo Data Management Plan recognizes this and thus references the RDA Recommendation on PID Kernel Information (<xref ref-type="bibr" rid="B43">RDA PID KI 2019</xref>): &#8220;Specific PID Kernel Information profiles and object type definitions must be registered for the Digital Specimen object type and other object types in the well-known Kernel Information profile and Data Type registries&#8221; (<xref ref-type="bibr" rid="B13">DiSSCo DMP 2019</xref>).</p>
<p>It is clear that a minimal set of information associated with each Digital Specimen should be available to facilitate machine-actionable services and programmatic decisions and delivery of these attributes must work with low latency and in a scalable fashion (<xref ref-type="bibr" rid="B56">Weigel et al. 2020</xref>). What is less clear is what these attributes should be or the extent of them, and what makes an optimal kernel information profile. This needs further study.</p>
<p>One use case that can exploit kernel information is submitting large number (millions) of specimen images in long term storage to a workflow for optical character/text recognition (OCR), making the results findable with a full-text search (<xref ref-type="bibr" rid="B6">Cazenave et al. 2019</xref>). These images and OCR&#8217;d label texts will reside in an ecosystem with millions of other digital objects (also with research artefacts from different domains). Full resolution of each PID might not be feasible in such cases. So for quick machine interpretation processing appropriate kernel information will be vital. A simple kernel information profile example to support this is in Table <xref ref-type="table" rid="T2">2</xref>.</p>
<table-wrap id="T2">
<label>Table 2</label>
<caption>
<p>Simple example of PID Kernel Information for a Digital Specimen. Example PID: 123prefix/uuid-27a9edf63.</p>
</caption>
<table>
<tr>
<th align="left" valign="top">Attribute</th>
<th align="left" valign="top">Value Type</th>
<th align="left" valign="top">Example Value</th>
</tr>
<tr>
<td colspan="3"><hr/></td>
</tr>
<tr>
<td align="left" valign="top">Location</td>
<td align="left" valign="top">url</td>
<td align="left" valign="top"><ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://example-dissco-repo/uuid-27a9edf63">http://example-dissco-repo/uuid-27a9edf63</ext-link></td>
</tr>
<tr>
<td align="left" valign="top">Created</td>
<td align="left" valign="top">date and time</td>
<td align="left" valign="top">2019-04-24T11:07:11.771Z</td>
</tr>
<tr>
<td align="left" valign="top">Type</td>
<td align="left" valign="top">type definition</td>
<td align="left" valign="top">typedef123/DigitalSpecimen</td>
</tr>
<tr>
<td align="left" valign="top">PhysicalSpecimenId</td>
<td align="left" valign="top">string</td>
<td align="left" valign="top">BMNH:1905.5.30.352</td>
</tr>
</table>
</table-wrap>
<p>In this example, &#8220;123prefix/uuid-27a9edf63&#8221; is the PID of a digital object with several attributes in its particular kernel information profile:</p>
<list list-type="order">
<list-item><p>Location: URL redirecting to the location of the Digital Specimen object. This URL can resolve to a digital object repository or another landing page and can also provide data serialisation options like JSON-LD.</p></list-item>
<list-item><p>Created: The timestamp when this object was created.</p></list-item>
<list-item><p>Type: A Digital Specimen. Instead of storing the string &#8220;Digital Specimen&#8221;, we refer to a PID in a Data Type Registry which is a resolvable entity with other metadata attached. It tells us the structure of the Digital Specimen object, thus enabling us to parse that.</p></list-item>
<list-item><p>PhysicalSpecimenId: Digital Specimen is a digital twin of a physical specimen, so the identifier of the physical specimen is an important and special attribute for this particular type of digital object. The value here contains the physical specimen identifier as a string.</p></list-item>
</list>
<p>From a simple machine-actionable point of view, this digital object provides the persistent identifier, points to a type declaration and, provides the physical specimen identifier. Other attributes &#8211; currently under discussion in DiSSCo Prepare WP 6 (Technical Architecture and Services provision)<xref ref-type="fn" rid="n18">18</xref> &#8211; such as scientific names, physical location, version, digitization level/definition, digital object policy, etc.) can also be included in such a profile. These can help to decide whether a Digital Specimen is suitable for the intended operation. For example, an update operation on a Digital Specimen can be adding missing records or fixing incorrect georeference and locality data. This update would be preceded by a search operation that will retrieve incomplete relevant records.</p>
<p>These operations will be part of services envisioned in DiSSCo such as the digitization workflow. At the moment, digitization activities vary from one specimen category to another and between institutions (<xref ref-type="bibr" rid="B8">Cocks et al 2020</xref>). We are addressing these challenges within the context of developing openDS (an open specification of Digital Specimen and other related object type definitions essential to mass digitization) and MIDS (Minimum Information about a Digital Specimen) to establish data standard and common practices.<xref ref-type="fn" rid="n19">19</xref> A common understanding of these processes will help us refine how and when the minimum set of metadata that does not change during the lifetime of the object needs to be created and maintained.</p>
<p>Digital Specimens now can become part of a FAIR infrastructure implementation because with kernel information and other metadata, they are findable and accessible. Repository and application services can be built in conjunction with these digital twins as the basis of a Digital Object Architecture. Digital Specimens can be accessed, retrieved and interacted with using standardized communication protocols such as Digital Object Interface Protocol (DOIP<xref ref-type="fn" rid="n20">20</xref>) or Hypertext Transfer Protocol (HTTP). Digital Specimens are interoperable because services and systems can determine the attributes that are tied semantically to FAIR vocabularies, and perform operations on them. And lastly, the kernel information profile and other attributes enable accurate and relevant data needed for reproducibility and reusability (for example, publishing the digitized data in a different format or running an experiment using data linked to a specimen).</p>
</sec>
<sec>
<title>Aggregation of digital objects</title>
<p><bold>RDA output from the Research Data Collection WG on actionable collections and a technical interface specification to enable client-server interaction provide guidelines for how to create meaningful services around the DiSSCo specific digital objects</bold>.</p>
<p>Building on the essential components of Digital Object Architecture and PID scheme, the next step considers how to go beyond the single data objects. &#8220;Collection&#8221; in the RDA sense means grouping objects together without demanding particular semantics or formats and this grouping should have a unique identifier with well defined actions such as CRUD (Create, Read, Update, Delete) that act on all objects in the group equally.</p>
<p>The NSC community has focused on creating a standard for Collection Descriptions.<xref ref-type="fn" rid="n21">21</xref> Furthermore, in NSC and DiSSCo terms, &#8220;Collection&#8221; has a specific meaning &#8211; &#8220;A collection is any set of physical things (material/natural objects) or image, audio and video recordings (either analogue or digital) treated together for curative purposes&#8221; (<xref ref-type="bibr" rid="B1">Addink et al. 2020</xref>). So we need to investigate further to see how commensurate this is with the RDA &#8220;Research Data Collection&#8221; recommendation (<xref ref-type="bibr" rid="B45">RDA Research Data Collections 2017</xref>).</p>
<p>In DiSSCo, a &#8220;Digital Collection&#8221; is a specific type of digital object acting as a twin for a real-world natural science collection. It is a collection of Digital Specimens, mirroring physical world practice of organizing specimens into specific kinds of collection (zoology, botany, etc.). In the digital world, however, the notion of a collection is far more flexible; insofar as objects can be members of multiple collections simultaneously, even without specific criteria defining membership.</p>
<p>Collections as digital objects will be consumed by services like ELViS (European Loans and Visits System) to facilitate loans and visits transactions and digitization on demand. A Collection Monitoring Dashboard can provide comprehensive overviews of collections across different institutions and disciplines. An extensive set of user stories<xref ref-type="fn" rid="n22">22</xref> maps user journeys for activities such as searching for collections and specimens, requesting loans, reviewing loan requests, generating reports on loans and visits and collection usage. For each of these steps and services, the role of collection as a digital object is crucial (Figure <xref ref-type="fig" rid="F4">4</xref>).</p>
<fig id="F4">
<label>Figure 4</label>
<caption>
<p>Building blocks of DiSSCo e-services start with individual objects (represented digitally through Digital Specimens), collections and collections overview.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-19-1250-g5.png"/>
</fig>
<p>Following the RDA recommendation (<xref ref-type="bibr" rid="B45">RDA Research Data Collections 2017</xref>), the digital collection as an entity with a persistent identifier (e.g., the digital object for the Mammal Collection at a museum) will support different operations (such as retrieve ordered or filtered list) and specific properties (essential information such as which museum, how it is stored), and membership information (e.g., specimens that are in the mammal collection). Beside ELViS, other e-services are envisaged to be implemented on collections of specific data types; for example, involving automated machine learning and computer vision (<xref ref-type="bibr" rid="B31">Livermore and Cubey 2019</xref>).</p>
<p>One of the challenges the DiSSCo architecture design work needs to address is the difficulty of understanding how data as a bundled package moves and is used across different practical situations in science, industry, policy making and public discourse. Data can be decontextualized and then recontextualized in novel situations to become meaningful beyond their original context of production (<xref ref-type="bibr" rid="B28">Leonelli 2016</xref>). Digital Specimens as atomic entities and the flexibility to organize them and other object types into different kinds of machine-actionable digital object container (i.e., &#8216;collection&#8217; in the RDA sense) will help in facilitating yet to be imagined uses.</p>
</sec>
<sec>
<title>Metadata attribution and use of PROV entities</title>
<p><bold>Output from the joint RDA/TWDG metadata attribution WG on standardized metadata for attributing work and tracking provenance in the curation and maintenance of research collections guides how attribution details can be preserved in DiSSCo digital objects</bold>.</p>
<p>DiSSCo&#8217;s Data Management Plan (<xref ref-type="bibr" rid="B13">DiSSCo DMP 2019</xref>) highlights the importance of the provenance of specimens, their digitization and change history, annotation, curation, and usage. These histories must be maintained consistently through the lifetime of the DiSSCo infrastructure. Recording activities of human and machine agents during data curation and processing phases is essential to FAIR implementation. Groom et al. (<xref ref-type="bibr" rid="B16">2019:12</xref>) highlight the importance of attributing people behind the specimens in a standard fashion: &#8220;<italic>Many people can be associated with a specimen: the collector, curator, determiner, annotator, mounter, transcriber, digitizer, imager and georeferencer. For many reasons, these people are important to science. Knowing the person gives a degree of credibility to the specimen and its identity. The biographical data of the people can not only help validate data, but also credit the people for the work they have done</italic>&#8221;.</p>
<p>Three important information elements capture this detail and can be used as the means for attributing work: the agent performing the activity, the activity (or action) they perform, and the digital or physical object (entity) they are curating/processing. The joint RDA/TDWG Attribution WG Metadata recommendation (<xref ref-type="bibr" rid="B46">RDA/TDWG Attribution Metadata 2018</xref>), which uses W3C&#8217;s PROV data model, makes it easy to implement this in the FAIR-DO context. In our design, we are planning to store provenance as another digital object type linked to the object on which work is performed. This will create a common curation space linked to different types of digital objects, other data classes and services. One current proposal to materialize this is through the European Curation and Annotation System (ECAS) e-service to serve as a community curation service.</p>
<p>One of the example use cases in the RDA recommendation is relevant:</p>
<disp-quote>
<p>&#8220;<italic>Sergey (a museum curator) recurates a jar containing multiple specimens. Each specimen is removed from the jar and individually mounted. Sergey then examines the specimen and jar label, and enters a new record into the collections management database. He uses the data in the new record to generate a new label to attach to the physical specimen</italic>.</p>
<p><italic>Sergey also, in the process of recurating one of the specimens, discovers a new species</italic>.</p>
<p><italic>He describes the new species, and uses the species description to publish a journal paper</italic>.</p>
<p><italic>Sergey should receive attribution for</italic>:</p>
<list list-type="bullet">
<list-item><p><italic>recurating the physical specimens</italic></p></list-item>
<list-item><p><italic>describing the new species</italic></p></list-item>
<list-item><p><italic>authoring the journal article</italic></p></list-item>
<list-item><p><italic>entering the specimen into the collections management database</italic></p></list-item>
<list-item><p><italic>generating a label for the re-curated specimen</italic>.&#8221;</p></list-item>
</list>
</disp-quote>
<p>As is evident, even within a single workflow, data can travel from a collection management database to a journal where different systems, standards, and application programming interfaces (API) are involved. These five attributions need to be captured in a standard way to be part of the Digital Specimen data when different operations are performed in multiple contexts.</p>
<p>Leonelli (<xref ref-type="bibr" rid="B28">2016: 188&#8211;189</xref>) using the example of model organism biology, makes the point that to support the scientists, we must understand the processes behind successful empirical research. Often, policymakers and funders predominantly understand research as products instead of processes. Metadata attribution and use of PROV entities provide technical foundation to bring these processes to the forefront of supporting and sustaining a research infrastructure.</p>
</sec>
<sec>
<title>FAIR Data Maturity Model</title>
<p><bold>Output from the RDA FAIR data maturity model WG (<xref ref-type="bibr" rid="B42">RDA FAIR Data Maturity Model 2020</xref>) provides guidelines and specifications to assess the DiSSCo FAIR implementation plan</bold>.</p>
<p>DiSSCo&#8217;s Data Management Plan (<xref ref-type="bibr" rid="B13">DiSSCo DMP 2019, Appendix E</xref>) provides a summary statement of DiSSCo&#8217;s implementation of the FAIR guiding principles. The indicators, priority levels and evaluation methods described by the FAIR Data Maturity Model (DMM) WG (<xref ref-type="bibr" rid="B42">RDA FAIR Data Maturity Model 2020</xref>) were not available during the preparation of the DiSSCo DMP. However, the output is an essential tool for future periodic evaluation of the DMP and FAIR implementation.</p>
<p>As DiSSCo data infrastructure is FAIR by design, the essential indicators in the DMM are thus addressed. At the time of writing this article, DiSSCo is in maturity level &#8220;2&#8221; (&#8220;under consideration or in planning phase&#8221;) for all the essential indicators. The DMM also decomposes texts of the FAIR principles to provide further granularity. For instance, the RDA output provides two indicators for FAIR principle F1<xref ref-type="fn" rid="n23">23</xref> (one for persistent identifier and one for globally unique identifier). DiSSCo DMP addresses F1 as such: <italic>&#8220;A handle is issued to each object published in or by DiSSCo, allowing the object data to be found regardless of its location&#8221;</italic>. Due to our design choice of FAIR Digital Object, DiSSCo addresses both the persistency and uniqueness aspect of F1. For FAIR principle R1, the DMM indicator is: &#8220;<italic>Plurality of accurate and relevant attributes are provided to allow reuse&#8221;</italic> which is based on &#8220;R1: <italic>(Meta)data are richly described with a plurality of accurate and relevant attributes</italic>&#8221;. For R1, DiSSCo DMP states:</p>
<list list-type="bullet">
<list-item><p>Each object contains a minimum of mandatory terms consistent with its formal object type definition, with the possibility to include optional additional terms and enrichments as necessary.</p></list-item>
<list-item><p>In the case of Digital Specimen and Digital Collection object types, the minimum of mandatory terms corresponds to the object&#8217;s classification as representing a specific level of digitization according to (respectively) the Minimum Information standard for Digital Specimens (MIDS) and the Minimum Information standard for Digital Collections (MICS).</p></list-item>
</list>
<p>Implementation of MIDS in the digitization process will ensure that enough data are captured, curated and published to make it reusable and thus creating &#8220;plurality of accurate and relevant attributes&#8221;. As we progress along from design phase to pilot and then implementation, the DMM indicators and evaluation methods can help DiSSCo to create tailored assessment but at the same time focus on FAIR convergence for cross-disciplinary interoperability (<xref ref-type="bibr" rid="B54">Sustokova et al. 2020</xref>). Similarly, the other indicators mentioned in the DMP is commensurable with the RDA DMM framework.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>In this paper, we have presented how the RDA outputs can be used to create building blocks for research infrastructure architectural design decisions towards FAIR compliance. For DiSSCo, e-services such as ELViS, designed around the concept of Digital Specimens are planning to improve access to natural science collections across Europe. Aggregation of these Digital Specimens through Digital Collections will enable monitoring tools like CMD to provide collections overview and reports that are immensely beneficial to track and assess scientific usage of the collection. The RDA outputs are not just for the access/use part of the data lifecycle. Data enhancement, annotation (using the planned Specimen Data Refinery) and community curation (using the European Curation and Annotation System) are building blocks for the research infrastructure vision of DiSSCo that all depend on these recommendations. Along with the different building blocks, the outputs also highlight the importance of data standards and common practices which have already been discussed in the ICEDIG project (2018&#8211;2020) and are currently being further studied in DiSSco Prepare (2020&#8211;2023).</p>
<p>The ideas expressed here are still in the design and/or conception stage and need to be further fleshed out to support the DiSSCo implementation and construction phase. Some of the RDA outputs are also similar in their conceptual nature and thus organizing workshops, and technical hackathons through RDA can help DiSSCo to clarify further, test and refine the concepts. DiSSCo experts regularly participate in RDA and collaboration with other disciplines through RDA can also provide learning opportunities and help us identify potential issues and risks in our concepts. There are other outputs &#8211; such as the outputs of the PID Information Types WG (<xref ref-type="bibr" rid="B44">RDA PID Information Types 2015</xref>) and the Data Type Registries WG (<xref ref-type="bibr" rid="B39">RDA DTR 2015</xref>) &#8211; that we are still exploring.</p>
<p>The RDA recommendations and the broader global expertise represented therein enable us to design and build a robust, FAIR Digital Object based data infrastructure. We envision that this new infrastructure will be essential in supporting the next phase in the digital transformation of collections-based science to widen access and better enable the production of data and knowledge about the 1.5 billion physical specimens in the European natural science collections.</p>
</sec>
</body>
<back>
<fn-group>
<fn id="n1"><p>Distributed System of Scientific Collections (DiSSCo): <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://dissco.eu">https://dissco.eu</ext-link>.</p></fn>
<fn id="n2"><p>At the TDWG 2020 conference (<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.tdwg.org/conferences/2020/working-sessions/#bof01">https://www.tdwg.org/conferences/2020/working-sessions/#bof01</ext-link>) the similarities and differences between Digital Specimen and Extended Specimen concepts were explored and plans for further global collaboration towards a global standard were discussed.</p></fn>
<fn id="n3"><p><ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.gbif.org/">https://www.gbif.org/</ext-link>.</p></fn>
<fn id="n4"><p><ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.biodiversitylibrary.org/">https://www.biodiversitylibrary.org/</ext-link>.</p></fn>
<fn id="n5"><p><ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://plazi.org/">http://plazi.org/</ext-link>.</p></fn>
<fn id="n6"><p>Current ELViS development activities can be found in GitHub: <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://github.com/DiSSCo/ELViS/wiki">https://github.com/DiSSCo/ELViS/wiki</ext-link>.</p></fn>
<fn id="n7"><p><ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="ICEDIG">ICEDIG</ext-link> &#8211; &#8220;Innovation and consolidation for large scale digitisation of natural heritage&#8221; &#8211; is an EU-funded project (grant agreement number 777483) that aims at supporting the implementation phase of the new Research Infrastructure DiSSCo (&#8220;Distributed System of Scientific Collections&#8221;) by designing and addressing the technical, financial, policy and governance aspects necessary to operate such a large distributed initiative for natural sciences collections across Europe.</p></fn>
<fn id="n8"><p>We define &#8216;FAIRness&#8217; as a characteristic exhibited by an infrastructure (or component thereof) when it achieves and maintains compliance with the FAIR Guiding Principles.</p></fn>
<fn id="n9"><p>C2CAMP: (Cross-continental Collection Access and Management Pilot) <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.go-fair.org/implementation-networks/overview/c2camp/">https://www.go-fair.org/implementation-networks/overview/c2camp/</ext-link>.</p></fn>
<fn id="n10"><p>Group of European Data Experts in RDA: <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.rd-alliance.org/groups/gede-group-european-data-experts-rda">https://www.rd-alliance.org/groups/gede-group-european-data-experts-rda</ext-link>.</p></fn>
<fn id="n11"><p><ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.go-fair.org/">https://www.go-fair.org/</ext-link>.</p></fn>
<fn id="n12"><p>The subject of ambiguity looms large when it comes to referring to things by species name (each plant, animal, fossil or rock/mineral specimen typically is labelled with its scientific name if known). Scientific names are immensely useful for taxonomic research and in particular for describing the object. However, they are not usable for disambiguation or by machine-actionable services (see <xref ref-type="bibr" rid="B38">Patterson et al. 2016</xref>, <xref ref-type="bibr" rid="B18">Guralnick et al. 2015</xref>, <xref ref-type="bibr" rid="B51">Sterner and Franz 2017</xref>).</p></fn>
<fn id="n13"><p><ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.cnri.reston.va.us/">https://www.cnri.reston.va.us/</ext-link>.</p></fn>
<fn id="n14"><p><ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.doi.org/doi_handbook/7_IDF.html">https://www.doi.org/doi_handbook/7_IDF.html</ext-link>.</p></fn>
<fn id="n15"><p><ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.grid.ac/">https://www.grid.ac/</ext-link>.</p></fn>
<fn id="n16"><p><ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://ror.org/">https://ror.org/</ext-link>.</p></fn>
<fn id="n17"><p><ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://orcid.org/">https://orcid.org/</ext-link>.</p></fn>
<fn id="n18"><p><ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.dissco.eu/dissco-prepare-work-programme/">https://www.dissco.eu/dissco-prepare-work-programme/</ext-link>.</p></fn>
<fn id="n19"><p>Work on the first of these specifications, openDS is at an early stage while work on the MIDS standard is more mature, having recently (September 2020) been taken up by TDWG &#8211; the community group responsible for Biodiversity Information Standards in a new Task Group on Minimum Information about a Digital Specimen (MIDS), <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.tdwg.org/community/cd/mids/">https://www.tdwg.org/community/cd/mids/</ext-link>.</p></fn>
<fn id="n20"><p><ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.dona.net/sites/default/files/2018-11/DOIPv2Spec_1.pdf">https://www.dona.net/sites/default/files/2018-11/DOIPv2Spec_1.pdf</ext-link>.</p></fn>
<fn id="n21"><p>Collection Description interests group in Biodiversity Information Standard (TDWG): <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://github.com/tdwg/cd">https://github.com/tdwg/cd</ext-link>.</p></fn>
<fn id="n22"><p>DiSSCo user stories in github: <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://github.com/DiSSCo/user-stories/">https://github.com/DiSSCo/user-stories/</ext-link>.</p></fn>
<fn id="n23"><p>F1: (Meta)data are assigned a globally unique and persistent identifier.</p></fn>
</fn-group>
<ack>
<title>Acknowledgements</title>
<p>This paper was supported by the RDA Europe 4.0 project that has received funding from the European Union&#8217;s Horizon 2020 research and innovation programme under grant agreement No 777388. The DiSSCo Prepare project has received funding from the European Union&#8217;s Horizon 2020 research and innovation programme under grant agreement No 871043.</p>
</ack>
<sec>
<title>Competing Interests</title>
<p>The authors have no competing interests to declare.</p>
</sec>
<ref-list>
<ref id="B1"><label>1</label><mixed-citation publication-type="journal"><string-name><surname>Addink</surname>, <given-names>W</given-names></string-name>, et al. <year>2020</year>. <article-title>Advancing the Catalogue of the World&#8217;s Natural History Collections &#8211; 10 recommendations from DiSSCo</article-title>. DOI: <pub-id pub-id-type="doi">10.5281/zenodo.3949839</pub-id></mixed-citation></ref>
<ref id="B2"><label>2</label><mixed-citation publication-type="journal"><string-name><surname>Agosti</surname>, <given-names>D</given-names></string-name>, et al. <year>2019</year>. <article-title>Biodiversity Literature Repository (BLR), a repository for FAIR data and publications</article-title>. <source>Biodiversity Information Science and Standards</source>. DOI: <pub-id pub-id-type="doi">10.3897/biss.3.37197</pub-id></mixed-citation></ref>
<ref id="B3"><label>3</label><mixed-citation publication-type="journal"><string-name><surname>Bechhofer</surname>, <given-names>S</given-names></string-name>, et al. <year>2013</year>. <article-title>Why linked data is not enough for scientists</article-title>. <source>Future Generation Computer Systems</source>, <volume>29</volume>(<issue>2</issue>): <fpage>599</fpage>&#8211;<lpage>611</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.future.2011.08.004</pub-id></mixed-citation></ref>
<ref id="B4"><label>4</label><mixed-citation publication-type="journal"><string-name><surname>Besnard</surname>, <given-names>G</given-names></string-name>, et al. <year>2016</year>. <article-title>Valuing museum specimens: high-throughput DNA sequencing on historical collections of New Guinea crowned pigeons (Goura)</article-title>. <source>Biological Journal of the Linnean Society</source>, <volume>117</volume>(<issue>1</issue>): <fpage>71</fpage>&#8211;<lpage>82</lpage>. DOI: <pub-id pub-id-type="doi">10.1111/bij.12494</pub-id></mixed-citation></ref>
<ref id="B5"><label>5</label><mixed-citation publication-type="journal"><string-name><surname>Blagoderov</surname>, <given-names>V</given-names></string-name>, <string-name><surname>Kitching</surname>, <given-names>IJ</given-names></string-name>, <string-name><surname>Livermore</surname>, <given-names>L</given-names></string-name>, <string-name><surname>Simonsen</surname>, <given-names>TJ</given-names></string-name> and <string-name><surname>Smith</surname>, <given-names>VS</given-names></string-name>. <year>2012</year>. <article-title>No specimen left behind: industrial scale digitization of natural history collections</article-title>. <source>ZooKeys</source>, <volume>209</volume>: <fpage>133</fpage>. DOI: <pub-id pub-id-type="doi">10.3897/zookeys.209.3178</pub-id></mixed-citation></ref>
<ref id="B6"><label>6</label><mixed-citation publication-type="journal"><string-name><surname>Cazenave</surname>, <given-names>N</given-names></string-name>, <string-name><surname>B&#233;chard</surname>, <given-names>L</given-names></string-name> and <string-name><surname>Rouchon</surname> <given-names>O</given-names></string-name>. <year>2019</year>. <article-title>Digitisation infrastructure design for EUDAT/CINES. ICEDIG Deliverable 6.2</article-title>. DOI: <pub-id pub-id-type="doi">10.5281/zenodo.3364533</pub-id></mixed-citation></ref>
<ref id="B7"><label>7</label><mixed-citation publication-type="journal"><string-name><surname>Chapman</surname>, <given-names>A</given-names></string-name>, et al. <year>2020</year>. <article-title>Developing standards for improved data quality and for selecting fit for use biodiversity data</article-title>. <source>Biodiversity Information Science and Standards</source>, <volume>4</volume>: <fpage>e50889</fpage>. DOI: <pub-id pub-id-type="doi">10.3897/biss.4.50889</pub-id></mixed-citation></ref>
<ref id="B8"><label>8</label><mixed-citation publication-type="journal"><string-name><surname>Cocks</surname>, <given-names>N</given-names></string-name>, <string-name><surname>Livermore</surname>, <given-names>L</given-names></string-name>, <string-name><surname>Smith</surname>, <given-names>V</given-names></string-name> and <string-name><surname>Woodburn</surname>, <given-names>M</given-names></string-name>. <year>2020</year>. <article-title>Technical capacities of digitisation centres within ICEDIG participating institutions</article-title>. <source>Research Ideas and Outcomes</source>, <volume>6</volume>: <fpage>e55522</fpage>. DOI: <pub-id pub-id-type="doi">10.3897/rio.6.e55522</pub-id></mixed-citation></ref>
<ref id="B9"><label>9</label><mixed-citation publication-type="journal"><string-name><surname>Cook</surname>, <given-names>JA</given-names></string-name>, et al. <year>2020</year>. <article-title>Integrating Biodiversity Infrastructure into Pathogen Discovery and Mitigation of Emerging Infectious Diseases</article-title>. <source>BioScience</source>, <volume>70</volume>(<issue>7</issue>): <fpage>531</fpage>&#8211;<lpage>534</lpage>. DOI: <pub-id pub-id-type="doi">10.1093/biosci/biaa064</pub-id></mixed-citation></ref>
<ref id="B10"><label>10</label><mixed-citation publication-type="journal"><string-name><surname>Copas</surname>, <given-names>K</given-names></string-name>, <string-name><surname>Noesgaard</surname>, <given-names>D</given-names></string-name> and <string-name><surname>Schigel</surname>, <given-names>D</given-names></string-name>. <year>2019</year>. <article-title>Crediting the reuse and impact of free, FAIR and open biodiversity data through DOI citations and event tracking</article-title>. <source>AGUFM</source>, 2019: <fpage>IN21A</fpage>&#8211;<lpage>06</lpage>.</mixed-citation></ref>
<ref id="B11"><label>11</label><mixed-citation publication-type="journal"><string-name><surname>De Smedt</surname>, <given-names>K</given-names></string-name>, <string-name><surname>Koureas</surname>, <given-names>D</given-names></string-name> and <string-name><surname>Wittenburg</surname>, <given-names>P</given-names></string-name>, <year>2020</year>. <article-title>FAIR Digital Objects for Science: From Data Pieces to Actionable Knowledge Units</article-title>. <source>Publications</source>, <volume>8</volume>(<issue>2</issue>): <fpage>21</fpage>. DOI: <pub-id pub-id-type="doi">10.3390/publications8020021</pub-id></mixed-citation></ref>
<ref id="B12"><label>12</label><mixed-citation publication-type="journal"><string-name><surname>D&#237;az</surname>, <given-names>S</given-names></string-name>, et al. <year>2016</year>. <article-title>The global spectrum of plant form and function</article-title>. <source>Nature</source>, <volume>529</volume>(<issue>7585</issue>): <fpage>167</fpage>&#8211;<lpage>171</lpage>. DOI: <pub-id pub-id-type="doi">10.1038/nature16489</pub-id></mixed-citation></ref>
<ref id="B13"><label>13</label><mixed-citation publication-type="journal"><collab>DiSSCo DMP</collab>. <year>2019</year>. <article-title>Provisional Data Management Plan for the DiSSCo infrastructure</article-title>. DOI: <pub-id pub-id-type="doi">10.5281/zenodo.3532937</pub-id></mixed-citation></ref>
<ref id="B14"><label>14</label><mixed-citation publication-type="journal"><collab>European Commission</collab>. <year>2018</year>. <article-title>Turning FAIR into reality. Final Report and Action Plan from the European Commission Expert Group on FAIR data</article-title>. <source>Luxembourg Publication Office of the European Union, Luxembourg</source>, <volume>78</volume>. DOI: <pub-id pub-id-type="doi">10.2777/1524</pub-id></mixed-citation></ref>
<ref id="B15"><label>15</label><mixed-citation publication-type="journal"><string-name><surname>Grobe</surname>, <given-names>P</given-names></string-name>, et al. <year>2019</year>. <article-title>From Data to Knowledge: A semantic knowledge graph application for curating specimen data</article-title>. <source>Biodiversity Information Science and Standards</source>. DOI: <pub-id pub-id-type="doi">10.3897/biss.3.37412</pub-id></mixed-citation></ref>
<ref id="B16"><label>16</label><mixed-citation publication-type="journal"><string-name><surname>Groom</surname>, <given-names>Q</given-names></string-name>, <string-name><surname>Dillen</surname>, <given-names>M</given-names></string-name>, <string-name><surname>Hardy</surname>, <given-names>H</given-names></string-name>, <string-name><surname>Phillips</surname>, <given-names>S</given-names></string-name>, <string-name><surname>Willemse</surname>, <given-names>L</given-names></string-name> and <string-name><surname>Wu</surname>, <given-names>Z</given-names></string-name>, <year>2019</year>. <article-title>Improved standardization of transcribed digital specimen data</article-title>. <source>Database</source>, 2019. DOI: <pub-id pub-id-type="doi">10.1093/database/baz129</pub-id></mixed-citation></ref>
<ref id="B17"><label>17</label><mixed-citation publication-type="journal"><string-name><surname>G&#252;ntsch</surname>, <given-names>A</given-names></string-name>, et al. <year>2017</year>. <article-title>Actionable, long-term stable and semantic web compatible identifiers for access to biological collection objects</article-title>. <source>Database</source>, 2017. DOI: <pub-id pub-id-type="doi">10.1093/database/bax003</pub-id></mixed-citation></ref>
<ref id="B18"><label>18</label><mixed-citation publication-type="journal"><string-name><surname>Guralnick</surname>, <given-names>RP</given-names></string-name>, et al. <year>2015</year>. <article-title>Community next steps for making globally unique identifiers work for biocollections data</article-title>. <source>ZooKeys</source>, <volume>494</volume>: <fpage>133</fpage>. DOI: <pub-id pub-id-type="doi">10.3897/zookeys.494.9352</pub-id></mixed-citation></ref>
<ref id="B19"><label>19</label><mixed-citation publication-type="journal"><string-name><surname>Hardisty</surname>, <given-names>A</given-names></string-name>., et al. <year>2020</year>. <article-title>Conceptual design blueprint for the DiSSCo digitization infrastructure-DELIVERABLE D8. 1</article-title>. <source>Research Ideas and Outcomes</source>, <volume>6</volume>. DOI: <pub-id pub-id-type="doi">10.3897/rio.6.e54280</pub-id></mixed-citation></ref>
<ref id="B20"><label>20</label><mixed-citation publication-type="journal"><string-name><surname>Hardisty</surname>, <given-names>A</given-names></string-name> and <string-name><surname>Roberts</surname>, <given-names>D</given-names></string-name>. <year>2013</year>. <article-title>A decadal view of biodiversity informatics: challenges and priorities</article-title>. <source>BMC ecology</source>, <volume>13</volume>(<issue>1</issue>): <fpage>16</fpage>. DOI: <pub-id pub-id-type="doi">10.1186/1472-6785-13-16</pub-id></mixed-citation></ref>
<ref id="B21"><label>21</label><mixed-citation publication-type="journal"><string-name><surname>Hedrick</surname>, <given-names>BP</given-names></string-name>, et al. <year>2020</year>. <article-title>Digitization and the future of natural history collections</article-title>. <source>BioScience</source>, <volume>70</volume>(<issue>3</issue>): <fpage>243</fpage>&#8211;<lpage>251</lpage>. DOI: <pub-id pub-id-type="doi">10.1093/biosci/biz163</pub-id></mixed-citation></ref>
<ref id="B22"><label>22</label><mixed-citation publication-type="journal"><string-name><surname>Hobern</surname>, <given-names>D</given-names></string-name>, et al. <year>2019</year>. <article-title>Connecting data and expertise: a new alliance for biodiversity knowledge</article-title>. <source>Biodiversity data journal</source>, <volume>7</volume>. DOI: <pub-id pub-id-type="doi">10.3897/BDJ.7.e33679.suppl10</pub-id></mixed-citation></ref>
<ref id="B23"><label>23</label><mixed-citation publication-type="journal"><string-name><surname>Kahn</surname>, <given-names>R</given-names></string-name> and <string-name><surname>Wilensky</surname>, <given-names>R</given-names></string-name>, <year>2006</year>. <article-title>A framework for distributed digital object services</article-title>. <source>International Journal on Digital Libraries</source>, <volume>6</volume>(<issue>2</issue>): <fpage>115</fpage>&#8211;<lpage>123</lpage>. DOI: <pub-id pub-id-type="doi">10.1007/s00799-005-0128-x</pub-id></mixed-citation></ref>
<ref id="B24"><label>24</label><mixed-citation publication-type="journal"><string-name><surname>Kays</surname>, <given-names>R</given-names></string-name>, et al. <year>2020</year>. <article-title>An empirical evaluation of camera trap study design: How many, how long and when?</article-title>. <source>Methods in Ecology and Evolution</source>. DOI: <pub-id pub-id-type="doi">10.1111/2041-210X.13370</pub-id></mixed-citation></ref>
<ref id="B25"><label>25</label><mixed-citation publication-type="journal"><string-name><surname>Kays</surname>, <given-names>R</given-names></string-name>, <string-name><surname>McShea</surname>, <given-names>WJ</given-names></string-name> and <string-name><surname>Wikelski</surname>, <given-names>M</given-names></string-name>. <year>2020</year>. <article-title>Born-digital biodiversity data: Millions and billions</article-title>. <source>Diversity and Distributions</source>, <volume>26</volume>(<issue>5</issue>): <fpage>644</fpage>&#8211;<lpage>648</lpage>. DOI: <pub-id pub-id-type="doi">10.1111/ddi.12993</pub-id></mixed-citation></ref>
<ref id="B26"><label>26</label><mixed-citation publication-type="journal"><string-name><surname>Lannom</surname>, <given-names>L</given-names></string-name>, <string-name><surname>Koureas</surname>, <given-names>D</given-names></string-name> and <string-name><surname>Hardisty</surname>, <given-names>AR</given-names></string-name>. <year>2020</year>. <article-title>FAIR data and services in biodiversity science and geoscience</article-title>. <source>Data Intelligence</source>, <volume>2</volume>(<issue>1&#8211;2</issue>): <fpage>122</fpage>&#8211;<lpage>130</lpage>. DOI: <pub-id pub-id-type="doi">10.1162/dint_a_00034</pub-id></mixed-citation></ref>
<ref id="B27"><label>27</label><mixed-citation publication-type="journal"><string-name><surname>Lendemer</surname>, <given-names>J</given-names></string-name>, et al. <year>2020</year>. <article-title>The extended specimen network: A strategy to enhance US biodiversity collections, promote research and education</article-title>. <source>BioScience</source>, <volume>70</volume>(<issue>1</issue>): <fpage>23</fpage>&#8211;<lpage>30</lpage>. DOI: <pub-id pub-id-type="doi">10.1093/biosci/biz165</pub-id></mixed-citation></ref>
<ref id="B28"><label>28</label><mixed-citation publication-type="book"><string-name><surname>Leonelli</surname>, <given-names>S</given-names></string-name>, <year>2016</year>. <source>Data-centric biology: A philosophical study</source>. <publisher-name>University of Chicago Press</publisher-name>. DOI: <pub-id pub-id-type="doi">10.7208/chicago/9780226416502.001.0001</pub-id></mixed-citation></ref>
<ref id="B29"><label>29</label><mixed-citation publication-type="journal"><string-name><surname>Lewis</surname>, <given-names>KP</given-names></string-name>, <string-name><surname>Vander</surname>, <given-names>Wal, E</given-names></string-name> and <string-name><surname>Fifield</surname>, <given-names>DA</given-names></string-name>. <year>2018</year>. <article-title>Wildlife biology, big data, and reproducible research</article-title>. <source>Wildlife Society Bulletin</source>, <volume>42</volume>(<issue>1</issue>): <fpage>172</fpage>&#8211;<lpage>179</lpage>. DOI: <pub-id pub-id-type="doi">10.1002/wsb.847</pub-id></mixed-citation></ref>
<ref id="B30"><label>30</label><mixed-citation publication-type="journal"><string-name><surname>Lister</surname>, <given-names>AM</given-names></string-name> and <collab>Climate Change Research Group</collab>. <year>2011</year>. <article-title>Natural history collections as sources of long-term datasets</article-title>. <source>Trends in ecology &amp; evolution</source>, <volume>26</volume>(<issue>4</issue>): <fpage>153</fpage>&#8211;<lpage>154</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.tree.2010.12.009</pub-id></mixed-citation></ref>
<ref id="B31"><label>31</label><mixed-citation publication-type="journal"><string-name><surname>Livermore</surname>, <given-names>L</given-names></string-name> and <string-name><surname>Cubey</surname>, <given-names>R</given-names></string-name>. <year>2019</year>. <article-title>Specimen Data Refinery: A landscape analysis on machine learning, computer vision and automated approaches to capture specimen metadata</article-title>. <source>Biodiversity Information Science and Standards</source>. DOI: <pub-id pub-id-type="doi">10.3897/biss.3.37647</pub-id></mixed-citation></ref>
<ref id="B32"><label>32</label><mixed-citation publication-type="book"><string-name><surname>Martin</surname>, <given-names>P</given-names></string-name>, <string-name><surname>Chen</surname>, <given-names>Y</given-names></string-name>, <string-name><surname>Hardisty</surname>, <given-names>A</given-names></string-name>, <string-name><surname>Jeffery</surname>, <given-names>K</given-names></string-name> and <string-name><surname>Zhao</surname>, <given-names>Z</given-names></string-name>. <year>2017</year>. <chapter-title>Computational Challenges in Global Environmental Research Infrastructures</chapter-title>. In: <source>Terrestrial Ecosystem Research Infrastructures: Challenges and Opportunities</source>, <string-name><surname>Chabbi</surname>, <given-names>A</given-names></string-name> and <string-name><surname>Loescher</surname>, <given-names>HW</given-names></string-name> (eds.). <publisher-name>CRC Press</publisher-name> ISBN 9781498751315. DOI: <pub-id pub-id-type="doi">10.1201/9781315368252</pub-id></mixed-citation></ref>
<ref id="B33"><label>33</label><mixed-citation publication-type="journal"><string-name><surname>Mons</surname>, <given-names>B</given-names></string-name>, et al. <year>2017</year>. <article-title>Cloudy, increasingly FAIR; revisiting the FAIR Data guiding principles for the European Open Science Cloud</article-title>. <source>Information Services &amp; Use</source>, <volume>37</volume>(<issue>1</issue>): <fpage>49</fpage>&#8211;<lpage>56</lpage>. DOI: <pub-id pub-id-type="doi">10.3233/ISU-170824</pub-id></mixed-citation></ref>
<ref id="B34"><label>34</label><mixed-citation publication-type="journal"><string-name><surname>Nachman</surname>, <given-names>MW</given-names></string-name>. <year>2013</year>. <article-title>Genomics and museum specimens</article-title>. <source>Molecular Ecology</source>, <volume>22</volume>(<issue>24</issue>): <fpage>5966</fpage>&#8211;<lpage>5968</lpage>. DOI: <pub-id pub-id-type="doi">10.1111/mec.12563</pub-id></mixed-citation></ref>
<ref id="B35"><label>35</label><mixed-citation publication-type="journal"><string-name><surname>Nelson</surname>, <given-names>G</given-names></string-name> and <string-name><surname>Ellis</surname>, <given-names>S</given-names></string-name>. <year>2019</year>. <article-title>The history and impact of digitization and digital data mobilization on biodiversity research</article-title>. <source>Philosophical Transactions of the Royal Society B</source>, <volume>374</volume>(<issue>1763</issue>): <elocation-id>20170391</elocation-id>. DOI: <pub-id pub-id-type="doi">10.1098/rstb.2017.0391</pub-id></mixed-citation></ref>
<ref id="B36"><label>36</label><mixed-citation publication-type="book"><string-name><surname>Nieva de la Hidalga</surname>, <given-names>A</given-names></string-name>, <string-name><surname>Hardisty</surname>, <given-names>A</given-names></string-name>, <string-name><surname>Martin</surname>, <given-names>P</given-names></string-name>, <string-name><surname>Magagna</surname>, <given-names>B</given-names></string-name> and <string-name><surname>Zhao</surname>, <given-names>Z</given-names></string-name>. <year>2020</year>. <chapter-title>The ENVRI Reference Model</chapter-title>. In <source>Towards Interoperable research infrastructures for environmental and earth sciences &#8211; A reference model guided approach for common challenges</source>, <string-name><surname>Zhao</surname>, <given-names>Z</given-names></string-name> and <string-name><surname>Hellstrom</surname>, <given-names>M</given-names></string-name> (eds.). <publisher-name>LNCS</publisher-name> 12003, <fpage>61</fpage>&#8211;<lpage>81</lpage>. DOI: <pub-id pub-id-type="doi">10.1007/978-3-030-52829-4_4</pub-id></mixed-citation></ref>
<ref id="B37"><label>37</label><mixed-citation publication-type="journal"><string-name><surname>Page</surname>, <given-names>R</given-names></string-name>. <year>2016</year>. <article-title>Towards a biodiversity knowledge graph</article-title>. <source>Research Ideas and Outcomes</source>, <volume>2</volume>. DOI: <pub-id pub-id-type="doi">10.3897/rio.2.e8767</pub-id></mixed-citation></ref>
<ref id="B38"><label>38</label><mixed-citation publication-type="journal"><string-name><surname>Patterson</surname>, <given-names>D</given-names></string-name>, <string-name><surname>Mozzherin</surname>, <given-names>D</given-names></string-name>, <string-name><surname>Shorthouse</surname>, <given-names>DP</given-names></string-name> and <string-name><surname>Thessen</surname>, <given-names>A</given-names></string-name>. <year>2016</year>. <article-title>Challenges with using names to link digital biodiversity information</article-title>. <source>Biodiversity Data Journal</source>, <volume>4</volume>. DOI: <pub-id pub-id-type="doi">10.3897/BDJ.4.e8080</pub-id></mixed-citation></ref>
<ref id="B39"><label>39</label><mixed-citation publication-type="journal"><collab>RDA DTR</collab>. <year>2015</year>. <article-title>Data Type Registries working group output</article-title>. DOI: <pub-id pub-id-type="doi">10.15497/A5BCD108-ECC4-41BE-91A7-20112FF77458</pub-id></mixed-citation></ref>
<ref id="B40"><label>40</label><mixed-citation publication-type="journal"><collab>RDA DF&amp;T</collab>. <year>2015</year>. <article-title>Data Foundation and Terminology Work Group Products</article-title>. <source>Research Data Alliance</source>. DOI: <pub-id pub-id-type="doi">10.15497/06825049-8CA4-40BD-BCAF-DE9F0EA2FADF</pub-id></mixed-citation></ref>
<ref id="B41"><label>41</label><mixed-citation publication-type="journal"><collab>RDA DFIG</collab>. <year>2018</year>. <article-title>Data Fabric Interest Group; Summary of Virtual Layer Recommendations</article-title>. <source>Research Data Alliance</source>. DOI: <pub-id pub-id-type="doi">10.15497/RDA00026</pub-id></mixed-citation></ref>
<ref id="B42"><label>42</label><mixed-citation publication-type="journal"><collab>RDA FAIR Data Maturity Model</collab>. <year>2020</year>. <article-title>FAIR Data Maturity Model: specification and guidelines</article-title>. <source>Research Data Alliance</source>. DOI: <pub-id pub-id-type="doi">10.15497/RDA00050</pub-id></mixed-citation></ref>
<ref id="B43"><label>43</label><mixed-citation publication-type="journal"><collab>RDA PID KI</collab>. <year>2019</year>. <article-title>RDA Recommendation on PID Kernel Information</article-title>. <source>Research Data Alliance</source>. DOI: <pub-id pub-id-type="doi">10.15497/RDA00031</pub-id></mixed-citation></ref>
<ref id="B44"><label>44</label><mixed-citation publication-type="journal"><collab>RDA PID Information Types</collab>. <year>2015</year>. <article-title>Final deliverable</article-title>. <source>Research Data Alliance</source>. DOI: <pub-id pub-id-type="doi">10.15497/FDAA09D5-5ED0-403D-B97A-2675E1EBE786</pub-id></mixed-citation></ref>
<ref id="B45"><label>45</label><mixed-citation publication-type="journal"><collab>RDA Research Data Collections</collab>. <year>2017</year>. <article-title>Recommendation on Research Data Collections</article-title>. <source>Research Data Alliance</source>. DOI: <pub-id pub-id-type="doi">10.15497/RDA00022</pub-id></mixed-citation></ref>
<ref id="B46"><label>46</label><mixed-citation publication-type="journal"><collab>RDA/TDWG Attribution Metadata</collab>. <year>2018</year>. <article-title>Final Recommendations</article-title>. <source>Research Data Alliance</source>. DOI: <pub-id pub-id-type="doi">10.15497/RDA00029</pub-id></mixed-citation></ref>
<ref id="B47"><label>47</label><mixed-citation publication-type="journal"><string-name><surname>Schindel</surname>, <given-names>DE</given-names></string-name> and <string-name><surname>Cook</surname>, <given-names>JA</given-names></string-name>. <year>2018</year>. <article-title>The next generation of natural history collections</article-title>. <source>PLoS Biology</source>, <volume>16</volume>(<issue>7</issue>): <fpage>e2006125</fpage>. DOI: <pub-id pub-id-type="doi">10.1371/journal.pbio.2006125</pub-id></mixed-citation></ref>
<ref id="B48"><label>48</label><mixed-citation publication-type="journal"><string-name><surname>Senderov</surname>, <given-names>V</given-names></string-name>, et al. <year>2018</year>. <article-title>OpenBiodiv-O: ontology of the OpenBiodiv knowledge management system</article-title>. <source>Journal of biomedical semantics</source>, <volume>9</volume>(<issue>1</issue>): <fpage>5</fpage>. DOI: <pub-id pub-id-type="doi">10.1186/s13326-017-0174-5</pub-id></mixed-citation></ref>
<ref id="B49"><label>49</label><mixed-citation publication-type="webpage"><string-name><surname>Sharp</surname>, <given-names>C</given-names></string-name>. <year>2016</year>. <article-title>Overview of the digital object architecture (DOA)</article-title>. <source>An Internet Society Information Paper, Internet Society</source>. Retrieved from <uri>https://www.internetsociety.org/resources/doc/2016/overview-of-the-digital-object-architecture-doa/</uri>.</mixed-citation></ref>
<ref id="B50"><label>50</label><mixed-citation publication-type="journal"><string-name><surname>Smith</surname>, <given-names>V</given-names></string-name>, et al. <year>2019</year>. <article-title>SYNTHESYS+ Abridged Grant Proposal</article-title>. <source>Research Ideas and Outcomes</source>, <volume>5</volume>: <fpage>e46404</fpage>. DOI: <pub-id pub-id-type="doi">10.3897/rio.5.e46404</pub-id></mixed-citation></ref>
<ref id="B51"><label>51</label><mixed-citation publication-type="journal"><string-name><surname>Sterner</surname>, <given-names>B</given-names></string-name> and <string-name><surname>Franz</surname>, <given-names>NM</given-names></string-name>. <year>2017</year>. <article-title>Taxonomy for humans or computers? Cognitive pragmatics for big data</article-title>. <source>Biological Theory</source>, <volume>12</volume>(<issue>2</issue>): <fpage>99</fpage>&#8211;<lpage>111</lpage>. DOI: <pub-id pub-id-type="doi">10.1007/s13752-017-0259-5</pub-id></mixed-citation></ref>
<ref id="B52"><label>52</label><mixed-citation publication-type="journal"><string-name><surname>Stocker</surname>, <given-names>M</given-names></string-name>, et al. <year>2018</year>. <article-title>Curating Scientific Information in Knowledge Infrastructures</article-title>. <source>Data Science Journal</source>, <volume>17</volume>(<issue>21</issue>): <fpage>1</fpage>&#8211;<lpage>16</lpage>. DOI: <pub-id pub-id-type="doi">10.5334/dsj-2018-021</pub-id></mixed-citation></ref>
<ref id="B53"><label>53</label><mixed-citation publication-type="journal"><string-name><surname>Sun</surname>, <given-names>S</given-names></string-name>, <string-name><surname>Lannom</surname>, <given-names>L</given-names></string-name> and <string-name><surname>Boesch</surname> <given-names>B</given-names></string-name>. <year>2003</year>. <article-title>Handle System Overview, RFC 3650</article-title>. DOI: <pub-id pub-id-type="doi">10.17487/rfc3650</pub-id></mixed-citation></ref>
<ref id="B54"><label>54</label><mixed-citation publication-type="journal"><string-name><surname>Sustkova</surname>, <given-names>HP</given-names></string-name>, <string-name><surname>Hettne</surname>, <given-names>KM</given-names></string-name>, <string-name><surname>Wittenburg</surname>, <given-names>P</given-names></string-name>, <string-name><surname>Jacobsen</surname>, <given-names>A</given-names></string-name>, <string-name><surname>Kuhn</surname>, <given-names>T</given-names></string-name>, <string-name><surname>Pergl</surname>, <given-names>R</given-names></string-name>, <string-name><surname>Slifka</surname>, <given-names>J</given-names></string-name>, <string-name><surname>McQuilton</surname>, <given-names>P</given-names></string-name>, <string-name><surname>Magagna</surname>, <given-names>B</given-names></string-name>, <string-name><surname>Sansone</surname>, <given-names>SA</given-names></string-name> and <string-name><surname>Stocker</surname>, <given-names>M</given-names></string-name>. <year>2020</year>. <article-title>FAIR convergence matrix: Optimizing the reuse of existing FAIR-related resources</article-title>. <source>Data Intelligence</source>, <volume>2</volume>(<issue>1&#8211;2</issue>): <fpage>158</fpage>&#8211;<lpage>170</lpage>. DOI: <pub-id pub-id-type="doi">10.1162/dint_a_00038</pub-id></mixed-citation></ref>
<ref id="B55"><label>55</label><mixed-citation publication-type="book"><string-name><surname>Webster</surname>, <given-names>MS</given-names></string-name>. (ed.) <year>2017</year>. <source>The extended specimen: emerging frontiers in collections-based ornithological research</source>. <publisher-name>CRC Press</publisher-name>.</mixed-citation></ref>
<ref id="B56"><label>56</label><mixed-citation publication-type="journal"><string-name><surname>Weigel</surname>, <given-names>T</given-names></string-name>, et al. <year>2020</year>. <article-title>Making data and workflows findable for machines</article-title>. <source>Data Intelligence</source>, <volume>2</volume>(<issue>1&#8211;2</issue>): <fpage>40</fpage>&#8211;<lpage>46</lpage>. DOI: <pub-id pub-id-type="doi">10.1162/dint_a_00026</pub-id></mixed-citation></ref>
<ref id="B57"><label>57</label><mixed-citation publication-type="journal"><string-name><surname>Wilkinson</surname>, <given-names>M</given-names></string-name>, <string-name><surname>Dumontier</surname>, <given-names>M</given-names></string-name>, <string-name><surname>Aalbersberg</surname>, <given-names>I</given-names></string-name>, et al. <year>2016</year>. <article-title>The FAIR Guiding Principles for scientific data management and stewardship</article-title>. <source>Scientific data</source>, <volume>3</volume>(<issue>1</issue>): <fpage>1</fpage>&#8211;<lpage>9</lpage>. DOI: <pub-id pub-id-type="doi">10.1038/sdata.2016.18</pub-id></mixed-citation></ref>
<ref id="B58"><label>58</label><mixed-citation publication-type="journal"><string-name><surname>Wittenburg</surname>, <given-names>P</given-names></string-name>, et al. <year>2019</year>. <article-title>Digital objects as drivers towards convergence in data infrastructures</article-title>. <source>Technical paper</source>. DOI: <pub-id pub-id-type="doi">10.23728/b2share.b605d85809ca45679b110719b6c6cb11</pub-id></mixed-citation></ref>
<ref id="B59"><label>59</label><mixed-citation publication-type="journal"><string-name><surname>Wittenburg</surname>, <given-names>P</given-names></string-name> and <string-name><surname>Strawn</surname>, <given-names>G</given-names></string-name>. <year>2019</year>. <article-title>Commenting on &#8220;Digital Object&#8221; Aspects</article-title>. DOI: <pub-id pub-id-type="doi">10.23728/b2share.2317b12321764f669c92ebbcf7518164</pub-id></mixed-citation></ref>
</ref-list>
</back>
</article>