<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "http://jats.nlm.nih.gov/publishing/1.0/JATS-journalpublishing1.dtd">
<!--<?xml-stylesheet type="text/xsl" href="article.xsl"?>-->
<article article-type="research-article" dtd-version="1.0" xml:lang="en" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id journal-id-type="issn">1683-1470</journal-id>
<journal-title-group>
<journal-title>Data Science Journal</journal-title>
</journal-title-group>
<issn pub-type="epub">1683-1470</issn>
<publisher>
<publisher-name>Ubiquity Press</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5334/dsj-2017-002</article-id>
<article-categories>
<subj-group>
<subject>Practice paper</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Utilizing the International Geo Sample Number Concept in Continental Scientific Drilling During ICDP Expedition COSC-1</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0002-8209-6290</contrib-id>
<name>
<surname>Conze</surname>
<given-names>Ronald</given-names>
</name>
<email>ronald.conze@gfz-potsdam.de</email>
<xref ref-type="aff" rid="aff-1"/>
</contrib>
<contrib contrib-type="author">
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0001-6095-2941</contrib-id>
<name>
<surname>Lorenz</surname>
<given-names>Henning</given-names>
</name>
<xref ref-type="aff" rid="aff-2"/>
</contrib>
<contrib contrib-type="author">
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0002-6298-758X</contrib-id>
<name>
<surname>Ulbricht</surname>
<given-names>Damian</given-names>
</name>
<xref ref-type="aff" rid="aff-1"/>
</contrib>
<contrib contrib-type="author">
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0001-5140-8602</contrib-id>
<name>
<surname>Elger</surname>
<given-names>Kirsten</given-names>
</name>
<xref ref-type="aff" rid="aff-1"/>
</contrib>
<contrib contrib-type="author">
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0001-6973-5628</contrib-id>
<name>
<surname>Gorgas</surname>
<given-names>Thomas</given-names>
</name>
<xref ref-type="aff" rid="aff-1"/>
</contrib>
</contrib-group>
<aff id="aff-1">GFZ German Research Centre for Geosciences, Potsdam, Germany</aff>
<aff id="aff-2">Uppsala University, Department of Earth Sciences, Uppsala, Sweden</aff>
<pub-date publication-format="electronic" date-type="pub" iso-8601-date="2017-01-25">
<day>25</day>
<month>01</month>
<year>2017</year>
</pub-date>
<volume>16</volume>
<elocation-id>2</elocation-id>
<history>
<date date-type="received" iso-8601-date="2016-11-24">
<day>24</day>
<month>11</month>
<year>2016</year>
</date>
 <date date-type="accepted" iso-8601-date="2017-01-04">
<day>04</day>
<month>01</month>
<year>2017</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright: &#x00A9; 2017 The Author(s)</copyright-statement>
<copyright-year>2017</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See <uri xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</uri>.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://datascience.codata.org/articles/10.5334/dsj-2017-002/"/>
<abstract>
<p>The International Geo Sample Number (IGSN) is a globally unique persistent identifier (PID) for physical samples that provides discovery functionality of digital sample descriptions via the internet. In this article we describe the implementation of a registration service for IGSNs of the Helmholtz Centre Potsdam &#8211; GFZ German Research Centre for Geosciences. This includes the adaption of the metadata schema developed within the context of the System for Earth Sample Registration (SESAR<xref ref-type="fn" rid="n1">1</xref>) to better describe the complex sample hierarchy of drilling cores, core sections and samples of scientific drilling projects.</p>
<p>Our case study is the COSC-1 expedition<xref ref-type="fn" rid="n2">2</xref> (Collisional Orogeny in the Scandinavian Caledonides) supported by the International Continental Scientific Drilling Program<xref ref-type="fn" rid="n3">3</xref> (ICDP). COSC-1 prompted for the first time in ICDP&#8217;s history to assign and register IGSNs during an on-going drilling campaign preserving the original parent-child relationship of the sample objects.</p>
<p>IGSN-associated data and metadata are distributed and shared with the world wide community through novel web portals, one of which is currently evolving as part of ICDP&#8217;s collaborative efforts within the GFZ Potsdam and researchers from ICDP&#8217;s COSC clientele. Thus, COSC-1 can be considered as a &#8216;Prime-Example&#8217; for ICDP projects to further improve the quality of scientific research output through a transparent process of producing and managing large quantities of data as they are normally acquired during a typical scientific drilling operation. The IGSN is an important new player in the general publication landscape that can be cited in scholarly literature and also cross-referenced in DOI-bearing scholarly and data publications.</p>
</abstract>
<kwd-group>
<kwd>sample curation</kwd>
<kwd>persistent identifiers</kwd>
<kwd>IGSN</kwd>
<kwd>geosciences</kwd>
<kwd>infrastructure</kwd>
<kwd>open data</kwd>
<kwd>metadata</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec>
<title>Introduction</title>
<sec>
<title>IGSN &#8211; The International Geo Sample Number</title>
<p>The International Geo Sample Number (IGSN) is a globally unique and persistent identifier (PID) for physical samples that reduces (and perhaps even eliminates) problems associated with ambiguous naming of samples. It has been developed to (1) address requirements for reproducibility of sample-based data, (2) ensure discovery, access, and re-usability of samples and data derived from them, (3) recognize sample collection and curation as scholarly contribution to the scientific community, and (4) improve data integration. The latter especially pertains to scientific drilling projects where samples and subsamples of the same core or core section are analysed in different laboratories and over long periods of time.</p>
<p>IGSN is governed by an international non-profit organisation (IGSN e.V.<xref ref-type="fn" rid="n4">4</xref>), which operates the central registration system based on the <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://handle.net/">Handle.Net</ext-link> System (<xref ref-type="bibr" rid="B1">CNRI 2010</xref>). The IGSN e.V. aims to develop standard methods for locating, identifying and citing physical samples (<xref ref-type="bibr" rid="B6">Devaraju et al. 2016</xref>). Similar to the digital object identifier (DOI), IGSNs resolve to a persistent link on the internet to IGSN landing pages with a virtual sample description, managed by federated IGSN Allocating Agents (e.g. <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://hdl.handle.net/10273/ICDP5054EHW1001">IGSN:ICDP5054EHW1001</ext-link>). The largest collection of registered IGSNs is accessible via the inventor&#8217;s web portal &#8216;System for Earth Sample Registration&#8217;, SESAR, at Lamont-Doherty Earth Observatory (<xref ref-type="bibr" rid="B9">Lehnert and Klump 2008</xref>). Furthermore, the IGSN provides means for sample citation in the literature and for establishing direct links from specimen to research results and interpretations.</p>
<p>Each member of the IGSN e.V. may become IGSN Allocating Agent and develop an IGSN registration service for their communities. In addition, allocating agents are free to develop individual metadata schemata for the various methodical disciplines using their service as well as to design the IGSN identifier. In addition to the disciplinary metadata schemata, each allocating agent has to provide metadata in the IGSN Description Schema (<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://schema.igsn.org/description/">http://schema.igsn.org/description/</ext-link>). The IGSN Description Schema contains persistent information about registered samples, such as temporal and spatial coordinates of sample acquisition, metadata about involved institutions and sample-requesting scientists, information about the sample material, collection methods, and alternate or related identifiers. This metadata kernel is aligned with the DataCite Metadata Schema 4.0 and will be harvested by the central IGSN metadata catalogue that is currently in development. Where possible, the descriptive metadata schema makes use of vocabulary lists, e.g. the Observations Data Model 2 (ODM2: <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.odm2.org">http://www.odm2.org</ext-link>) and a list of collection methods.</p>
</sec>
<sec>
<title>Drilling Samples and the Drilling Information System (DIS)</title>
<p>The necessity for a meticulous acquisition and documentation of scientific drilling project data and associated samples in a structured and hierarchical way was already described for the German Continental Deep Drilling Program (KTB) (<xref ref-type="bibr" rid="B14">W&#228;chter et al. 1989</xref>; <xref ref-type="bibr" rid="B4">Conze et al. 1993</xref>; <xref ref-type="bibr" rid="B2">Conze 1995</xref>). As one outcome and consequence of the KTB project, the development of a dedicated IT system, the DIS, was initiated. The DIS (Conze et al. 2007, <xref ref-type="bibr" rid="B3">Conze 2016</xref>) is based on a relational database with data verification routines and desktop input forms. It is designed for the full documentation of the on-site drilling operations, including the sample material and the acquired primary data. Scientific drilling projects generate a huge amount of sample material. Most prominent are drill cores, where also a number of hierarchical relationships have to be taken care of: the origin of all material of drilling projects are the drill holes, the core runs (a drill core reaching the earth surface), core sections (core runs sub-divided into segments of manageable length), and samples for further analyses. Primary data include core images, multi-sensor core logging data, borehole logging data, lithological descriptions, and so forth.</p>
<p>The data management system <bold>ExpeditionDIS</bold> is dedicated to individual projects and is used in field expeditions, during drilling operations, and for post-drilling examinations of core and sample material in project-associated laboratories. The broad range of different research topics of each ICDP project call for a DIS, which is tailored for each scientific drilling expedition (such as COSC-1) for the inventory of recovered sample material and core samples extracted for research from the drill cores.</p>
<p>In contrast, the <bold>CurationDIS</bold> is used in storage facilities (i.e., core repositories, such as MARUM in Bremen, or the &#8216;Nationales Bohrkernlager f&#252;r kontinentale Forschungsbohrungen&#8217; of the Federal Institute for Geosciences and Natural Resources (BGR) in Berlin-Spandau, Germany<xref ref-type="fn" rid="n5">5</xref>). There, sample material from many different expeditions is collected, managed, shared with and distributed to the scientific community. The lifetime of an ExpeditionDIS is limited to the period of the expedition, whereas the CurationDIS is lasting for the operational lifetime of the sample material storage.</p>
<p>Both DIS systems received an update recently that automatically creates an IGSN identifier from the parent-child relationships of the sample material, a prefix individual to the project, and the sample type stored in the DIS relational database. We describe the syntax of the ICDP IGSNs later in this article.</p>
</sec>
</sec>
<sec>
<title>The COSC-1 Project</title>
<p>The COSC-1 project studies mountain building processes by drilling a continuous cored section through the thrust sheets of the Caledonian foreland in Sweden (Lorenz et al. 2015). The Caledonides are an approximately 400 my old mountain belt that originally had Himalayan dimensions.</p>
<p>Research in scientific drilling projects is complex and not limited to the primary goal, which often results in intricate and comprehensive science and sampling programmes. During COSC-1 operations, approximately 2.4 km of drill core were retrieved (the &#8216;Sample Material&#8217;), drill mud and mud gases sampled, and diverse screening techniques for microbiology employed. During the project field campaign and subsequent sampling party, inventory-keeping of all drill cores was done routinely using the ExpeditionDIS. Primary core metadata, core scan image files and on-site analytical data (Lorenz et al. 2015) were entered at several DIS client stations immediately upon its retrieval on the drill deck. Obtained core pieces, which are unanimously and uniquely identified to belonging together were treated as a single object and logged as individual &#8216;Core Run&#8217; into the DIS. The core run is a &#8216;Child&#8217; object of the borehole &#8216;Parent&#8217;, and does not carry any additional information. IGSNs were assigned immediately to all eligible objects. Off-line IGSN assignment worked flawlessly and without interfering with the researchers&#8217; workflow. Analytical data, such as geophysical logs and XRF geochemical analyses, were linked to the respective core sections using adapted data pumps and imported into the DIS. Backups of the database were taken off-site and transferred to ICDP via secure file transfer protocols (sftp).</p>
<p>After completing the field campaign, all sample material was shipped to the core repository in Berlin-Spandau. The project data were subsequently transferred to the BGR CurationDIS. Sampling can continue based on the sampling policies of the project and the storage facility itself. In the case of COSC-1, researchers with accepted and DIS-generated sample requests met for a sampling party to visually inspect the core and mark/list their sampling spots and intervals. Samples were documented in the DIS and so IGSNs automatically assigned, and subsequently physically taken by the BGR curator. The properly DIS-labelled samples and sample lists were then shipped world-wide to the respective researchers. This workflow approach turned out to be successful and efficient as the scientists&#8217; time was entirely spent on science, while sampling was performed by the assigned curator in an orderly manner. Thereby also the proper documentation of all sample material and sampling procedures was generated.</p>
</sec>
<sec>
<title>IGSN Implementation, Generation and Registration</title>
<sec>
<title>Inherent IGSN Properties</title>
<p>The most conspicuous requirement for a persistent identifier (PID) in a drilling project context is that it reflects the hierarchy in the sample material. Without satisfying this pre-requirement, the provenance of a referenced piece of drill core could only be tracked by a user who has direct access to the data logged in the DIS.</p>
<p>When persistent identifiers are used, it is quite common to work with prefixes. This allows the separation of responsible domains by different namespaces. For example, the IGSN e.V. assigned the namespace &#8216;ICDP&#8217; to ICDP and &#8216;BGRB&#8217; for the core archive at BGR. The namespace is followed by an &#8216;Expedition ID&#8217; and a report prefix (see Table <xref ref-type="table" rid="T1">1</xref>). Both create several independent sub-namespaces that could be used in individual DIS systems to do data acquisition independent from each other on remote sites without internet access. An object tag allows for a quick, human-readable identification of the object in question, such as &#8216;Hole&#8217;, &#8216;Core Run&#8217;, &#8216;Core Section&#8217; or &#8216;Sample&#8217;. The IGSN ends with a coded pattern directly derived from primary key values generated by the DIS, and thereby guarantees the uniqueness of the IGSN identifier. Figure <xref ref-type="fig" rid="F1">1</xref> illustrates the IGSN syntax for ICDP and shows links to the ICDP naming convention.</p>
<table-wrap id="T1">
<label>Table 1</label>
<caption>
<p>Structure of the extended IGSN for ICDP sample material. The coded pattern is directly derived from the internal object-ID in the DIS, and therefore guarantees uniqueness of the sample.</p>
</caption>
<table>
<tr>
<th align="left" valign="top">Name Space</th>
<th align="left" valign="top">Expedition ID</th>
<th align="left" valign="top">Report Prefix</th>
<th align="left" valign="top">Object Tag</th>
<th align="left" valign="top">Coded Pattern</th>
</tr>
<tr>
<td colspan="5">
<hr/></td>
</tr>
<tr>
<td align="left" valign="top">ICDP</td>
<td align="left" valign="top">5054</td>
<td align="left" valign="top">E = Expedition<break/>
R = Repository</td>
<td align="left" valign="top">H = hole<break/>
C = core run<break/>
S = core section<break/>
X = sample</td>
<td align="left" valign="top">W1001</td>
</tr>
</table>
</table-wrap>
<fig id="F1">
<label>Figure 1</label>
<caption>
<p>Mapping of sample objects to IGSNs. Example of the data structure and IGSN assignment for drill holes, core runs, core sections and core samples by the ICDP DIS as assigned during the COSC-1 project. The <bold>Expedition</bold> ID (here &#8216;5054&#8217; for COSC) is defined as unique key value in the ICDP naming convention and is accompanied by a character string that includes indicators for drill <bold>Site</bold> and <bold>Hole</bold> (several boreholes per site and several sites per expedition can exist). Each lower level of the sample hierarchy (<bold>Core</bold> runs, <bold>Sections</bold> and <bold>Samples</bold>) are symbolized by additional characters or numbers to the name of the higher level. The <bold>Hole</bold> is the top level for the IGSN. Each derived object (core runs, core sections, samples, etc.) is related to the borehole through the parent-child relation of each subordinated object/sample.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-689-g1.jpg"/>
</fig>
</sec>
<sec>
<title>The ICDP-IGSN Metadata Schema</title>
<p>As discussed before, the IGSN allocating agents have to provide metadata in the IGSN Description Schema. In addition, IGSN-allocating agents typically provide services to their respective topical and geographical communities and often must address the specific and variable needs of these communities. The allocating agent GFZ Potsdam decided to develop a new IGSN metadata schema for the description of scientific drilling projects in the framework of ICDP (ICDP-IGSN Schema). This community-specific schema is loosely based on the original universal SESAR metadata schema, enriched by specific metadata fields representing the drilling process and instruments, analysis methods, extended descriptive elements for the geological and lithological descriptions. In addition, individual metadata fields for the corrected depths in boreholes exist (e.g., <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://hdl.handle.net/10273/ICDP5054EXF4601">IGSN:ICDP5054EXF4601</ext-link>). Furthermore, we put all information necessary for disseminating the IGSN Description Schema into our ICDP-IGSN Schema. The metadata are exported directly from the DIS into XML, and thus prepared for the IGSN registration. This way, we store IGSN metadata in XML format at GFZ Potsdam, which allows retrieving metadata via an Open Archive Initiative Protocol for Metadata Harvesting (OAI-PMH) interface.</p>
</sec>
<sec>
<title>IGSN Registration</title>
<p>The main technical tasks of IGSN allocating agents are to allow registration of IGSNs for physical samples, guarantee the uniqueness of IGSNs by issuing namespaces to clients, and collect and disseminate IGSN metadata (Figure <xref ref-type="fig" rid="F2">2</xref>). These technical tasks are similar to the tasks which DataCite solves with their corresponding DOI registration infrastructure. As we already modified the DataCite registration software in the past for our data publication activities at GFZ Potsdam (<xref ref-type="bibr" rid="B13">Ulbricht et al., 2016</xref>), we used our experiences and modified the program sources of the DataCite Metadata Store to be used as GFZ IGSN registry.</p>
<fig id="F2">
<label>Figure 2</label>
<caption>
<p>Visualisation of the registration infrastructure. The associcated XML metadata flows (dotted lines and arrows) at the GFZ IGSN Allocating Agent during registration of ICDP samples are shown. Building blocks of the IGSN allocating agent are highlighted in grey boxes. In additon, the steps of an exemplary sample discovery through a web portal or a research article are shown as black arrows.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-689-g2.png"/>
</fig>
<p>In Figure <xref ref-type="fig" rid="F2">2</xref> we outline how the GFZ IGSN registry fits into the federated structure of the IGSN e.V. and its allocating agents. While an internet browser interface to the registry software exists, metadata and URLs to IGSN landing pages (Figure <xref ref-type="fig" rid="F3">3</xref>) are expected to be registered by machines through web-service APIs, which are feeding corresponding data through existing databases into the web portal. The GFZ IGSN registry is designed as a proxy to the global IGSN registry that mints handles which resolve to landing pages.</p>
<fig id="F3">
<label>Figure 3</label>
<caption>
<p>Example of an IGSN Landing Page for a Core Sample of the COSC-1 project (<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://hdl.handle.net/10273/ICDP5054EXF4601">IGSN:ICDP5054EXF4601</ext-link>). The left part contains the full sample description thematically grouped into General Identifiers (for the drilling project and the IGSN hierarchy), Sampling Location, Geology, and Methods used to produce primary borehole data, Drilling (details on the drilling method, instrumentation, PIs, and drilling dates) as well as location of the sample (Repositories). The top right box allows to browse through the sample hierarchy. Different icons indicate the type of sample (Hole, Core, Core Section, Core Sample). The map below shows the geographical location, whereas the lower right part highlights publications that are related with the sample (here the initial scientific publication in Scientific Drilling Journal<xref ref-type="fn" rid="n8">8</xref> and the Operational Report of COSC-1, <xref ref-type="bibr" rid="B11">Lorenz et al. 2015a</xref> and <xref ref-type="bibr" rid="B12">2015b</xref>, and the master thesis and the data publication of <xref ref-type="bibr" rid="B7">Hierold 2016</xref> and <xref ref-type="bibr" rid="B8">Hierold et al. 2016</xref>).</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="dsj-16-689-g3.png"/>
</fig>
<p>Since COSC-1 has produced over 4460 IGSNs so far, it made full use of the programming interface of the GFZ IGSN registry software. The registration of URLs via web-portal landing pages and metadata was accomplished with a set of shell and python scripts. Since the IGSN Description Schema makes use of vocabulary lists, this information had to be adapted to the ICDP-IGSN schema. In particular, we had to ensure to include information about the appropriate ODM2 term for materials, specimen and feature type and the correct term for collection methods, which originates from SESAR. However, the material (rock), the resource type (hole, core, core section, core sample), and the collection method (rock corer) could be easily integrated into a conversion stylesheet.</p>
<p>The ICDP-IGSN schema for samples is designed as superset of the IGSN description schema that has to be available for data dissemination through the OAI-PMH. Furthermore, the ICDP-IGSN schema is used to generate the web portal landing pages through the <italic>Extensible Stylesheet Transformation</italic> (XSLT).</p>
</sec>
<sec>
<title>IGSN Landing Pages</title>
<p>To provide easy web-access to information about IGSN registered scientific drilling sample material and samples, a specific landing page (Figure <xref ref-type="fig" rid="F3">3</xref>) was developed for the COSC-1 drill hole A (5054_1_A), whose top- level IGSN &#8216;ICDP5054EHW1001&#8217; can be resolved using the URL <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://hdl.handle.net/10273/ICDP5054EHW1001">http://hdl.handle.net/10273/ICDP5054EHW1001</ext-link>. Landing pages comprise the complete descriptive data of the IGSN-registered sample material and allow the navigation through the hierarchical &#8216;Parent-Child&#8217; data structure. Additionally, related publications and data sets are listed as DOI-referenced sources on this landing page and in the metadata. As soon as an IGSN is registered by the allocating agent, its sample description (metadata) is made available to the public domain via this web portal system.</p>
</sec>
</sec>
<sec>
<title>Discussion und Outlook</title>
<p>The development of IGSN is an important tool to increase the visibility and access of physical samples. It is complementary to other text and data publication formats, including journal articles, data publications, reports, etc. For a full overview it is essential to carefully cross-reference all participating publications via the metadata. DataCite metadata is offering a broad range of &#8216;relation types&#8217; in their &#8216;related identifier&#8217; package, not only to tie different publications to a dataset or a report, but also to classify the related material in (dataset) documentation, supplement to a journal article and material for further reading. The relevance of IGSN as PID for physical samples is also reflected in the newly released DataCite Metadata Schema 4.0, where IGSN is added as new &#8216;relation type&#8217; option (<xref ref-type="bibr" rid="B5">DataCite 2016</xref>). In addition, it is already possible to cite IGSNs (including the link to the landing pages) in articles of certain journals (e.g. <xref ref-type="bibr" rid="B10">Lloyd et al. 2014</xref>; this paper).</p>
<p>For COSC we put efforts in presenting the whole sample family. However, it is difficult to control what happens after a sample leaves the core repository. Scientists can generate their own samples and hand over subsamples to colleagues, which are then out of reach for the core repository manager. Hence such subsamples are impossible to integrate in the CurationDIS. This may often set the lower limit of the IGSN hierarchy for drilling projects.</p>
<p>The content and provision of descriptive data of drilling projects is highly relevant beyond the reaches of continental scientific drilling, although it is not directly coupled to the IGSN. Similar problems persist in the ocean research drilling community and at any other organisation that holds relevant information about boreholes and drill cores that should be made accessible to the entire scientific community. A relevant example is the on-going work within the EU Horizon 2020 project European Plate Observing System (EPOS)<xref ref-type="fn" rid="n6">6</xref> EPOS is a multi-disciplinary e-Infrastructure for Solid Earth Science in Europe. It integrates the distributed national Research Infrastructures (RIs) by harmonising existing service and component interfaces, which are co-developed by IT specialists and the geoscientific communities. Since no disciplinary metadata standards exist for the drilling community, available practices, such as the here developed metadata schema for the IGSN registration of a scientific borehole, can be considered as first steps towards these.</p>
</sec>
<sec>
<title>Conclusion</title>
<p>This article summarizes the state-of-the art sample and data curation in a successful drilling project, exemplified by COSC-1 and co-sponsored by ICDP in consortium with multiple additional academic and industry partner institutions and agencies. COSC-1 provides a superb &#8216;Blueprint&#8217; for future ICDP projects, i.e., how to plan, organize, conduct and finalize a typical ICDP drilling project from start to finish. COSC-1 has demonstrated the feasibility and applicability of modern geoscientific technologies by merging old and well-proven techniques and methods (e.g., core image scanning as part of acquiring &#8216;primary data&#8216;) with novel developments in database management, data publication and dissemination. This article highlights the ICDP&#8217;s DIS in conjunction with the rapidly evolving IGSN system and associated web-portals for providing open-access of drilling-generated data and metadata.</p>
</sec>
</body>
<back>
<fn-group>
<fn id="n1">
<p>SESAR: <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.geosamples.org/igsnabout">http://www.geosamples.org/igsnabout</ext-link>.</p>
</fn>
<fn id="n2">
<p>The ICDP expedition COSC (Collisional Orogeny in the Scandinavian Caledonides): <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://cosc.icdp-online.org">http://cosc.icdp-online.org</ext-link>.</p>
</fn>
<fn id="n3">
<p>International Continental Scientific Drilling Program (ICDP): <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.icdp-online.org">http://www.icdp-online.org</ext-link>.</p>
</fn>
<fn id="n4">
<p>IGSN e.V.: <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://schema.igsn.org">http://schema.igsn.org</ext-link>.</p>
</fn>
<fn id="n5">
<p>&#8216;Nationales Bohrkernlager f&#252;r kontinentale Forschungsbohrungen&#8217; in Berlin-Spandau, Germany: <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.bohrkernlager.de">http://www.bohrkernlager.de</ext-link>.</p>
</fn>
<fn id="n6">
<p>Scientific Drilling Journal: <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.scientific-drilling.net/">http://www.scientific-drilling.net/</ext-link>.</p>
</fn>
<fn id="n7">
<p>European Plate Observing System (EPOS): <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.epos-ip.org/">https://www.epos-ip.org/</ext-link>.</p>
</fn>
<fn id="n8">
<p>smartcube GmbH, Berlin, Germany: <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.smartcube.de">http://www.smartcube.de</ext-link>.</p>
</fn>
</fn-group>
<ack>
<title>Acknowledgements</title>
<p>The development of the ICDP Drilling Information System was funded by the GFZ German Research Centre for Geosciences, Potsdam, the German Research Foundation and ICDP. ExpeditionDIS and CurationDIS were financed by European Consortium for Ocean Research Drilling (ECORD) and ICDP. The database and software development was accomplished and executed by smartcube GmbH, Berlin, Germany.<xref ref-type="fn" rid="n7">7</xref></p>
</ack>
<sec>
<title>Competing Interests</title>
<p>The authors have no competing interests to declare.</p>
</sec>
<ref-list>
<ref id="B1">
<label>1</label>
<element-citation publication-type="webpage">
<person-group person-group-type="author">
<collab>CNRI</collab>
</person-group>
<chapter-title>Technical Manual, <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://handle.net/">Handle.Net</ext-link> Version 8.1</chapter-title>
<year iso-8601-date="2010">2010</year>
<publisher-name>Corporation for National Research Initiatives CNRI</publisher-name>
<month>December</month>
<comment>Retrieved from: <uri>http://handle.net/tech_manual/HandleTool_UserManual.pdf</uri></comment>
</element-citation>
</ref>
<ref id="B2">
<label>2</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Conze</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title> Electronic Data Processing at KTB</article-title>
<year iso-8601-date="1995">1995</year>
<fpage>F1</fpage>
<lpage>F13</lpage>
<comment>KTB Report 95-2</comment>
</element-citation>
</ref>
<ref id="B3">
<label>3</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Conze</surname>
<given-names>R</given-names>
</name>
</person-group>
<chapter-title>Drilling Information System DIS and Core Scanner</chapter-title>
<year iso-8601-date="2016">2016</year>
<publisher-name>JLSRF</publisher-name>
<volume>2</volume>
<fpage>A63</fpage>
<pub-id pub-id-type="doi">10.17815/jlsrf-2-130</pub-id>
</element-citation>
</ref>
<ref id="B4">
<label>4</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Conze</surname>
<given-names>R</given-names>
</name>
<name>
<surname>H&#228;ner</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Yazici</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Part C: Technical aspects and future developments-The KTB Information System</article-title>
<year iso-8601-date="1993">1993</year>
<fpage>535</fpage>
<lpage>541</lpage>
<comment>KTB Report 93-2</comment>
</element-citation>
</ref>
<ref id="B5">
<label>5</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<collab>DataCite Metadata Working Group</collab>
</person-group>
<article-title>DataCite Metadata Schema Documentation for the Publication and Citation of Research Data</article-title>
<year iso-8601-date="2016">2016</year>
<pub-id pub-id-type="doi">10.5438/0012</pub-id>
<comment>Version 4.0</comment>
</element-citation>
</ref>
<ref id="B6">
<label>6</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Devaraju</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Klump</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Cox</surname>
<given-names>S J D</given-names>
</name>
<name>
<surname>Golodoniuc</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>Representing and publishing physical sample descriptions</article-title>
<source>Computers&amp;Geosc</source>
<year iso-8601-date="2016">2016</year>
<volume>96</volume>
<fpage>1</fpage>
<lpage>10</lpage>
<pub-id pub-id-type="doi">10.1016/j.cageo.2016.07.018</pub-id>
</element-citation>
</ref>
<ref id="B7">
<label>7</label>
<element-citation publication-type="thesis">
<person-group person-group-type="author">
<name>
<surname>Hierold</surname>
<given-names>J</given-names>
</name>
</person-group>
<source>Analysis of element behavior in mylonites of the Seve Nappe of the Scandinavian Caledonides using different core scanning methods. Master Thesis</source>
<year iso-8601-date="2016">2016</year>
<publisher-name>Scientific Technical Report STR; 16/07</publisher-name> 
<pub-id pub-id-type="doi">10.2312/GFZ.b103-16070</pub-id>
</element-citation>
</ref>
<ref id="B8">
<label>8</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hierold</surname>
<given-names>J</given-names>
</name>
<name>
<surname>K&#246;rting</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Kollaske</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Rogass</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Harms</surname>
<given-names>U</given-names>
</name>
</person-group>
<article-title>Analysis of element behavior in mylonites of the Seve Nappe of the Scandinavian Caledonides using different core scanning methods (Datasets)</article-title>
<source>GFZ Data Services</source>
<year iso-8601-date="2016">2016</year>
<pub-id pub-id-type="doi">10.5880/ICDP.5054.001</pub-id>
</element-citation>
</ref>
<ref id="B9">
<label>9</label>
<element-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Lehnert</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Klump</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Facilitating Research in Mantle Petrology with Geoinformatics</article-title>
<conf-name>9th International Kimberlite. Conference Extended Abstract No. 9IKCA00250</conf-name>
<year iso-8601-date="2008">2008</year>
</element-citation>
</ref>
<ref id="B10">
<label>10</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lloyd</surname>
<given-names>A S</given-names>
</name>
<etal/>
</person-group>
<article-title>NanoSIMS results from olivine-hosted melt embayment: Magma ascent rate during explosive basaltic eruptions</article-title>
<source>Journal of Volcanology and Geothermal Research</source>
<year iso-8601-date="2014">2014</year>
<volume>283</volume>
<fpage>1</fpage>
<lpage>18</lpage>
<pub-id pub-id-type="doi">10.1016/j.jvolgeores.2014.06.002</pub-id>
</element-citation>
</ref>
<ref id="B11">
<label>11</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lorenz</surname>
<given-names>H</given-names>
</name>
<etal/>
</person-group>
<article-title>COSC-1 operational report &#8211; Scientific data sets</article-title>
<year iso-8601-date="2015a">2015a</year>
<pub-id pub-id-type="doi">10.1594/GFZ.SDDB.ICDP.5054.2015</pub-id>
</element-citation>
</ref>
<ref id="B12">
<label>12</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lorenz</surname>
<given-names>H</given-names>
</name>
<etal/>
</person-group>
<article-title>COSC-1 &#8211; drilling of a subduction-related allochthon in the Palaeozoic Caledonide orogen of Scandinavia</article-title>
<source>Scientific Drilling</source>
<year iso-8601-date="2015b">2015b</year>
<volume>19</volume>
<fpage>1</fpage>
<lpage>11</lpage>
<pub-id pub-id-type="doi">10.5194/sd-19-1-2015</pub-id>
</element-citation>
</ref>
<ref id="B13">
<label>13</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ulbricht</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Elger</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Bertelmann</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Klump</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>PanMetaDocs, eSciDoc and DOIDB &#8211; An Infrastructure for the Curation and Publication of File-Based Datasets for GFZ Data Services</article-title>
<source>International Journal of Geoinformatics</source>
<year iso-8601-date="2016">2016</year>
<volume>53</volume>
<fpage>25</fpage>
<pub-id pub-id-type="doi">10.3390/ijgi5030025</pub-id>
</element-citation>
</ref>
<ref id="B14">
<label>14</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>W&#228;chter</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Friese-Haug</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>I. KTB Oberpfalz VB &#8211; Datenverarbeitung. KTBase KTB database &#8211; der Kern eines wissenschaftlich/technischen Informationssystems: Hardware-Konfiguration und Struktur der Datenbank</article-title>
<year iso-8601-date="1989">1989</year>
<fpage>I6</fpage>
<lpage>I20</lpage>
<comment>KTB Report 89-2</comment>
</element-citation>
</ref>
</ref-list>
</back>
</article>