Persistent Identifier Practice for Big Data Management at NCI

Authors

Jingbo Wang National Computational Infrastructure, Canberra https://orcid.org/0000-0002-3594-1893
Nicolas Car Geoscience Australia, Canberra
Ben Evans National Computational Infrastructure, Canberra
Kashif Gohar National Computational Infrastructure, Canberra
Claire Trenham National Computational Infrastructure, Canberra
Lesley Wyborn National Computational Infrastructure, Canberra

DOI:

https://doi.org/10.5334/dsj-2017-020

Keywords:

Persistent Identifier, Big Data, URI, Data catalogue, Data Management

Abstract

The National Computational Infrastructure (NCI) manages over 10 PB research data, which is co-located with the high performance computer (Raijin) and an HPC class 3000 core OpenStack cloud system (Tenjin). In support of this integrated High Performance Computing/High Performance Data (HPC/HPD) infrastructure, NCI’s data management practices includes building catalogues, DOI minting, data curation, data publishing, and data delivery through a variety of data services. The metadata catalogues, DOIs, THREDDS, and Vocabularies, all use different Uniform Resource Locator (URL) styles. A Persistent IDentifier (PID) service provides an important utility to manage URLs in a consistent, controlled and monitored manner to support the robustness of our national ‘Big Data’ infrastructure. In this paper we demonstrate NCI’s approach of utilising the NCI’s PID Service to consistently manage its persistent identifiers with various applications.

Downloads

Published

2017-04-18

Issue

Vol. 16 (2017)

Section

Practice Papers

Collections

20 Years of Persistent Identifiers: Applications and Future Directions

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors who publish with this journal agree to the following terms. If a submission is rejected or withdrawn prior to publication, all rights return to the author(s):

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.

Submitting to the journal implicitly confirms that all named authors and rights holders have agreed to the above terms of publication. It is the submitting author's responsibility to ensure all authors and relevant institutional bodies have given their agreement at the point of submission.

Note: some institutions require authors to seek written approval in relation to the terms of publication. Should this be required, authors can request a separate licence agreement document from the editorial team (e.g. authors who are Crown employees).