Presentation Type

Lightning Talk

Location

Zoom. Recording Coming Soon!

Start Date

15-4-2024 3:00 PM

End Date

15-4-2024 3:20 PM

Description

Institutional requirements and changing scholarly cultures have led to a positive increase in the amount of research data that is openly available. However, academic institutions do not always know much about where their researchers share data and may not be able to provide them support for doing so effectively. When research datasets are poorly described or lack key metadata, they are difficult to discover and therefore unlikely to be reused. Persistent identifiers such as digital object identifiers (DOIs) for research outputs, ROR IDs for institutions, and ORCID iDs for individuals are of particular importance for promoting findability and accessibility. For example, DataCite, a common minter of DOIs for datasets, provides public dashboards that aggregate datasets at the institutional level based on their attached ROR IDs; however, datasets shared without affiliated ROR IDs are not captured.

This lightning talk will report on a project in process to explore a process for more fully capturing datasets shared by researchers at a given institution by utilizing the DataCite API. It will also cover how to use Python code to extract relevant metadata, including the repositories where researchers share their data and the completeness of its associated metadata. This data can inform services by giving librarians a robust understanding of repositories of interest at their institutions as well as gaps in data sharing practices that they may seek to address in order to ensure that research data can be more findable and accessible.

Author Bios

Isaac Wink is the Research Data Librarian at the University of Kentucky. He is a recent graduate of the University of Illinois at Urbana-Champaign, where he received his MS in Library and Information Science and MA in History. His research interests include data sharing practices and barriers to productive data reuse.

Comments

While still in development, I hope to be able to provide participants with access to a Google Colab notebook that will allow them to recreate my analysis for their own institutions without having to change the underlying Python code. It would be similar to the analysis I provide here: https://github.com/igwink/DOI_Metadata_Analysis/blob/main/DataCite_DOI_Analysis.ipynb

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Share

COinS
 
Apr 15th, 3:00 PM Apr 15th, 3:20 PM

Investigating Researcher Data Sharing Practices Using the DataCite API

Zoom. Recording Coming Soon!

Institutional requirements and changing scholarly cultures have led to a positive increase in the amount of research data that is openly available. However, academic institutions do not always know much about where their researchers share data and may not be able to provide them support for doing so effectively. When research datasets are poorly described or lack key metadata, they are difficult to discover and therefore unlikely to be reused. Persistent identifiers such as digital object identifiers (DOIs) for research outputs, ROR IDs for institutions, and ORCID iDs for individuals are of particular importance for promoting findability and accessibility. For example, DataCite, a common minter of DOIs for datasets, provides public dashboards that aggregate datasets at the institutional level based on their attached ROR IDs; however, datasets shared without affiliated ROR IDs are not captured.

This lightning talk will report on a project in process to explore a process for more fully capturing datasets shared by researchers at a given institution by utilizing the DataCite API. It will also cover how to use Python code to extract relevant metadata, including the repositories where researchers share their data and the completeness of its associated metadata. This data can inform services by giving librarians a robust understanding of repositories of interest at their institutions as well as gaps in data sharing practices that they may seek to address in order to ensure that research data can be more findable and accessible.