Skip to Main Content

RDM - Share & Reuse: Data Citation



As research data is an independent scholarly work, data citation is the practice of referencing datasets similar to citations of books, articles, or other sources in academic and professional work. Proper data citation is important for acknowledging the original creators of the data, enabling others to locate and access the data, and supporting the reproducibility and verification of research findings.


Citation Elements

A proper data citation should typically include the following elements:

  1. Author(s): The individual(s) or organization(s) responsible for creating the dataset.

  2. Title: The dataset's title should be descriptive and specific.

  3. Year of Publication: The year when the dataset was published or made available.

  4. Version: If applicable, the version of the dataset being cited, as datasets can be updated over time.

  5. Publisher/Distributor: The entity responsible for distributing or hosting the dataset, such as a data repository or archive.

  6. Identifier: A unique identifier for the dataset, such as a DOI (Digital Object Identifier), provides a persistent link to the dataset.

  7. Access Information: Information on how to access the dataset, which may include a URL or other location details.

  8. Date Accessed: The date when the dataset was accessed, especially if the dataset is subject to change over time.

Proper data citation ensures that datasets are credited appropriately and that others can find and use the data in their own research.


 Example of a data citation in APA style:

Smith, J. A., & Doe, R. L. (2020). Global temperature data (Version 3.2) [Data set]. National Climate Data Center. https://doi.org/10.1234/ncdc.tempdata.v3.2

  For enquiries, please contact the Library's Research Data Management Services of the Research Support and Scholarly Communication Section at lbrdms@cityu.edu.hk