How to cite datasets used in your published works
To comply with ethics norms and good research practice, you must cite primary data sources in your published papers, just as you cite other sources.
The term “dataset” refers to the data files used as research evidence, as well as data collection documents and metadata (information on the provenance of the data and how to access and use the data). Here are some tips on correct data citations.
- Identify the data early in on your paper, preferably in the abstract.
- Include a dedicated "data" section so that readers can immediately identify the data that underlies your work.
- Reference the data in your data tables.
- Cite data in your references. References are more frequently indexed than full papers, so the citation will be made more visible by its inclusion here.
- Cite the exact version of the data used in your research, to support data discovery.
- Include the unique identifier of the dataset in your citation, such as Direct Object Identifiers (DOIs).These will enable the data to be accessed even if URLs change and thus provide a permanent link to the data.
DataFirst uses the international data citation recommended by DataCite. This follows the format below:
Name of producer. Survey name and date [dataset]. Version number. Place of production: Producer [producer], date of production. Place of distribution: Distributor [distributor]. URL or DOI
Statistics South Africa. General Household Survey 2010 [dataset]. Version 2. Pretoria. Statistics South Africa [producer], 2011. Cape Town. DataFirst [distributor], 2011 https://www.datafirst.uct.ac.za/dataportal/index.php/catalog/192.
Contact our helpdesk support[at]data1st.org for help with citing data in your published research.
Based on: Ball, A. & Duke, M. (2011). ‘How to Cite Datasets and Link to Publications’. DCC How-to Guides. Edinburgh: Digital Curation Centre. Available Online: http://www.dcc.ac.uk/resources/how-guides/cite-datasets.