Skip to main content

Research data

A guide to help you find data for your research

Data

Finding data for your research may involve looking at a number of different sources.  Some major sources of published research data include :

  • Government websites
  • Data catalogues and data directories
  • Library databases
  • Subject or discipline based repositories or archives
  • Institutional repositories
  • University websites, including research centres or research group websites
  • Internet search engines, e.g. Google or Google Scholar

 

Once you have found your data, you will need to be mindful of managing that data appropriately and providing attribution. Check the Research Data Management guide for more information.

Evaluating a data source

Once you’ve chosen a data set that you believe will work, take care to carefully evaluate it before you start using it.

  • Is it appropriate for your research? Is it complete? Is it accurate?
  • Does it come from an authoritative source?
  • Who owns the data and is it accessible?
  • Is it in a format you can work with?
  • Does it cover your Where, When, and Who or What requirements?
  • Are you willing to compromise your requirements or manipulate the data to fit your needs?

Always read the supporting documentation or codebooks to ensure that the analysis you are planning to do really measures what you want it to. You can find more ideas on choosing data.

Searching for data

Before you start searching for research data, make sure you define the data you are looking for in order to focus your search and the sources you use.

1.  Identify your topic - From your research question, identify key concepts and be specific.

2.  Unit of Analysis – This is what your variables will describe.  They may be individuals, flora, fauna, organisations, or products.

3.  Geographical area – Is the topic to be restricted to a particular location?   For example, the koala population in south east Queensland, or schools in Vietnam.

4.  Time frame – Is the topic to be restricted to a particular point in time?  For example, recent data from the last 10 years, or from 1991-1995.

5.  Frequency of data collection – Do you require a series of observations, made at regular intervals (annually, monthly or daily)?

6.  Type of data collection –  Cross-sectional – collected by observing many subjects (the unit of analysis) at the same  point of time or Longitudinal – repeated observations of the same subject over a period of time     

You should also consider:

  • What analysis do you plan to do?
  • Will you be using data analysis software to analyse the data (R, MATLAB, ATLAS.ti, SPSS)? Is the data in a format that is compatible with this software?
  • Where is the data coming from? Is there a contact person? Are there organisational affiliations or credentials associated with the person responsible for the data collection?
  • Where might the data be held; which organisations might collect that type of data?
  • Will you need to reframe your approach to accommodate, or work with, the dataset that is available?

Keep in mind that the data you are looking for may be a subset of a larger dataset.         

Data citation

Just as researchers routinely provide a bibliographic reference to sources such as journal articles, reports and conference papers, data citation is the practice of providing reference to datasets. Like traditional bibliographic references, Data Citations acknowledge the original author/creator and help other researchers find the dataset. When finding and using data make sure proper acknowledgment is being provided to the originator of the data with proper data citation.

DataCite is an international organisation that aims to establish easier access to research data.  DataCite's recommended format for a data citation is:

Creator (PublicationYear): Title. Publisher. Identifier

 

The Australian National Data Service (ANDS) provides more information on how to cite data, and the benefits of data citation.