Data documentation
Why document your data?
It is important that your data is organised and well-documented so that you and others can later find and understand/interpret your data. Adding documentation has many advantages; it:
- ensures the data is not meaningless and does not become so
- clearly identifies the data
- provides a context for your data
- provides a link between your data and associated research publications and authors
- enables your data to be more easily found/discovered
- enables your data to be more easily interpreted
- enables your data to be validated by others
- enables other researchers to replicate your findings
- enables other researchers to enhance or enrich your data
- enables you and others to compare your data with similar data sets
- increases the likelihood that your data will be cited by others.
Data that has been poorly documented will be difficult to find, difficult to interpret, is more likely to be misunderstood, and is more likely to be taken out of context if re-used by others. Data that cannot be readily identified is also more likely to be accidentally lost or deleted.
What to document
Document the following:
- when the data was collected (e.g. what was the date-range for the collection of data)
- why the data was collected (e.g. what were the goals and objectives of the research)
- where the data was collected
- how the data was collected
- how the data was de-identified (if applicable).
You should also ensure there is a description of each data element (e.g. explanation of labels, coding systems used, etc.) and provide any other information that may assist readers to interpret and analyse your data.
Other documentation
During the course of your research project, you should create and maintain clear records to facilitate appropriate access to and retrieval of your data to all authorised persons. These records should include:
- the location of data and primary materials
- the location of physical keys, passwords, or other devices necessary to access them
- information on indexes, catalogues or other finding tools necessary to access them
- conditions of access.
How to document your data
There is no single correct way of documenting your data. Some methods commonly used include:
- a single text file that includes all documentation in a single place;
- a set of contextual files explaining the data contained within a given folder. Such files should be stored in the same folder as the data which they accompany, and be given an appropriate name such as “Readme.txt”
- Documentation stored within the data files themselves, often presented as a separate page in a text file or a separate worksheet within a spreadsheet