Skip to main content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Research Data Management Toolkit: Documentation

Best practices in Research Data Management promote research integrity and collaborative opportunities. A Data Management Plan ensures data security, accessibility and validation of results.

Documentation and Metadata

 

Metadata can be explained simply as 'data about data'. Metadata describes the related research data and details its location to enable efficient retrieval and reuse throughout the research data lifecycle.

It is an essential component of research data management  and should incorporate file naming and organisation protocols which are used by all researchers working on a project.

Metadata:

Storing your documentation and metadata

Correct storage of documentation and metadata is just as important as the storage of the research data itself, as the metadata provides a descriptive meaning to raw research data. Researchers are encouraged to use the same guidelines to store all documentation and metadata as those used for research data storage and backup.

Metadata essentials

File organisation

Researchers should develop an organised electronic filing system where everyone involved in data collection, analysis, reuse and storage understands the file naming protocols. The use of folders must follow the agreed standard where each project, assay method, laboratory experiment or sample group is logically placed in a hierarchical order.

Examples include:

  • Top level folders should reflect the year or project number.
  • Lower level folders should consist of a subsection of folders relevant to the project e.g. experiment types and sample results; sample number and various experimental results etc.
  • Different versions of the same data are named and saved according to the file status e.g. final/master or draft.
  • Master files are held in a suitable format and the owner has write access with all changes tracked.

File naming guide

Researchers should adhere to disciplinary standards and maintain consistency in file naming and version control to ensure ease of reuse for all.

Naming Standard Description of Naming Standard
Numbering Standards Specification of digit numbers to ensure a consecutive listing of files
Date Standards Specification of date formats to ensure a consecutive listing a files e.g. 'YYYY-MM-DD'
Punctuation Standards Do not use any punctuation or spaces except for underscores or hyphens to partition words. The period sign should only precede the file extension e.g. ‘project_101_sample_001.xls’ or ‘project-101-sample-001.xls’
Vocabulary Standards Maintain disciplinary standards in vocabulary, language and abbreviations e.g. ‘project-101-pcr-sample-001’ or ‘project-101-microarray-sample-101’
File Version Numbering Label the file versions in numerical terms e.g. '1.0, 1.1, 1.2 etc.'
File Version Description Complete file naming appropriately through the use of descriptive terms at the end of the document name e.g. 'draft_1, draft_2, final_1 etc.'

 

To rename a larger volume of files simultaneously see the Microsoft Support’s Rename Multiple Files in Windows XP with Windows Explorer guide.

Version control refers to file revisions management. Version control facilitates best practice in research data management during a project where constant redrafting and revision is occurring by numerous researchers.

Throughout the course of the research data lifecycle multiple versions of documents or files can be created and mechanisms must be put into place to decipher between the different versions.

Version control management can be achieved through:

  • research data access and editing privilege control;
  • selecting one individual to handle all manual editing of data; and
  • the use of software such as Git.

Version control ensures maintenance of a master file which documents all versions and all changes that are made to the research data.

Researchers must also consider the longevity of the software/hardware required to create and analyse research data. It is important to include documentation relating to software/hardware requirements in the Research Data Management Plan. It may be necessary to include a copy of the software version including any related metadata together with the research data (depending on software licensing conditions).

Manage Version Control through file sharing services

There are numerous free or commercial storage tools available online which can aid researchers with version control management.

  • Pawsey Supercomputing Centre

The Pawsey Supercomputing Centre facilitates the uptake of supercomputing, large scale data storage and visualisation in Western Australia. It is an unincorporated joint venture between CSIRO, Curtin University, Edith Cowan University, Murdoch University and The University of Western Australia and is supported by the Western Australian Government. For details of services see https://pawsey.org.au/.

  • Australian Data Archive (ADA)

UWA is also contributing datasets to the Australian Data Archive (ADA) for further analysis by researchers.  ADA Data Access information and forms are available. Those interested in contributing their research to ADA can find information on how to do this on the ADA website

 Through the use of metadata standards, computer software is able to recall and combine metadata from several sources.

The most commonly used descriptive standard is Dublin Core as it is flexible across disciplines and data formats (including non-digital). It includes elements such as Title, Creator, Subject, Date and Type. The UWA Profiles and Research Repository uses Dublin Core and MARC as the metadata standard.

The Registry Interchange Format – Collections and Services (RIF-CS) schema is used to describe collections, parties, activities and services related to research data collections.  UWA uses the RIF-CS schema to describe local research data collections which are then harvested into Research Data Australia (RDA). For more information about RIF-CS, please refer to the Australian National Data Service (ANDS) website.

Persistent identifiers

An identifier is a label or reference number given to a data object and is integral to research data documentation and metadata. It is the responsibility of the researcher to ensure that the location information of the research data is kept current. Identifiers should be both persistent and unique. A unique identifier - such as a Uniform Resource Locator (URL) - which is not persistent may result in a broken link if the dataset is relocated. Persistent identifiers (PIDs) are kept current or redirected over specified time periods by the host. Digital Object Identifiers (DOI), Persistent Uniform Resource Locators (PURLs) or the Handle System can embed an identifier into the URL to also ensure the PID is kept up to date.

The Australian National Data Service's (ANDS) guide on Persistent Identifiers awareness is a useful resource for researchers. 

Metadata should include an explanatory description of the research data incorporating:

UWA’s Research Data Management Plan is available to researchers for download to aid in the creation of metadata descriptions for their research data.

Discipline-specific metadata standards 

Discipline

Metadata Standard

Humanities Data

Geospatial Data

Social Sciences Data

Scientific Data

 

Section 6 of the Research Integrity Policy refers to the Responsible Conduct of Research.

The Australian Code for the Responsible Conduct of Research states:

2.6.1 Keep clear and accurate records of the research methods and data sources, including any approvals granted, during and after the research process.

Good quality documentation ensures that research data is:

  • discoverable;
  • usable and understandable; 
  • described to attain relevance for future use;
  • linked to results;
  • easily retrieved;
  • validated and reproducible;
  • protected from incidental destruction; and
  • used appropriately and accurately.

Depending on the research discipline, documentation will have different requirements but as a general rule should include comprehensive metadata.

What is Metadata? video

Edina Data Centre has developed a video which describes metadata and its benefits.

CONTENT LICENCE

 Unless otherwise stated, content in this guide is licensed under a Creative Commons Attribution-ShareAlike 4.0 International Licence