Before commencing your research project you will need to decide how you and your team will work with your data. Decisions need to be made at the outset and documented in your data management plan about:
Metadata is an essential part of your project documentation.
Researchers need to ensure that their research data is secure and retrievable for long term use. The longevity of research data formats and software versions must be accounted for at the data storage stage.
When selecting file formats consideration must be given to:
During data collection and analysis, researchers may select specific data formats. Conversion of data into standard interchangeable formats may be necessary for preservation purposes. As future access and reuse of data may be affected by proprietary formats, it is advisable to use open formats such as Rich Text Format (RTF) or Open Document Format (ODF) for preservation purposes. The Library of Congress Recommended Formats Statement (2021-2022) includes recommendations for datasets and associated metadata.
Researchers should document all data capture and storage formats as well as any analysis software used. In the event of future software changes it is advisable to also store a copy of the software along with the data.
Establishing a logical, consistent way to organise your folders and the files within them will make it much easier to manage your data throughout the project and for describing your project in future publications. There are many ways of doing this, such as by date or project. The University of Cambridge has guidance on organising your data, suggestions for file names, versioning, documentation and metadata, as well as managing emails. Some commonly used techniques include:
ANDS site: https://www.ands.org.au/working-with-data/data-management/data-versioning
Quality assurance is the steps taken during the data collection process to ensure the data is of high quality and complete. As explained by the USGS (US Geological Survey), the key steps are:
Quality control is the subsequent process of checking that the data meets overall quality goals and criteria. Quality control should be conducted regularly throughout the project and may lead to improvements in quality assurance processes.
Metadata can be explained simply as 'data about data'. Metadata describes the related research data and details its location to enable efficient retrieval and reuse throughout the research data lifecycle.
It is an essential component of research data management and should incorporate file naming and organisation protocols which are used by all researchers working on a project.
Metadata:
Correct storage of documentation and metadata is just as important as the storage of the research data itself, as the metadata provides a descriptive meaning to raw research data. Researchers are encouraged to use the same guidelines to store all documentation and metadata as those used for research data storage and backup.
Many disciplines have their own way of structuring metadata, known as schemas. They list the information you need to provide about your data and how the information should be structured. Some examples include:
Discipline | Metadata Standard |
---|---|
General |
Metadata Object Description Schema (MODS) |
Humanities | |
Social Sciences | Data Documentation Initiative (DDI) |
Sciences | |
Geospatial | Content Standard for Digital Geospatial Metadata (CSDGM) |
There are several tools available to make creating and managing metadata easier, as listed by Stanford Libraries.
Except for logos, Canva designs or where otherwise indicated, content in this guide is licensed under a Creative Commons Attribution-ShareAlike 4.0 International Licence.