Skip to Main Content

RDM - Collect & Organize: Documentation and Metadata



Documentation and metadata refer to the detailed information that describes and contextualizes data, ensuring its usability and comprehension.

Data documentation provides all the information required to discover, interpret, understand, access, and reuse the data. It includes protocols, methodologies, and data collection processes, providing a comprehensive understanding of the research. In contrast, metadata consists of structured information such as titles, dates, authors, and keywords that facilitate data discovery and retrieval. Together, they enhance data transparency, reproducibility, and sharing, enabling researchers to effectively manage, interpret, and utilize data throughout the research lifecycle.

 Metadata

Metadata provides context and information about the data set, including details such as the creator, date of creation, file format, and any relevant methodology. Proper and comprehensive metadata ensures that data can be understood, interpreted, and reused by others in the future. Generally, it includes descriptive, structural, and administrative information. 

Type Description Examples
Descriptive Metadata  Provides information about the content and context of the data. It helps users understand what the data is about, who created it, and when it was created 
Click to check
  • Title
  • Creator
  • Date
  • Description
  • Subject
  • Keywords
  • DOI, etc.
Structural Metadata  Describes the organization and relationships within a data set. It provides information on how data is structured and how different components of the data relate to each other 
Click to check
  • File format
  • File names
  • Data schema
  • Table of contents
  • Versions, etc.
Administrative Metadata  Provides information needed to manage a data set. This type of metadata ensures that data is properly maintained and accessible over time. 
Click to check
  • Right & License
  • Access rights like embargo period
  • Retention Policy
  • Contact Information

Some funders require the metadata standards to be detailed in the Data Management Plan (DMP), while others stipulate that associated data from funded projects must be shared openly and described with comprehensive metadata that follows best practices of the respective discipline. Researchers should carefully observe the metadata requirements specified by their funders.

Below are some useful resources for you to explore metadata schema in your research areas:

 

  For enquiries, please contact the Library's Research Data Management Services of the Research Support and Scholarly Communication Section at lbrdms@cityu.edu.hk

Documentation of metadata

Data Dictionary

A data dictionary is a detailed description of the data within a dataset. The data dictionary helps users understand the variables, their definitions, data types, and any specific coding or classification used.

An example: Survey responses

Variable Name Description Data Type
Respondent ID Unique identifier for each respondent Integer
Age Age of the respondent String
Gender Gender of the respondent String
Score Customer satisfaction score Integer

 

Readme file

Similar to a data dictionary, a README file is a text document that provides an overview of the dataset, including essential information about its purpose, structure, and usage. It typically includes details on how to navigate the dataset, any pre-processing steps, and instructions for using the data.

Useful guidelines of creating a Readme file

 

  For enquiries, please contact the Library's Research Data Management Services of the Research Support and Scholarly Communication Section at lbrdms@cityu.edu.hk