Skip to Main Content

RDM - Collect & Organize: Data Collection



Data collection serves as the foundation for the entire study, ensuring that relevant and accurate information is gathered for analysis. The main sources and types of research data, and some common data collection methods are listed as follows.

 Main Types and Sources of Research Data

Type Description Source
Primary Data  Original and firsthand data collected directly by researchers for a specific research purpose 
Examples
  • Experiments and trials
  • Surveys / Interviews / Focus Groups
  • Direct observations
  • Case and Longitudinal Studies
  • ......
Secondary Data  Data that has already been collected, analyzed, and published by someone else
Examples
  • Published research articles and reports
  • Government and organizational databases
  • Online databases and repositories
  • Historical records and archives
  • Media content (e.g., newspapers, television broadcasts)
  • ......
Qualitative Data

A descriptive type of data and is often collected through methods such as interviews, focus groups, and observations

Examples
  • Textual data (e.g., interview transcripts, open-ended survey responses)
  • Visual data (e.g., photographs, videos)
  • Audio recordings
  • .....
Quantitative Data Data that is numerical and can be measured and analyzed statistically
Examples
  • Experimental measurements
  • Statistical datasets
  • Numerical survey responses
  • .....
Sensor and Instrument Data Data that is collected using various sensors and instruments, often in scientific and engineering research
Examples
  • Environmental data from weather stations
  • Geospatial data from GPS devices
  • Biological data from laboratory instruments
  • .....
Administrative Data Data that is collected by organizations or institutions as part of their routine operations
Examples
  • Employment data
  • Health records
  • Educational records
  • .....
Big Data This refers to large and complex datasets that require advanced methods for storage, processing, and analysis
Examples
  • Social media platforms
  • E-commerce transactions
  • Internet of Things (IoT) devices

 

Common Data Collection Methods

Experiments This involves manipulating one or more variables to determine their effect on other variables. Experiments are often conducted in controlled environments and are common in scientific research.
Case Studies This method involves an in-depth analysis of a single case or a small number of cases. It is useful for exploring complex phenomena in detail.
Observations Researchers collect data by observing subjects in their natural environment. This method is often used in ethnographic studies and can be either participant or non-participant observation.
Surveys and Questionnaires These are used to gather data from a large number of respondents. They can be administered in person, over the phone, via mail, or online. Surveys can include open-ended questions, closed-ended questions, or a mix of both.
Interviews This method involves direct, one-on-one interaction between the researcher and the participant. Interviews can be structured, semi-structured, or unstructured, depending on the level of flexibility desired.
Focus Groups This involves guided discussions with a group of people to gather diverse perspectives on a particular topic. It is useful for exploring complex issues and generating ideas.
Document and Content Analysis This involves analyzing existing documents, texts, or media to extract relevant information. It is often used in historical research or media studies.
Secondary Data Analysis Researchers use existing data collected by other researchers or organizations. This can include datasets from government agencies, research institutions, or commercial entities.

 Handling Sensitive Data

Before sharing data publicly or storing it long-term, ensure data privacy by anonymizing personal information, encrypting sensitive data, and implementing strict access controls. This protects individuals' privacy and complies with legal and ethical standards. 

Data Classification

City University of Hong Kong has released the least Information Classification and Handling Standard to the public, indicating all its information assets into appropriate levels to indicate the need, priority and degree of protection required. Please refer to Information Classification and Handling Standard , created by Computing Service Centre, City University of Hong Kong for details. 

Data Encryption

Data encryption is a critical component ensuring that sensitive information remains secure and inaccessible to unauthorized users. Data encryption transforms readable data (plaintext) into an unreadable format (ciphertext) using algorithms and keys. The details of such technologies could be found on the webpage of Encryption for Information Protection , created by Computing Service Centre, City University of Hong Kong. 

 

Apart from adopting the security technologies, below are some practices to protect individual identities without relying on technical security measures. 

Remove or alter personal identifiers (e.g., names, addresses) to prevent the identification of individuals.
Aggregate data or use broader categories to reduce the specificity of information (e.g., age ranges instead of exact ages).
Obtain explicit consent from individuals before collecting, using, or sharing their data, ensuring they are fully informed about how their data will be used.
Collect only the data that is necessary for the research purpose, avoiding the collection of excessive or irrelevant information
Replace personal identifiers with pseudonyms or codes that can be linked to the original data only through a separate, secure key.
Require all personnel handling sensitive data to sign confidentiality agreements, committing to protect the privacy of the data.

  For enquiries, please contact the Library's Research Data Management Services of the Research Support and Scholarly Communication Section at lbrdms@cityu.edu.hk