After collecting the raw data, researchers will perform several key processes to ensure that the data is accurate, reliable, and appropriate for analysis. Below is a comprehensive guide for data processing from data cleaning, transformation, integration, to data analysis, visualization, and interpretation.
The 6 Key Processes
1. Data Cleansing |
||
Data cleansing, also known as data cleaning or data scrubbing, is the process of identifying and correcting errors, inconsistencies, and inaccuracies in a dataset to improve its quality and reliability. The process involves removing duplicate records, filling in missing values, correcting corrupted and erroneous data entries, and standardizing data formats. |
A video created by NetCom Learning (1:50) Topics include:
|
|
2. Data Transformation |
||
Through this process, data will be converted into a suitable format or structure for analysis. This may involve data normalization/ deduplication/ aggregation/ discretization, creating new variables, or transforming data types. Transformation helps in preparing data into a target format that can be fed into operational systems, a data lake, a data warehouse, or other repositories for use in business intelligence and analytics applications. |
A video created by the University of Liverpool (3:41) Topics include:
|
|
3. Data Integration |
||
The process of combining and harmonizing data from multiple sources into a unified and coherent format. Ultimately, the data from various systems and databases will be transformed into a consistent structure and become accessible for analysis and decision making. Some common data integration approaches are ELT (Extract, Load, and Transform), Real-time Data Integration, Application Integration (API), Data Virtualization, etc. |
A video created by Qlik (2:04) Topics include:
|
|
4. Data Analysis |
||
The process of systematically and comprehensively applying statistical and logical techniques to evaluate and interpret data, uncovering patterns, trends, and relationships. It involves cleaning, transforming, and modeling data to extract meaningful insights, predict trends, and support decision-making from both structured and unstructured data. Data analysis can be qualitative or quantitative, depending on the nature of the data and research objectives. |
A video created by DataWrangler (2:46) Topics include:
|
|
5. Data Visualization |
||
The process of using essential tool to present data in a visually appealing and understandable manner, helping in conveying complex information and insights to stakeholders effectively. The commonly adopted graphics for the data representation are charts, plots, infographics, heat maps, and even animations. There are two primary classes of visualization: Information Visualization:
Scientific Visualization:
|
A video created by the University of British Columbia (2:40) Topics include:
|
|
6. Data Interpretation |
||
Data interpretation involves analyzing collected data to draw meaningful conclusions. This process includes identifying patterns, trends, and relationships within the data. Researchers use statistical methods, visualizations, and contextual knowledge to understand the data's implications. Effective data interpretation helps validate hypotheses, inform decision-making, and generate insights that contribute to research objectives. |
A video created by the University of Melbourne (8:24) Topics include:
|
For enquiries, please contact the Library's Research Data Management Services of the Research Support and Scholarly Communication Section at lbrdms@cityu.edu.hk