Removing Duplicates and Transforming Data Using Bold Data Hub

In this article, we will demonstrate how to import tables from a CSV file, remove duplicate records using transformations, and migrate the cleaned data into the destination database using Bold Data Hub. Follow the step-by-step process below.

Sample Data Source:

Sample CSC Data

Creating Pipeline

Learn about [Pipeline Creation] (https://help.boldbi.com/working-with-data-sources/working-with-bold-data-hub/working-with-pipelines/)

Applying Transformation

Go to the Transform tab and click Add Table.
Enter the table name to create a transform table for customer satisfaction summary.

Tranformation Use Case

Note: The data will initially be transferred to the DuckDB database within the designated {pipeline_name} schema before undergoing transformation for integration into the target databases. As an illustration, in the case of a pipeline named “customer_service_data”, the data will be relocated to the customer_service_data table schema.

Learn more about transformation here

Removing Duplicate Records

Overview

Duplicate records in a dataset can lead to inconsistencies and inaccurate analysis. To ensure data integrity, we can remove duplicates based on unique identifiers such as Customer_ID and Ticket_ID while retaining the most recent entry.

Approach

We use the DISTINCT ON clause to retain only one record per Customer_ID and Ticket_ID, prioritizing the latest entry based on Ticket_Creation_Date DESC.

SQL Query for Removing Duplicates

WITH Unique_Tickets AS (
    SELECT DISTINCT ON (Customer_ID, Ticket_ID) * 
    FROM {pipeline_name}.sample_csc_data 
    ORDER BY Customer_ID, Ticket_ID, Ticket_Creation_Date DESC
) 
SELECT * FROM Unique_Tickets;

Tranformation Use Case

Contents

Creating Pipeline
Applying Transformation
Removing Duplicate Records
Overview
Approach
SQL Query for Removing Duplicates

Contents

Creating Pipeline
Applying Transformation
Removing Duplicate Records
Overview
Approach
SQL Query for Removing Duplicates

Please provide additional information

Thank you for your feedback and comments.We will rectify this as soon as possible!