Search results

Data Smoothing and Transforming Data Using Bold Data Hub

In this article, we will demonstrate how to import tables from a CSV file, smooth data through transformations, and move the cleaned data into the destination database using Bold Data Hub. Follow the step-by-step process below.

Sample Data Source:
Sample CSC Data


Step-by-Step Process in Bold Data Hub

Step 1: Open Bold Data Hub

  • Click on the Bold Data Hub.

Tranformation Use Case

Step 2: Create a New Pipeline

  • Click Add Pipeline in the left-side panel.
  • Enter the pipeline name and click the tick icon.

Tranformation Use Case

Step 3: Choose the Connector

  • Select the newly created pipeline and opt for the CSV connector. You can either double-click or click on the Add Template option to include a template.

Tranformation Use Case

Step 4: Upload Your CSV File

  • Click the “Upload File” button to select and upload your CSV file.

Tranformation Use Case

Step 5: Set the Properties

  • Copy the file path and paste it into the filePath property field.

Tranformation Use Case

Step 6: Save and Choose the Destination

  • Click Save, choose the destination, and confirm by clicking the Yes button.

Tranformation Use Case

Note: On-Demand Refresh will be triggered when the pipeline is saved. If needed, the pipeline can be scheduled in the Schedules tab.

Step 7: View Logs and Outputs

  • Click the pipeline name in the left-side panel and switch to the Logs tab to view logs.

Tranformation Use Case

Step 8: Apply Transformations

  • Go to the Transform tab and click Add Table.

  • Enter the table name to create a transform table for customer satisfaction summary.

Tranformation Use Case

Note: The data will initially be transferred to the DuckDB database within the designated {pipeline_name} schema before undergoing transformation for integration into the target databases. As an illustration, in the case of a pipeline named “customer_service_data”, the data will be relocated to the customer_service_data table schema.


Learn more about transformation here

Data Smoothing

Overview

Ticket volumes may exhibit seasonal spikes, making it difficult to analyze long-term trends. By applying rolling averages, we can smooth fluctuations and gain clearer insights into underlying patterns.

Approach

We use a rolling average to normalize ticket resolution data and detect anomalies. If an agent’s resolution time significantly exceeds the average (by more than 2 standard deviations), it is flagged as an anomaly.

SQL Query for Data Smoothing

WITH Agent_Resolution AS (
    SELECT 
        Agent_ID, 
        Agent_Name, 
        COUNT(Ticket_ID) AS Resolved_Tickets, 
        AVG(Resolution_Time) AS Avg_Resolution_Time, 
        STDDEV(Resolution_Time) AS Std_Dev_Resolution 
    FROM {pipeline_name}.sample_csc_data 
    WHERE Ticket_Status = 'Resolved' AND Resolution_Time IS NOT NULL 
    GROUP BY Agent_ID, Agent_Name
) 
SELECT 
    t.Ticket_ID, 
    t.Agent_ID, 
    t.Agent_Name, 
    t.Resolution_Time, 
    CASE 
        WHEN t.Resolution_Time > (a.Avg_Resolution_Time + 2 * a.Std_Dev_Resolution) 
        THEN 'Anomaly' 
        ELSE 'Normal' 
    END AS Resolution_Anomaly 
FROM {pipeline_name}.sample_csc_data t 
JOIN Agent_Resolution a ON t.Agent_ID = a.Agent_ID 
WHERE t.Ticket_Status = 'Resolved';

Tranformation Use Case