In this article, we will demonstrate how to import tables from a CSV file, flag suspicious data through transformations, and move the cleaned data into the destination database using Bold Data Hub. Follow the step-by-step process below.
Sample Data Source:
Learn about Pipeline Creation
Learn more about transformation here
To maintain data accuracy, records with conflicting information should be flagged. For example, an “Open” ticket should not have a resolution time, and a “Resolved” ticket should have a valid resolution time.
We use a CASE
statement to identify and flag suspicious records:
SELECT
Ticket_ID,
Ticket_Status,
Resolution_Time,
CASE
WHEN Ticket_Status = 'Open' AND Resolution_Time IS NOT NULL THEN 'Conflict'
WHEN Ticket_Status = 'Resolved' AND (Resolution_Time IS NULL OR Resolution_Time <= 0) THEN 'Invalid Resolution Time'
ELSE 'Valid'
END AS Suspicious_Flag
FROM {pipeline_name}.sample_csc_data;