Why Should We Use SCD Type 2 Architecture in Data Warehouse Projects?
What are Slowly Changing Dimensions (SCD)?
In corporate databases and operational systems (ERP, CRM), data undergoes changes over time. The department an employee works in may change, the sales price of a product may be updated, or the marital status of a customer may differ. In operational systems, such changes are usually made by overwriting the existing record (Update). This method is called SCD Type 1 (Slowly Changing Dimension Type 1) in Data Warehouse terminology.
If you manage an e-commerce site, knowing the customer's current address is sufficient for shipping. However, in the world of Business Intelligence (BI) and Reporting, overwriting old data is a complete disaster. For example, if a sales representative makes record sales in the 'Izmir' branch in 2024 and is assigned to the 'Istanbul' branch in 2025, and you overwrite the data, in the retrospective 2024 sales reports, that personnel's Izmir sales will suddenly appear as if they were the revenue of the Istanbul branch!
The Savior Power of SCD Type 2
To prevent these historical losses, the SCD Type 2 architecture was developed. When this architecture is implemented, changed data is never overwritten.
The system works as follows:
- It goes to the row containing the old information, the end date of validity (EndDate) of this record is updated to the current date, and the active status (IsActive) is marked as 'False' (0).
- The newly updated information is added to the table as a brand new row (Insert). The start date (StartDate) of this new row becomes the current date, the end date is generally assigned an infinite date such as '9999-12-31', and the active status is set to 'True' (1).
Thus, in your Power BI report, when you look at the branch profitability 2 years ago, you can exactly and accurately calculate the valid branch assignment of that day.
Automated SCD2 Integration with Microto ETL
Building this architecture from scratch, writing start/end date (Effective Date) algorithms for hundreds of tables, and creating Surrogate Key logic requires massive T-SQL and SSIS coding effort. Most software teams give up on doing Type 2 to avoid this burden.
However, DVision Technology's own product, Microto ETL Studio, allows you to select SCD2 implementation on your Target table with a single Checkbox. Microto itself writes all the complex 'Date Range Comparison' codes and MERGE queries that need to run in the background in seconds! As DVision, we always secure historical structures and audit trails in DWH infrastructures.
