Slowly Changing Dimensions (SCDs) play a pivotal role in maintaining historical data integrity within a data warehouse. They are crucial for preserving the context of data changes over time, which is essential for accurate reporting and analysis. SCDs are categorized into three main types: Type 1, Type 2, and Type 3. Each type offers a distinct approach to handling changes in dimension attributes. Let’s delve into each type with examples and outputs:
Type 1: Overwrite
In Type 1 SCD, the new data overwrites the existing data without preserving any historical information. This method is suitable when historical changes are irrelevant or when storage space is limited. However, it lacks the ability to track historical changes.
Example: Consider a customer dimension table:
CustomerID | Name | Age | City |
---|---|---|---|
101 | John Doe | 35 | New York |
102 | Jane Smith | 28 | Los Angeles |
Now, suppose John Doe moves to Boston. With Type 1 SCD, the table would be updated as follows:
CustomerID | Name | Age | City |
---|---|---|---|
101 | John Doe | 35 | Boston |
102 | Jane Smith | 28 | Los Angeles |
Type 2: Add New Row
In Type 2 SCD, a new row is added to the dimension table to accommodate the changed data, while retaining the original row to preserve historical information. Each row is assigned a surrogate key for identification.
Example: Expanding upon the previous example, if we use Type 2 SCD:
SurrogateKey | CustomerID | Name | Age | City | StartDate | EndDate |
---|---|---|---|---|---|---|
1 | 101 | John Doe | 35 | New York | 01/01/2022 | 02/15/2024 |
2 | 101 | John Doe | 35 | Boston | 02/16/2024 | NULL |
3 | 102 | Jane Smith | 28 | Los Angeles | 01/01/2022 | NULL |
Type 3: Add New Column
Type 3 SCD involves adding new columns to the dimension table to store both the current and previous values of specific attributes, providing a compromise between Types 1 and 2.
Example: Extending the previous scenario with a Type 3 SCD approach:
CustomerID | Name | Age | City_Current | City_Previous |
---|---|---|---|---|
101 | John Doe | 35 | Boston | New York |
102 | Jane Smith | 28 | Los Angeles | Los Angeles |
Each SCD type has its advantages and limitations, and choosing the appropriate type depends on the specific requirements of the data warehousing project. By understanding the nuances of Type 1, Type 2, and Type 3 SCDs, data professionals can effectively design and manage data warehouses to support accurate analysis and decision-making.
Read more on