Exploring Snowflake Schema and Galaxy Schema in Data Modeling

Learn Datawarehouse @ Freshers.in

While Star Schema is a widely adopted schema design in data warehousing, more complex data modeling scenarios may necessitate alternative approaches such as Snowflake Schema and Galaxy Schema. In this article, we explore these advanced data modeling techniques, providing detailed explanations, examples, and outputs to broaden your understanding.

1. Snowflake Schema

Snowflake Schema extends the Star Schema concept by normalizing dimension tables, thereby reducing redundancy and improving data integrity. In a Snowflake Schema, dimension tables are organized into multiple levels of hierarchy, with each level linked through primary and foreign key relationships.

Example:

Consider the retail sales data warehouse scenario. In a Snowflake Schema, the dimension tables may be normalized to represent hierarchical relationships. For instance, the “Product” dimension could have separate tables for “Product Category,” “Subcategory,” and “Product Details.”

CREATE TABLE DimProduct (
    ProductID INT PRIMARY KEY,
    CategoryID INT,
    SubcategoryID INT,
    ProductName VARCHAR(100),
    ...
);

CREATE TABLE DimCategory (
    CategoryID INT PRIMARY KEY,
    CategoryName VARCHAR(50)
    ...
);

CREATE TABLE DimSubcategory (
    SubcategoryID INT PRIMARY KEY,
    SubcategoryName VARCHAR(50),
    CategoryID INT,
    ...
);

2. Galaxy Schema

Galaxy Schema, also known as Constellation Schema, extends the Snowflake Schema further by introducing multiple fact tables, each associated with its set of dimension tables. This allows for more complex analytical queries and accommodates diverse business scenarios requiring multiple independent sets of measures.

Example:

Expanding on the retail sales data warehouse, suppose we want to analyze both sales data and inventory data independently. In a Galaxy Schema, we would have separate fact tables for sales and inventory, each connected to their respective dimension tables.

CREATE TABLE FactSales (
    SalesID INT PRIMARY KEY,
    ProductID INT,
    StoreID INT,
    TimeID INT,
    CustomerID INT,
    SalesAmount DECIMAL(10, 2),
    QuantitySold INT
);

CREATE TABLE FactInventory (
    InventoryID INT PRIMARY KEY,
    ProductID INT,
    StoreID INT,
    TimeID INT,
    QuantityOnHand INT,
    ...
);
Snowflake Schema and Galaxy Schema represent advanced data modeling techniques that offer flexibility and scalability in accommodating complex data relationships and analytical requirements.
Author: user