Data Warehousing – Schemas



Star Schema:-

The star schema is the simplest data warehouse schema. It is called star schema because the structure of star schema resembles a  star, with points radiating from the center. The center of the star consists of one or more fact tables and the point of the stars are the dimension or look up  tables.

Usually the fact table which contains the primary information in the data warehouse,  is surrounding by the much smaller dimension lookup tables which contains the information about the entries for a particular attribute in the fact table. The primary key in each dimension table is related to a foreign key in the fact table.

Star schema

For an example, suppose our data warehouse keeps store sales data, and the different dimensions are store, product, time and region. In this case the above diagram represent the star schema. The line between two table represents the primary key/ foreign key relationship between two tables. You can see that dimension tables are not related to each other.

Fact Tables- A typical fact table contains two types of attributes: foreign keys to dimension tables and measures. For example sales fact table may contain sales quantity, sales cost etc.

Dimension tables – Dimension tables contain descriptive attribute that are mainly textual data. Dimension table rows are uniquely identified by a single key field. In our above example region dimension attributes can be region_id, region_name. Here each region will be uniquely identified by the region_id.

Snowflake Schema:-

Snowflake schema is an extension of star schema means it is more complex than star schema. It is called as a snowflake schema the diagram resembles a snowflake. In star schema each dimension is represented by a single dimension table whereas in snowflake schema each dimension is grouped into multiple lookup table to eliminate the redundancy.

Snowflake schema

For example, The product dimension can be normalized into a product table, a product category table in snowflake schema. Time dimension can be normalized into a quarter and a month table.

The main advantage of the snowflake schema is the improvement in query performance due to minimized disk storage requirements and joining smaller lookup tables. The main disadvantage of the snowflake schema is the additional maintenance efforts needed due to the increase number of lookup tables.

Fact Constellation Schema:-

A fact constellation schema has more than one fact table. For each star schema, it is possible to create fact constellation schema by splitting the original star schema into more star schemes, each of them can have few common lookup tables.

fact Constellaton schema

The above diagram shows two fact tables namely sales and delivery.

The sales fact table same as in the star schema. The delivery fact has three dimension, namely product, supplier and time.

It is also possible to share dimension tables between fact tables. for example product dimension is shared between sales and delivery fact table.