Tuesday, February 4, 2014

ER Modelling for Star Schema

ER Modelling for Star Schema

I would first compare two approaches of ER modelling for Star along with deficiencies in each of them. Then I would present with another way of modelling star schema.

First Approach: The first approach is taken from paper "From ER Models to Dimensional Models." by Dr. Moody and Dr. Kortink.
They suggested to create 3 new entities:
1.) Transaction entity: it is used to represent Fact Tables
2.) Component Entity: it is used to represent Dimension Table
3.) Classification Entity: it is used to represent tables related to Dimensional table.
They created ER Diagram using these 3 entities and used notation similar to crow foot.
Shortcoming of this approach: A completely new model for ER diagram.

Second Approach: This approach is taken from paper"The Relation of Star Schema into Entity-Relation Diagrams" by Dr. Krippendorf and Dr. Song.
It uses the existing ER framework to model star schema.
Dimensional tables are converted into entities and Fact tables are either represented as relations or as weak entity.
Shortcoming of this approach: 1.) Dr. Teorey in paper "A Logical Design Methodology for Relational Databases - Using the Extended Entity" proves that , only entities can have attributes. So, we Fact Tables cannot be represented as relations because Fact Tables have attributes.
2.) It does not give us the detail of dimension. For ex: a dimension market with attributes Market Description, district and region would be depicted as single entity. This loses the detail that district and region were essentially different entities.

My Approach: Now, I would present my approach which aims at solving problems we saw in above two approaches:
Broadly speaking ER model of star schema can be represented as n-ary relationship with many dimension connected to single fact using this relationship.

ER representation of Fact Table:
1.) Fact Table should always be converted to a weak entity.
2.) If fact table has own primary key, it should be shown as partial identifier in ER Diagram.
3.) Foreign Keys from dimensional tables are not shown in the Fact entity.
For ex:

In figure shown on left, SKU_ID and Store_id are foreign key from dimensional table which will be shown as primary key of dimension entity.
Date, receipt_nbr and Receipt_line_Item are primary key for fact table.





                                                                                                                           



ER Representation:

 As, seen in the ER diagram Receipt Item, Receipt Number and date are shown as partial identifier while keys from dimensional table are not shown.






Before modelling dimensional table, I will discuss super class, subclass and classification tables.

Super Class and Sub Class: In ER diagram, sub class inherits all the attributes from super class. For ex:

In figure on left manager and associates will inherit employee's attributes like: name and address.












Classification Tables: Tables that form dimension tables are called classification tables. For ex: in figure shown below, district and region would be classification tables and market would be dimension table:                                                                                                                                                                                                                                        

ER Modelling for Dimensional Tables:
I would propose new construct called Dm (Dimension) to model dimensional tables. While creating ER diagram for dimension tables, classification tables are linked using the construct Dm to dimension tables. This can be compared to how super class was linked to subclass using P.

Classification tables transfers all the non-key attributes to dimension table in the same way as super class passed it's attributes to subclass.

Below is ER Representation of Market Dimension:


In this example Region name and district name
will get transferred to market table.


The arrow points to the dimension table.









As, seen the only new construct introduced in my approach called Dm. Apart from that my approach to model ER diagram uses traditional ER modelling technique.

In this article I just showed basic ER modelling. I might be covering this in more details in coming blogs with some examples.

References:
"The Relation of Star Schema into Entity-Relation Diagrams" by Dr. Krippendorf and Dr. Song.
"From ER Models to Dimensional Models." by Dr. Moody and Dr. Kortink
"A Logical Design Methodology for Relational Databases - Using the Extended Entity" by Dr. Teorey, Dr. Yang and Dr. Fry
The Data warehouse toolkit by Dr. Kimball and Dr. Ross
ER Diagram Slides by Dr. Currim


1 comment: