The star/snowflake schema is a way to implement multi-dimensional database (MDDB) functionality using a mainstream relational database.
Star schema resembles a star, one or more fact tables are surrounded by single level dimension tables (one completely de-normalized table per relationship). Dimension tables aren't normalized - that means even if you have repeating fields such as name or category no extra table is added to remove the redundancy.
Snowflake schema resembles a snowflake because comparing with Star Schema the dimension tables are further normalized or have parent tables.
Either way the fact tables are usually designed in third normal form because all data depends on either one dimension or all of them, not on combinations of a few dimensions.
Star Schema vs. Snowflake Schema
The reason for using a star schema is its simplicity from the users' point of view: queries are never complex because the only joins and conditions involve a fact table and a single level of dimension tables, without the indirect dependencies to other tables that are possible in a better normalized snowflake schema.
Snowflake schema generates more joins than a star schema during, which translates into longer queries. Therefore it is normally recommended to choose the star schema design over the snowflake schema for optimal performance.
Snowflake schema does have an advantage of providing more flexibility.
Disadvantage of Star Schema
A well designed database has to be in 3NF whereas the star schema's dimensions are de-normalized. Normally a huge number of redundant data existed in dimension tables.
For improving query performance many pre-processes (sum, grouping, resort etc) have been done on the dimension’s purpose and saved in the dimension table. When business logic changed to modify the related pre-processes are always hard.
When create new dimension table and build relationship with fact table, it can cause geometrically fact table size increasing.
How to Index Star/Snowflake Schema Data
Declare a primary key in each dimension table.
For star dimensions, declare foreign key (FK) relationships between each dimension table and the corresponding fact table. Additionally, for snowflake dimensions, declare FK relationships between each secondary dimension table and the primary dimension table that it augments.
Make sure that there is a primary key associated with the fact table(s), even if you have to use an IDENTITY field.
Declare indexes on each of the primary keys in the dimension tables and in the fact table(s).
Declare indexes on each of the foreign keys in the fact table(s).
Monday, 12 April 2010
Subscribe to:
Comments (Atom)