DW Concepts

Data Warehouse Includes

1.     DATA MODELING (Conceptual, Logical and Physical)
2.     ETL - EXTRACT TRANSFORM LOAD ( Normalization and Slowly Changing Dimensions )
3.     OLAP ( ROLAP,MOLAP ,HOLAP, Hierarchies, Drill Down Drill Up and Drill through)
4.     REPORTING
  

DATA  MODELING


  • This is a very important step in the data warehousing project. Indeed, it is fair to say that the foundation of the data warehousing system is the data model. A good data model will allow the data warehousing system to grow easily, as well as allowing for good performance.
  • In data warehousing project, the logical data model is built based on user requirements, and then it is translated into the physical data model.
  •   Data Modeling Includes
  Conceptual Data Modeling,
  Logical Data Modeling and
  Physical Data Modeling.


Conceptual Data Model

      A conceptual data model identifies the highest-level relationships between the different entities. Features of conceptual data model include:
1.       Includes the important entities and the relationships among them.
2.      No attribute is specified.
3.      No primary key is specified. 


Example for Conceptual Data Model

Logical Data Model

      A logical data model describes the data in as much detail as possible, without regard to how they will be physical implemented in the database. Features of a logical data model include:
1.       All entities and relationships among them.
2.      All attributes for each entity are specified.
3.      The primary key for each entity is specified. Foreign keys (keys identifying the relationship between different entities) are specified.
4.      Normalization occurs at this level. 

      The steps for designing the logical data model are as follows:
1.       Specify primary keys for all entities.
2.       Find the relationships between different entities.
3.      Find all attributes for each entity.
4.      Resolve many-to-many relationships.
5.      Normalization.

Example for Logical Data Model


Physical Data Model

      Physical data model represents how the model will be built in the database. A physical database model shows all table structures, including column name, column data type, column constraints, primary key, foreign key, and relationships between tables. Features of a physical data model include:
1.       Specification all tables and columns.
2.      Foreign keys are used to identify relationships between tables.
3.      De-normalization may occur based on user requirements.
4.      Physical considerations may cause the physical data model to be quite different from the logical data model.
5.      Physical data model will be different for different RDBMS.
6.      For example, data type for a column may be different between MySQL and SQL Server.

         The steps for physical data model design are as follows:
1.       Convert entities into tables.
2.      Convert relationships into foreign keys.
3.      Convert attributes into columns.
4.      Modify the physical data model based on physical constraints / requirements.



NORMALIZATION

         Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process: eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make sense (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored. 


 The Normal Forms:

         The database community has developed a series of guidelines for ensuring that databases are normalized. These are referred to as normal forms and are numbered from one (the lowest form of normalization, referred to as first normal form or 1NF) through five (fifth normal form or 5NF). In practical applications, you'll often see 1NF, 2NF, and 3NF along with the occasional 4NF. Fifth normal form is very rarely seen and won't be discussed in this article.
         Before we begin our discussion of the normal forms, it's important to point out that they are guidelines and guidelines only. Occasionally, it becomes necessary to stray from them to meet practical business requirements. However, when variations take place, it's extremely important to evaluate any possible ramifications they could have on your system and account for possible inconsistencies. That said, let's explore the normal forms. 

No comments: