Tuesday, 21 February 2012

Normalization



The logical design of database, including the tables and the relationship between them, is the core of an optimized relational database. A good logical database design can lay the foundation for optimal database and application performance. A poor logical database design can reduce the performance of the entire system.
                                Normalization improves performance by reading redundancy. Redundancy can lead to
·         Inconsistencies -                              Errors are more likely to occur when facts are repeated.
·         Updated Anomalies -                    Inserting, modifying & deleting data may cause inconsistencies
                                     Normalization or logical database design involves using formal methods to separate the data into multiple, related tables. A greater number of smaller tables (with fewer columns) is characteristic of a normalized database.
Benefit of Normalization:
        I.            Faster sorting and index creation.
      II.            A larger number of cluster indexes.
    III.            Narrow and more compact indexes.
    IV.            Fewer indexes per table, which improves the performance of inset, update and delete statement.
      V.            Fewer null values and less opportunity for inconsistencies which increase database compactness.
Normal Forms:
                     Normalization result in the formation of tables that’s satisfied certain specify constrains and represent certain normal form. The normal forms are used to ensure that various type of anomalies and inconsistencies are not introduced in the database. Normal’s forms table structure with minimum redundancy.
First Normal Form (1NF):
A relation R (you may recall that a table also called a relation) is said to be in 1NF, if the domains of all attributes of R are atomic.
Consider the following project:
Ecode
Dept
Proj  code
Hours
E101
Systems
P27
P51
P20
90
101
60
E505
Sales
P27
P22
109
98
E508
Admin
P51
P27
Null
72

The data in the table is not normalized because a cell in project code and hours contains non-atomic value.
                                                        By applying 1NF definition to the project table we arrive at the following table.
Project:  
Ecode
Dept
Proj code
Hours
E101
Systems
P27
90
E101
Systems
P51
101
E101
Systems
P20
60
E305
Sales
P27
109
E305
Sales
P22
98
E508
Admin
P51
NULL
E508
Admin
P27
72

Functional Dependency:
Give a relation (we may recall that a table is also called a relation) R, attribute A is functionally dependent on attribute B if each value of A in R is associated with preciously one value of B.
In word, attribute A is functionally dependent on B if only if for each value of B, there is exactly one value of A. If attribute A & Bare functionally dependent on each other then they are call determinant.
Consider the following table of Employee
Employee:
Code
Name
City
E1
Mac
Delhi
E2
Sandra
CA
E3
Henry
France

Given a particular value of code, there is preciously one corresponding value for Name. For example- corresponding Code E1, there is exactly one value of Name Mac. Hence Name is functionally dependent on code, similarly there is exactly one value of city for each value of code id the determinant. You can also say that code determines city & Name.
2nd Normal Form:
A table is said to be 2NF when it is in 1NF & every attribute in the row is functionally dependent upon the whole key and not just part of the key.
Consider the project table:
Project
Ecode
Proj Code
Dept
Hours

The table has the following rows:
Ecode
Proj Code
Dept
Hours
E101
P27
Systems
90
E305
P27
Finance
10
E508
P51
Admin
NULL
E101
P51
Systems
101
E101
P20
Systems
60
E508
P27
Admin
72

For a given example, the employee code and department code are repeated served times. Hence if an employee is transferred to another dept this change will have to be recorded in every row of the employee table. Any omission will lead to inconsistencies.
If an employee completes works on a project, the employee’s record will be deleted. The information regarding the dept to which the employee belongs will also be lost.
The primary key here is composite (Ecode+Proj Code). The table satisfies the definition of 1NF. We need to one check if it satisfies 2NF.
In the table for each value of Ecode, there is more than one value of hours. For example Ecode, E101, there are values hours: 90,101 &60. Hence hours are not functionally dependent on Ecode. Similarly for each value of Proj Code, there is more than one value of hours.
For example, Projcode , P27 , there are three values of hours:90,10 & 72. However for a combination of Ecode,projcode values. Thre is exactly one values of hours. Hence hours of functionally dependent on the whole key Ecode+Proj Code.
Noe we must check if Dept is functionally dependent on whole key Ecode +Proj Code. For each value of Ecode , there is exactly one value of Dept. For example, forEcode, 101, there is exactly one value the system department. Hence Dept is functionally dependent on Ecode. However for each value of Proj Code there is more than one value of dept. For example for Proj ode P27, there are three values of Dept, System and Finance and Admin. Hence Dept is not functionally dependent on Proj Code. Dept is therefore functionally dependent on part of the key (which is Ecode) & not functionally dependent on the whole key (Ecode+Proj Code). Therefore the table is not in 2NF. For the table to be in 2NF, the non key attributes must be fully functionally dependent on the whole key and not part of the key.
Converting a table to 2NF:
ü  Find and remove attributes that are functionally dependent on only a part of the key and not on the whole key. Place them in a different table.
ü  Group the remaining attributes:                               To convert the table project into 2NF. We must remove the attributes that are not fully functionally dependent on the whole key and place them in a different table along with attribute that it is functionally depend on. In the above example, since Dept is not fully functionally dependent on the whole key Ecode+Proj Code, we place Dept along with Ecode in a separate table called Employee Dept.
Now the table project will contain Ecode, Proj Code and Hours.
Ecode
Dept
E101
Systems
E305
Sales
E508
Admin

Ecode
Proj Code
Hours
E101
P27
90
E101
P51
101
E101
P20
60
E305
P27
10
E508
P51
Null
E508
P27
72

3rd Normal Form:
                A relation is said to be in 3NF when it is in 2NF and every non key attribute is functionally dependent only on the primary key.
Consider the table employee;
Ecode
Dept
Dept Head
E101
Systems
E901
E305
Finance
E909
E402
Sales
E906
E508
Admin
E908
E607
Finance
E909
E608
Finance
E909

                We must check if the table 3NF since each cell in the table has a single value. The table in 1NF.
The primary key in Employee table is Ecode. For each value of Ecode, there is exactly one value of dept. Hence the attribute Dept is functionally dependent on the primary key, Ecode. Similarly, for each value of Ecode, there is exactly one value of Dept Head. Hence Dept Head is functionally dependent on the primary key Ecode. Hence, all the attributes are functionally dependent on the whole key, Ecode. Hence the table is in 2NF.
                                However the attribute Dept Head is dependent on the attribute Dept also. As per 3NF, all non-key attributes have to be functionally dependent only on the primary key. This table is not in 3NF since Dept Head is functionally dependent on Dept which is not on primary key.

Converting Table to 3NF:
ü  Find and remove non-key attribute that are functionally dependent on attributes that are not the primary key. Place them in different table.
ü  Group the remaining attributes:   To convert the table into 3NF, we must remove the column Dept Head. Since it is not functionally dependent on only the primary key, Ecode and place it in another table called Department along with the attribute Dept on which it is functionally dependent.
Employee:
Ecode
Dept
E101
System
E305
Finance
E402
Sales
E508
Admin
E607
Finance
E608
Finance
Dept
Dept Head
System
E901
Sales
E906
Admin
E908
Finance
E909

 

No comments:

Post a Comment