“Normalization is a design technique that is widely used as a guide in designing relational databases. Normalization is essentially a two step process that puts data into tabular form by removing repeating groups and then removes duplicated data from the relational tables.”
Goal of Normalization The goal of normalization is to create a set of relational tables that are free of redundant data and that can be consistently and correctly modified. This means that all tables in a relational database should be in the third normal form (3NF). 1)A relational table is in 3NF if and only if all non-key columns are (a)mutually independent and (b) fully dependent upon the primary key. C.J Date [Date90] . -A company obtains parts from a number of suppliers. Each supplier is located in one city. A city can have more than one supplier located there and each city has a status code associated with it. Each supplier may provide many parts. The company creates a simple relational table to store this information that can be expressed in relational notation as: FIRST (s, status, city, p, qty) where s supplier identifcation number (this is the primary key) status status code assigned to city city name of city where supplier is located p part number of part supplied qty> quantity of parts supplied to date In order to uniquely associate quantity supplied (qty) with part (p) and supplier (s), a composite primary key composed of s and p is used.
1) First Normal Form ( 1 NF)-: “A relational table, by definition, is in first normal form. All values of the columns are atomic. That is, they contain no repeating values. Update anomalies are problems that arise when information is inserted, deleted, or updated. Ã¢Â?Â¢INSERT. The fact that a certain supplier (s5) is located in a particular city (Mumbai) cannot be added until they supplied a part. Ã¢Â?Â¢DELETE. If a row is deleted, then not only is the information about quantity and part lost but also information about the supplier. Ã¢Â?Â¢UPDATE. If supplier s1 moved from London to New York, then six rows would have to be updated with this new information.
2) Second Normal Form (2 NF) -“The definition of second normal form states that only tables with composite primary keys can be in 1NF but not in 2NF. A relational table is in second normal form 2NF if it is in 1NF and every non-key column is fully dependent upon the primary key. That is, every non-key column must be dependent upon the entire primary key . “
FIRST is in 1NF but not in 2NF because status and city are functionally dependent upon only on the column s of the composite key (s, p). This can be illustrated by listing the functional dependencies in the table: s -> city, status city -> status (s,p) ->qty
The process for transforming a 1NF table to 2NF is:
1.) Identify any determinants other than the composite key, and the columns they determine. 2. Create and name a new table for each determinant and the unique columns it determines 3. Move the determined columns from the original table to the new table. The determinate becomes the primary key of the new table. 4. Delete the columns you just moved from the original table except for the determinate which will serve as a foreign key. 5. The original table may be renamed to maintain semantic meaning. To transform FIRST into 2NF we move the columns s, status, and city to a new table called SECOND. The column s becomes the primary key of this new table. 3) Third Normal Form
The third normal form requires that all columns in a relational table are dependent only upon the primary key.
Definition : “A relational table is in third normal form (3NF) if it is already in 2NF and every non-key column is non transitively dependent upon its primary key.”
In other words, all nonkey attributes are functionally dependent only upon the primary key. Table PARTS is already in 3NF. The non-key column, qty, is fully dependent upon the primary key (s, p). SUPPLIER is in 2NF but not in 3NF because it contains a transitive dependency. A transitive dependency is occurs when a non-key column that is a determinant of the primary key is the determinate of other columns.
The process of transforming a table into 3NF is:
1. Identify any determinants, other the primary key, and the columns they determine.
2. Create and name a new table for each determinant and the unique columns it determines.
3. Move the determined columns from the original table to the new table. The determinate becomes the primary key of the new table.
4. Delete the columns you just moved from the original table except for the determinate which will serve as a foreign key. 5. The original table may be renamed to maintain semantic meaning. To transform SUPPLIER into 3NF, we create a new table called CITY_STATUS and move the columns city and status into it. Status is deleted from the original table, city is left behind to serve as a foreign key to CITY_STATUS, and the original table is renamed to SUPPLIER_CITY to reflect its semantic meaning.