The Database Design Resource Center



The Five Normal Forms - formal definitions

Introduction

If your database design doesn’t conform to (at least) the 3rd Normal Form (3NF), chances are high that you will find it hard to achieve the performance need for a successful application.

Furthermore, you will find that writing good SQL-statements (SELECT, UPDATE, INSERT or DELETE) will be difficult, and sometimes actually impossible without using a lot of procedural coding (PL/SQL in Oracle, VB/C# in Microsoft products).

BTW: For a more practical, down-to-Earth English-oriented explanation of the 5 Normal forms, visit my Database Normalization eBook page.

Definition of terms

Now, first a few definitions about database normalization: I cite a few as I have found them in the Hyperdictionary on the Internet, though you may find a number of differing definitions around.

We are dealing with the "Relational Model" as the basis for our relational databases. The relational model as I see it is

Definition: A data model introduced by E.F. Codd in 1959/1970, particularly well suited for business data management. In this model, data are organized in tables. The set of names of the columns is called the "schema" of the table.". The work of E. F. Codd, and also Chris Date, is based on relational algebra, which is well outside the scope of this article.

database model: The product (outcome) of the database design process which aims to identify and organize the required data conceptually and logically. A database model tells you what information is to be contained in a particular database, how the information will be used, and how the items in the database will be related to each other. A well thought-out database model reduces the need for changes. Future systems may re-use parts of existing models, which should lower development costs.

Database Normalization: A series of steps followed to obtain a database design that allows for efficient access and storage of data in a relational database. These steps reduce (should eliminate) data redundancy and the chances of data becoming inconsistent.

A table in a relational database is said to be normalized if it satisfies 1st normal form. The higher normal forms deal with different levels of redundancy elimination. Codd's original work defined three such forms, but there are now five (and a 6th coming up) generally accepted steps of normalization.

The output of the first step is called First Normal Form (1NF), the output of the second step is Second Normal Form (2NF), etc.

Another profitability from the normalization process is that a normalized database is much easier to write code against. This first lecture will step through the 5 different normal forms which are generally accepted as useful to comply with in database design.

The five Normal Forms; Formal Definitions

These formal definitions are taken from the book by Chris J. Date: An Introduction to Database Systems Volume 1 4th edition, © 1996, Addison-Wesley Publishing Co., Inc., Reading, Massachusetts.

First Normal Form

‘A relation R is in first normal form (1NF) if and only if all underlying domains contain atomic values only.’

Second Normal Form

‘A relation R is in second normal form (2NF) if and only if it is in 1NF and every nonkey attribute is fully dependent on the primary key.’

Third Normal Form

‘A relation R is in third normal form (3NF) if and only if it is in 2NF and every nonkey attribute is nontransitively dependent on the primary key.’

Boyce/Codd Normal Form

‘A relation R is in Boyce/Codd normal form (BCNF) if and only if every determinant is a candidate key.’

Fourth Normal Form

‘A relation R is in fourth normal form (4NF) if and only if, wherever there exists an MVD in R, say A -> -> B, then all attributes of R are also functionally dependent on A. In other words, the only dependencies (FDs or MVDs) in R are of the form K -> X (i.e. a functional dependency from a candidate key K to some other attribute X). Equivalently: R is in 4NF if it is in BCNF and all MVD’s in R are in fact FDs.’

Fifth Normal Form

‘A relation R is in fifth normal form (5NF) - also called projection-join normal form (PJ/NF) if and only if every join dependency in R is a consequence of the candidate keys of R.’

For every normal form it is assumed that every occurrence of R can be uniquely identified by a primary key using one or more attributes in R.

FD = Functional Dependency
MVD = Multi-Valued Dependency

Return to Normalization


Exclusive interviews with:
Steven Feuerstein, PLSQL expert
Donald Burleson, Top IT consultant


Free eBook

Subscribe to my newsletter and get my ebook on Entity Relationship Modeling Principles as a free gift:


What visitors say...

"I just stumbled accross your site looking for some normalization theory and I have to say it is fantastic.

I have been in the database field for 10+ years and I have never before come across such a useful site. Thank you for taking the time to put this site together."

Mike, USA

Read more Testimonials



Database Normalization eBook:


Database Normalization eBook




Copyright © www.databasedesign-resource.com /
All rights reserved.
All information contained on this website is for informational purposes only.
Disclaimer: www.databasedesign-resource.com does not warrant any company, product, service or any content contained herein.

Return to top

Copyright acknowledgement note:

The name Oracle is a trademark of Oracle Corporation.
The names MS Access/MS SQL Server are trademarks of Microsoft Corporation.
Any other names used on this website may be trademarks of their respective owners, which I fully respect.