Functional Dependency

What is Functional Dependency?
A functional dependency is a constraint between two sets of attributes in a relation from a database.
An attribute or set of attributes X is said to functionally determine another attribute Y (written X à Y) if and only if  each X value is associated with at most one Y value. Customarily we call X determinant set and Y a dependent set.
So if we are given the value of X we can determine the value of Y.

Why is Functional Dependency Important?
The determination of functional dependencies is an important part of designing databases in the relational model, and in database normalization and denormalization

The functional dependencies, along with the attribute domains, are selected so as to generate constraints that would exclude as much data inappropriate to the user domain from the system as possible.
Functional Dependency Inference Rules
Reflexivity:
    If B is a subset of A then A functionally determines B
        For example:
{name, location} à {name}
Augmentation:
    If B is a subset of A and C functionally determines D then A and C functionally determine B and D
        For example:
{name, location} and {birthdate, time} à {name} and {age}
            (as {name} is a subset of {name, location} and {birthdate, time}
            functionally determines {age})
Transitivity:
    If A functionally determines B and B functionally determines C then A functionally determines C
        For example:
            {name, location} à {initials}
            (as {name, location} functionally determines
           {name} and {name} functionally determines {initials})
Pseudo transitivity:
    If A functionally determines B and B and C functionally determine D then A and C functionally determine D 
Union:
    If A functionally determines B and A functionally determines C then A functionally determines B and C
        For example:
{name, location, birthdate, time} à{initials, age}
 (as {name, location, birthdate, time} à{initials} and
{name, location, birthdate, time} à{age})
Decomposition:
    If A functionally determines B and C then A functionally determines B and A functionally determines C
        For example:
{name, location, birthdate, time} à{initials, age} implies that
{name, location, birthdate, time} à {initials} and
{name, location, birthdate, time} à{age}

Trivial Functional Dependencies
Some functional dependencies are said to be trivial because they are satisfied by all relation.
     For example:
à A
à Y    if Y is a subset of X

Keys and Functional Dependencies
Keys and, more generally, functional dependencies, are constraints on the database that require relations to satisfy certain properties. Relations that satisfy all such constraints are legal relations.

What is Superkey?(review)
A superkey is defined in the relational model as a set of attributes of a relation for which it holds that
 in all instances of the relation there are no two distinct tuples that have the same values for the
attributes in this set.
Equivalently a superkey can also be defined as those sets of attributes of a relation upon which all

attributes of the relation are functionally dependent.

Functional dependencies allow us to express constraints that we cannot express with superkeys.
Let's consider the schema of the example in the Figure 7.2.
Figure 7.2    Partial list of tuples in relations loan, borrower, and bor_loan




Figure 7.2, we consider the schema
bor_loan = (customer_id, loan_number, amount)
in which the functional dependency loan_number à amount holds because for each loan (identified by loan_number) there is a unique amount. We denote the fact that the pair of attributes(customer_id, loan_amount) forms a superkey for bor_loan by writing:
customer_id, loan_number à customer_id, loan_number, amount
or, equivalently, customer_id, loan_number à bor_loan

We shall use functional dependencies in two ways:     
1. To test relations to see whether they are legal under a given set of functional dependencies. If a relation r  is a legal under a set F of functional dependencies, we say that r satisfies F.    
2. To specify constraints on the set of legal relations. We shall thus concern ourselves with only those
    relations that satisfy a given set of functional dependencies. If we wish to constrain ourselves to relations on schema R that satisfy a set F of functional dependencies, we say that F holds on R.  

Database Normalization
Database normalization relates to the level of redundancy in a relational database's structure. The key idea is to reduce the chance of having multiple different versions of the same data, like an address, by storing all potentially duplicated data in different tables and linking to them instead of using a copy. Then updating the address in one place will instantly change all the places where the
address is used.
Well-normalized databases have a schema that reflects the true dependencies between tracked quantities. This means that updates can be quickly performed with little risk of data becoming inconsistent.

Normal Forms
In the relational model, formal methods exist for quantifying "how normalized" a database is.
These classifications are called normal formsand there are algorithms for converting a given
database between them.
Edgar F. Codd originally established three normal forms: 1NF, 2NF and 3NFThere are now others that are generally accepted, but 3NF is widely considered to be sufficient for many practical applications. Most tables when reaching 3NF are also in BCNF.

1NF(First Normal Form)
In the relational model, we formalize this idea that attributes do not have any substructure. A domain
is atomic if elements of the domain are considered to be indivisible units. We say that a relation schema R is in first normal form if the domains of all attributes of R are atomic. In other words, a relation schema R is in 1NF if there are no muntivalued attributes.

To understand first normal form (1NF), consider these two examples of things you might know:
    "What is your favorite color?"
    "What food will you not eat?"
A difference between these two examples is that, you can have only one favorite color; but, there is very little limitation on the number of foods you might not eat.
Data that has a single value such as "person's favorite color" is inherently in first normal form. Storing such data has not much changed since Codd wrote and needs no further explanation here. Data that has multiple values must be stored differently.

2NF(Second Normal Form)
Second normal form (2NF) prescribes full functional dependancy on the primary key. It most commonly applies to tables that have composite primary keyswhere two or more attributes comprise the primary key. It requires that there are no non-trivial functional dependencies of a non-key attribute on a part (subset) of a candidate key.
A table is said to be in the 2NF if and only if it is in the 1NF and every non-key attribute is irreducibly dependent on the primary key.
How about this example?
    Example:
Machine_parts(part_number, supplier_name, price, supplier_address)
In this case, price is fully dependent on the primary key(different suppliers may charge different price
on the same part).

However, supplier_address is partially dependent because it only depends on the supplier_name.
Therefore, this table is not in 2NF.

3NF(Third Normal Form)
Third normal form (3NF) requires that there are no non-trivial functional dependencies of non-key attributes on something other than a superset of a candidate key.
A table is in 3NF if it is in 2NF, and none of the non-primary key attributes is a fact about any other non-primary key attribute.
In summary, all non-key attributes are mutually independent(i.e. there should not be transitive dependencies).
How about this example?
    Example:
Machine_parts(part_number, supplier_name, supplier_address)

In this case, supplier_address is dependent on supplier_name. Therefore, there is a transitive dependency in the table. It means that it is not in 3NF.

BCNF (Boyce-Codd Normal Form)
Boyce-Codd normal form (or BCNF) requires that there are no non-trivial functional dependencies of attributes on something else than a superset of a candidate key. At this stage, all attributes are dependent on a key, a whole key and nothing but a key (excluding trivial dependencies, like AàA).
A table is said to be in the BCNF if and only if it is in the 3NF and every non-trivial, left-irreducible functional dependency has a candidate key as its determinant.
In more informal terms, a table is in BCNF if it is in 3NF and the only determinants are the candidate keys.

























Comments

Popular posts from this blog