#### Desirable Properties of a Schema

- Eliminate anomalies
- Preserve dependencies
- Ensure good query performance

#### Armstrongs Axioms

wlog: $X=\{x_1, x_2,\ldots,x_n\}$

Reflexivity:

$X\rightarrow $ some subset of $X$

Augmentation

if $X\rightarrow Y$, then $X,Z\rightarrow Y,Z$

Transitivity

if $X\rightarrow Y\wedge Y\rightarrow Z$, then $X\rightarrow Z$

Union

if $X\rightarrow Y\wedge X\rightarrow Z$, then $X\rightarrow Y,Z$

Decomposition

if $X\rightarrow Y,Z$, then $X\rightarrow Y \wedge X\rightarrow Z$

Pseudo-Transitivity

if $X\rightarrow Y\wedge YZ\rightarrow U$, then $XZ\rightarrow U$

#### Incomplete List of Normal Forms

## Normal Forms:

- First Normal Form – 1NF
- Second Normal Form – 2NF
- Third Normal Form – 3NF
- Boyce-Codd Normal Form (BCNF)
- Fourth Normal Form (4NF)

there are others…

#### First Normal Form

All attributes must be atomic. No lists of things in a cell.

#### Second Normal Form

No partial dependencies.

Given a relation: $R(abcd)$, $S=\{ab\rightarrow c, b\rightarrow d\}$

Compute the Keys:

$X$ | a | b | c | d | a,b | a,c | a,d | b,c | b,d | c,d | a,b,c | a,c,d | a,b,d | b,c,d | a,b,c,d |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

$X^+$ | a | b | c | d | abcd | ac | ad | bcd | bd | cd | abcd | acd | abcd | bcd | abcd |

– | – | – | – | Key | – | – | – | – | – | SK | – | SK | – | SK |

**Candidate Key: **a minimal superkey. If more than one, then you’ve got to pick one to be the key (later). **Prime attributes**: attributes that are in a candidate key. a and b are prime

**Non-prime**

**attributes**: attributes that are not in a candidate key: c, d are non-prime

**Partial dependency: **a non-prime attribute that depends on part of a candidate key

**2NF means no partial dependency.**

Is this in 2NF? No. $b\rightarrow d$ is a partial dependency.

So what do we do?

## Decomposition

There are lots of ways to decompose – we will cover only the *easy *one.

To decompose we need to find the dependency $X\rightarrow Y$ that violates the normal form condition. Then we split as follows:

Create $R_1=X\cup$(others) $R_2=X\cup Y$

Note: Try to make the split as large as possible (heuristic to minimize number of times you have to split)

Then we check $R_1$ and $R_2$ for further violations and re-decompose if necessary.

#### Back to the example:

Violation is $b\rightarrow d$

So we decompose into:

$R_1(abc)$ $S=\{ab\rightarrow c\}$

$R_2(bd)$ $S_2=\{b\rightarrow d\}$.

Note that its important that $X$ is in both $R_1$ and $R_2$ so we may rejoin later. But this can lose information sometimes.

We check again for for 2NF; they both pass so we’re done.

## Third Normal Form

In 2NF and a non-prime attribute cannot determine another non-prime attribute: *i.e.*, no transitive dependencies.

**Definition:**

Whenever there is a nontrivial FD $X\rightarrow Y$ in $R$, then $X$ is a superkey for $R$ or $Y$ is a part of a candidate key (*i.e.*, is prime).

In English: “Everything is determined by the key or is in the key.”

3NF is not a hard condition to achieve.

#### Example:

$R(abcd)$ $S=\{ab\rightarrow c,c\rightarrow d\}$

Is it in 3NF? Nope

What is the violating dependency? $c\rightarrow d$.

So we decompose:

$R_1(c,d)$ $S_1(c\rightarrow d)$

$R_2(c,a,b)$ $S_1(a,b\rightarrow c)$

Is $R_1$ and $R_2$ in 3NF? Yes

Are dependencies preserved? Yes

#### Example:

$R(abcde)$ $S=\{ab\rightarrow c, c\rightarrow b, a\rightarrow d\}$

Compute the keys:

a | b | c | d | e | ab | ac | ad | ae | bc | bd | be | cd | ce | abc | abd | abe | acd | ace | ade |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

ad | b | cb | d | e | abcd | abcd | ad | ade | bc | bd | be | cbd | cbe | abcd | abcd | abcde | abcd | abcde | ade |

– | – | – | – | – | – | – | – | – | – | – | – | – | – | – | – | key | – | key | – |

Is it in 3NF? No, not in 2NF either

What is the violating dependency? $a\rightarrow d$ Because $a$ is not a superkey

So we need to decompose $R$ based on $a\rightarrow d$

$R_1(abce)$ $S=\{ab\rightarrow c, c\rightarrow b\}$

$R_2(ad)$ $S=\{a\rightarrow d\}$

Are $R_1$ and $R_2$ this in 3NF?

For $R_1$ what are the keys? abe, ace (others too)

$ab$ is not a superkey, but $c$ is prime

$c$ is not a superkey, but $b$ is prime

Are $R_1$ and $R_2$ in 2NF? Yes

Are $R_1$ and $R_2$ in 3NF? Yes