HW4 Answer Key

1. Exercise 7.2: List all functional dependencies satisfied by the relation of Figure 7.18.

Answer:
Non-trivial functional dependencies:

A -> B
C -> B

2. Exrercise 7.6: Compute the closure of the following set F of functional dependencies for relation schema R = {A, B, C, D, E}.

A -> BC
CD -> E
B -> D
E -> A

List the candidate keys for R.

Answer:
A -> BC, B -> D so A -> D so A -> DC -> E
therefore A -> ABCDE

E -> A, A -> ABCDE, so E -> ABCDE
CD -> E, so CD -> ABCDE
B -> D, BC -> CD, so BC -> ABCDE

Attribute closure:
A -> ABCDE
B -> BD
C -> C
D -> D
E -> ABCDE
AB -> ABCDE
AC -> ABCDE
AD -> ABCDE
AE -> ABCDE
BC -> ABCDE
BD -> BD
BE -> ABCDE
CD -> ABCDE
CE -> ABCDE
DE -> ABCDE
ABC -> ABCDE
ABD -> ABCDE
ABE -> ABCDE
ACD -> ABCDE
ACE -> ABCDE
ADE -> ABCDE
BCD -> ABCDE
BDE -> ABCDE
CDE -> ABCDE
ABCD -> ABCDE
ABCE -> ABCDE
ABDE -> ABCDE
ACDE -> ABCDE
BCDE -> ABCDE

The candidate keys are A, E, CD, and BC

Any combination of attributes that includes those is a superkey.

3. Exercise 7.18: Why are certain functional dependencies called trivial functional dependencies?

Answer:
Because the right hand side is a subset of the left hand side. Therefore it is obvious that the right hand side is dependent on the left hand side.

More opaque book definition: An FD is trivial if it is satisfied by all instances of a relation.

4. Exercise 7.17: Explain what is meant by repetition of information and inability to represent information. Explain why each of these properties may indicate a bad relational database design.

Answer:
- Repetition of Information is a condition in a relational database where the values of one attribute are determined by the values of another attribute in the same relation, and both values are repeated throughout the relation. This is a bad relational database design because it increases the storage required for the relation and it makes updating the relation more difficult.

- Inability to represent information is a condition where a relationship exists amoung only a proper subset of the attributes in a relation. This is bad relational database design because all the unrelated attributes must be filled with null values otherwise a tuple without the unrelated information cannot be inserted into the relation.

5. Consider a relation R(A,B,C,D,E) with the following dependencies: 
{AB-> C, CD -> E, DE -> B} 
Is AB a candidate key of this relation? If not, is ABD? Explain your answer. 

No. The closure of AB does not give you all of the attributes of the relation.

If not, is ABD? Explain your answer. 

A -> A
B -> B
C -> C
D -> D
E -> E
AB -> ABC
AC -> AC
AD -> AD
AE -> AE
BC -> BC
BD -> BD
BE -> BE
CD -> BCDE
CE -> CE
DE -> BDE

ABD -> ABCDE

Yes, ABD is a candidate key. No subset of its attributes is a key.

6. Consider a relation with schema R(A,B,C,D) and FDs {AB -> C, C -> D, D -> A}. 
a. What are some of the nontrivial FDs that can be inferred from the given FDs? 

Some examples:

C -> ACD
D -> AD
AB -> ABCD
AC -> ACD
BC -> ABCD
BD -> ABCD
CD -> ACD
ABC -> ABCD
ABD -> ABCD
BCD -> ABCD

b. What are all candidate keys of R? 

By calculating an attribute closure we can see the candidate keys are:
AB, BC, and BD

Attribute closure:
A -> A
B -> B
C -> ACD
D -> AD
AB -> ABCD
AC -> ACD
AD -> AD
BC -> ABCD
BD -> ABCD
CD -> ACD
ABC -> ABCD
ABD -> ABCD
ACD -> ACD
BCD -> ABCD

c. Indicate all BCNF violations for R. 

C->D and D->A

d. Decompose the relations into collections of relations that are in BCNF. 

(ABCD)
|           \
C->D         |
|            |
(CD)      (ABC)
             |          \
             C->A        |
             |           |
             (CA)     (AC)

So you get: R1(CD), R2(AC), and R3(BC)

If we split on D->A

(ABCD)
|           \
D->A         |
|            |
(AD)      (BCD)
             |         \
             C->D        |
             |           |
             (CD)     (BC)

So you get: R1(AD), R2(CD), and R3(BC)

e. Indicate which dependencies if any are not preserved by the BCNF decomposition. 

If we start to decompose on C->D then D->A and AB->C
If we start to decompose on D->A then AB->C

7. Consider a relation R(A,B,C,D,E) with FDs {AB -> C, DE ->C, and B -> D} 
a. Indicate all BCNF violations for R. 

Logically, since C and D are the only attributes that can be determined via other attributes, we can deduce that the keys will contain the other attributes, thus we prefrom a smaller attribute closure:

ABE -> ABCDE
ABCD -> ABCDE
ABCE -> ABCDE

Candidate keys: ABE

Violations:

B->D, AB->C, DE->C

b. Decompose the relations into collections of relations that are in BCNF. 

(ABCDE) 

Break down (ABCDE) with AB->C
(ABC) and (ABDE)

Break down (ABDE) with B->D
(BD) And (ABE)

So we get R1(ABC), R2(BD) and R3(ABE) 

c. Indicate which dependencies if any are not preserved by the BCNF decomposition. 

DE->C

8. Prove or disprove the following inference rules for functional dependencies. 
Note: Read "|=" as implies 
a. {X->Y, Z->W} |= XZ ->YW 

XZ -> XZ
XZ -> XW	(Z -> W)
XZ -> W		(decomposition rule)

XZ -> XZ
XZ -> YZ	(X -> Y)
XZ -> Y 		(decomposition rule)

XZ -> YW	(union rule)

b. {X->Y, XY -> Z} |= X -> Z 

Y -> Z		(pseudotransitivity rule)
X -> Z		(transitivity)

c. {XY -> Z, Y->W} |= XW->Z 

W -> W
X -> X
Y -> YW
Z -> Z
WX -> WX
WY -> WY
WZ -> WZ
XY -> WXYZ
XZ -> XZ
YZ -> WYZ

Therefore WX -> Z is not true 
You can also find the attribute closure for WX and show that closure set does not contain Z.

Use Armstrong's Axioms or Attribute closure to prove or disprove. 

9. Consider a relation R(A,B,C,D) with FDs {A ->B, B ->C, C-> D} 
a. Indicate all BCNF violations for R. 

Logically, since B, C, and D are the only attributes that can be determined via other attributes, we can deduce that the keys will contain the other attributes, thus we prefrom a smaller attribute closure:

A -> ABCD
AB -> ABCD
AC -> ABCD
AD -> ABCD
ABC -> ABCD
ABD -> ABCD
ACD -> ABCD

Violations:

B->C, C->D

b. Decompose the relations into collections of relations that are in BCNF. 

Breakdown based on B->C
(BC), (ABD)
Breakdown based on B->D
(AB), (BD)

So we get R1(BC), R2(AB), R3(BD)

c. Indicate which dependencies if any are not preserved by the BCNF decomposition. 

C->D

10. For the same example relation R with the two tuples as in the notes above, decompose it as R1(A,B) and R2(A,C).  Try and merge them back using natural join and see if the resulting relation is the same as R. Do you think this decomposition is a lossless join decomposition? 

Suppose R contains two tuples (1, 2, 3) and (2, 2, 4) 

R1 contains (1, 2), (2, 2)

R2 contains (1, 3), (2, 4)

Natural Join on A and we have:

(1, 2, 3), (2, 2, 4)

As you can see, we have gotten the original relations back.

Yes, it is lossless because the dependency A->B is not broken.