SQL Series
Database Learning 1: Introduction to Databases: https://blog.yexca.net/archives/86
Database Learning 2: Relational Model: https://blog.yexca.net/archives/87
Database Learning 3: SQL Language: https://blog.yexca.net/archives/88
Database Learning 4: Relational Database Theory: This Article
Database Learning 5: Normalization: https://blog.yexca.net/archives/90
Database Learning 6: Database Design: https://blog.yexca.net/archives/91
Database Learning 7: Database Control Functions: https://blog.yexca.net/archives/92
Relational Schema
A relational schema is formally defined as a 5-tuple (including the relation name):
R<U, D, dom, F>
- R: Relation name (the symbolic semantic of the tuple).
- U: A set of attributes.
- D: The domains from which the attributes in U originate.
- dom: The mapping from attributes to domains.
- F: A set of data dependencies (functional dependencies) on the attribute set U.
Since points 3 and 4 have little impact on schema design, we usually simplify the relational schema to a triple: R<U, F>. A relation $r$ on $U$ is a relation of schema R<U, F> if and only if $r$ satisfies $F$.
Example: R is a grade table, U = (StudentID, Name, CourseID, Grade), F = {StudentID â Name, CourseID â CourseName, (StudentID, CourseID) â Grade}.
Functional Dependency (FD)
Functional dependency is the most fundamental type of data dependency.
Let R(U) be a relational schema on attribute set U, and X, Y be subsets of U. If for every possible relation $r$ of R(U), it is impossible for two tuples to have the same values for attributes in X but different values for attributes in Y, then X functionally determines Y (or Y is functionally dependent on X). This is denoted as X â Y.
Non-trivial Functional Dependency If X â Y but Y â X, it is a non-trivial FD. Usually, we only care about non-trivial FDs.
Trivial Functional Dependency If X â Y and Y â X, it is a trivial FD.
Full Functional Dependency In R(U), if X â Y and for any proper subset X’ of X, X’ does not determine Y, then Y is fully functionally dependent on X. Denoted as X -F-> Y.
Partial Functional Dependency If X â Y but Y is not fully functionally dependent on X, it is a partial FD. Denoted as X -P-> Y.
Transitive Dependency In R(U, F), if X â Y, Y â X, and Y â Z, then Z is transitively dependent on X.
Reference: Untitled Document (pop0726.github.io)
Keys
Let K be a combination of attributes in R(U, F). If K â U and for any proper subset K’ of K, K’ does not determine U, then K is a Candidate Key. If there are multiple candidate keys, pick one to be the Primary Key.
Primary Attribute vs. Non-primary Attribute Attributes contained in any candidate key are primary attributes. Others are non-primary attributes.
Foreign Key If an attribute (or group) X in R(U) is not a key for R but is a key for another relation, X is a foreign key.
Superkey Any set of attributes that can determine all attributes in the relation. A candidate key is a minimal superkey.
All-key A case where all attributes combined form the primary key.
Armstrong’s Axioms
For a relational schema R(U, F), the following inference rules apply:
- Reflexivity: If Y â X â U, then X â Y is implied by F.
- Augmentation: If X â Y is implied by F and Z â U, then XZ â YZ is implied by F.
- Transitivity: If X â Y and Y â Z are implied by F, then X â Z is implied by F.
Derived rules:
- Union: If X â Y and X â Z, then X â YZ.
- Pseudo-transitivity: If X â Y and WY â Z, then XW â Z.
- Decomposition: If X â Y and Z â Y, then X â Z.
Attribute Closure Calculation
Closure calculation is used to find candidate keys. Here is the logic:
- Attributes appearing only on the left side of FDs must be part of the candidate key.
- Attributes appearing only on the right side cannot be part of a candidate key.
- Attributes appearing on both sides or neither side need to be checked.
- Attributes appearing on neither side must be part of the candidate key.
- Calculate the closure of the “must-be” attributes. If the closure includes all attributes, they form the candidate key. Otherwise, add other potential attributes one by one and re-calculate the closure.
Example: R<U, F>, U(A, B, C, D, E, G) F = {AB â C, CD â E, E â A, A â G}. Find candidate keys.
- Only on left: B, D (Must be in key)
- Only on right: G (Cannot be in key)
- Both sides: A, C, E
- Calculate closure: (BD)âē: Nothing extra. We need to add attributes from {A, C, E}. (BDA)âē: â C, E, A, G. Results in ABCDEG. (Candidate Key) (BDC)âē: â E, A, G. Results in ABCDEG. (Candidate Key) (BDE)âē: â A, G, C. Results in ABCDEG. (Candidate Key)
Candidate keys: {(BDA), (BDC), (BDE)}
Finding the Minimal Functional Dependency Set
How to find the minimal set?
- Decompose the right side (e.g., A â BC becomes A â B and A â C).
- Remove redundant dependencies: For each FD, check if the right side can still be determined by the left side using the other remaining FDs.
- Minimalize the left side: For FDs with multiple attributes on the left (e.g., BCD â A), check if a subset of the left side (like CD) can determine the right side.
Example: R<U, F> U{A, B, C, D, E, F, G} F = {BCD â A, BC â E, A â F, F â G, C â D, A â G}. Find the minimal set.
Solution:
// (1) Decompose RHS (Already mostly done, but A â G is redundant if we look ahead)
(BCD)âē = BCDED…
(BC)âē = BCD
(A)âē = AG
(F)âē = F
(C)âē = C
(A)âē = AFG (Wait, check if A â G can be derived from A â F and F â G. Yes. So remove A â G).
// (2) Minimalize LHS
In BCD â A, we have C â D. So BCD â A can be simplified to BC â A.
Final Minimal Set:
BC â A
BC â E
A â F
F â G
C â D