SQL Series
Database Learning 1: Introduction: https://blog.yexca.net/archives/86
Database Learning 2: Relational Model: This article
Database Learning 3: SQL Language: https://blog.yexca.net/archives/88
Database Learning 4: Relational Database Theory: https://blog.yexca.net/archives/89
Database Learning 5: Normal Forms: https://blog.yexca.net/archives/90
Database Learning 6: Database Design: https://blog.yexca.net/archives/91
Database Learning 7: Database Control Functions: https://blog.yexca.net/archives/92
Basic Terms in the Relational Model
A relational database system is a database system that supports the relational model. Here are some basic terms:
- Relation: A relation is a two-dimensional table. Each relation has a name.
- Tuple: A row in the table is a tuple, corresponding to a record value in a storage file.
- Attribute: Columns in a table are called attributes. Each column has an attribute name. Attribute values correspond to data items or field values in a record.
- Domain: The range of values an attribute can take.
- Relation Schema: A description of the relation. It consists of the relation name and its set of attributes. The format is:
RelationName(Attribute 1, ..., Attribute n).
Usually, one relation schema corresponds to the structure of one relation file.
- Candidate Key: An attribute or set of attributes that uniquely identifies a tuple.
- Primary Key: If a relation has multiple candidate keys, one is chosen as the primary key.
- Prime Attribute: Attributes included in any candidate key are called prime attributes. Others are non-prime attributes.
- Foreign Key: If an attribute (or group of attributes) in a relation is not a key for that relation but is a key for another relation, it is a foreign key.
- All-key: When all attributes of a relation schema form the candidate key.
- Superkey: A set of attributes that contains a key.
Definition of the Relational Model
A relation schema can be formally represented as: R(U, D, dom, F)
Where R is the relation name, U is the set of attributes, D is the domain of the attributes, dom is the mapping from attributes to domains, and F is the set of data dependencies between attributes.
Commonly abbreviated as R(U) or R($A_1$, $A_2$, …), where R is the relation name and $A_1$, $A_2$ are attribute names.
A relational model consists of three parts: relational data structure, relational operations, and relational integrity constraints.
Relational Data Structure
The logical data structure in a relational model is a simple 2D table.
Relational Operations
The relational model defines a set of operations for data manipulation.
Common operations include: Select, Project, Join, Divide, Union, Intersection, Difference (queries), as well as Insert, Delete, and Update.
Relational Integrity Constraints
There are three main integrity rules:
- Entity Integrity: The primary key cannot be null or partially null.
- Referential Integrity: If a foreign key X in relation $R_2$ matches the primary key of relation $R_1$, then every value of X in $R_2$ must either exist in the primary key of $R_1$ or be null.
- User-defined Integrity: Constraints defined by the user for specific data requirements.
Relational Algebra Operators
| Operator Category | Symbol | Meaning |
|---|---|---|
| Set Operators | ∪ – ∩ × | Union Difference Intersection Cartesian Product |
| Specialized Relational Operators | σ π ⋈ ÷ | Select Project Join Divide |
| Comparison Operators | > ≥ < ≤ = ≠ | Greater than Greater or equal Less than Less or equal Equal Not equal |
| Logical Operators | ¬ ∧ ∨ | Not And Or |
Traditional Set Operations

Specialized Relational Operators
- Join
Joins include θ-join, Equijoin, and Natural join.
- θ-join: Join condition uses any comparison operator (not just equals).
- Equijoin: Join condition uses the equality operator.
- Natural Join: An equijoin that removes duplicate attributes. This is the most common join.
- Project
Projection is a vertical operation. It selects specific attribute columns A from relation R to form a new relation, denoted as $π_A(R)$.
- Select
Selection is a horizontal operation. It selects tuples from relation R that satisfy a given condition F, denoted as $σ_F(R)$.

- Outer Join
An extension of the join operation to handle missing information.
- Left Outer Join: Includes all tuples from the left relation. If no match is found in the right relation, attributes from the right are filled with NULL.
- Right Outer Join: Includes all tuples from the right relation. If no match is found in the left relation, attributes from the left are filled with NULL.
- Full Outer Join: Combines both Left and Right outer joins.
- Divide

Converting Symbols to SQL
Relation R:
| A | B | C |
|---|---|---|
| 1 | 2 | 3 |
| 4 | 5 | 6 |
| 7 | 8 | 9 |
Project to SQL
$π_?(?)$
| |
$π_{A, B, C}(R)$ or $π_{1, 2, 3}(R)$
| |
Note: SQL does not support column indices (ordinal numbers) in the select clause like relational algebra.
Select to SQL
$π_?(σ_?(?))$
| |
$π_{A, B, C}(σ_{B >= '5'}(R))$ or $π_{1, 2, 3}(σ_{2 >= '5'}(R))$
| |
Cartesian Product to SQL
$π_{R.A, R.B, R.C, S.A, S.B, S.D}(R×S)$
| |
Natural Join to SQL
Relations R(A, B, C) and S(B, C, D)
$π_{R.A, R.B, R.C, S.D}(σ_{R.B = S.B ∧ R.C = S.C}(R×S))$
| |