Database Learning 2: The Relational Model

SQL Series
Database Learning 1: Introduction: https://blog.yexca.net/en/archives/86
Database Learning 2: Relational Model: This article
Database Learning 3: SQL Language: https://blog.yexca.net/en/archives/88
Database Learning 4: Relational Database Theory: https://blog.yexca.net/en/archives/89
Database Learning 5: Normal Forms: https://blog.yexca.net/en/archives/90
Database Learning 6: Database Design: https://blog.yexca.net/en/archives/91
Database Learning 7: Database Control Functions: https://blog.yexca.net/en/archives/92

📢 This article was translated by gemini-3-flash-preview

Basic Terms in the Relational Model

A relational database system is a database system that supports the relational model. Here are some basic terms:

Relation: A relation is a two-dimensional table. Each relation has a name.
Tuple: A row in the table is a tuple, corresponding to a record value in a storage file.
Attribute: Columns in a table are called attributes. Each column has an attribute name. Attribute values correspond to data items or field values in a record.
Domain: The range of values an attribute can take.
Relation Schema: A description of the relation. It consists of the relation name and its set of attributes. The format is: RelationName(Attribute 1, ..., Attribute n).

Usually, one relation schema corresponds to the structure of one relation file.

Candidate Key: An attribute or set of attributes that uniquely identifies a tuple.
Primary Key: If a relation has multiple candidate keys, one is chosen as the primary key.
Prime Attribute: Attributes included in any candidate key are called prime attributes. Others are non-prime attributes.
Foreign Key: If an attribute (or group of attributes) in a relation is not a key for that relation but is a key for another relation, it is a foreign key.
All-key: When all attributes of a relation schema form the candidate key.
Superkey: A set of attributes that contains a key.

Definition of the Relational Model

A relation schema can be formally represented as: R(U, D, dom, F)

Where R is the relation name, U is the set of attributes, D is the domain of the attributes, dom is the mapping from attributes to domains, and F is the set of data dependencies between attributes.

Commonly abbreviated as R(U) or R($A_1$, $A_2$, …), where R is the relation name and $A_1$, $A_2$ are attribute names.

A relational model consists of three parts: relational data structure, relational operations, and relational integrity constraints.

Relational Data Structure

The logical data structure in a relational model is a simple 2D table.

Relational Operations

The relational model defines a set of operations for data manipulation.

Common operations include: Select, Project, Join, Divide, Union, Intersection, Difference (queries), as well as Insert, Delete, and Update.

Relational Integrity Constraints

There are three main integrity rules:

Entity Integrity: The primary key cannot be null or partially null.
Referential Integrity: If a foreign key X in relation $R_2$ matches the primary key of relation $R_1$, then every value of X in $R_2$ must either exist in the primary key of $R_1$ or be null.
User-defined Integrity: Constraints defined by the user for specific data requirements.

Relational Algebra Operators

Operator Category	Symbol	Meaning
Set Operators	∪ – ∩ ×	Union Difference Intersection Cartesian Product
Specialized Relational Operators	σ π ⋈ ÷	Select Project Join Divide
Comparison Operators	＞ ≥ ＜ ≤ ＝ ≠	Greater than Greater or equal Less than Less or equal Equal Not equal
Logical Operators	¬ ∧ ∨	Not And Or

Traditional Set Operations

Specialized Relational Operators

Join

Joins include θ-join, Equijoin, and Natural join.

θ-join: Join condition uses any comparison operator (not just equals).
Equijoin: Join condition uses the equality operator.
Natural Join: An equijoin that removes duplicate attributes. This is the most common join.

Project

Projection is a vertical operation. It selects specific attribute columns A from relation R to form a new relation, denoted as $π_A(R)$.

Select

Selection is a horizontal operation. It selects tuples from relation R that satisfy a given condition F, denoted as $σ_F(R)$.

Outer Join

An extension of the join operation to handle missing information.

Left Outer Join: Includes all tuples from the left relation. If no match is found in the right relation, attributes from the right are filled with NULL.
Right Outer Join: Includes all tuples from the right relation. If no match is found in the left relation, attributes from the left are filled with NULL.
Full Outer Join: Combines both Left and Right outer joins.

Divide

Converting Symbols to SQL

Relation R:

A	B	C
1	2	3
4	5	6
7	8	9

Project to SQL

$π_?(?)$

1
select ? from ?

$π_{A, B, C}(R)$ or $π_{1, 2, 3}(R)$

1
select A, B, C from R

Note: SQL does not support column indices (ordinal numbers) in the select clause like relational algebra.

Select to SQL

$π_?(σ_?(?))$

1
select ? from ? where ?

$π_{A, B, C}(σ_{B >= '5'}(R))$ or $π_{1, 2, 3}(σ_{2 >= '5'}(R))$

1
2
3
select A, B, C from R where B >= '5'
-- or
select A, B, C from R where B >= 5

Cartesian Product to SQL

$π_{R.A, R.B, R.C, S.A, S.B, S.D}(R×S)$

1
2
3
4
select R.A, R.B, R.C, S.A, S.B, S.D
from R, S
-- Ascending order
order by R.A asc

Natural Join to SQL

Relations R(A, B, C) and S(B, C, D)

$π_{R.A, R.B, R.C, S.D}(σ_{R.B = S.B ∧ R.C = S.C}(R×S))$

1
2
3
select R.A, R.B, R.C, S.D
from R, S
where R.B = S.B and R.C = S.C