Database Learning 2: The Relational Model

SQL Series

Database Learning 1: Introduction: https://blog.yexca.net/archives/86
Database Learning 2: Relational Model: This article
Database Learning 3: SQL Language: https://blog.yexca.net/archives/88
Database Learning 4: Relational Database Theory: https://blog.yexca.net/archives/89
Database Learning 5: Normal Forms: https://blog.yexca.net/archives/90
Database Learning 6: Database Design: https://blog.yexca.net/archives/91
Database Learning 7: Database Control Functions: https://blog.yexca.net/archives/92

📢 This article was translated by gemini-3-flash-preview

Basic Terms in the Relational Model

A relational database system is a database system that supports the relational model. Here are some basic terms:

  • Relation: A relation is a two-dimensional table. Each relation has a name.
  • Tuple: A row in the table is a tuple, corresponding to a record value in a storage file.
  • Attribute: Columns in a table are called attributes. Each column has an attribute name. Attribute values correspond to data items or field values in a record.
  • Domain: The range of values an attribute can take.
  • Relation Schema: A description of the relation. It consists of the relation name and its set of attributes. The format is: RelationName(Attribute 1, ..., Attribute n).

Usually, one relation schema corresponds to the structure of one relation file.

  • Candidate Key: An attribute or set of attributes that uniquely identifies a tuple.
  • Primary Key: If a relation has multiple candidate keys, one is chosen as the primary key.
  • Prime Attribute: Attributes included in any candidate key are called prime attributes. Others are non-prime attributes.
  • Foreign Key: If an attribute (or group of attributes) in a relation is not a key for that relation but is a key for another relation, it is a foreign key.
  • All-key: When all attributes of a relation schema form the candidate key.
  • Superkey: A set of attributes that contains a key.

Definition of the Relational Model

A relation schema can be formally represented as: R(U, D, dom, F)

Where R is the relation name, U is the set of attributes, D is the domain of the attributes, dom is the mapping from attributes to domains, and F is the set of data dependencies between attributes.

Commonly abbreviated as R(U) or R($A_1$, $A_2$, …), where R is the relation name and $A_1$, $A_2$ are attribute names.

A relational model consists of three parts: relational data structure, relational operations, and relational integrity constraints.

Relational Data Structure

The logical data structure in a relational model is a simple 2D table.

Relational Operations

The relational model defines a set of operations for data manipulation.

Common operations include: Select, Project, Join, Divide, Union, Intersection, Difference (queries), as well as Insert, Delete, and Update.

Relational Integrity Constraints

There are three main integrity rules:

  • Entity Integrity: The primary key cannot be null or partially null.
  • Referential Integrity: If a foreign key X in relation $R_2$ matches the primary key of relation $R_1$, then every value of X in $R_2$ must either exist in the primary key of $R_1$ or be null.
  • User-defined Integrity: Constraints defined by the user for specific data requirements.

Relational Algebra Operators

Operator CategorySymbolMeaning
Set Operators


×
Union
Difference
Intersection
Cartesian Product
Specialized Relational Operatorsσ
π

÷
Select
Project
Join
Divide
Comparison Operators




Greater than
Greater or equal
Less than
Less or equal
Equal
Not equal
Logical Operators¬

Not
And
Or

Traditional Set Operations

img

Specialized Relational Operators

  1. Join

Joins include θ-join, Equijoin, and Natural join.

  • θ-join: Join condition uses any comparison operator (not just equals).
  • Equijoin: Join condition uses the equality operator.
  • Natural Join: An equijoin that removes duplicate attributes. This is the most common join.
  1. Project

Projection is a vertical operation. It selects specific attribute columns A from relation R to form a new relation, denoted as $π_A(R)$.

  1. Select

Selection is a horizontal operation. It selects tuples from relation R that satisfy a given condition F, denoted as $σ_F(R)$.

img

  1. Outer Join

An extension of the join operation to handle missing information.

  • Left Outer Join: Includes all tuples from the left relation. If no match is found in the right relation, attributes from the right are filled with NULL.
  • Right Outer Join: Includes all tuples from the right relation. If no match is found in the left relation, attributes from the left are filled with NULL.
  • Full Outer Join: Combines both Left and Right outer joins.
  1. Divide

img

Converting Symbols to SQL

Relation R:

ABC
123
456
789

Project to SQL

$π_?(?)$

1
select ? from ?

$π_{A, B, C}(R)$ or $π_{1, 2, 3}(R)$

1
select A, B, C from R

Note: SQL does not support column indices (ordinal numbers) in the select clause like relational algebra.

Select to SQL

$π_?(σ_?(?))$

1
select ? from ? where ?

$π_{A, B, C}(σ_{B >= '5'}(R))$ or $π_{1, 2, 3}(σ_{2 >= '5'}(R))$

1
2
3
select A, B, C from R where B >= '5'
-- or
select A, B, C from R where B >= 5

Cartesian Product to SQL

$π_{R.A, R.B, R.C, S.A, S.B, S.D}(R×S)$

1
2
3
4
select R.A, R.B, R.C, S.A, S.B, S.D
from R, S
-- Ascending order
order by R.A asc

Natural Join to SQL

Relations R(A, B, C) and S(B, C, D)

$π_{R.A, R.B, R.C, S.D}(σ_{R.B = S.B ∧ R.C = S.C}(R×S))$

1
2
3
select R.A, R.B, R.C, S.D
from R, S
where R.B = S.B and R.C = S.C
This post is licensed under CC BY-NC-SA 4.0 by the author.