Introduction
Sometimes it’s the people who no one imagines anything of who do the things that no one can imagine
Recently, I learned from the comments on Bitecho’s video about the documentary about Turing, The Imitation Game. I watched it and was deeply moved by this quote from the film, using it as an epigraph for this article on Formal Languages and Automata.
If you ever feel a bit down, also check out Amy’s thoughts on this quote .
Basic Concepts
Alphabet: A non-empty, finite set of symbols (characters).
$$ {\textstyle \sum_1} = \{0, 1\} $$String: A finite sequence of symbols from some alphabet.
$$ \text{If } \sum_1 = \{0, 1\}, \text{ then } 0, 1, 00, 101011 \text{ are strings over } \sum_1. $$Empty String: Denoted as $\varepsilon$, a string with 0 characters. For any alphabet $\sum$, $\varepsilon \notin \sum$.
Symbol Conventions
Alphabets: $\sum, \Gamma, \cdots$
Characters: $a, b, c, \cdots$
Strings: $\cdots,w,x,y,z$
Sets: $A,B,C,\cdots$
Length of a String: The number of symbol positions in the string.
$$ \text{For } \sum_1 = \{0, 1\}, |0010|=4. $$Concatenation of Strings $x$ and $y$: An operation that joins them end-to-end to form a new string, denoted as $x \cdot y$ or $xy$.
$$ x=01, y=ab, \text{ then } xy=01ab $$$n$-th Power of String $x$ ($n \ge 0$), recursively defined as:
$$ x^n= \left \{ \begin{matrix} \varepsilon & n=0 \\ x^{n-1}x & n \gt 0 \end{matrix} \right . $$For $\sum_1 = \{a, b\}$, $(ba)^2=baba, ba^2=baa$.
Concatenation of Sets $A$ and $B$, denoted as $A\cdot B$ or $AB$, defined as:
$$ A \cdot B = \{ w |w=x \cdot y, x \in A \space and \space y \in B \} $$If $A = \{0, 11\}$ and $B = \{ab, ba\}$, then $AB = \{ 0ab, 0ba, 11ab, 11ba \}, BA= \{ ab0, ab11, ba0, ba11 \}$
$n$-th Power of Set $A$ ($n \ge 0$), recursively defined as:
$$ A^n= \left \{ \begin{matrix} \{ \varepsilon \} & n=0\\ A^{n-1}A & n \ge 1 \end{matrix} \right . $$For $\sum_1 = \{0, 1\}$, we have $\sum^0=\{\varepsilon\}, \sum^1=\{0,1\}, \sum^2=\{00,01,10,11\}, \cdots$.
Kleene Closure: The union of all powers (including zero-th power) of an alphabet.
$$ {\textstyle \sum^*} = \bigcup_{i=0}^{\infty}{\textstyle \sum^i} $$Positive Closure: The union of all powers (excluding zero-th power) of an alphabet.
$$ {\textstyle \sum^+} = \bigcup_{i=1}^{\infty}{\textstyle \sum^i} $$Clearly,
$$ {\textstyle \sum^*} = {\textstyle \sum^+} \cup \{ \varepsilon \} $$Languages
Definition: If $\sum$ is an alphabet and $\forall L \subseteq \sum^*$, then $L$ is called a language over the alphabet $\sum$.
- Natural languages, programming languages, etc.
- $ \{ 0^n 1^n \mid n \ge 0 \} $
- The set of strings of 0’s and 1’s with an equal number of each
- $\emptyset$, $\{\varepsilon\}$, and $\sum^*$ are all languages over any alphabet $\sum$, but $\emptyset \ne \{\varepsilon\}$.
The only constraint for a language is that all alphabets must be finite.