123 lines
4.7 KiB
Markdown
123 lines
4.7 KiB
Markdown
---
|
|
created: 2026-04-08 08:52
|
|
course: "[[29593850 - Automationtheory]]"
|
|
topic: "#languages #string #character #kleene #regularExpressions"
|
|
related: "[[29593850 - Automationtheory]]"
|
|
type: lecture
|
|
status: 🔴
|
|
tags:
|
|
- university
|
|
---
|
|
## 📌 Summary
|
|
|
|
> [!abstract]
|
|
> Overview of lecture 1 on `Wednesday, 2026/Apr/08`
|
|
|
|
---
|
|
|
|
## 📝 Content
|
|
|
|
### Alphabets
|
|
Alphabets are formal, non-empty, finite, sets of characters (or _letters_ or _symbols_). They are denoted by $Sigma$.
|
|
|
|
$Sigma = {a, b}$
|
|
> Alphabet $Sigma$ contains the characters $a$ and $b$.
|
|
|
|
$Sigma = {a, ..., z, A, ..., Z, 0, ..., 9}$
|
|
> usual alphabet for writing text
|
|
|
|
### Strings
|
|
A word (or _string_) is a finite sequence $w = a_1 a_2 ... a_n$ if characters from $Sigma$.
|
|
|
|
> [!CONVENTION]
|
|
> We will use small letters to describe strings that are part of a language.
|
|
|
|
> [!EXAMPLE]
|
|
> $"aa", "ab", "bba"$ and $"baab"$ are strings over $Sigma = {a, b}.
|
|
|
|
#### Length of a string
|
|
The _length_ $abs(x)$ of a string $x = a_1 ... a_n$ is its number $abs(x) = n$ of characters.
|
|
#### Empty String
|
|
The empty string is denoted by $epsilon$, this is the neutral element.
|
|
-> $abs(epsilon) = 0$
|
|
### Concatenation
|
|
String can be concatenated, where one string is appended to another.
|
|
For strings $x = a_1 ... a_n$ and $y = b_1 ... b_m$ over alphabets $Sigma_x$ and $Sigma_y$, their _concatenation_ over the alphabet $Sigma = Sigma_x union Sigma_y$ is the string
|
|
$$x circle.small y = x y = a_1 a_2 ... a_n b_1 b_2 ... b_m$$
|
|
> This string is of the length $abs(x y) = n + m$
|
|
|
|
> [!EXAMPLE]
|
|
> $x = "apple"$
|
|
> $y = "pie"$
|
|
> $x circle.small y = "applepie"$
|
|
|
|
Order of operations / Brackets do _not matter_. (Concatenation is associative but **not** commutative $x y eq.not y x$)
|
|
> $(x circle.small y) circle.small z = x circle.small (y circle.small z)$
|
|
|
|
Any string concatenated with the empty string $epsilon$ will result in itself.
|
|
> $x circle.small epsilon = x = epsilon circle.small x$
|
|
### Exponentiation
|
|
|
|
The $n^"th"$ power $x^n$ of a string $x$ is the $(n-1)$-fold concatenation of $x$ with itself.
|
|
> $x^0 := epsilon$
|
|
> $x^n := x^(n-1) circle.small x$ for $n in NN$
|
|
|
|
> [!Example]
|
|
> $x^4 = x x x x$
|
|
> $(a b)^3 = a b a b a b$
|
|
|
|
### Reversing / Mirroring
|
|
For a string $x = a_1 a_2 ... a_(n-1) a_n$ of length $n$, it's _mirrored string_ is given by
|
|
$$ x^("Rev") = a_n a_(n-1)...a_2 a_1$$
|
|
### Substrings
|
|
A string $x$ is a _substring_ of a string $y$ if $y = u x v$, where $u$ and $v$ can be arbitrary strings.
|
|
- If $u = epsilon$ then $x$ is a _prefix_ of $y$.
|
|
- If $v = epsilon$ then $x$ is a suffix of $y$.
|
|
|
|
For strings $x$ and $y$ the quantity $abs(y)_x$ is the number of times that $x$ is a substring of $y$.
|
|
|
|
### Kleene Star
|
|
Denoted by $Sigma^*$. The Kleene Star (or _Kleene operator_ or _Kleene Closure_) gives an infinite amount of strings made up of the characters of the alphabet $Sigma ^ *$.
|
|
$Sigma^*$ is the set of all string that can be generated by arbitrary concatenation of its characters.
|
|
> $Sigma^* := union.big_(n>=0) A_n$
|
|
> where $A_n$ is the set of all string combinations of length $n$
|
|
|
|
#### Remarks
|
|
- The same character can be used multiple times.
|
|
- The empty string $epsilon$ is also part f $Sigma^*$.
|
|
|
|
> [!Example]
|
|
> $Sigma^* {a, b} = {epsilon, a, b, "aa", "ab", "ba", "bb", "aaa", "aab", ...}$
|
|
|
|
> [!FACT]
|
|
> - The set $Sigma^*$ is infinite, since we defined $Sigma$ to be non-empty.
|
|
> - It is _countable_ and has the same cardinality as the set $NN$ of natural numbers
|
|
|
|
#### Kleene Plus
|
|
The _Kleene Plus_ of an alphabet $Sigma$ is given by $Sigma^+ = Sigma^* backslash {epsilon}$
|
|
|
|
#### Lemma group structure
|
|
The structure _Lemma_ is induced by the Kleene star - it is a monoid, that is a semigroup with a neutral element.
|
|
|
|
> [!PROOF]
|
|
> - Associativity has been shown
|
|
> - Existence of a neutral element has been shown.
|
|
> - Closure under $circle.small$: Let $x in Sigma^*$ and $y in Sigma^*$ be two string over the alphabet $Sigma$. Then $x circle.small y = x y in Sigma^*$
|
|
### Formal Languages
|
|
A formal _language_ of the alphabet $Sigma$ is a subset $L$ of $Sigma^*$
|
|
|
|
### Finite representation of languages
|
|
**Goal:** Represent a language using _finite_ information
|
|
#### Using set notation
|
|
$S = {a^n b^m bar n, m >= 0} = {epsilon, a, b, "aa", "ab", ...}$
|
|
> This is very inefficient.
|
|
|
|
#### Using regular expressions
|
|
A _regular expression_ $r$ over an alphabet $Sigma$ is defined recursively:
|
|
- $emptyset, epsilon$ and each $a in Sigma$ are regular expression, which represent the Languages $L(emptyset) = emptyset, L(epsilon) = {epsilon}$ and $L(a) = {a}$
|
|
- If $r$ and $s$ are regular expressions then
|
|
- $(r+s)$
|
|
<mark style="background: #FF5582A6;"> - !! Complete from leture notes</mark>
|
|
|
|
### Regular languages
|
|
A language $L$ that can be described by a regular expression $r$ (i. e. $L(r) = L$) is called _regular_. |