vault backup: 2026-04-08 10:35:59
This commit is contained in:
@@ -1,8 +1,8 @@
|
||||
---
|
||||
created: 2026-04-08 08:52
|
||||
course: "[[29593850 - Automationtheory]]"
|
||||
topic: "#languages #string #character #kleene #regularExpressions"
|
||||
related: "[[29593850 - Automationtheory]]"
|
||||
topic: strings
|
||||
related: "[[29593929 - Alphabets]]"
|
||||
type: lecture
|
||||
status: 🔴
|
||||
tags:
|
||||
@@ -17,16 +17,6 @@ tags:
|
||||
|
||||
## 📝 Content
|
||||
|
||||
### Alphabets
|
||||
Alphabets are formal, non-empty, finite, sets of characters (or _letters_ or _symbols_). They are denoted by $Sigma$.
|
||||
|
||||
$Sigma = {a, b}$
|
||||
> Alphabet $Sigma$ contains the characters $a$ and $b$.
|
||||
|
||||
$Sigma = {a, ..., z, A, ..., Z, 0, ..., 9}$
|
||||
> usual alphabet for writing text
|
||||
|
||||
### Strings
|
||||
A word (or _string_) is a finite sequence $w = a_1 a_2 ... a_n$ if characters from $Sigma$.
|
||||
|
||||
> [!CONVENTION]
|
||||
@@ -40,6 +30,8 @@ The _length_ $abs(x)$ of a string $x = a_1 ... a_n$ is its number $abs(x) = n$ o
|
||||
#### Empty String
|
||||
The empty string is denoted by $epsilon$, this is the neutral element.
|
||||
-> $abs(epsilon) = 0$
|
||||
|
||||
## String Operations
|
||||
### Concatenation
|
||||
String can be concatenated, where one string is appended to another.
|
||||
For strings $x = a_1 ... a_n$ and $y = b_1 ... b_m$ over alphabets $Sigma_x$ and $Sigma_y$, their _concatenation_ over the alphabet $Sigma = Sigma_x union Sigma_y$ is the string
|
||||
@@ -56,6 +48,7 @@ Order of operations / Brackets do _not matter_. (Concatenation is associative bu
|
||||
|
||||
Any string concatenated with the empty string $epsilon$ will result in itself.
|
||||
> $x circle.small epsilon = x = epsilon circle.small x$
|
||||
|
||||
### Exponentiation
|
||||
|
||||
The $n^"th"$ power $x^n$ of a string $x$ is the $(n-1)$-fold concatenation of $x$ with itself.
|
||||
@@ -69,55 +62,10 @@ The $n^"th"$ power $x^n$ of a string $x$ is the $(n-1)$-fold concatenation of $x
|
||||
### Reversing / Mirroring
|
||||
For a string $x = a_1 a_2 ... a_(n-1) a_n$ of length $n$, it's _mirrored string_ is given by
|
||||
$$ x^("Rev") = a_n a_(n-1)...a_2 a_1$$
|
||||
### Substrings
|
||||
## Substrings
|
||||
A string $x$ is a _substring_ of a string $y$ if $y = u x v$, where $u$ and $v$ can be arbitrary strings.
|
||||
- If $u = epsilon$ then $x$ is a _prefix_ of $y$.
|
||||
- If $v = epsilon$ then $x$ is a suffix of $y$.
|
||||
|
||||
For strings $x$ and $y$ the quantity $abs(y)_x$ is the number of times that $x$ is a substring of $y$.
|
||||
|
||||
### Kleene Star
|
||||
Denoted by $Sigma^*$. The Kleene Star (or _Kleene operator_ or _Kleene Closure_) gives an infinite amount of strings made up of the characters of the alphabet $Sigma ^ *$.
|
||||
$Sigma^*$ is the set of all string that can be generated by arbitrary concatenation of its characters.
|
||||
> $Sigma^* := union.big_(n>=0) A_n$
|
||||
> where $A_n$ is the set of all string combinations of length $n$
|
||||
|
||||
#### Remarks
|
||||
- The same character can be used multiple times.
|
||||
- The empty string $epsilon$ is also part f $Sigma^*$.
|
||||
|
||||
> [!Example]
|
||||
> $Sigma^* {a, b} = {epsilon, a, b, "aa", "ab", "ba", "bb", "aaa", "aab", ...}$
|
||||
|
||||
> [!FACT]
|
||||
> - The set $Sigma^*$ is infinite, since we defined $Sigma$ to be non-empty.
|
||||
> - It is _countable_ and has the same cardinality as the set $NN$ of natural numbers
|
||||
|
||||
#### Kleene Plus
|
||||
The _Kleene Plus_ of an alphabet $Sigma$ is given by $Sigma^+ = Sigma^* backslash {epsilon}$
|
||||
|
||||
#### Lemma group structure
|
||||
The structure _Lemma_ is induced by the Kleene star - it is a monoid, that is a semigroup with a neutral element.
|
||||
|
||||
> [!PROOF]
|
||||
> - Associativity has been shown
|
||||
> - Existence of a neutral element has been shown.
|
||||
> - Closure under $circle.small$: Let $x in Sigma^*$ and $y in Sigma^*$ be two string over the alphabet $Sigma$. Then $x circle.small y = x y in Sigma^*$
|
||||
### Formal Languages
|
||||
A formal _language_ of the alphabet $Sigma$ is a subset $L$ of $Sigma^*$
|
||||
|
||||
### Finite representation of languages
|
||||
**Goal:** Represent a language using _finite_ information
|
||||
#### Using set notation
|
||||
$S = {a^n b^m bar n, m >= 0} = {epsilon, a, b, "aa", "ab", ...}$
|
||||
> This is very inefficient.
|
||||
|
||||
#### Using regular expressions
|
||||
A _regular expression_ $r$ over an alphabet $Sigma$ is defined recursively:
|
||||
- $emptyset, epsilon$ and each $a in Sigma$ are regular expression, which represent the Languages $L(emptyset) = emptyset, L(epsilon) = {epsilon}$ and $L(a) = {a}$
|
||||
- If $r$ and $s$ are regular expressions then
|
||||
- $(r+s)$
|
||||
<mark style="background: #FF5582A6;"> - !! Complete from leture notes</mark>
|
||||
|
||||
### Regular languages
|
||||
A language $L$ that can be described by a regular expression $r$ (i. e. $L(r) = L$) is called _regular_.
|
||||
Reference in New Issue
Block a user