Files
uni_notes/00 Inbox/29593852 - Strings.md
2026-04-08 10:11:50 +02:00

4.7 KiB

created, course, topic, related, type, status, tags
created course topic related type status tags
2026-04-08 08:52 29593850 - Automationtheory #languages #string #character #kleene #regularExpressions 29593850 - Automationtheory lecture 🔴
university

📌 Summary

[!abstract] Overview of lecture 1 on Wednesday, 2026/Apr/08


📝 Content

Alphabets

Alphabets are formal, non-empty, finite, sets of characters (or letters or symbols). They are denoted by Sigma.

Sigma = {a, b}

Alphabet Sigma contains the characters a and b.

Sigma = {a, ..., z, A, ..., Z, 0, ..., 9}

usual alphabet for writing text

Strings

A word (or string) is a finite sequence w = a_1 a_2 ... a_n if characters from Sigma.

[!CONVENTION] We will use small letters to describe strings that are part of a language.

[!EXAMPLE] "aa", "ab", "bba" and "baab" are strings over $Sigma = {a, b}.

Length of a string

The length abs(x) of a string x = a_1 ... a_n is its number abs(x) = n of characters.

Empty String

The empty string is denoted by epsilon, this is the neutral element. -> abs(epsilon) = 0

Concatenation

String can be concatenated, where one string is appended to another. For strings x = a_1 ... a_n and y = b_1 ... b_m over alphabets Sigma_x and Sigma_y, their concatenation over the alphabet Sigma = Sigma_x union Sigma_y is the string

x circle.small y = x y = a_1 a_2 ... a_n b_1 b_2 ... b_m

This string is of the length abs(x y) = n + m

[!EXAMPLE] x = "apple" y = "pie" x circle.small y = "applepie"

Order of operations / Brackets do not matter. (Concatenation is associative but not commutative x y eq.not y x)

(x circle.small y) circle.small z = x circle.small (y circle.small z)

Any string concatenated with the empty string epsilon will result in itself.

x circle.small epsilon = x = epsilon circle.small x

Exponentiation

The n^"th" power x^n of a string x is the $(n-1)$-fold concatenation of x with itself.

x^0 := epsilon x^n := x^(n-1) circle.small x for n in NN

[!Example] x^4 = x x x x (a b)^3 = a b a b a b

Reversing / Mirroring

For a string x = a_1 a_2 ... a_(n-1) a_n of length n, it's mirrored string is given by

x^("Rev") = a_n a_(n-1)...a_2 a_1

Substrings

A string x is a substring of a string y if y = u x v, where u and v can be arbitrary strings.

  • If u = epsilon then x is a prefix of y.
  • If v = epsilon then x is a suffix of y.

For strings x and y the quantity abs(y)_x is the number of times that x is a substring of y.

Kleene Star

Denoted by Sigma^*. The Kleene Star (or Kleene operator or Kleene Closure) gives an infinite amount of strings made up of the characters of the alphabet Sigma ^ *. Sigma^* is the set of all string that can be generated by arbitrary concatenation of its characters.

Sigma^* := union.big_(n>=0) A_n where A_n is the set of all string combinations of length n

Remarks

  • The same character can be used multiple times.
  • The empty string epsilon is also part f Sigma^*.

[!Example] Sigma^* {a, b} = {epsilon, a, b, "aa", "ab", "ba", "bb", "aaa", "aab", ...}

[!FACT]

  • The set Sigma^* is infinite, since we defined Sigma to be non-empty.
  • It is countable and has the same cardinality as the set NN of natural numbers

Kleene Plus

The Kleene Plus of an alphabet Sigma is given by Sigma^+ = Sigma^* backslash {epsilon}

Lemma group structure

The structure Lemma is induced by the Kleene star - it is a monoid, that is a semigroup with a neutral element.

[!PROOF]

  • Associativity has been shown
  • Existence of a neutral element has been shown.
  • Closure under circle.small: Let x in Sigma^* and y in Sigma^* be two string over the alphabet Sigma. Then x circle.small y = x y in Sigma^*

Formal Languages

A formal language of the alphabet Sigma is a subset L of Sigma^*

Finite representation of languages

Goal: Represent a language using finite information

Using set notation

S = {a^n b^m bar n, m >= 0} = {epsilon, a, b, "aa", "ab", ...}

This is very inefficient.

Using regular expressions

A regular expression r over an alphabet Sigma is defined recursively:

  • emptyset, epsilon and each a in Sigma are regular expression, which represent the Languages L(emptyset) = emptyset, L(epsilon) = {epsilon} and L(a) = {a}
  • If r and s are regular expressions then
    • (r+s) - !! Complete from leture notes

Regular languages

A language L that can be described by a regular expression r (i. e. L(r) = L) is called regular.