Operations on Languages - Models of Computation

When designing an algorithm for a problem, it is often helpful to decompose the problem into simpler sub-problems. We can do something similar when designing algorithms for languages as well. In particular, we will now consider operations on languages that allow us to combine languages, similar to how propositional connectives ( $\neg, \vee, \wedge$ ) let us combine propositions.

These operations are:

complement
intersection
union
concatenation
Kleene star (aka Kleene closure)

The last 3 are called regular operations.

Given an alphabet $\Sigma$ , we define $\Sigma^*$ to be the set of all strings. For example, when $\Sigma = \{0,1\}$ , then $\Sigma^* = \{\epsilon, 0, 1, 00, 01, 10, 11, \ldots \}$ . A language $L$ over the alphabet $\Sigma$ is a subset of $\Sigma^*$ .

4.2.1Concatenation¶

Given two strings $x$ and $y$ , we denote their concatenation by $xy$ .

Alternative Definition Attempt 1¶

Let’s first try to design an algorithm that constructs $A \circ B$ . To be more precise, we want an algorithm that prints the strings in $A \circ B$ , one by one.

If $A$ and $B$ are finite, it is easy to do it using the following algorithm:

for x in A:
  for y in B:
    print x+y

Note that the pseudocode uses the Python notation for string concatenation.

What about when $A \circ B$ is infinite (e.g. if $B = \Sigma^*$ )? In this case, we say that the algorithm succeeds if every string in $A \circ B$ is eventually printed by the algorithm.

Taking a closer look, we realize that when $B$ is finite, the algorithm succeeds even when $A$ is infinite. Unfortunately, it is unclear at this point how to modify the algorithm to be successful if both $A$ and $B$ are infinite. We leave this as an advanced exercise for now.

Exercise 1 (Advanced exercise)

Given a language $L$ , an algorithm $M$ is said to enumerate $L$ if every string in $L$ is eventually printed by $M$ .

Suppose that there are algorithms $M_A$ and $M_B$ that enumerate $A$ and $B$ , respectively. Design an algorithm that enumerates $A \circ B$ . Your algorithm should work even when both $A$ and $B$ are infinite.

Alternative Definition Attempt 2¶

Let’s instead try to build on cases of $A \circ B$ that are easier to understand.

Suppose $A$ consists of a single string $a$ , i.e. $A = \{a\}$ . Then, $A \circ B$ is easy to describe: it is the set of strings that are obtained by appending $a$ with a string in $B$ .

We can then define $A \circ B$ iteratively as follows: $A \circ B$ is the union of the sets $\{a\} \circ B$ over all $a \in A$ .^[1]

4.2.2Kleene star¶

Consider the following algorithm that, roughly speaking, generates $L^*$ , given a subroutine concat" that computes the concatenation of two languages:

k = 0
initialise Lstar to contain only the empty string
while(true):
  for j = 1 up to k:
    Lstar = concat(Lstar, L)

Closure¶

The proof follows by taking the finite automata for $L$ , $A$ and $B$ , and modifying them appropriately.

For complement, this is easy: let $M$ be the finite automaton for $L$ . Then, the finite automaton $N$ for $L^c$ is obtained by swapping accept and reject states in $M$ . The others seem difficult. We will need the power of non-determinism to deal with them.

Footnotes¶

If you are comfortable with set notation, $A \circ B = \bigcup_{a \in A} (\{a\} \circ B)$ .
↩

Models of Computation

Introduction to Finite Automata