Proving Nonregularity - Models of Computation

In this section, we get a first glimpse at proving impossibility results. In particular, we will discuss methods for proving a given language $L$ is nonregular, i.e. that there is no possible deterministic finite automata that can recognise $L$ .

We begin with the fooling set technique, and then we show how to use closure properties to leverage the fact that some other language is already known to be nonregular.

Fooling sets vs Pumping Lemma

Previous offerings of this subject and the majority of resources out there use a different approach based on the so-called Pumping Lemma instead of the fooling set technique. However, past experience and student feedback has shown that it is highly challenging for students who are not accustomed to proofs. Courses that assume similar levels of mathematical background (e.g. Stanford CS 103) use the fooling set technique as it’s a more direct and less abstract. In my opinion, the fooling set technique also has the benefit that it is easier to provide intermediate stepping stones, and thus partial marks, in assessments.

4.6.1Technique 1: Fooling Set Technique¶

At an intuitive level, to prove a language $L$ is not regular, we need to exploit the fact that a finite automata has a fixed amount of memory, independent of the length of the input string. The fooling set technique allows us to prove lower bounds on the amount of memory that is needed to recognize $L$ .

In more detail, the technique lets us prove lower bounds on the number of states a DFA needs to recognize $L$ , i.e. it allows us to prove statements of the form “no DFA with fewer than 5 states can recognize $L$ ”. Since a DFA has a fixed number of states, to show that no DFA can recognize $L$ , we will prove that for every non-negative integer $k$ , no DFA with fewer than $k$ states can recognize $L$ .

What does it mean to prove nonregularity?¶

Recall that a language $L$ is regular if and only if there is a DFA $M$ that recognizes it.

Recall also that a DFA $M$ recognizes a language $L$ if and only if for every input $x$ :

if $x$ is in $L$ , then $M$ accepts $x$ , and
if $x$ is not in $L$ , then $M$ rejects $x$ .

Thus, we get the following definition of nonrecognition.

This then leads to the following definition of nonregularity.

Memorylessness of DFAs, formalised¶

Let $M$ be a DFA. Recall the definition of the state transition function $\delta$ : given current state $q$ and after processing the next input symbol $a$ , the DFA $M$ moves to state $\delta(q,a)$ that the transition function $\delta$ tells us the next state given its current state and the next input symbol. Recall also that the resulting state of an input string $w$ is the state that $M$ ends up in after processing $w$ , and is denoted by $q(w)$ .

We now define the extended transition function $\delta^*$ which tells us the resulting state after processing a string starting from a given state. See the following for a formal definition.

Next, we obtain some important consequences of the definition of $\delta^*$ . Let $w$ be an input string. It immediately follows from the definition that the resulting state $q(w)$ is equal to $\delta^*(q_0,w)$ .

The below observation also immediately follows from the definition but will be important later on.

Example 1 (Examples of Observation 1)

Consider the following finite automata.

Here are some examples of Observation 1:

Input string 01011 with prefix 010 and suffix 11, we get
$\delta^*(q_0, 010) = q_2 \text{ and } \delta^*(q_2, 11) = q_1.$
(5)
Equivalently,
$q_0 \xrightarrow{010} q_2 \xrightarrow{11} q_1$
(6)
Input string 111 with prefix 1 and suffix 11, we get
$\delta^*(q_0, 1) = q_2 \text{ and } \delta^*(q_2,11) = q_1.$
(7)
Equivalently,
$q_0 \xrightarrow{1} q_2 \xrightarrow{11} q_1$
(8)
Input string 0011 with prefix 00 and suffix 11, we get
$\delta^*(q_0, 00) = q_2 \text{ and } \delta^*(q_2,11) = q_1.$
(9)
Equivalently,
$q_0 \xrightarrow{00} q_2 \xrightarrow{11} q_1$
(10)

Looking at the above examples more closely, we see a pattern: if two strings $x$ and $y$ have the same resulting state, then appending the same string $z$ to both $x$ and $y$ yield strings $xz$ and $yz$ with the same resulting state. Moreover, it does not matter what $z$ is, i.e. the statement holds true for every $z$ .

In fact, this is not a coincidence and holds for all DFA. We state it more formally in the following lemma.^[7]

Warm up: Ruling out single-state DFAs¶

Lemma 1 is central to the fooling set technique. Before we proceed with the fooling set technique, let’s see how analyzing resulting states is useful. In particular, we will prove the following characterization of the languages recognized by DFAs with a single state. The characterization allows us to rule out single-state DFAs for some languages $L$ : no DFA with 1 state can recongize $L$ if $L$ is not the empty language or the language of all strings.

Distinguishable pairs and distinguishing suffixes¶

Lemma 1 captures the essence of the memorylessness of finite automata. To use it to show that certain DFA cannot recognize a language $L$ , we will need the following definition.

We also sometimes say that $x$ and $y$ are distinguishable with respect to $L$ , and when the language $L$ is clear from context, we also say $x$ and $y$ are distinguishable.

Let us take a look at some examples and exercises to consolidate our understanding of the definition.

Examples of distinguishable pairs and distinguishing suffixes¶

Distinguishable strings have different resulting states¶

Next, we connect the notion of distinguishable strings and Lemma 1 to show that every DFA that recognizes $L$ must satisfy a certain condition.

An immediate consequence of Lemma 3 is that any language with at least one distinguishable pair (such as the ones in the example and exercises above) cannot be recognized by a single-state DFA. Indeed, there are only 2 languages without any distinguishable pair: the empty language and the language of all strings. Observe that this gives a different proof of the characterization of the languages accepted by single-state DFA (Lemma 2).

Fooling sets¶

Next, we show how to use Lemma 3 to rule out DFAs with more than 1 state.

More explicit proofs of Fooling Set Lemma¶

Here’s an alternative proof of the first part of the lemma that is more explicit. See also the last page of the Week 8 Lecture 2 slides for a visual illustration of the argument for $k = 3$ .

Proof

Let $F = \{w_1, \ldots, w_k\}$ be a fooling set for $L$ of size $k$ . Suppose $M$ is a DFA that recognizes $L$ . For each string $w_i$ in $F$ , define $r_i = q(w_i)$ . Thus, $r_1, \ldots, r_k$ are the $k$ resulting states of the $k$ strings $F$ . We now use the fact that every pair of string in $F$ is distinguishable to show that these $k$ states must all be distinct (i.e. no two of them are the same) and thus, $M$ has at least $k$ states.

Consider strings $w_i$ and $w_j$ and their respective resulting states $r_i$ and $r_j$ . Since $F$ is a fooling set, $w_i$ and $w_j$ are distinguishable. Let $z$ be their distinguishing suffix and suppose $w_iz$ is in $L$ and $w_jz$ is not in $L$ .

Since $M$ recognizes $L$ , we get that $M$ accepts $w_iz$ but not $w_jz$ , and so the two strings cannot have the same resulting state, i.e. $q(w_iz) \neq q(w_jz)$ . The Memorylessness Lemma then tells us that $q(w_i) \neq q(w_j)$ .

Applying the same argument to every pair of strings in $F$ and their resulting states, we get that $r_i \neq r_j$ for every $1 \leq i < j \leq k$ . Since no 2 of the $k$ states $r_1, \ldots r_k$ are the same, $M$ must have at least $k$ states.

Next, we give a more direct proof of the second part of the Fooling Set Lemma, without relying on the first part.

Proof

As discussed here, to show $L$ is not regular, we need to show that for every DFA $M$ , there is a counterexample string $x$ for $M$ :

$x$ is in $L$ and $M$ rejects $x$ , or
$x$ is not in $L$ and $M$ accepts $x$ .

Let $M$ be a DFA. Suppose $M$ has $k$ states. We now use the fooling set $F_{k+1}$ of size at least $k+1$ to find a counterexample string for $M$ . Let $w_1, \ldots, w_{k+1}$ be $k+1$ distinct strings from $F_{k+1}$ , and let $r_1, \ldots, r_{k+1}$ be their resulting states.

By the Pigeonhole Principle, two of these resulting states must be the same. Let $w_i$ and $w_j$ be the two strings such that $q(w_i) = q(w_j)$ .

On the one hand, by definition of fooling sets, every pair of strings in $F$ has a distinguishing suffix. Thus, there exists a distinguishing suffix $z^*$ for $w_i$ and $w_j$ . By definition of distinguishing suffix, we have two cases:

$w_iz^*$ is in $L$ and $w_jz^*$ is not in $L$ , or
$w_iz^*$ is not in $L$ and $w_jz^*$ is in $L$

On the other hand, the Memorylessness Lemma tells us that since $q(w_i) = q(w_j)$ , for every string $z$ , we have $q(w_iz) = q(w_jz)$ . In particular, we have $q(w_iz^*) = q(w_jz^*)$ . Since $w_iz^*$ and $w_jz^*$ have the same resulting states, we also have two cases:

$M$ accepts both $w_iz^*$ and $w_jz^*$
$M$ rejects both $w_iz^*$ and $w_jz^*$

In all of these cases, $M$ makes a mistake on either $w_iz^*$ or $w_jz^*$ , i.e. one of them is a counterexample and so $M$ does not recognize $L$ .

Applying the above argument to every DFA, we get that there is no DFA that recognizes $L$ . Therefore, $L$ is not regular.

Examples of fooling sets¶

Strategies for constructing fooling sets¶

There is no sure-fire way of constructing fooling sets for a language $L$ . Here are some general heuristics to try. Each of these were used for the examples and exercises above.

Construct fooling set using prefixes of strings in $L$ .
To construct $F_k$ , consider strings for which it seems a counter that can count up to at least $k$ is needed to distinguish between them.
Often, it is possible to construct a fooling set $F_k$ such that for every string $x \in F_k$ , there is a string $z$ such that $z$ distinguishes $x$ from the other strings in $F_k$ , i.e. either $xz$ is in $L$ but $yz$ is not in $L$ for every other $y$ in $F_k$ , or vice versa.

Every pair must be distinguishable

A common pitfall is constructing a set of strings in which not every pair of strings is distinguishable. The proof needs every pair to be distinguishable as the pigeonhole principle only tells us that there is some pair $x$ and $y$ with the same resulting states^[8], and our proof needs to work no matter what pair it is.

For example, it is easy to design a 2-state finite automata that recognizes the language of even-length bit strings. On the other hand, consider the set of strings $S$ consisting of the empty string $\epsilon$ as well as all odd-length strings. Pairing $\epsilon$ with any odd-length string $x$ gives a distinguishing pair with distinguishing suffix 0. Thus, $S$ contains infinitely many distinguishable pairs but does not imply that the language is not regular.

More examples of fooling sets¶

Fooling sets and minimization¶

The fooling set technique can also be used to prove that a DFA $M$ is minimal for a language $L$ : if there exists a fooling set $F$ for $L$ of size exactly equal to the number of states of $M$ , then $M$ is minimal. This is interesting as the fooling set $F$ is a witness to the minimality of $M$ .

4.6.2Technique 2: Closure Properties¶

Now that we have several languages that we have shown to be nonregular, we switch to a different technique that can often result in a shorter proof but can be trickier to apply.

Recall that regular languages are closed under several operations such as:

complement
intersection

These operations let us take regular languages and create new regular languages:

if $L$ is regular, then $L^c$ is also regular.
if $A$ and $B$ are regular, then so is $A \cap B$ .

We can also use them to prove nonregularity as follows:

if $L^c$ is not regular, then $L$ cannot be regular.
if $A$ is regular and $A \cap B$ is not regular, then $B$ cannot be regular.

Example applications¶

Other operations¶

In general, one can use any operation that the regular languages are closed under. Other operations include:

union
concatenation
Kleene star (aka Kleene closure)

4.6.3Fooling sets vs closure properties¶

Here are the benefits and drawbacks to the two techniques:

If a language $L$ is not regular, one can always prove its nonregularity using fooling sets.^[10] This is not always the case for the closure property technique.^[11]
Even if a language can be proved nonregular using closure properties, it can be quite tricky to find the right languages and operations.
Proofs via closure properties tend to be much shorter.

One can also combine the two. For example, suppose you want to show $B$ is nonregular but are having difficulty finding fooling sets for $B$ . Then, you can try to find a regular language $A$ such that it is easier to find fooling sets for $A \cap B$ . In other words, closure of intersection lets you reduce the task of finding fooling sets for $B$ to finding fooling sets for $A \cap L$ . More generality, it lets you reduce the task of proving nonregularity of $B$ to proving nonregularity of $A \cap B$ .

Footnotes¶

One main difference between these and my lecture notes is that I have tried to avoid proof by contradiction as much as possible as students who have not encountered them before find them very confusing, and I have tried to minimize use of mathematical notation and jargon.
↩
Chandra also discusses the Pumping Lemma approach and the differences between the two approaches.
↩
For the mathematically inclined, we can express the second case more succinctly as $\delta^*(q,w) = \delta(\delta^*(q,x),a)$ .
↩
For a more concrete example, think back to the tally counter from Section 4.1.1. The counter does not remember how many times it has been reset. Pressing the increment counter 5 times when it is in state 0000 always results in 0005.
↩
If you are familiar with Markov chains, this should remind you of a similar property of Markov chains.
↩
For the mathematically inclined, we can express this more succinctly as $\delta^*(q_0,w) = \delta^*(\delta^*(q,x),y)$ .
↩
The name of the lemma is something I came up with, not a standard name used by others.
↩
The pair can depend on the exact specification of $M$ .
↩
What is a tacocat? It’s a 🐈 in a 🌮, of course! Here’s mine.
↩
In fact, the Myhill-Nerode Theorem says that $L$ is not regular if and only if for every $k$ , there is a fooling set of size at least $k$ .
↩
One might say the fooling set technique is fool-proof.
↩

Models of Computation

Regular Expressions

Models of Computation

Summary

4.6Proving Nonregularity

4.6.1Technique 1: Fooling Set Technique¶

What does it mean to prove nonregularity?¶

Memorylessness of DFAs, formalised¶

Warm up: Ruling out single-state DFAs¶

Distinguishable pairs and distinguishing suffixes¶

Examples of distinguishable pairs and distinguishing suffixes¶

Distinguishable strings have different resulting states¶

Fooling sets¶

More explicit proofs of Fooling Set Lemma¶

Examples of fooling sets¶

Strategies for constructing fooling sets¶

More examples of fooling sets¶

Fooling sets and minimization¶

4.6.2Technique 2: Closure Properties¶

Example applications¶

Other operations¶

4.6.3Fooling sets vs closure properties¶