Skip to article frontmatterSkip to article content

4.6Proving Nonregularity

The University of Melbourne

In this section, we get a first glimpse at proving impossibility results. In particular, we will discuss methods for proving a given language LL is nonregular, i.e. that there is no possible deterministic finite automata that can recognise LL.

We begin with the fooling set technique, and then we show how to use closure properties to leverage the fact that some other language is already known to be nonregular.

4.6.1Technique 1: Fooling Set Technique

At an intuitive level, to prove a language LL is not regular, we need to exploit the fact that a finite automata has a fixed amount of memory, independent of the length of the input string. The fooling set technique allows us to prove lower bounds on the amount of memory that is needed to recognize LL.

In more detail, the technique lets us prove lower bounds on the number of states a DFA needs to recognize $L$, i.e. it allows us to prove statements of the form “no DFA with fewer than 5 states can recognize LL”. Since a DFA has a fixed number of states, to show that no DFA can recognize LL, we will prove that for every non-negative integer kk, no DFA with fewer than kk states can recognize LL.

What does it mean to prove nonregularity?

Recall that a language LL is regular if and only if there is a DFA MM that recognizes it.

Recall also that a DFA MM recognizes a language LL if and only if for every input xx:

Thus, we get the following definition of nonrecognition.

This then leads to the following definition of nonregularity.

Memorylessness of DFAs, formalised

Let MM be a DFA. Recall the definition of the state transition function δ\delta: given current state qq and after processing the next input symbol aa, the DFA MM moves to state δ(q,a)\delta(q,a) that the transition function δ\delta tells us the next state given its current state and the next input symbol. Recall also that the resulting state of an input string ww is the state that MM ends up in after processing ww, and is denoted by q(w)q(w).

We now define the extended transition function δ\delta^* which tells us the resulting state after processing a string starting from a given state. See the following for a formal definition.

Next, we obtain some important consequences of the definition of δ\delta^*. Let ww be an input string. It immediately follows from the definition that the resulting state q(w)q(w) is equal to δ(q0,w)\delta^*(q_0,w).

The below observation also immediately follows from the definition but will be important later on.

Looking at the above examples more closely, we see a pattern: if two strings xx and yy have the same resulting state, then appending the same string zz to both xx and yy yield strings xzxz and yzyz with the same resulting state. Moreover, it does not matter what zz is, i.e. the statement holds true for every zz.

In fact, this is not a coincidence and holds for all DFA. We state it more formally in the following lemma.[7]

Warm up: Ruling out single-state DFAs

Lemma 1 is central to the fooling set technique. Before we proceed with the fooling set technique, let’s see how analyzing resulting states is useful. In particular, we will prove the following characterization of the languages recognized by DFAs with a single state. The characterization allows us to rule out single-state DFAs for some languages LL: no DFA with 1 state can recongize LL if LL is not the empty language or the language of all strings.

Distinguishable pairs and distinguishing suffixes

Lemma 1 captures the essence of the memorylessness of finite automata. To use it to show that certain DFA cannot recognize a language LL, we will need the following definition.

We also sometimes say that xx and yy are distinguishable with respect to LL, and when the language LL is clear from context, we also say xx and yy are distinguishable.

Let us take a look at some examples and exercises to consolidate our understanding of the definition.

Examples of distinguishable pairs and distinguishing suffixes

Distinguishable strings have different resulting states

Next, we connect the notion of distinguishable strings and Lemma 1 to show that every DFA that recognizes LL must satisfy a certain condition.

An immediate consequence of Lemma 3 is that any language with at least one distinguishable pair (such as the ones in the example and exercises above) cannot be recognized by a single-state DFA. Indeed, there are only 2 languages without any distinguishable pair: the empty language and the language of all strings. Observe that this gives a different proof of the characterization of the languages accepted by single-state DFA (Lemma 2).

Fooling sets

Next, we show how to use Lemma 3 to rule out DFAs with more than 1 state.

More explicit proofs of Fooling Set Lemma

Here’s an alternative proof of the first part of the lemma that is more explicit. See also the last page of the Week 8 Lecture 2 slides for a visual illustration of the argument for k=3k = 3.

Next, we give a more direct proof of the second part of the Fooling Set Lemma, without relying on the first part.

Examples of fooling sets

Strategies for constructing fooling sets

There is no sure-fire way of constructing fooling sets for a language LL. Here are some general heuristics to try. Each of these were used for the examples and exercises above.

  1. Construct fooling set using prefixes of strings in LL.

  2. To construct FkF_k, consider strings for which it seems a counter that can count up to at least kk is needed to distinguish between them.

  3. Often, it is possible to construct a fooling set FkF_k such that for every string xFkx \in F_k, there is a string zz such that zz distinguishes xx from the other strings in FkF_k, i.e. either xzxz is in LL but yzyz is not in LL for every other yy in FkF_k, or vice versa.

More examples of fooling sets

Fooling sets and minimization

The fooling set technique can also be used to prove that a DFA MM is minimal for a language LL: if there exists a fooling set FF for LL of size exactly equal to the number of states of MM, then MM is minimal. This is interesting as the fooling set FF is a witness to the minimality of MM.

4.6.2Technique 2: Closure Properties

Now that we have several languages that we have shown to be nonregular, we switch to a different technique that can often result in a shorter proof but can be trickier to apply.

Recall that regular languages are closed under several operations such as:

  1. complement

  2. intersection

These operations let us take regular languages and create new regular languages:

  1. if LL is regular, then LcL^c is also regular.

  2. if AA and BB are regular, then so is ABA \cap B.

We can also use them to prove nonregularity as follows:

  1. if LcL^c is not regular, then LL cannot be regular.

  2. if AA is regular and ABA \cap B is not regular, then BB cannot be regular.

Example applications

Other operations

In general, one can use any operation that the regular languages are closed under. Other operations include:

  1. union

  2. concatenation

  3. Kleene star (aka Kleene closure)

4.6.3Fooling sets vs closure properties

Here are the benefits and drawbacks to the two techniques:

  1. If a language LL is not regular, one can always prove its nonregularity using fooling sets.[10] This is not always the case for the closure property technique.[11]

  2. Even if a language can be proved nonregular using closure properties, it can be quite tricky to find the right languages and operations.

  3. Proofs via closure properties tend to be much shorter.

One can also combine the two. For example, suppose you want to show BB is nonregular but are having difficulty finding fooling sets for BB. Then, you can try to find a regular language AA such that it is easier to find fooling sets for ABA \cap B. In other words, closure of intersection lets you reduce the task of finding fooling sets for BB to finding fooling sets for ALA \cap L. More generality, it lets you reduce the task of proving nonregularity of BB to proving nonregularity of ABA \cap B.

Footnotes
  1. One main difference between these and my lecture notes is that I have tried to avoid proof by contradiction as much as possible as students who have not encountered them before find them very confusing, and I have tried to minimize use of mathematical notation and jargon.

  2. Chandra also discusses the Pumping Lemma approach and the differences between the two approaches.

  3. For the mathematically inclined, we can express the second case more succinctly as δ(q,w)=δ(δ(q,x),a)\delta^*(q,w) = \delta(\delta^*(q,x),a).

  4. For a more concrete example, think back to the tally counter from Section 4.1.1. The counter does not remember how many times it has been reset. Pressing the increment counter 5 times when it is in state 0000 always results in 0005.

  5. If you are familiar with Markov chains, this should remind you of a similar property of Markov chains.

  6. For the mathematically inclined, we can express this more succinctly as δ(q0,w)=δ(δ(q,x),y)\delta^*(q_0,w) = \delta^*(\delta^*(q,x),y).

  7. The name of the lemma is something I came up with, not a standard name used by others.

  8. The pair can depend on the exact specification of MM.

  9. What is a tacocat? It’s a 🐈 in a 🌮, of course! Here’s mine.

  10. In fact, the Myhill-Nerode Theorem says that LL is not regular if and only if for every kk, there is a fooling set of size at least kk.

  11. One might say the fooling set technique is fool-proof.