In this section, we get a first glimpse at proving impossibility
results. In particular, we will discuss methods for proving a given
language L is nonregular, i.e. that there is no possible deterministic
finite automata that can recognise L.
We begin with the fooling set technique, and then we show how to use
closure properties to leverage the fact that some other language is
already known to be nonregular.
Fooling sets vs Pumping Lemma
Previous offerings of this subject and the majority of resources out
there use a different approach based on the so-called Pumping Lemma
instead of the fooling set technique. However, past experience and
student feedback has shown that it is highly challenging for students
who are not accustomed to proofs. Courses that assume similar levels of
mathematical background (e.g. Stanford CS
103) use
the fooling set technique as it’s a more direct and less abstract. In my
opinion, the fooling set technique also has the benefit that it is
easier to provide intermediate stepping stones, and thus partial marks,
in assessments.
At an intuitive level, to prove a language L is not regular, we need
to exploit the fact that a finite automata has a fixed amount of memory,
independent of the length of the input string. The fooling set technique
allows us to prove lower bounds on the amount of memory that is needed
to recognize L.
In more detail, the technique lets us prove lower bounds on the number
of states a DFA needs to recognize $L$, i.e. it allows us to prove
statements of the form “no DFA with fewer than 5 states can recognize
L”. Since a DFA has a fixed number of states, to show that no DFA can
recognize L, we will prove that for every non-negative integer k,
no DFA with fewer than k states can recognize L.
Let M be a DFA. Recall the definition of the state transition function
δ: given current state q and after processing the next input
symbol a, the DFA M moves to state δ(q,a) that the transition
function δ tells us the next state given its current state and
the next input symbol. Recall also that the resulting state of an
input string w is the state that M ends up in after processing w,
and is denoted by q(w).
We now define the extended transition function δ∗ which tells us
the resulting state after processing a string starting from a given
state. See the following for a formal definition.
Next, we obtain some important consequences of the definition of
δ∗. Let w be an input string. It immediately follows from the
definition that the resulting state q(w) is equal to
δ∗(q0,w).
The below observation also immediately follows from the definition but
will be important later on.
Looking at the above examples more closely, we see a pattern: if two
strings x and y have the same resulting state, then appending the
same string z to both x and y yield strings xz and yz with the
same resulting state. Moreover, it does not matter what z is, i.e. the
statement holds true for every z.
In fact, this is not a coincidence and holds for all DFA. We state it
more formally in the following lemma.[7]
Lemma 1 is central to the fooling set technique. Before we proceed
with the fooling set technique, let’s see how analyzing resulting states
is useful. In particular, we will prove the following characterization
of the languages recognized by DFAs with a single state. The
characterization allows us to rule out single-state DFAs for some
languages L: no DFA with 1 state can recongize L if L is not the
empty language or the language of all strings.
Distinguishable pairs and distinguishing suffixes¶
Lemma 1 captures the essence of the memorylessness of finite
automata. To use it to show that certain DFA cannot recognize a language
L, we will need the following definition.
We also sometimes say that x and y are distinguishable with respect
to L, and when the language L is clear from context, we also say x
and y are distinguishable.
Let us take a look at some examples and exercises to consolidate our
understanding of the definition.
Examples of distinguishable pairs and distinguishing suffixes¶
Distinguishable strings have different resulting states¶
Next, we connect the notion of distinguishable strings and Lemma 1 to
show that every DFA that recognizes L must satisfy a certain
condition.
An immediate consequence of Lemma 3 is that any language with at
least one distinguishable pair (such as the ones in the example and
exercises above) cannot be recognized by a single-state DFA. Indeed,
there are only 2 languages without any distinguishable pair: the empty
language and the language of all strings. Observe that this gives a
different proof of the characterization of the languages accepted by
single-state DFA (Lemma 2).
Here’s an alternative proof of the first part of the lemma that is more
explicit. See also the last page of the Week 8 Lecture 2
slides for a visual illustration of the argument for
k=3.
Next, we give a more direct proof of the second part of the Fooling Set
Lemma, without relying on the first part.
There is no sure-fire way of constructing fooling sets for a language
L. Here are some general heuristics to try. Each of these were used
for the examples and exercises above.
Construct fooling set using prefixes of strings in L.
To construct Fk, consider strings for which it seems a counter
that can count up to at least k is needed to distinguish between
them.
Often, it is possible to construct a fooling set Fk such that for
every string x∈Fk, there is a string z such that z
distinguishes x from the other strings in Fk, i.e. either xz
is in L but yz is not in L for every other y in Fk, or
vice versa.
The fooling set technique can also be used to prove that a DFA M is
minimal for a language L: if there exists a fooling set F for L of
size exactly equal to the number of states of M, then M is minimal.
This is interesting as the fooling set F is a witness to the
minimality of M.
Now that we have several languages that we have shown to be nonregular,
we switch to a different technique that can often result in a shorter
proof but can be trickier to apply.
Recall that regular languages are closed under several operations such
as:
complement
intersection
These operations let us take regular languages and create new regular
languages:
if L is regular, then Lc is also regular.
if A and B are regular, then so is A∩B.
We can also use them to prove nonregularity as follows:
if Lc is not regular, then L cannot be regular.
if A is regular and A∩B is not regular, then B cannot be
regular.
Here are the benefits and drawbacks to the two techniques:
If a language L is not regular, one can always prove its
nonregularity using fooling sets.[10] This is not always the case
for the closure property technique.[11]
Even if a language can be proved nonregular using closure
properties, it can be quite tricky to find the right languages and
operations.
Proofs via closure properties tend to be much shorter.
One can also combine the two. For example, suppose you want to show B
is nonregular but are having difficulty finding fooling sets for B.
Then, you can try to find a regular language A such that it is easier
to find fooling sets for A∩B. In other words, closure of
intersection lets you reduce the task of finding fooling sets for B
to finding fooling sets for A∩L. More generality, it lets you
reduce the task of proving nonregularity of B to proving nonregularity
of A∩B.
One main difference between these and my lecture notes is that I
have tried to avoid proof by contradiction as much as possible as
students who have not encountered them before find them very
confusing, and I have tried to minimize use of mathematical notation
and jargon.
For a more concrete example, think back to the tally counter from
Section 4.1.1. The counter does not remember how many times it has been
reset. Pressing the increment counter 5 times when it is in state
0000 always results in 0005.