Friday, November 21, 2008

Happy Birthday, Magritte!

At the end of the lecture we were left with a tantalizing little puzzle to think about (which is great for me, because it gives me something to think about during the walk home - I feel much more efficient that way). Prof. Heap, near the end of class, told us that the language with an equal number of zeros and ones couldn't be expressed by any regular expression, so I took it upon myself to think of a proof on the way home.

Let L be said language, and let Q be a DFSA that expresses L. Let n be the number of states in Q (there must be a finite number, by the 'F' in DFSA).

Let x be the string consisting of n+1 0's concatenated.
I assert that there must be at least n+1 states between the state at x, and the accepting state. This string is able to reach the accepting state, we know, if n+1 1's are appended to it.

We can prove this result by contradiction. Let j and k be strings consisting entirely of 1's, both of whose lengths are greater than 0 and less than n+1, and such that |j| < |k|, and sigma(xj1) = sigma(xk1) = q0. Then let q1 = sigma(q0, 1), q2 = sigma(q1, 1), q(m+1) = sigma(qm, 1)

let p (running out of letters here) equal (n+1) - |k|. then qp must be the accepting state, since it represents n+1 0's, concatenated with |k| + [(n+1) - |k|] = n+1 1's. However, this state also accepts the string that is n+1 0's concatenated with |j| + [(n+1) - |k|] < n+1 1's. Here, the number of 1's and 0's are obviously not equal, so we've arrived at a contradiction.

Then the DFSA passes through at least n+1 states. However, this contradicts our assumption that the DFSA has only n states. Therefore, L cannot be represented by a DFSA, and by extension, cannot be represented by a regular expression.

Man, that was a kinda ugly proof, in no small part because of the alphabet soup. Maybe it would just be easier to explain my insights informally. My idea was simply that, to represent the language would require an infinite number of states (I found it easier to envisage as a DFSA than a regex), since there would need to be at least one state for each difference in 0's and 1's (intuitively, anyways).

If there were such a thing as a deterministic infinite state automata (hey, it doesn't sound as crazy as being nondeterministic), it seems like it would be pretty easy to write. It could just be a ladder climbing infinitely in both directions with one symbol taking you up and another taking you down, and the accepting state being right in the middle.

No comments: