Pumping Lemma For Context Free Languages - CFL
Pumping Lemma for Context-Free Languages
Let be an infinite context-free language.
Then there exists some positive integer such that any
can be decomposed as
such that
-
, Pumping portion is small
-
, Length of middle section is bounded
-
for all
Intuitive Explanation
-
The lemma tells us that long enough strings in any context-free language must contain a portion that can be "pumped" (repeated or deleted) and still produce valid strings in the same language.
-
This happens because context-free grammars have finite sets of variables but can generate infinite strings.
-
So, when we derive a very long string using a parse tree, some variable must repeat along some path in the tree.
-
This repeated variable allows us to repeat or remove the corresponding part of the string — giving the “pumping” property.
Proof
Consider the language , and let be a context-free grammar for without λ-productions or unit-productions.
Let the maximum length of the right-hand side of any production in be .
That is, for each production
Since is infinite, it has derivations of arbitrarily long length and hence derivation trees of arbitrarily large height.
Step 1: Repetition of a variable along a path
Let the number of variables in be .
In any derivation tree of height greater than ,
there must be some variable that repeats on a single path from the root to a leaf (by the pigeonhole principle).
Suppose a variable occurs twice along a path,once near the top and once lower down.
Schematically, the derivation has the form:
where are strings of terminals.
Step 2: Structure of the repeated variable
From this derivation we see:
and
Thus, the variable can derive a string containing itself (surrounded by and ),
and it can also derive a string consisting only of terminals.
This gives the pattern that allows “pumping” of the substrings and .
Step 3: Pumping mechanism
Replacing the lower occurrence of with its derivation repeatedly yields:
Finally, replacing the last by gives the sequence of strings:
Each of these strings is generated by , and therefore belongs to .
Step 4: Bounds on the substring lengths
Since the grammar has a finite number of variables and each production is of finite length, the size of the section between the two occurrences of in the derivation tree can be bounded by some integer .
Thus, we can ensure that:
and, because has no λ-productions or unit-productions, both and cannot be empty simultaneously:
Step 5: Conclusion
Hence, for every with ,
we can decompose it as satisfying:
-
-
-
, for all
This completes the proof.
🧠 Key Takeaways
Property | Explanation |
---|---|
Purpose | The pumping lemma for CFLs is used to prove that a language is not context-free. |
Based on | Repetition of variables in long derivations (parse tree argument). |
Limitations | It gives a necessary, but not sufficient, condition — i.e., some non-CFLs may still satisfy it. |
Main idea | Strings in a CFL can be “pumped” in two places (v,y) while remaining valid. |
Comments
Post a Comment