Removing ε-productions

 

1. What are ε-productions?

  • An ε-production is a production rule of the form:

    AϵA \rightarrow \epsilon

    where AA is a variable (non-terminal).

  • It means that the non-terminal can vanish during derivations.

Example:

SaSb    ϵS \rightarrow aSb \;|\; \epsilon

Here, SS can disappear, allowing strings like "ab", "aabb", etc.


2. Why are ε-productions problematic?

ε-productions are not “wrong” but they make grammars messy and hard to analyze:

  1. Ambiguity in parsing:

    Since ε can appear “out of nowhere,” parsers need to handle “empty moves,” complicating parsing algorithms.

  2. Conversion to Normal Forms (CNF, GNF):

    • Chomsky Normal Form (CNF) does not allow ε-productions, except possibly SϵS \rightarrow \epsilon if the language contains the empty string.

    • To transform CFGs into CNF (needed for many theoretical proofs and parsing algorithms like CYK), ε-productions must be eliminated.

  3. Cleaner Grammar:

    Eliminating ε-productions gives a simpler grammar where every derivation step is “visible” (produces at least one symbol).


3. Method 

Step 1: Identify nullable variables

  • A variable AA is nullable if AϵA \Rightarrow^* \epsilon

  • Directly nullable: if AϵA \rightarrow \epsilon

  • Indirectly nullable: if ABCA \rightarrow BC and both BB and CC are nullable.

Step 2: Modify productions

  • For every production that contains a nullable variable, add new productions where that variable is “dropped.”

  • Keep doing this for all nullable variables.

Step 3: Remove the original ε-productions

  • Except keep SϵS \rightarrow \epsilon if the language requires ε.


4. Example

Grammar with ε-productions:

SAB    b,Aa    ϵ,Bc    ϵS \rightarrow AB \;|\; b, \quad A \rightarrow a \;|\; \epsilon, \quad B \rightarrow c \;|\; \epsilon

Step 1: Find nullable variables

  • AϵA \Rightarrow \epsilon, so A is nullable.

  • BϵB \Rightarrow \epsilon, so B is nullable.

  • SABϵS \rightarrow AB \Rightarrow \epsilon so S is nullable.

Step 2: Add alternatives by removing nullable symbols

  • From SABS \rightarrow AB:

    • Drop A: SBS \rightarrow B

    • Drop B: SAS \rightarrow A

    • Drop both: SϵS \rightarrow \epsilon

  • From A → a | ε → A → a

  • From B → c | ε → B → c

Step 3: Remove ε-productions (except S if needed)

Final grammar (without ε-productions, but still equivalent):

SAB    A    B    b    ϵ      Aa   BcS \rightarrow AB \;|\; A \;|\; B \;|\; b \;|\; \epsilon \\ A \rightarrow a \\ B \rightarrow c

5. Significance of eliminating ε-productions

  • Essential for CNF and GNF conversion (core proofs in theory of computation).

  • Simplifies parsing (no need to handle “disappearing” symbols).

  • Makes grammar more deterministic and easier to reason about.

  • Still preserves the original language (possibly keeping ε if it belongs to the language).


In summary: Eliminating ε-productions makes CFGs cleaner, easier to parse, and transformable into standard normal forms, which are crucial for proving results like decidability, parsing algorithms, and equivalence between CFGs and PDAs.

Example

We have the grammar:

SaS1b,S1aS1bϵS \to aS_1 b, \quad S_1 \to aS_1 b \mid \epsilon


Step 1: Identify nullable variables

  • S1ϵS_1 \to \epsilon, so S1S_1 is nullable.

  • Therefore, anywhere S1S_1 appears, we must consider the option of omitting it.


Step 2: Modify productions

From SaS1bS \to a S_1 b

  • If S1S_1 remains: SaS1bS \to a S_1 b

  • If S1S_1 vanishes: SabS \to ab

So now we have:

SaS1babS \to a S_1 b \mid ab


From S1aS1bϵS_1 \to a S_1 b \mid \epsilon

  • First rule: S1aS1bS_1 \to a S_1 b

    • If the middle S1S_1: S1abS_1 \to ab

    • If it remains: keep S1aS1bS_1 \to a S_1 b

  • Second rule S1ϵS_1 \to \epsilon: this is the ε-production we will remove later.

So S1productions become:

S1aS1babS_1 \to a S_1 b \mid ab

Step 3: Remove ε-productions

We delete S1ϵS_1 \to \epsilon, since we’ve already accounted for it by adding alternatives.


✅ Final Grammar (without ε-productions)

SaS1b    abS \to a S_1 b \;\mid\; ab
S1aS1b    abS_1 \to a S_1 b \;\mid\; ab


Check

  • Original grammar generates: anbn,n1a^n b^n, n \ge 1.

  • New grammar still generates:

    • abab (for n=1n=1)

    • aabba a b b (via SaS1b,S1abS \to a S_1 b, S_1 \to ab)

    • aaabbba a a b b b (via nested S1S_1)

    • and so on. ✅

So the language is preserved.

Example

Original grammar

S → A B a C 
A → B C 
B → b | ε 
C → D | ε 
D → d

1. Find nullable variables

A variable XX is nullable if XεX \Rightarrow^* \varepsilon

  • BεB \to \varepsilon ⇒ B nullable.

  • CεC \to \varepsilon ⇒ C nullable.

  • ABCA \to B C and B,C nullable ⇒ A nullable.

  • DdD \to d ⇒ D not nullable.

  • SS contains terminal a so S not nullable.

Nullable set: {A, B, C}.


2. Remove ε-productions — add alternatives by omitting nullable variables

Handle S → A B a C
A, B, C are nullable. Consider all ways to keep/omit them but keep the terminal a. This yields:

S → A B a C 
S → B a C 
S → A a C 
S → A B a 
S → a C 
S → B a 
S → A a 
S → a

Handle A → B C
Both B and C nullable, so include:

A → B C 
A → B 
A → C

(We do not re-introduce A → ε because A is not the start symbol; only S→ε would be preserved if originally S were nullable.)

Handle B and C
Remove their ε-rules and keep terminal rules:

B → b
C → D
D → d

3. Resulting grammar with no ε-productions

  • S → A B a C | B a C |A a C | A B a | a C | B a |A a | a 

  •  A → B C | B |C 

  •  B → b 

  •  C → D 

  •  D → d

    No ε-productions remain.

  • Language is preserved (e.g., S → a is present since original could produce a).

  • Next usual simplification steps (optional): eliminate unit productions (rules of the form X → Y) and then remove any useless symbols if present.

Comments

Popular posts from this blog

Theory Of Computation PCCST302 KTU Semester 3 BTech 2024 Scheme

Non deterministic Finite Automata NFA

Formal Definition - Turing Machine