Removing useless symbols and productions

 

1. What are “useless symbols”?

A useless symbol in a CFG is a non-terminal or terminal that does not contribute to deriving any string of terminals from the start symbol.

There are two types of useless symbols:

  1. Non-generating symbols

    • Symbols that cannot derive any terminal string.

    • Example: If a non-terminal never leads to terminals, it is useless.

  2. Non-reachable symbols

    • Symbols that cannot be reached from the start symbol.

    • Even if they could generate terminals, if the start symbol never reaches them, they are useless.

We remove both types to simplify the grammar.


2. Step 1: Remove Non-Generating Symbols

Algorithm :

  1. Mark all terminal symbols as generating.

  2. Iteratively mark non-terminals A as generating if there exists a production:

    AαA \rightarrow \alpha

    where every symbol in α is already generating.

  3. Repeat until no new generating symbols are found.

  4. Remove all non-generating symbols and any production containing them.


Example

CFG:

SAB | a AaA | B Bb CcC

Step 1: Identify generating symbols

  • Terminals: a, b, c → generating

  • Check non-terminals:

    • B → b → B is generating

    • A → aA (a is terminal, A? not yet generating) → keep checking

    • A → B → B generating → A is generating

    • S → AB | a → both A & B generating → S generating

    • C → cC → c is terminal, C? C depends on itself → never fully generating → C is non-generating

Remove C and its productions:

SAB | a AaA | B Bb

3. Step 2: Remove Non-Reachable Symbols

Algorithm :

  1. Start with the start symbol S as reachable.

  2. Iteratively mark symbols reachable if they appear on the right-hand side of a production of a reachable non-terminal.

  3. Remove all non-reachable symbols and their productions.


Example (continuing)

CFG after Step 1:

SAB | a AaA | B Bb
  • Start with S → reachable

  • S → AB → A, B reachable

  • A → aA | B → A and B already reachable

  • B → b → B already reachable

✅ All symbols are reachable → nothing removed in this step.


Concept of a Dependency Graph

  • Vertices: Each vertex represents a variable (non-terminal) in the CFG.

  • Edges: There is an edge from C → D if C has a production containing D, i.e., CxDyC \rightarrow x D y, where x,yx, y are strings of terminals and/or variables.

  • Interpretation:

    • If there is a path from S (start symbol) to some variable V, then V is reachable.

    • If no path exists, V is unreachable → useless → remove it.

This is essentially a graph reachability problem.

Consider the Grammer

S->.aS|A
A->a
B->aa
Dependency Graph

B is a non reachable state so remove the useless production.
S->.aS|A
A->a


4. Final Simplified CFG

SAB | a AaA | B Bb
  • Useless symbol C removed.

  • Productions only involve symbols contributing to terminal strings derivable from S.


5. Summary of Steps 

  1. Remove non-generating symbols

    • Symbols that can never produce a terminal string.

  2. Remove non-reachable symbols

    • Symbols that the start symbol cannot reach.

  3. Result: A simplified CFG generating exactly the same language.


6. Application to NPDA → CFG

After converting an NPDA to a CFG:

  • Many non-terminals like [q_i X q_j] may never appear in derivations from S → non-reachable → remove.

  • Some non-terminals may never derive terminals, e.g., intermediate states that cannot pop the stack fully → non-generating → remove.

✅ This makes the NPDA-derived CFG much smaller and manageable, while still generating the same language.

Comments

Popular posts from this blog

Theory Of Computation PCCST302 KTU Semester 3 BTech 2024 Scheme

Non deterministic Finite Automata NFA

Formal Definition - Turing Machine