Building CFG from Push Down Automata

June 18, 2025

1. Goal of NPDA → CFG Conversion

For every context-free language (CFL) there exists a CFG that generates it, and vice versa. To prove this formally, he provides a method to convert an NPDA $M$ into an equivalent CFG $G$ .

NPDA: $M = (Q, Σ, Γ, δ, q_{0}, z, F)$
- $Q$ = set of states
- $\Sigma$ = input alphabet
- $\Gamma$ = stack alphabet
- $\delta$ = transition function
- $q_0$ = initial state
- $z$ = initial stack symbol
- $F$ = set of final states
CFG: $G = (V, \Sigma, P, S)$
- $V$ = set of non-terminals
- $\Sigma$ = terminal alphabet (same as NPDA input)
- $P$ = set of productions
- $S$ = start symbol

2. Simplifying Assumptions

For ease of conversion (as Linz recommends), assume:

Single final state $q_f$ .
Acceptance by empty stack when reaching $q_f$ . That is, the NPDA accepts a string $w$ if starting from $(q_{0}, w, z) it can reach$ $(q_f, \lambda, \lambda)$ .
Every transition into the final state must empty the stack, i.e., $\delta(q, a, X) \ni (q_f, \lambda)$

These assumptions do not change the language accepted, and they make the CFG construction systematic.

3. Core Idea

We want to define CFG non-terminals that describe how the NPDA processes input while manipulating the stack.

Non-terminal: $[q_i X q_j]$
Meaning: “The NPDA can go from state $q_i$ to state $q_j$ , popping $X$ from the stack and leaving the rest of the stack unchanged, while reading some input string.”
These non-terminals capture NPDA computations over segments of input, corresponding to popping or pushing stack symbols.

4. CFG Components

(a) Non-terminals $V$

V = \{ [q_i X q_j] \mid q_i, q_j \in Q, X \in \Gamma \}

Each $[q_i X q_j]$ represents consuming input that removes X from the stack while moving from $q_i$ to $q_j$ .
This allows the CFG to mimic the NPDA’s stack behavior using productions.

(b) Start Symbol $S$

S \rightarrow [q_0 z q_j] \quad \text{for all } q_j \in Q

Start symbol simulates the NPDA starting in $q_0$ with stack $z$ .
Any $q_j$ represents the state after the NPDA finishes processing $z$ (final acceptance).

(c) Production Rules $P$

For each NPDA transition:

\delta(q_i, a, X) \ni (q_k, Y_1 Y_2 \dots Y_m)

$q_i$ = current state
$a \in \Sigma \cup \{\lambda\}$ = input symbol read
$X$ = stack top symbol
$q_k$ = next state
$Y_1 \dots Y_m$ = symbols pushed onto the stack (may be empty, $m=0$ )

Cases

Case 1: Pop operation (m=0)

\delta(q_i, a, X) \ni (q_k, \lambda) \quad\Rightarrow\quad [q_i X q_k] \rightarrow a

Pop X and read input a (or λ).
No intermediate stack symbols; the NPDA goes directly from $q_i$ to $q_k$ .

Case 2: Push one symbol (m=1)

\delta(q_i, a, X) \ni (q_k, Y_1) \quad\Rightarrow\quad [q_i X q_j] \rightarrow a [q_k Y_1 q_j] \quad\forall q_j \in Q

NPDA replaces X with one stack symbol $Y_1$ .
Intermediate state $q_j$ represents where we finish after consuming Y₁.

Case 3: Push multiple symbols (m ≥ 2)

[q_i X q_j] \rightarrow a [q_k Y_1 q_{l_1}] [q_{l_1} Y_2 q_{l_2}] \dots [q_{l_{m-1}} Y_m q_j]

NPDA replaces X with $Y_1 \dots Y_m$ (stack growth).
Intermediate states $q_{l_1}, \dots, q_{l_{m-1}}$ represent intermediate computation points.
Number of rules: $∣ Q ∣^{m - 1 for each NPDA transition.}$

Important: This is why NPDA → CFG conversion can produce very large grammars.

5. Example from Linz: $L = {a^{n} b^{n}}$

NPDA

States: $Q = \{q_0, q_1\}$
Input: $\Sigma = \{a, b\}$
Stack: $\Gamma = \{z, A\}$
Initial: $q_0, z$
Accept: by empty stack

Transitions:

δ	Meaning
δ(q₀, a, z) = (q₀, Az)	push A over z
δ(q₀, a, A) = (q₀, AA)	push A over A
δ(q₀, b, A) = (q₁, λ)	pop A on b
δ(q₁, b, A) = (q₁, λ)	pop A on b
δ(q₁, λ, z) = (q₁, λ)	pop z at end

CFG Non-terminals

$[q_i X q_j]$ , $q_i,q_j \in \{q_0,q_1\}, X \in \{z,A\}$

Start Symbol

S \rightarrow [q_0 z q_1]

Productions

NPDA Transition	CFG Rule
δ(q₀, a, z) → (q₀, Az)	[q₀ z q₀] → a [q₀ A q₀][q₀ z q₀] [q₀ z q₀] → a [q₀ A q₁][q₁ z q₀] [q₀ z q₁] → a [q₀ A q₀][q₀ z q₁] [q₀ z q₁] → a [q₀ A q₁][q₁ z q₁]
δ(q₀, a, A) → (q₀, AA)	[q₀ A q₀] → a [q₀ A q₀][q₀ A q₀] [q₀ A q₀] → a [q₀ A q₁][q₁ A q₀] [q₀ A q₁] → a [q₀ A q₀][q₀ A q₁] [q₀ A q₁] → a [q₀ A q₁][q₁ A q₁]
δ(q₀, b, A) → (q₁, λ)	[q₀ A q₁] → b
δ(q₁, b, A) → (q₁, λ)	[q₁ A q₁] → b
δ(q₁, λ, z) → (q₁, λ)	[q₁ z q₁] → λ

This grammar generates exactly strings $a^n b^n$
The complexity arises from multiple intermediate states $q_{l_1}, …, q_{l_{m-1}}$ , but conceptually, every non-terminal represents a computation segment in the NPDA.

6. Summary of Steps

Start from NPDA (simplified: single final state, acceptance by empty stack).
Create non-terminals $[q_i X q_j]$ for all states and stack symbols.
Start symbol: S → [q₀ z q_j].
Add productions according to transitions:
- Pop only → direct terminal
- Push one symbol → one intermediate non-terminal
- Push multiple symbols → sequence of non-terminals for all possible intermediate states
Resulting CFG generates the same language as NPDA.

✅ Key Idea: Each CFG non-terminal simulates the effect of NPDA transitions on stack and input. This shows formally that every NPDA-recognizable language is context-free, proving the equivalence.

Search This Blog

Theory Of Computation PCCST302 KTU Semester 3 BTech 2024 Scheme