Grammars in Automata

 

What is a grammar (in automata)?

A grammar is a formal tool  that generates strings of a language by repeatedly applying production rules.

Formal definition
A grammar is a 4-tuple G=(V,Σ,R,S)G=(V,\Sigma,R,S):

  • VV: finite set of nonterminals (variables)

  • Σ\Sigma: finite set of terminals (alphabet), VΣ=V\cap\Sigma=\varnothing

  • RR: finite set of productions (rewrite rules)

  • SVS\in V: start symbol

Applying productions creates derivations (leftmost/rightmost), which can be shown by parse trees. The set of all terminal strings derivable from is the languageL(G).

Why grammars matter:

  • They’re the generative counterpart to automata (the recognizers).

  • Each grammar class corresponds to a machine model: finite automata, pushdown automata, linear-bounded automata, Turing machines.


The Chomsky hierarchy (types of grammars)

Type-3: Regular Grammars (RG)

Rule shape (choose one style consistently):

  • Right-linear: AaBaεA\to aB \mid a \mid \varepsilon

  • Left-linear: ABaaεA\to Ba \mid a \mid \varepsilon

Power / Automaton: exactly the regular languagesDFA/NFAregular expressions.
Use: token patterns (lexical analysis).
Example: L=(ab)L=(ab)^*

  • SaAεS\to aA \mid \varepsilon, AbSA\to bS

Note: For some definitions, ε\varepsilon is allowed only if SS doesn’t appear on any RHS and εL\varepsilon\in L.


Type-2: Context-Free Grammars (CFG)

Rule shape: AαA\to \alpha where AVA\in V and α(VΣ)\alpha\in(V\cup\Sigma)^*

 (LHS is a single nonterminal).

Power / Automaton: context-free languagespushdown automata.
Use: programming-language syntax (nested/recursive structures), parse trees/ASTs.
Normal forms: CNF (Chomsky normal form), GNF (Greibach).
Example: L={anbnn0}L=\{a^n b^n\mid n\ge 0\}

  • SaSbεS\to aSb \mid \varepsilon

Ambiguity: a CFG is ambiguous if some string has ≥2 parse trees (e.g., EE+EEEidE\to E+E\mid E*E\mid id). Disambiguate by stratifying precedence:

  • EE+TT,  TTFF,  F(E)idE\to E+T\mid T,\; T\to T*F\mid F,\; F\to (E)\mid id


Type-1: Context-Sensitive Grammars (CSG)

Rule shape (length nondecreasing): αAβαγβ\alpha A \beta \to \alpha\,\gamma\,\beta with γ1|\gamma|\ge 1. (No ε\varepsilon-productions, except a special case for SS.)

Power / Automaton: context-sensitive languageslinear-bounded automata (LBA).
Use: languages requiring synchronized counts beyond a single stack.
Normal form: Kuroda normal form.
Classic example: L={anbncnn1} (not CFL). One standard CSG (intuition: generate markers, then “commute” and translate):

  • SaSBCabcS \to aSBC \mid abc

  • CBBCCB \to BC

  • aBab,  bBbb,  bCbc,  cCccaB \to ab,\; bB \to bb,\; bC \to bc,\; cC \to cc

(High-level idea: each new aa introduces a B,CB,C pair; swap CC past BB with CBBCCB\to BC, then rewrite B,CB,Cinto extra b,cb,c symbols, keeping counts equal.)


Type-0: Unrestricted (Phrase-Structure) Grammars

Rule shape: αβ\alpha \to \beta where α,β(VΣ)\alpha,\beta \in (V\cup\Sigma)^* and α\alpha contains at least one nonterminal.

Power / Automaton: recursively enumerable languagesTuring machines (most general).
Use: theoretical completeness; not practical for parsing.
Note: Constructions typically simulate a TM; explicit examples are verbose.


How grammars relate to automata (at a glance)

Grammar TypeRule RestrictionLanguage ClassEquivalent MachineExample Language
Type-3 Regularlinear (left or right)RegularDFA/NFA(ab)(ab)^*, strings ending in 1
Type-2 Context-Freesingle nonterminal on LHSContext-freePushdown Automaton{anbn}\{a^n b^n\}, balanced parentheses
Type-1 Context-Sensitivenondecreasing length, context-limitedContext-sensitiveLinear-Bounded Automaton{anbncn}\{a^n b^n c^n\}
Type-0 Unrestrictednone (except LHS has a nonterminal)Recursively enumerableTuring Machineall TM-recognizable languages

Comments

Popular posts from this blog

Theory Of Computation PCCST302 KTU Semester 3 BTech 2024 Scheme

Non deterministic Finite Automata NFA

Example DFAs University Questions