Chomsky Hierarchy

Chomsky Hierarchy — one of the most fundamental concepts in Automata Theory, originally proposed by linguist Noam Chomsky in 1956.

This hierarchy classifies formal languages (and the machines that recognize them) into four levels — from the simplest to the most powerful.


🧭 Overview of the Chomsky Hierarchy

TypeGrammar TypeLanguage TypeAutomaton / MachineExamples
Type 0Unrestricted GrammarRecursively Enumerable LanguagesTuring Machin{ a^n b^n c^n
Type 1Context-Sensitive GrammarContext-Sensitive LanguagesLinear Bounded Automaton (LBA){ a^n b^n c^n}
Type 2Context-Free GrammarContext-Free LanguagesPushdown Automaton (PDA)`{ a^n b^n}
Type 3Regular GrammarRegular LanguagesFinite Automaton (FA)`{ a^n}

🧩 1. Type 3 – Regular Languages

🔹 Grammar:

Regular Grammar has rules of the form:

  • AaBA \rightarrow aB

  • AaA \rightarrow a

  • (sometimes AϵA \rightarrow \epsilon)

where A,BA, B are non-terminal symbols and aa is a terminal symbol.

🔹 Machine:

Recognized by a Finite Automaton (FA):

  • DFA (Deterministic Finite Automaton) or NFA (Nondeterministic Finite Automaton).

  • Works with a finite number of states and no external memory.

🔹 Example:

L={ab}={ϵ,a,ab,aab,aaab,}L = \{ a^*b^* \} = \{ \epsilon, a, ab, aab, aaab, \ldots \}

Grammar:

S → aS | bS | ε

🔹 Applications:

  • Lexical analysis in compilers

  • Pattern matching (regular expressions)

  • Text search (grep, regex engines)


🧮 2. Type 2 – Context-Free Languages (CFLs)

🔹 Grammar:

Context-Free Grammar (CFG) has rules of the form:

AαA \rightarrow \alpha

where AA is a single non-terminal, and α\alpha is a string of terminals and/or non-terminals.

🔹 Machine:

Recognized by a Pushdown Automaton (PDA)
a finite automaton with a stack (memory).

🔹 Example:

L={anbnn1}L = \{ a^n b^n \mid n \ge 1 \}

Grammar:

S → aSb | ab

🔹 Idea:

The stack is used to “remember” how many as were seen before matching bs.

🔹 Applications:

  • Syntax analysis (parsing) in compilers

  • Programming language grammar definitions

  • Nested structures (like parentheses, HTML tags)


⚙️ 3. Type 1 – Context-Sensitive Languages (CSLs)

🔹 Grammar:

Context-Sensitive Grammar (CSG) has rules of the form:

αAβαγβ\alpha A \beta \rightarrow \alpha \gamma \beta

where:

  • AA is a non-terminal,

  • α,β,γ\alpha, \beta, \gammaare strings of terminals/non-terminals,

  • and γ1|\gamma| \ge 1 (the rule cannot make the string shorter).

So the replacement of AA may depend on its context (the symbols around it).

🔹 Machine:

Recognized by a Linear Bounded Automaton (LBA)
a Turing Machine whose tape is bounded by the input length (it cannot use infinite tape space).

🔹 Example:

L={anbncnn1}L = \{ a^n b^n c^n \mid n \ge 1 \}

Grammar sketch:

S → aSBC | abc CB → BC aB → ab bB → bb bC → bc cC → cc

🔹 Applications:

  • Natural language processing (some human languages are context-sensitive)

  • Theoretical study of resource-bounded computation


🧠 4. Type 0 – Recursively Enumerable Languages

🔹 Grammar:

Unrestricted Grammar
rules can be of the general form:

αβ\alpha \rightarrow \beta

where α,β\alpha, \beta are strings of terminals and non-terminals,
and α\alpha contains at least one non-terminal.

There are no restrictions on how rewriting happens.

🔹 Machine:

Recognized by a Turing Machine (TM).
This is the most general and most powerful computational model.

🔹 Example:

Languages that can be recognized by a Turing machine, e.g.:

L={aibjcki×j=k}L = \{ a^i b^j c^k \mid i \times j = k \}

This cannot be handled by simpler automata.

🔹 Applications:

  • All computable problems

  • Defines the boundary of what can be computed algorithmically


⚖️ Hierarchy Relationships

RegularContext-FreeContext-SensitiveRecursively Enumerable\text{Regular} \subset \text{Context-Free} \subset \text{Context-Sensitive} \subset \text{Recursively Enumerable}

That means:

  • Every regular language is also context-free.

  • Every context-free language is also context-sensitive.

  • Every context-sensitive language is also recursively enumerable.

But the reverse is not true.

Think of the hierarchy as a ladder of computational power:

Regular < Context-Free < Context-Sensitive < Recursively Enumerable (FA) (PDA) (LBA) (TM)

As we move up the hierarchy:

  • The grammar rules become more general.

  • The automaton becomes more powerful.

  • The languages become more complex.



We have also met several other language families that can be fitted into this picture. Including the families of deterministic context-free languages(LDCF) and recursive languages (LREC), we arrive at the extended hierarchy shown in the following figure.


Comments

Popular posts from this blog

Theory Of Computation PCCST302 KTU Semester 3 BTech 2024 Scheme

Non deterministic Finite Automata NFA

Formal Definition - Turing Machine