Posts

Showing posts from June, 2025

Formal definition of Context-Free Grammer

 Here’s the formal definition of a Context-Free Grammar (CFG). Definition A Context-Free Grammar is a 4-tuple: G = ( V , Σ , R , S ) G = (V, \Sigma, R, S) where: V V  — A finite set of nonterminal symbols (or variables). Σ \Sigma  — A finite set of terminal symbols , such that V ∩ Σ = ∅ V \cap \Sigma = \varnothing R R  — A finite set of production rules of the form A → α A \rightarrow \alpha where A ∈ V A \in V  and α ∈ ( V ∪ Σ ) ∗ \alpha \in (V \cup \Sigma)^*  (that is, α \alpha  is any finite string of terminals and/or nonterminals, possibly empty). S S  — A distinguished element of V V , called the start symbol . ✅ Key condition that makes it “context-free” : In every production A → α A \rightarrow \alpha , the left-hand side consists of exactly one nonterminal symbol A . Characteristics of CFGs Each production’s LHS must be a single nonterminal (unlike context-sensitive grammars). They can describe nested a...

Regular Grammers

Image
  Regular Grammar A Regular Grammar is the most restricted type of grammar in the Chomsky Hierarchy . It generates exactly the class of regular languages (the same languages accepted by finite automata and described by regular expressions ). Formally, a grammar G = ( V , Σ , R , S ) G = (V, \Sigma, R, S)  is regular if all productions in R R  are of one of the following forms: Right-linear form : A → a B or A → a or A → ε A \rightarrow aB \quad \text{or} \quad A \rightarrow a \quad \text{or} \quad A \rightarrow \varepsilon where A , B ∈ V A, B \in V , a ∈ Σ a \in \Sigma . Left-linear form : A → B a or A → a or A → ε A \rightarrow Ba \quad \text{or} \quad A \rightarrow a \quad \text{or} \quad A \rightarrow \varepsilon Note: A grammar must be entirely right-linear or left-linear to be regular. Mixing both may lead to a CFG that is not regular. Intuition Regular grammars correspond to finite automata (DFA/NFA). Productions represent transitions ...

Designing Context-Free Grammers

Designing a Context-Free Grammar (CFG) is really about thinking recursively and breaking a language into smaller, replaceable parts. Here’s a step-by-step method explaining how to design CFGs. 1. Understand the Structure of the Language Identify patterns in the strings you want to generate. Look for recursion or nesting (these are strong hints that the language is context-free). Ask: “What must always match or balance?” Example 1: Balanced a ’s and b ’s: { a n b n   ∣   n ≥ 0 } \{ a^n b^n \ | \ n \ge 0 \} { a n b n   ∣   n ≥ 0 } Pattern: Same number of a ’s followed by same number of b ’s. Recursive idea: If aabb is in the language, then so is aa…bb with more pairs. 2. Choose Nonterminals for Main Components Start with S (start symbol). If the language has subparts, create separate variables for them. Example 2: Arithmetic expressions might need: V = { E , T , F } V = \{ E, T, F \} where E = expression, T = term, F = fa...

Ambiguous Grammars

Ambiguity in grammars is a very important concept in Context-Free Grammars (CFGs) , especially because it directly affects parsing, compiler design, and language clarity .  🔹 What is Ambiguity in a Grammar? A context-free grammar (CFG) is said to be ambiguous if at least one string in the language it generates has more than one distinct parse tree (or equivalently, more than one leftmost or rightmost derivation). 👉 Formally: A CFG G G  is ambiguous if ∃    w ∈ L ( G ) such that w       has two or more different parse trees. \exists \; w \in L(G) \quad \text{such that} \quad w \;\; \text{has two or more different parse trees.} 🔹 Why is Ambiguity a Problem? In programming languages , ambiguity makes parsing uncertain : Example: Does a + b * c mean (a + b) * c or a + (b * c) ? Compilers require deterministic parsing → ambiguous grammars must be avoided or rewritten. 🔹 Example of an Ambiguous Grammar Consider the C...

Resolving Ambiguity

  🔹 What Does Resolving Ambiguity Mean? A CFG is ambiguous if some string has more than one parse tree. To resolve ambiguity , we rewrite the grammar so that every valid string has exactly one parse tree (i.e., unambiguous grammar). 🔹 Methods to Resolve Ambiguity ✅ (a) Enforce Operator Precedence Ensure * is evaluated before + . Example: Ambiguous grammar: E → E + E    ∣    E ∗ E    ∣    ( E )    ∣    a E \to E + E \;|\; E * E \;|\; (E) \;|\; a Unambiguous grammar (with precedence): E → E + T    ∣    T E \to E + T \;|\; T T → T ∗ F    ∣    F T \to T * F \;|\; F F → ( E )    ∣    a F \to (E) \;|\; a Now in a + a * a , multiplication ( * ) happens before addition ( + ). So only one parse tree is possible. ✅ (b) Enforce Operator Associativity Ambiguity also arises in expressions like a - b - c . Should it be (a - b) - c (left-associative) or a - (b - c) (right-associative)? We rewrite grammar for left-associativity : E → E − T    ∣    T E \to E - ...

CFGs in Programming Languages

  🔹 Why CFGs in Programming Languages? Programming languages (like C, Java, Python) have complex syntax : Nested structures: if … then … else … Balanced symbols: { } , ( ) , [ ] Expressions with operator precedence and associativity Loops, function calls, recursive definitions 👉 Regular grammars (or finite automata) are not powerful enough to handle such constructs. CFGs are powerful enough because they naturally describe hierarchical and nested structures . 🔹  CFG as the Basis of Syntax A Context-Free Grammar formally defines the syntax rules of a programming language. Example (simplified arithmetic grammar): E → E + T ∣ T E \to E + T \mid T T → T ∗ F ∣ F T \to T * F \mid F F → ( E ) ∣ id F \to (E) \mid \text{id} This grammar: Defines valid arithmetic expressions. Enforces precedence ( * before + ) and associativity (left-to-right). ✅ So, id + id * id is valid. ❌ But + * id id is not. 🔹 Role in Compiler Design Compilers use CFGs i...