Theory of Computation PCCST302 KTU 2024 Scheme - Model Question Paper and Answers

 



When you google something, the search engine essentially performs pattern matching: it identifies and retrieves web pages containing the search keyword(s), then displays them in a predetermined (typically relevance-based) order. Cast this pattern matching as a decision problem. Also write down the language corresponding to your decision problem

🔍 Step 1: What happens when you Google something

When you type a keyword, say "Turing Machine", Google:

  1. Looks through its index of web pages (a huge database of text).

  2. Searches for matches — web pages that contain the keyword "Turing Machine".

  3. Returns those pages (ranked by relevance).

This is a pattern-matching task — find whether the keyword appears in a document or not.


💡 Step 2: Casting this as a Decision Problem

In automata theory, a decision problem is one that has a yes/no answer.

So, we can define the problem as follows:

Decision Problem:
Given a web page (or document) ww and a search keyword (pattern) pp,
decide whether the keyword pp appears anywhere in the web page ww.

✅ Output YES — if pp appears in ww
❌ Output NO — if pp does not appear in ww


Formal Definition

Let’s define it formally as a decision problem:

PATTERN-MATCHING Problem:
Input: Two strings pp and ww, where pp is the pattern (keyword) and ww is the text (web page content).
Question: Does pp occur as a substring of ww?


🧩 Step 3: Defining the Language

Every decision problem can be represented as a language
the set of all inputs for which the answer is “YES”.

So, for this problem, the language is:

Lpattern={p,wthe pattern p occurs as a substring in w}L_{pattern} = \{ \langle p, w \rangle \mid \text{the pattern } p \text{ occurs as a substring in } w \}

That is:

  • Each element of the language is a pair ⟨pattern, text⟩.

  • The pair is in the language if and only if the pattern appears in the text.


🧠 Example

Let’s take:

  • p="data"p = "data"

  • w="Ilovedatascience"w = "I love data science"

Since “data” occurs in the web page text,

"data","Ilovedatascience"Lpattern\langle "data", "I love data science" \rangle \in L_{pattern}

But:

  • p="robot"p = "robot"

  • w="Ilovedatascience"w = "I love data science"

Then,

"robot","Ilovedatascience"Lpattern\langle "robot", "I love data science" \rangle \notin L_{pattern}

⚙️ Step 4: Is this Problem Decidable?

✅ Yes — pattern matching is decidable.

Because:

  • There exists a finite, mechanical procedure (like the Knuth-Morris-Pratt algorithm or even a simple scan)
    that can determine in finite time whether pp appears in ww.

So, a Turing machine can be constructed to:

  1. Read both pp and ww

  2. Slide pp along ww

  3. Compare character by character

  4. Halt and accept if a match is found, else reject


📘 Final Answer 

ConceptDescription
Decision ProblemGiven a pattern pp and text ww, determine whether pp appears in w.
LanguageLpattern={p,wp occurs as a substring of w}L_{pattern} = \{ \langle p, w \rangle \mid p \text{ occurs as a substring of } w \}
OutputYES (accept) if pp is in ww; NO (reject) otherwise.
DecidabilityDecidable — a Turing Machine can check this in finite time.


Construct a DFA for the language consisting of strings over the alphabet Σ = {a, b} that contains no more than one occurrence of the string aa. (Note that the string aaa contains two occurrences of aa.)


Is the class of languages recognized by NFAs closed under complement? Explain your answer.
NFAs themselves are not directly closed under complement — you can’t just flip final and non-final states.
However, the class of languages recognized by NFAs (the regular languages) is closed under complement,
because any NFA can be converted to a DFA, and DFAs are closed under complement.

Thus, we can always convert the NFA to an equivalent DFA, complement that DFA, and get an NFA (since DFAs are special cases of NFAs).

Give a context-free grammar that generates the language: Lpair = {a^i b^j c^k} | i = j or j = k, where i, j, k ≥ 0}

this language is the union of two simple context-free patterns:

  • L1={anbnckn,k0} (equal number of aa's and bb's, arbitrary number of cc's), and

  • L2={aibncni,n0}L_2=\{a^i b^n c^n \mid i,n\ge 0\} (equal number of bb's and cc's, arbitrary number of aa's).

A straightforward CFG for Lpair=L1L2L_{\text{pair}}=L_1\cup L_2

is obtained by giving one nonterminal for each case and making the start symbol choose between them.



Grammar

Nonterminals: S,S1,S2,A,B,C,D
Terminals: {a,b,c}
Start symbol: S

Productions:

S → S1 | S2 
S1 → A B 
A → ε | a A b (generates a^n b^n) 
B → ε | c B (generates c^k) 

 S2 → C D 
C → ε | a C (generates a^i) 
D → ε | b D c (generates b^n c^n)

Construct a PDA to recognize the language generated by the
following grammar:
S → aA
A → aABC | bB | a
B → b
C → c

PDA = (Q, Σ, Γ, δ, q, S, accept-by-empty-stack) where:

  • Q = {q}

  • Σ = {a, b, c}

  • Γ = {S, A, B, C, a, b, c} (we push terminals on the stack too)

  • Start state q, initial stack symbol S. Accept when input is consumed and stack is empty.

Transitions (informal description):

  1. For each production X → α, add an ε-transition that replaces X on top of stack by the string α (pushed so that the leftmost symbol of α becomes the new top at the time of matching). Concretely (we list pop X then push the RHS in reverse so the terminal that should be matched next is on top):

  • δ(q, ε, S) ⊢ (q, A a) (replace S by aA, so push A then a so a is on top)

  • δ(q, ε, A) ⊢ (q, C B A a) (replace A by aABC, push in reverse: C, B, A, a so a is top)

  • δ(q, ε, A) ⊢ (q, B b) (replace A by bB, push B then b)

  • δ(q, ε, A) ⊢ (q, a) (replace A by a)

  • δ(q, ε, B) ⊢ (q, b) (replace B by b)

  • δ(q, ε, C) ⊢ (q, c) (replace C by c)

  1. For matching terminals: when the top of the stack is a terminal equal to the current input symbol, consume it and pop it:

  • δ(q, a, a) ⊢ (q, ε) (read a, pop a)

  • δ(q, b, b) ⊢ (q, ε) (read b, pop b)

  • δ(q, c, c) ⊢ (q, ε) (read c, pop c)

(Other combinations have no moves.)

Trace for input aaabc

Convention: show each configuration as (remaining input, stack) with the stack top on the left. Start with stack containing the start symbol S. Accept by empty stack.

Initial: (aaabc, [S])

  1. Apply S → aA (replace S with aA, pushing A then a so a is top):
    (aaabc, [a, A])

  2. Match terminal a (consume input a, pop a):
    (aabc, [A])

  3. Apply A → aABC (replace A with a A B C, pushing C, B, A, a so a is top):
    (aabc, [a, A, B, C])

  4. Match terminal a (consume a, pop a):
    (abc, [A, B, C])

  5. Apply A → a (replace A with terminal a, push a):
    (abc, [a, B, C])

  6. Match terminal a (consume a, pop a):
    (bc, [B, C])

  7. Replace B → b (push terminal b):
    (bc, [b, C])

  8. Match terminal b (consume b, pop b):
    (c, [C])

  9. Replace C → c (push terminal c):
    (c, [c])

  10. Match terminal c (consume c, pop c):
    (ε, [])

Stack is empty and input is consumed → accepted.

If G is a context free grammar and w is a string of length l in L(G), how long is a derivation of w in G, if G is in Chomsky normal form? How would your answer change if G is in Greibach normal form?

  • If GG is in Chomsky Normal Form (CNF) and wL(G)w\in L(G) has length w=l>0|w|=l>0, then any complete derivation of ww uses exactly 2l12l-1 production applications (steps).(If w=εw=\varepsilon and CNF allows SεS\to\varepsilon as the only ε-rule, that derivation has 1 step.)

  • If GG is in Greibach Normal Form (GNF) and w=l|w|=l, then there is a leftmost derivation of ww of exactly ll steps (one production per terminal). GNF cannot generate ε\varepsilon except by special convention, so the l=0l=0 case is separate.

Brief proofs / reasoning:

CNF. In CNF every non-start production is either ABCA\to BCor AaA\to a. Let xx be the number of uses of ABCA\to BC in a derivation and yy the number of uses of AaA\to a. Each ABCA\to BC increases the number of nonterminals on the sentential form by +1+1, while each AaA\to a decreases it by 11. Starting with one nonterminal SS and ending with zero nonterminals gives

1+xy=0y=x+1.1 + x - y = 0 \quad\Rightarrow\quad y = x+1.

But every terminal in the final string is produced by an AaA\to a application, so y=ly=l. Hence x=l1x=l-1 and the total number of productions applied is

x+y=(l1)+l=2l1.

GNF. In GNF every production has the form AaαA\to a\alpha where aa is a terminal and α\alpha is (possibly empty) a string of nonterminals. In a leftmost derivation each application replaces the leftmost nonterminal by a production whose first symbol is a terminal, thereby producing exactly one terminal in the leftmost part of the sentential form. So to produce ll terminals you need exactly ll leftmost steps. Thus there exists (and naturally we take) a leftmost derivation of length ll.

For each of the following decision problems about a Turing machine M, indicate whether it is decidable or not.
1. Does M take more than 1008 steps on some input?
2. Does M accept ε?
3. Is L(M) context free?

1. Algorithm: enumerate all strings xx with x1008|x|\le 1008 (finitely many); simulate MM on each xx for 1009 steps. If any simulation reaches step 1009, answer YES; if all simulations finish within ≤1008 steps (or reject) then answer NO. This procedure always halts and is correct. Hence the property is decidable

2. This is the classical acceptance problem restricted to a fixed input. The general acceptance problem
ATM={M,wM accepts w} is undecidable. Reduce from it: given M,w\langle M,w\rangle build a machine MM' that on input ε\varepsilon first writes wwon its tape (or otherwise simulates having ww as input) and then simulates MM on ww; MM' accepts ε\varepsilon iff MM accepts ww. Thus a decider for “does a TM accept ε\varepsilon?” would decide ATM\mathsf{A_{TM}}, contradiction. So undecidable.

3.This is a nontrivial language property about the (recursively enumerable) language of a TM. By Rice’s theorem any nontrivial semantic property of the language recognized by a TM (a property that holds for some recursively enumerable languages and fails for others) is undecidable. Being context-free is such a nontrivial property: there exist RE languages that are context-free and RE languages that are not context-free. Therefore it is undecidable to test, given⟨M⟩, whether L(M) is a context-free language.

Let L1 and L2 be Turing-recognizable languages over the same alphabet Σ. Prove that L1 ∩ L2 is also Turing-recognizable.

💡 Step 1. What it means to be Turing-recognizable

A language LL is Turing-recognizable (or recursively enumerable) if there exists a Turing Machine (TM) that:

  • Accepts every string in LL.

  • May run forever (does not have to halt) for strings not in LL.

So the machine doesn’t have to decide; it only needs to recognize.


💡 Step 2. Given two such languages

Let:

  • M1M_1 recognize L1L_1

  • M2M_2 recognize L2L_2

We want a new TM MM that recognizes L1L2L_1 \cap L_2, i.e.,
it accepts a string only if both M1M_1 and M2M_2 accept it.


💡 Step 3. The key idea — run both machines in parallel

If we just run M1M_1 completely first, it might never halt!
So instead, we simulate both gradually:

  • On input ww:

    1. Run M1M_1 for 1 step.

    2. Run M2M_2 for 1 step.

    3. Run M1M_1 for another step.

    4. Run M2M_2 for another step.

    5. Continue alternating forever (this is called dovetailing).

Whenever both machines have accepted ww, our new machine accepts.


💡 Step 4. Why this works

  • If ww is in both L1 and L2L_2:
    Eventually, both M1M_1 and M2M_2 will accept — so our new machine will also accept.

  • If ww is not in both:
    At least one of M1M_1 or M2M_2 will never accept,
    so our machine will never reach the “both accepted” condition — it runs forever.
    (That’s allowed for recognizers.)


Therefore:
The intersection of two Turing-recognizable languages is also Turing-recognizable.


Comments

Popular posts from this blog

Theory Of Computation PCCST302 KTU Semester 3 BTech 2024 Scheme

Non deterministic Finite Automata NFA

Formal Definition - Turing Machine