可能的重点：

必会：

CFG 定义（四元组）
写 grammar（给语言写 CFG）
画 parse tree
判断 ambiguity
leftmost / rightmost derivation

进阶：

CFL vs Regular 区别
PDA 基本理解
简单构造 grammar

Chomsky Hierarchy

类型	名称	工具
Type 3	Regular Language	FSM
⭐ Type 2	Context-Free Language	CFG + PDA
Type 1	Context-Sensitive	LBA
Type 0	Recursively Enumerable	Turing Machine

🔹包含关系：$L₀ ⊃ L₁ ⊃ L₂ ⊃ L₃$

🔹 若一个 grammar 是 regular，则 it must be context-free

Definition of Context-Free Grammar

A CFG is a 4-tuple：$G = (Vₙ, Vₜ, P, S)$

符号	含义	示例
$Vₙ$	Non-terminal variables（非终结符）	$A, B, S, <expr>$
$Vₜ$	Terminal symbols（终结符）	$a, b, 0, 1, +, id$
$P$	Productions（产生式规则）	$A → 0A1 \| B$
$S$	Start variable（开始符号）	$S ∈ Vₙ$

约束条件：$V = Vₙ ∪ Vₜ，Vₙ ∩ Vₜ = ∅$

最关键的是 $A \rightarrow \beta$ （左边只有一个非终结符）

Derivation

Write down the start variable
Find a variable + matching rule，replace with RHS
Repeat until no variables remain

符号表示

⇒：direct derivation（直接推导）
⇒*：derives（零步或多步推导）

CFL 属于一种生成语言 (generate strings) CFG 这种特殊的递归语法结构可以推导出任意符合相应语法的字符串

Parse Tree（语法分析树）

Root：starting symbol
Leaf：terminal symbols（no children）
Interior node：non-terminals（with children）

Parse tree 是 derivation 的 alternative representation

两种推导方式

LL → leftmost 每次展开最左边的非终结符
LR → rightmost 每次展开最右边的非终结符

Ambiguity Problem

歧义问题，顾名思义是，有歧义针对相同的字符串，可能有不同的理解定义是：A CFG G is ambiguous if it generates some strings with more than one distinct parse tree

对于相同的字符串有多个语法分析树，语法分析树不同，编译的时候执行的先后顺序就不同，就会有 Ambiguity Problem

例如：

<EXPR> → <EXPR>+<EXPR> | <EXPR>×<EXPR> | a

对于 a+a×a，存在 two different parse trees：

Tree 1: (+ at root)        Tree 2: (× at root)
     +                           ×
    / \\                         / \\
   a   ×                       +   a
      / \\                     / \\
     a   a                   a   a

编译器必须避免 ambiguity

在编译的时候，我们要保证 Each program must have a unique meaning, Interpretation depends on parse tree 所以我们不能有这种 Ambiguity Problem

消除 Ambiguity Problem 的方法

引入优先级层次

🔸 注意：No algorithm 能将 arbitrary ambiguous CFG 转为 unambiguous

🔸 某些 CFL 是inherently ambiguous（本质上二义）

Pushdown Automaton (PDA) 下推自动机

CFL 对应的机器： PDA（带栈的自动机）

Formal Definition（6-tuple）

$PDA = (S, Σ, Γ, δ, s₀, F)$

符号	含义
$S$	finite set of states
$Σ$	input alphabet
$Γ$	stack alphabet
$δ$	transition: $S×Σₑ×Γₑ → P(S×Γₑ)$
$s₀$	start state
$F$	accept states

核心思想是：多了一个 stack

所以可以：

记住数量
处理嵌套结构

PDA for $L = \{0ⁿ1ⁿ | n≥0\}$

策略：

读 0 → push 0 onto stack
读 1 → pop 0 from stack
结束且 stack empty → accept
否则 → reject

注意这个地方字符串的特点是，前半部分是 0 后半部分是 1，两部分字符的个数相同核心思想就是处理字符串的时候用栈来记录 0 的数量，遇到 0 入栈，等到轮到 1 的时候遇到 1 出栈就能保证两部分数量相同

CFG 与 PDA 的等价性

For every CFG, ∃ PDA that recognizes the same language For every PDA, ∃ CFG that generates the same language

Context-Free Language

必会：

进阶：

Chomsky Hierarchy

Definition of Context-Free Grammar

Derivation

符号表示

Parse Tree（语法分析树）

两种推导方式

Ambiguity Problem

编译器必须避免 ambiguity

消除 Ambiguity Problem 的方法

Pushdown Automaton (PDA) 下推自动机

Formal Definition（6-tuple）

PDA for $L = \{0ⁿ1ⁿ | n≥0\}$

CFG 与 PDA 的等价性

关键结论

Comments

必会：

进阶：

Chomsky Hierarchy

Definition of Context-Free Grammar

Derivation

符号表示

Parse Tree（语法分析树）

两种推导方式

Ambiguity Problem

编译器必须避免 ambiguity

消除 Ambiguity Problem 的方法

Pushdown Automaton (PDA) 下推自动机

Formal Definition（6-tuple）

PDA for $L = \{0ⁿ1ⁿ | n≥0\}$

CFG 与 PDA 的等价性

关键结论

Comments

Scan to share on WeChat