1. What is a Symbol Table?

符号表是什么？

Symbol Table is one of the most important data structures in a compiler.

符号表是编译器中最重要的数据结构之一。

It stores information about identifiers in the source program.

它用来存储源程序中各种 标识符（identifiers） 的信息。

A symbol table records the names used in a program and their associated attributes.

符号表记录程序中出现的名字，以及这些名字相关的属性信息。

2. Why do we need a Symbol Table?

为什么需要符号表？

Programs contain many named entities, such as:

variables
procedures / functions
data objects
classes

程序中有很多被命名的对象，比如：

变量
过程 / 函数
数据对象
类

The compiler must keep track of them.

编译器必须跟踪并管理这些名字。

核心理解

编译器不是只“看到一个名字”就结束了，它必须知道：

this identifier refers to what

这个标识符到底代表什么
what type it has

它是什么类型
where it is valid

它在哪个作用域有效
where it is stored

它存在哪里

3. Main functions of the Symbol Table

符号表的主要作用

(1) Store declarations

保存声明信息

The compiler stores information about declared objects.

编译器要保存程序中所有被声明对象的信息。

(2) Support identifier checking

支持标识符检查

When an identifier is used, the compiler checks:

whether it has been declared
whether it is used correctly
whether the type is correct

当一个标识符被使用时，编译器要检查：

是否已经声明
使用是否合法
类型是否正确

(3) Help code generation

帮助代码生成

The symbol table may store the store address of an object.

符号表还可能保存对象的 存储地址，方便后续生成目标代码。

4. What information is stored?

符号表中存什么？

这个是考试很容易考的重点。

(1) Identifier itself

标识符本身

The symbol table must store the identifier name itself.

符号表必须保存标识符名字本身。

Two possible methods:

store the string directly

直接存字符串
store a pointer to a separate string area

存指针，指向独立字符串区

难点理解

直接存字符串简单，但可能浪费空间。

Using a separate string area is more space-efficient.

使用独立字符串区通常更节省空间。

(2) Type information

类型信息

The symbol table stores type information.

符号表要保存 类型信息。

Examples:

Boolean
Integer
Float

More complex cases:

arrays
procedures / functions
records

Arrays

For an array, we may need:

number of subscripts / dimensions
subscript type
range of subscripts

对于数组，可能要存：

维数 / 下标个数
下标类型
下标范围

Procedures / Functions

For a procedure or function, we may need:

number of parameters
parameter types
return type (for functions)

对于过程 / 函数，可能要存：

参数个数
参数类型
返回值类型（函数）

Records

For a record, we may need:

field names
field types

对于记录体 / 结构体，可能要存：

字段名
字段类型

(3) Scope and lifetime

作用域和生存期

The symbol table stores scope and lifetime information.

符号表要保存 作用域（scope） 和 生存期（lifetime） 信息。

这两个词要分清：

scope = where the identifier can be used

作用域 = 这个名字在哪些地方能用
lifetime = how long the object exists

生存期 = 这个对象存在多久

(4) Store address

存储地址

The symbol table may also store the store address.

符号表还可以保存对象的存储地址。

(5) Other information

其他信息

Possible extra information includes:

pointers for printing identifiers in alphabetical order
lists of declaration/use positions

额外还可能保存：

为了按字母顺序输出标识符而设置的指针
标识符声明 / 使用位置列表

5. Efficiency requirements

效率要求

这部分很容易变成简答题。

Searching must be efficient

查找必须高效

The compiler repeatedly searches the symbol table when identifiers appear.

编译器每次遇到标识符时，都可能要查符号表。

Adding new entries must also be efficient

插入新条目也必须高效

When a new identifier is encountered, it must be added quickly.

当第一次遇到新标识符时，需要高效插入。

Meaning:

A symbol table must support efficient lookup and efficient insertion.

符号表必须支持高效查找和高效插入。

6. How is it used during compilation?

编译过程中怎么使用符号表？

课件提到两种方式。

Method 1

Lexical analyzer recognizes identifiers, and syntax analyzer accesses the symbol table.

词法分析器先识别 identifier，再由语法分析器去访问符号表。

Method 2

The lexical analyzer directly searches/inserts into the symbol table and returns a reference.

词法分析器直接查找/插入符号表，然后把表项引用返回给语法分析器。

你要抓住的点

无论哪种方式，本质都一样：

identifier is recognized
symbol table is searched
new entry may be inserted
later phases use the stored information

也就是：

先识别名字
再查表
必要时插入
后续阶段继续使用这些信息

7. Pre-loading the symbol table

预先装入符号表

The symbol table may be partially built before compilation starts.

符号表可以在正式编译前先装入一些内容。

Examples:

standard I/O routines in Pascal
imported package symbols in Java

例如：

Pascal 的标准输入输出函数
Java 中 import 进来的包符号

重点理解

符号表不一定只放“用户自己写的名字”，也可以预装：

built-in names
library routines
imported symbols

8. Data structures for implementation

符号表可用什么数据结构实现？

The PPT mentions:

binary trees
hash tables

Meaning:

A symbol table can be implemented using data structures such as binary trees or hash tables.

符号表可以用二叉树或哈希表等数据结构实现。

难点提醒

这里老师更看重的是你知道：

why efficiency matters

为什么效率重要
why these structures are suitable

为什么这些结构适合高效查找和插入

9. Scope, Block, and Scoping Rules

作用域、块、作用域规则

这部分是 重点 + 难点。

Scope

Every declaration has a scope.

每个声明都有自己的 作用域。

Block

A block is a program unit that defines the scope of declarations.

块（block） 是限定声明作用范围的程序单元。

Scoping rules

Scoping rules determine which declaration an identifier use refers to.

作用域规则 决定某次标识符使用到底对应哪一个声明。

中文直白理解

当程序里同名变量很多时，编译器必须根据 block 和 scoping rules 判断：

现在看到的这个 x，到底是哪一个 x。

10. Three block structures

三种块结构

这个地方很容易考比较题。

(1) Monolithic blocks

The whole program is one block.

整个程序只有一个块。

Characteristics:

an identifier can be declared only once
every identifier must be declared
no local scope

特点：

标识符只能声明一次
每个标识符都必须先声明
没有局部作用域

(2) Flat blocks

There are global and local blocks, but no deeper nesting.

有全局和局部两层，但没有更深嵌套。

Characteristics:

global declarations cannot be duplicated
local declarations in the same block cannot be duplicated
an identifier must have either a local or global declaration

特点：

全局声明不能重复
同一局部块内不能重复声明
使用时必须能在局部或全局找到声明

(3) Nested blocks

Blocks can be nested inside other blocks.

块可以嵌套在别的块里。

Used in languages such as:

Pascal
Ada
C
Java

Characteristics:

an identifier can only be declared once in the same block
when used, it must be declared in the current block or an enclosing block

特点：

同一个块中一个名字只能声明一次
使用时必须能在当前块或外层块中找到声明

11. Most-Closely Nested Rule

12. What must the Symbol Table do in nested scopes?

在嵌套作用域下，符号表必须完成什么任务？

(1) Find the correct declaration

找到正确声明

Given an occurrence of an identifier, the symbol table must determine which declaration it refers to.

给定一个标识符出现位置，符号表必须判断它对应哪个声明。

(2) Prevent duplicate declarations at the same lexical depth

防止同一词法层次重复声明

An identifier should not be declared more than once at the same lexical depth.

在同一 lexical depth 中，同名标识符不能重复声明。

(3) Handle hiding / shadowing correctly

正确处理遮蔽

The symbol table must allow inner declarations to hide outer ones according to scoping rules.

符号表必须根据作用域规则正确处理内层对外层的遮蔽。

13. Simple stack organization

简单栈式组织

The PPT mentions a simple stack organization for local variables.

课件提到可以用一种简单的 栈式组织 来处理局部变量。

Idea:

when entering a block/procedure, allocate space
when leaving, deallocate it

思想是：

进入一个块/过程时分配空间
离开时回收空间

难点理解

课件进一步指出：

it may be better not to physically remove symbol table entries, but to make them inaccessible outside the block.

更好的做法不一定是把表项物理删除，而是让它们在块外 不可访问。

你要会讲

The important issue is visibility, not necessarily physical deletion.

关键问题是“可见性”，不一定是“物理删除”。

14. 考前最该背的高频句

下面这些句子很适合考试直接写。

定义类

A symbol table is a compiler data structure used to store identifiers and their attributes.

符号表是编译器中用来存储标识符及其属性的数据结构。

作用类

It supports declaration processing, semantic checking, and code generation.

它支持声明处理、语义检查和代码生成。

信息类

A symbol table may store identifier names, type information, scope, lifetime, and store addresses.

符号表可以存储标识符名字、类型信息、作用域、生存期和存储地址。

效率类

Efficient lookup and insertion are essential for symbol table implementation.

高效查找和高效插入是符号表实现中的关键要求。

作用域类

Scoping rules determine which declaration is associated with a given identifier use.

作用域规则决定一个标识符使用对应哪个声明。

15. 重点与难点总表

Key Points vs Difficult Points

Key Points 重点

definition of symbol table

符号表定义
functions of symbol table

符号表作用
information stored in symbol table

符号表保存的信息
efficiency requirements

效率要求
possible data structures

可用数据结构
scope / block / scoping rules

作用域 / 块 / 作用域规则
monolithic, flat, nested blocks

三种块结构
most-closely nested rule

最近嵌套规则

Difficult Points 难点

difference between scope and lifetime

scope 和 lifetime 的区别
how identifier lookup works in nested blocks

嵌套块中如何查找标识符
how shadowing works

遮蔽如何发生
why symbol table entries may remain but become inaccessible

为什么条目可以不删除，但块外不可见

16. 一段考前速记版

Final Exam Quick Review

Symbol Table is a key compiler data structure that stores identifiers and their attributes.

符号表 是编译器核心数据结构，用于保存标识符及其属性。

Its main functions are:

storing declarations
checking identifier usage
supporting code generation

它的主要作用是：

保存声明信息
检查标识符使用
支持代码生成

A symbol table may contain:

identifier name
type information
scope
lifetime
store address

符号表中可能包含：

标识符名字
类型信息
作用域
生存期
存储地址

It must support efficient lookup and efficient insertion.

它必须支持 高效查找 和 高效插入。

In nested block languages, identifier lookup follows the most-closely nested rule: the nearest enclosing declaration is used.

在嵌套块语言中，标识符查找遵循 最近嵌套规则：使用最近的外层声明。

1. What is a Symbol Table?

符号表是什么？

2. Why do we need a Symbol Table?

为什么需要符号表？

核心理解

3. Main functions of the Symbol Table

符号表的主要作用

(1) Store declarations

(2) Support identifier checking

(3) Help code generation

4. What information is stored?

符号表中存什么？

(1) Identifier itself

难点理解

(2) Type information

Arrays

Procedures / Functions

Records

(3) Scope and lifetime

这两个词要分清：

(4) Store address

(5) Other information

5. Efficiency requirements

效率要求

Searching must be efficient

Adding new entries must also be efficient

Meaning:

6. How is it used during compilation?

编译过程中怎么使用符号表？

Method 1

Method 2

你要抓住的点

7. Pre-loading the symbol table

预先装入符号表

重点理解

8. Data structures for implementation

符号表可用什么数据结构实现？

Meaning:

难点提醒

9. Scope, Block, and Scoping Rules

作用域、块、作用域规则

Scope

Block

Scoping rules

中文直白理解

10. Three block structures

三种块结构

(1) Monolithic blocks

(2) Flat blocks

(3) Nested blocks

11. Most-Closely Nested Rule

最近嵌套规则

中文解释

关键词

例子理解

12. What must the Symbol Table do in nested scopes?

在嵌套作用域下，符号表必须完成什么任务？

(1) Find the correct declaration

(2) Prevent duplicate declarations at the same lexical depth

(3) Handle hiding / shadowing correctly

13. Simple stack organization

简单栈式组织

难点理解

你要会讲

14. 考前最该背的高频句

定义类

作用类

信息类

效率类

作用域类

最近嵌套规则

15. 重点与难点总表

Key Points vs Difficult Points

Key Points 重点

Difficult Points 难点

16. 一段考前速记版

Final Exam Quick Review

Comments

Scan to share on WeChat