Skip to content

Commit

Permalink
Update lec7
Browse files Browse the repository at this point in the history
  • Loading branch information
RangerNJU authored and LaplaceDem0n committed Dec 4, 2020
1 parent 9eb2d16 commit a298040
Show file tree
Hide file tree
Showing 42 changed files with 333 additions and 187 deletions.
Binary file added .gitbook/assets/Ex4-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .gitbook/assets/Ex4-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .gitbook/assets/Ex4-3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .gitbook/assets/Ex4-4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .gitbook/assets/Ex4-5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .gitbook/assets/Ex4-6.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
16 changes: 15 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,20 @@ Getting started with static program analysis. Read this and start writing your f

静态程序分析入门。阅读此书并着手编写你的第一个静态程序分析器吧!

## 我想听到你的声音

- **批评的意见很有价值。** 这是我第一次书写教程,一定有很多做得不好的地方。如果你觉得我写得不好,可以选择提issue或者通过邮箱联系我(ranger.nju#gmail.com)。
- 如果你觉得我写得不错,可以到GitHub仓库中给我一个Star,也可以在自己的社交圈子中宣传,让更多的人了解这个项目。


## 更新记录与里程碑事件记录

1. Oct, 2020. 设立Github Repo
2. 16th, Oct. 第一次得到Star,第一次被Fork
3. 29th, Oct . 更新第七课《过程间分析》至第四章
4. 30th, Oct. 处理第一次PR,更新文档结构


# 这一《静态程序分析》教程对谁有用?

学生,开发者,研究者……几乎所有当代生活者都能从中受益。
Expand Down Expand Up @@ -54,7 +68,7 @@ Getting started with static program analysis. Read this and start writing your f
- 空指针引用与内存泄漏等:几乎每个程序编写者都被这两个问题所困扰过
2. 提高程序安全性
- Private information leak, injection attack, etc.
- 隐私信息泄漏:TODO
- 隐私信息泄漏:这一问题在移动应用中较为普遍,如果你感兴趣,可以参考[这篇论文](https://www.ieee-security.org/TC/SP2012/posters/ScanDal.pdf)
- [注入攻击](https://en.wikipedia.org/wiki/Code_injection):这是网络安全中非常常见的议题。不熟悉的读者可以查看[W3School](https://www.w3schools.com/sql/sql_injection.asp)[Wiki](https://en.wikipedia.org/wiki/SQL_injection)上关于SQL注入攻击的例子。
3. 为编译优化提供基础技术
- Dead code elimination, code motion, etc.
Expand Down
10 changes: 2 additions & 8 deletions SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,7 @@
* [初见——静态分析是什么?](ch1/01-01-whats-spa.md)
* [中间表示——静态分析器的输入](ch1/01-02-ir.md)
* [第二章-数据流分析——应用](ch2/README.md)
* [数据流分析](ch2/02-00-dataflow-analysis.md)
* [到达定值分析](ch2/02-01-reaching-def-analysis.md)
* [活跃变量分析](ch2/02-02-live-var-analysis.md)
* [可用表达式分析](ch2/02-03-available-exp-analysis.md)
* [第三章-数据流分析——理论](ch3/README.md)
* [数据流分析](ch3/03-00-dataflow-analysis.md)
* [到达定值分析](ch3/03-01-reaching-def-analysis.md)
* [活跃变量分析](ch3/03-02-live-var-analysis.md)
* [可用表达式分析](ch3/03-03-available-exp-analysis.md)
* [第四章-过程间分析](ch4/README.md)
* [过程间分析简介](ch4/04-01-inter-analysis-spa.md)

10 changes: 10 additions & 0 deletions ch1/01-01-whats-spa.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# 初见——静态分析是什么?

**WARNING:即将进入施工未完毕区域。**

> 静态程序分析是指**不编译**出二进制代码通过测试用例对程序进行测试,仅通过**静态地**分析程序得到程序**不平凡**的性质的过程。
## 静态程序分析的抽象定义与诠释
Expand Down Expand Up @@ -138,3 +140,11 @@ complete: 报出的问题都是对的 must analysis: outputs information that mu

### 关于判定问题中经常用到的术语

在计算机的判定性问题中和各种医学诊断中,都会提到这几种概念:
1. True positive
2. True negative
3. False positive
4. False negative

[Youtube上](https://www.youtube.com/watch?v=rFwu69tuMiU)有一个简单的介绍视频。

8 changes: 8 additions & 0 deletions ch1/01-02-ir.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
# 中间表示——静态分析器的输入

如果你学习过编译原理课程,能理解:

> 静态分析器的输入是线型IR(三地址码3AC),而非树型IR(抽象语法树AST)
可以直接跳过本小节。

**WARNING:即将进入施工未完毕区域。**

## 从编译器的组件谈起

一个典型的编译器分成一下几个部分:
Expand Down
14 changes: 10 additions & 4 deletions ch1/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,14 @@

在这一部分中,将正式地介绍:

- 什么是静态程序分析(下简称为静态分析)
- 这一技术有什么样的应用
- 为什么它值得我们去学习与研究
-
- 什么是静态程序分析(下简称为静态分析)?
- 如何设计一个实用的静态程序分析器?

**注:第一章到第三章在B站上有相应的视频。不建议读者现在阅读一到三章的内容。在作者腾出时间整理文稿之前,建议读者先到B站观看相应的视频。**

对应的视频在这里:
- [第二课-IR](https://www.bilibili.com/video/BV1zE411s77Z)

[这里](https://ranger-nju.gitbook.io/static-program-analysis-book/ch4)直接跳转到施工完毕区域(第四章)。

**WARNING:即将进入施工未完毕区域。**
9 changes: 9 additions & 0 deletions ch2/README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,11 @@
# 第二章-数据流分析的理论与应用

**注:第一章到第三章在B站上有相应的视频。不建议读者现在阅读一到三章的内容。在作者腾出时间整理文稿之前,建议读者先到B站观看相应的视频。**

对应的视频在这里:
- [第三课-数据流分析一](https://www.bilibili.com/video/BV1oE411K79d)
- [第四课-数据流分析二](https://www.bilibili.com/video/BV19741197zA)

[这里](https://ranger-nju.gitbook.io/static-program-analysis-book/ch4)直接跳转到施工完毕区域(第四章)。

**WARNING:即将进入施工未完毕区域。**
180 changes: 6 additions & 174 deletions ch3/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,180 +2,12 @@

帮助理解之前所数的数据流分析技术,让你在将来review的时候更有效率。忘记了也能自己推导出大概。

## 基础(Recall离散数学)
**注:第一章到第三章在B站上有相应的视频。不建议读者现在阅读一到三章的内容。在作者腾出时间整理文稿之前,建议读者先到B站观看相应的视频。**

### Yet Another(Math Instead of Program) View to Iterative Algorithm
对应的课程在这里:
- [第五课-数据流分析理论一](https://www.bilibili.com/video/BV1A741117it)
- [第六课-数据流分析理论二](https://www.bilibili.com/video/BV1964y1M7nL)

Forward:根据IN求OUT
May:Merge时求并,初始化为bottom
[这里](https://ranger-nju.gitbook.io/static-program-analysis-book/ch4)直接跳转到施工完毕区域(第四章)。

文字描述->符号化描述。

*下标:Node,上标:迭代次数。最后i和i+1的结果一致。*

右下角引出不动点(数学定义)。

- 有解性(一定有解)
- 解的唯一性(假设取得了最好的)
- 最大最小不动点?
- 什么时候算法给出解?

**这是数十年来程序分析的问题精华。接下来我们接触一些必要的数学。**

偏序: (能够容忍不可比较性)

用例子记忆。

poset(偏序集)

- 自反
- 反对称
- 传递

Lattice(格)之图。


正式介绍Lattice之前,介绍两个概念:上下界。

注意:*上下界不需要是集合S中的元素。*

例子。

定义一个least upper bound(lub or join),即最小上界。类似地定义最大下界。

例子。

属性:
1. join和meet不一定有。
2. 有则唯一。( 证明-反证法。)

定义Lattice: 任取两个元素,都有join和meet就是lattice。

定义Semilattice:join和meet有且只有一个。

定义Complete Lattice:任何一个子集都有join和meet。

top和bottom。

性质:有穷则complete,反之不行(Recall Ex1)

程序通常是有限的(表达式,变量等是有限的)。因此,在数据流分析中主要关注Complete Lattice。

定义Product Lattice:新的集合,新的关系和join和meet。

性质:

1. A product lattice is a lattice
2. If a product lattice L is a product of complete lattices, then L is also complete

Lattice上数据流分析的框架:

例子。

总结:Data flow analysis can be seen as iteratively applying transfer functions and meet/join operations on the values of a lattice.

基础已备。回顾开头的三个问题:

回顾单调性(和高中数学定义一致)。

**不动点定理**

前提: Complete Lattice/Function is Monotonic/Lattice is Finite.

效果:给出了固定的求不动点的方法。

证明:
1. 存在
2. 最小

存在-f(bottom)中的元素一定在L中,由bottom的定义,这个式子自然成立。 *微积分:单调有界必收敛。*

最小-从bottom开始和从任意一个x开始都满足关系。

#### 喜闻乐见的五块钱

从PL与数学的证明(T的不同方向)讲起。

通过清晰地定义问题的相关因素,缩小问题。

1. 重新定义问题的Scope(上下文敏感指针分析可以跑得必非上下文敏感指针分析更快)。
2. 科研任务/工作任务需要通过沟通缩小Scope。

### Relate Iterative Algorithm to Fixed-Point Theorem

上节课的Product Lattice。

Transfer Function**应该被设计为**是单调的。
join和meet都可以被证明是单调的。

回答问题:
1. Yes。因为不动点原理。
2. Yes。因为推理出来...

何时能达到不动点?
首先定义高度。

提示:每个node一次一步,一个node最坏情况下走h步。所有node最坏情况下要走$h*k$步。

所以算法快慢与程序的规模和Lattice Domain有关。

### May and Must Analyses, a Lattice View

相当于总结上文。
TODO:从前三节课程的理解到一张图解释

所有分析过程一定从不安全的结果向安全但没有意义的结果推进,
以reaching definition为例子,每个块被初始化为全0,代表没有definition可以reach到特定程序点。
Truth的位置在Safe和Unsafe中间,越接近Safe,精度就越低。

问题:为什么一定能越过Truth达到Fixed Point呢?
是否Safe是由Transfer Function和Mering的策略决定的,也就是由设计算法的人决定的。

<img src="../.gitbook/assets/mayMustSum.png" style="zoom: 50%;" />

### Meet/Join-Over-All-Paths Solution(MOP)

> 我们的结论有多准?
Meet:一旦数据流汇聚了,如何处理?

PL基础小知识:
程序中的汇聚从何而来?<-分支和循环结构
- 分支:if,switch,try catch, exception, promise(JS) ...
- 循环:while, do while, for ...

这是分析精度的标杆。

### Iterative Algorithm vs MOP

IA:在每次第一次汇聚的时候就等一等,先merge。不需要枚举,运算量更小,结果没那么准。
MOP:没必要等,在最后一次汇聚的时候才merge。因为要枚举Path,运算量更大,结果更准。

关于精度,有一个简单的证明。(前提是F满足单调性)。

如果F满足分配律(高中数学)。那么MOP和我们的IA一样准。

好消息!当join/meet使用set union/intersection时(之前举过的三个例子都是可分配的)。

### 单调性与不动点

### 函数的可分布性

### Constant Propagation

> Given a variable x at program point p, determine whether x is guaranteed to hold a constant value at p. 在程序点P指定一个变量X,判断X是否在这点是一个常量。
Undefine->Constant->NotAConstant

下划线表示PL领域的通配符。


other -> 两个UNDEF;一个UNDEF一个Constant=>不能是NAC:例如x+y,x第一次是UNDEF,第二次是14,y一直都是2,则两次的结果一次是NAC一次是常量=>这个Transfer Funtion不满足单调性。

## Worklist Algorithm

作为IA的优化版本,懂了IA之后WA很容易懂。

IA适合性质的分析与证明,工作中使用的往往是WA。

WA为什么快->IA为什么慢->
**WARNING:即将进入施工未完毕区域。**
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit a298040

Please sign in to comment.