Introduction to Static Analysis Ch.4

# Introduction to Static Analysis Ch.4 A General Static Analysis Based on a Trasitional Semantics # Chapterの内容 ### 静的解析3step(復習) 1. 調べたいpropertyからsemanticsを定義 1. semanticsを良く近似するabstractionを定義 1. semantics,abstractionからalgorithmを構築 # 4.1 Semantics as State Transitions #### 検証したいProperty 到達可能性 Programの実行中に、ある条件を満たすパスはない（Safety）ことの証明 #### 3章におけるCompositional-Styleの限界制御フローを無視しているため、現実の言語に存在する動的ジャンプに対応できない - 関数呼び出し - (non-)local goto - 間接ジャンプ - 関数ポインタ - Function as Values(例えば関数オブジェクト？) - 動的ディスパッチ - 例外 Semanticsのoutputはプログラム終了時の「最終状態」 = 中間状態を明示的に見ることはできない #### Transitional Style Semanticsによる解釈直感的に、Transitional Styleでは「Runtimeがあるラベルにいるとき、どのような状況があり得るか」を表すので、制御フローを考慮した解析が可能 Semanticsのoutputが実行パスの中間状態を明示的に含む = 到達可能性の判定はoutputを見るだけで良い ## 4.1.1 Concrete Semantics あるState $s$からState $s'$への遷移を次の記号で表現 $$ s \hookrightarrow s' $$ 状態遷移列は次のように表現 $$ s_0 \hookrightarrow s_1 \hookrightarrow s_2\hookrightarrow ... $$ １本の状態遷移列は１つの実行パスに相当 ### State 状態$s\in \mathbb{S}$をProgram labelとMachine Stateの組$(l,m)\in\mathbb{L}\times\mathbb{M}$と定義 #### Program label Programの各Statementに固有のラベルを貼る状態$s$のラベルは次に実行されるStatementを指す :::info この言語では制御フローが決定的（programのみから構築可能）制御フローが非決定的なとき、nextラベルは現在のstateに依存する（遷移先そのものも近似するか、動的解析を併用して遷移先を絞るとか） ::: #### Machine State これまでの実行を反映したデータと、これからの実行に必要なデータ一般の命令型プログラミング言語だと、以下の3つ - Memory - Environment - 変数名から場所へのmapping table - global/local変数の関係のように、スコープごとにenvironmentがある - Continuation - return contextのStack - return contextはreturn先のアドレスとenvironment :::danger この章で使う言語ではmemory stateだけ見てればいい以下で出てくる$(l,m)$は全部label,memory stateのタプル ::: ### Example 4.1 ##### Program 1 ```c input(x); //label 0 while(x <= 99) //label 1 { x := x + 1; //label 2 } ... // label 3 ``` ##### CFG of Program 1 ![](https://i.imgur.com/JazQWj9.png) inputに応じて、複数の実行パスが考えられる以下はそれぞれinputが100,99,0のときの例 $$ \begin{align} &(0,\emptyset)\hookrightarrow(1,x\mapsto100)\hookrightarrow(3,x\mapsto100)...\\ &(0,\emptyset)\hookrightarrow(1,x\mapsto99)\hookrightarrow(2,x\mapsto99)\hookrightarrow(1,x\mapsto100)\hookrightarrow(3,x\mapsto100)...\\ &(0,\emptyset)\hookrightarrow(1,x\mapsto0)\hookrightarrow(2,x\mapsto0)\hookrightarrow(1,x\mapsto1)\hookrightarrow(2,x\mapsto1)...\\ \end{align} $$ :::warning $(1,x\mapsto100)$は$(1,\{x\mapsto100\})$の略記 ::: ### Set of Reachable States 到達可能な状態集合を知りたい到達可能状態集合＝考えられる全ての実行パスに対してPath上のStateを集めたものつまり、それぞれの実行パス上のStateで和集合を取ればよい ### Example 4.2 Reachable States Ex4.1のProgramでのReachable States inputで入れる値を$0,99,100$に限定 $$ \begin{align} &\{ (0,\emptyset),(1,x\mapsto100),(3,x\mapsto100)\ \}\\ \cup &\{ (0,\emptyset),(1,x\mapsto 99),(2,x\mapsto 99),(1,x\mapsto 100),(3,x\mapsto 100) \}\\ \cup &\{ (0,\emptyset),(1,x\mapsto 0),(2,x\mapsto 0),(1,x\mapsto 1),(2,x\mapsto 1),... \} \end{align} $$ ### Example 4.3 $Step^i$ operation Example 4.2では、それぞれの実行パスを導出してから和集合を取った（DFS的） $Step$によるアプローチでは複数の実行パスを同時に探索していく（BFS的） ##### Execution Treeのイメージ（ラベルのみ表記） ![](https://i.imgur.com/JrNK8oN.png) ２章でやったように、$Step$は実行を1step分進める関数 $$ \begin{align} Step^0(I) &= I = \{(0,\emptyset)\} \\ Step^1(I) &= \{(1,x\mapsto100),(1,x\mapsto99),(1,x\mapsto0)\} \\ Step^2(I) &= \{(3,x\mapsto100),(2,x\mapsto99),(2,x\mapsto0)\} \\ Step^2(I) &= \{(1,x\mapsto100),(1,x\mapsto1)\} \\ Step^3(I) &= \{(3,x\mapsto100),(2,x\mapsto1)\} \\ Step^4(I) &= \{(1,x\mapsto2)\} \\... \end{align} $$ 通過可能状態の集合はこれらの和集合を取ることで得られるすなわち、 $$ \bigcup_{i=0}Step^i(I) = I \cup Step^1(I) \cup Step^2(I)... $$ これは漸化式表現できて、$i$ step実行したとき通過可能状態集合$C_i$とすると、 $$ \begin{align} C_0 &= I \\ C_{i+1} &= I \cup Step(C_i) \end{align} $$ この漸化式が収束するときの集合がプログラムの通過可能状態集合となる即ち次の関数$F$の最小不動点$\mathbf{lfp}F$が解である $$ F: \mathscr{P}(\mathbb{S}) \rightarrow \mathscr{P}(\mathbb{S}) \\ F(X) = I \cup Step(X) $$ ### Theorem4.1 $F(X)=I\cup Step(X)$の最小不動点$\mathbf{lfp}F$は次のようになる $$ \mathbf{lfp}F= \bigcup_{i\geq0}F^i(\emptyset) $$ #### Term ##### 完備半順序(complete partial order) 集合$E$,2項関係$\preceq$に対し、以下が成り立つものを半順序集合と呼ぶ - 反射律 $x\preceq x$ - 推移律 $x\preceq y\ \land\ y\preceq z \implies x\preceq z$ - 反対称律 $x\preceq y\ \land\ y\preceq x \implies x = y$ 半順序集合のうち、最小元が存在し、E上の任意のChainが最小上界をもつものを完備半順序と呼ぶ ##### 最小上界(least upper bound) 順序集合$E$の２要素$x,y$に対し、$x,y$より大きい要素のうち最小のものもし存在すれば$x\sqcup y$で表す $E$がべき集合のとき、これは単なる和集合演算に等しい Ex. $\{0\},\{2\}$の最小上界は$\{0,2\}$ 最小上界は上限と同じです（最大$\neq$上限と勘違いしていた） ##### 鎖(chain) 順序集合$E$上の部分集合で、完全(total)なものすなわち、任意の2要素が比較可能 ##### 単調(monotone) $f:E\rightarrow F$として、$x,y\in E$ に対し$x \preceq y$ ならば$f(x) \preceq f(y)$ ##### 連続(continuous) $f:E\rightarrow F$として、$E,F$を完備半順序とする $E$上の任意の鎖$G$に対し、次が成り立つ - $G$に$f$を適用した集合の最小上界は、$G$の最小上界に$f$を適用したものと一致する $$ \sqcup\{f(x)|x\in G\} = f(\sqcup G) $$ ###### lemma $f$が連続ならば、単調である #### Kleeneの不動点定理 $f:E\rightarrow E$が連続かつ$E$が下限$\perp$をもつ完備半順序であるとき、$f$は以下で表される最小不動点をもつ。 $$ \rm{lfp}\ f = \bigcup_{n\in\mathbb{N}}f^n(\perp) $$ #### 証明 $f$は連続だから単調で、$\perp$は下限なので、以下の不等式を得る $$ \perp \preceq f(\perp) \preceq f^2(\perp) \preceq...\preceq f^n(\perp) $$ すなわち、$\{f^n(\perp) | n \in \mathbb{N}\}$は$E$上の鎖(chain)を形成する $E$は完備半順序なのでこの鎖は最小上界をもち、これを$X$とする $$ X = \bigcup_{n\in N}f^n(\perp) $$ $\perp$は下限なので、任意の$e\in E$に対し、以下が成り立つ $$ \perp \cup\ e = e $$ これらより、次の等式が成り立つ $$ \begin{align} f(X) &= f(\bigcup_{n\in N}f^n(\perp))\\ &= \bigcup_{n\in N}f^{n+1}(\perp)\\ &= \perp \cup \bigcup_{n\in N}f^{n+1}(\perp)\\ &= \bigcup_{n\in N}f^n(\perp)\\ &= X \end{align} $$ 即ち、$X$は$f$の不動点 ##### Xの最小性 $X'\neq X$を$f$の任意の不動点とし、$f^n(\perp)\preceq X'$を仮定 $X'$は不動点で$f$は単調なので、 $$ f^{n+1}(\perp) = f(f^n(\perp)) \preceq f(X)' = X' $$ $\perp\preceq X'$なので、帰納的に$f^n(\perp) \preceq X$ これより直ちに $$X = \bigcup_{n\in N}f^n(\perp) \preceq X'$$ すなわち$X$が最小不動点である ### Definition4.1 ここまでの話のまとめ #### 遷移関係状態ドメイン$\mathbb{S}$上の２つの要素$s,s'$に対し、「遷移がある」という関係を$s\hookrightarrow s'$で表すこの遷移関係をべき集合にliftした概念が$Step$であるすなわち、$S \in \mathscr{P}(\mathbb{S})$に対し、$s\in S$からの遷移先を集めた状態集合をとる $$ Step(S) = \{s' | s\hookrightarrow s', s\in \mathbb{S}\} $$ #### Semantics ここでのConcrete Semanticsは、到達可能な状態集合を示す Programに対し、その初期状態集合$I \in \mathbb{S}$とするこのとき、Semanticsは次の関数$F$の最小不動点$\rm{lfp}F$となる $$ F(X) = I \cup Step(X) $$ ## Recipe for Defining a Concrete Transitional Semantics ### Definition 4.2 Def4.1でConcrete Semanticsをを$F:\mathscr{P}(\mathbb{S}) \rightarrow \mathscr{P}(\mathbb{S})$の最小不動点と定義した #### 用語定義 - $F$ : Concrete Semantic Function - $\mathscr{P}(\mathbb{S})$ : Concrete Semantic domain (or Concrete Domain) # 4.2 Abstract Semantics as Abstract State Transitions Concrete Semanticsを定義したので、Abstractionを考える ### Goal 関数$F$のabstract版として、$F^\sharp$を定義したい $$ \begin{align} F&:\mathscr{P}(\mathbb{S})\rightarrow \mathscr{P}(\mathbb{S})\\ F^\sharp &:\ \ \mathbb{S}^\sharp\ \ \rightarrow \ \ \ \mathbb{S}^\sharp \end{align} $$ Abstract Domain $\mathbb{S}^\sharp$はConcrete Domain $\mathscr{P}(\mathbb{S})$ に対する抽象化関数の中身もそれぞれ抽象化したものを定義したい $$ \begin{align} F(X) = I \cup Step(X) \\ F^\sharp(X^\sharp) = I^\sharp \cup^\sharp Step^\sharp(X^\sharp) \end{align} $$ ### 抽象化に対する要件 - abstract semanticsは常に(現実的な)有限時間で計算可能である - abstract semanticsはconcrete semanticsである$\rm{lfp}F$を過大近似する - conservative, soundである ## 4.2.1 Abstraction of the Semantic Domain **(復習)** Concrete DomainはStateのべき集合 StateはLabel集合とMachine state集合の直積(Cartesian Product) $$ \begin{align} \mathbb{D} &= \mathscr{P}(\mathbb{S}) \\ \mathbb{S} &= \mathbb{L} \times \mathbb{M} \end{align} $$ 要素でいうと Concrete Domainの要素はStateの部分集合 Stateの要素はlabelとmachine ### Program-Label-Wise Reachability それぞれのLabelにおける到達可能状態の集合を知りたい #### Concrete Semanticsで導かれるもの到達可能な$\rm{(label,machine\ state)}$のタプルの集合 $$ \mathscr{P}(\mathbb{S}) = \mathscr{P}(\mathbb{L} \times \mathbb{M}) $$ ##### Ex.A collection of states $$ \{(0,\emptyset),(1,\{x\mapsto100\}),(1,\{x\mapsto99\}),(1,\{x\mapsto1\}),(2,\{x\mapsto1\}),(2,\{x\mapsto99\}),(3,\{x\mapsto100\})\} $$ #### 欲しいものそれぞれのLabelにおける到達可能状態の集合＝Labelと到達可能状態の集合のmapping $$ \mathbb{L} \rightarrow \mathscr{P}(\mathbb{M}) $$ ##### Ex.B label-wise collections $$ \begin{align} \{&0 \mapsto \{\emptyset \},\\ &1\mapsto\{\{x\mapsto1\},\{x\mapsto99\},\{x\mapsto100\}\} \\ &2\mapsto\{\{x\mapsto1\},\{x\mapsto99\} \}\\ &3\mapsto\{\{x\mapsto100\}\}\\ \} \end{align} $$ :::danger ISA本の方ではタプル表記 mappingなのを明示するために$\mapsto$で表記 ::: $\mathscr{P}(\mathbb{M})$がまだ重いので抽象化して軽くしたい $\rightarrow$ $\mathscr{P}(\mathbb{M})$に対する抽象化$\mathbb{M}^\sharp$を考える = Labelから到達可能状態の集合の抽象化へのmapping $$ \mathbb{L} \rightarrow \mathbb{M}^\sharp $$ ##### Question label-wise abstraction Value AbstractionとしてInterval Abstractionを採用したとき、Ex.Bはどのように抽象化されるか $$ \begin{align} \{&0\mapsto \ ?\\ &1\mapsto \ ?\\ &2\mapsto \ ?\\ &3\mapsto \ ? \} \end{align} $$ <details> <summary>answer</summary> $$ \begin{align} \{&0\mapsto \ \{x\mapsto[1,100]\}\\ &1\mapsto \ \{x\mapsto[1,100]\}\\ &2\mapsto \ \{x\mapsto[1,99]\}\\ &3\mapsto \ \{x\mapsto[100,100]\} \} \end{align} $$ </details> #### ドメイン抽象化まとめ 2段階抽象化した $$ \mathscr{P}(\mathbb{L} \times \mathbb{M}) \rightsquigarrow (\mathbb{L} \rightarrow \mathscr{P}(\mathbb{M})) \rightsquigarrow (\mathbb{L} \rightarrow \mathbb{M}^\sharp) $$ ### Abstract Domain by Galois Connection $S^\sharp$はどのようなドメインか - 4章では完備半順序(Complete Partial Order)を使う - 最小元$\perp$が存在し、任意のChainが最小上界をもつ - 3章では結び半束(join-semilattice)を使った - 任意の2要素が最小上界をもつ - 完備半順序と結び半束は包含関係にない - 完備半順序だが結び半束でない例 - ![](https://i.imgur.com/99c6QpT.png) - 結び半束だが完備半順序でない例 - Intervals abstract domainから無限大を抜いたもの - $[0,1]\leq[0,2]\leq...$は無限長のChainを形成するが、上限はない - $[0,\infty]$を導入するとこれが上限になるので、Interval abstract domain 自体はCPO #### Galois Connection 完備半順序集合であるConcrete Domain$(\mathscr{P}(\mathbb{L} \times \mathbb{M}),\subseteq)$に対し、同じく完備半順序集合のAbstract Domain$(\mathbb{L} \rightarrow \mathscr{P}(\mathbb{M}),\sqsubseteq)$を定義し、順序を保存するようなconcretization $\gamma$ ,abstraction $\alpha$を定義することでこの2つはガロア接続される $$ (\mathscr{P}(\mathbb{L} \times \mathbb{M}),\subseteq) \ ^\gamma\rightleftarrows_\alpha (\mathbb{L} \rightarrow \mathscr{P}(\mathbb{M}),\sqsubseteq) $$ #### 詳細 ![](https://i.imgur.com/Jwi1uKl.png) - $(\wp(\mathbb{L}\times\mathbb{M}),\subseteq)$から$(\mathbb{L}\rightarrow\wp({\mathbb{M}}),\sqsubseteq)$ - ラベルごとに集約しただけなので実は等価 - $(\mathbb{L}\rightarrow\wp({\mathbb{M}}),\sqsubseteq)$から$(\mathbb{L}\rightarrow \mathbb{M}^\sharp,\sqsubseteq)$ - ラベルごとにMemoryを抽象化 - このMemory抽象化もまたGalois Connection - ($\wp(\mathbb{M}),\subseteq) \rightleftarrows (\mathbb{M}^\sharp,\sqsubseteq_M)$ - $\sqsubseteq_M$は完備半順序集合$\mathbb{M}$上の順序関係 - $\sqsubseteq$はlabel-wise - $a^\sharp \sqsubseteq b^\sharp\ \textrm{iff}\ \forall l \in \mathbb{L}:a^\sharp(l) \sqsubseteq_M b^\sharp(l)$ #### Notations 1. 写像$f\in A\rightarrow B$は$f$は$f \in \mathscr{P}(A\times B)$と変換できる - ![](https://i.imgur.com/lcHKRA3.png) 1. 関係$f\subseteq A\times B$は、写像$f\in A\rightarrow \mathscr{P}(B)$と相互変換できる - ![](https://i.imgur.com/95DL6UO.png) - Ex. 関係$\hookrightarrow \subseteq\mathbb{S}\times\mathbb{S}$は写像$\hookrightarrow\in\mathbb{S}\rightarrow\mathscr{P}(\mathbb{S})$と同じ - $\hookrightarrow =\{(s_0,s_1),(s_0,s_2),(s_2,s_1)\}$は $\hookrightarrow = \{s_0\mapsto\{s_1,s_2\},s_1\mapsto\emptyset,s_2\mapsto\{s_1\}\}$である 1. 写像$f:A\rightarrow B$に対し、べき写像(造語)$\mathscr{P}(f)$を以下で定義する $$ \begin{align} \mathscr{P}(f) &: \mathscr{P}(A) \rightarrow \mathscr{P}(B)\\ \mathscr{P}(f)(X) &= \{f(x) | x \in X\}\ w.r.t.\ X \in \mathscr{P}(A) \end{align} $$ ![](https://i.imgur.com/gkF6SMT.png) :::info $X$は$A$の部分集合 $X \subseteq A$の要素に対し$f$を適用したものを集める操作 ::: - - Ex.写像$Step$ 3. 写像$f:A\rightarrow \mathscr{P}(B)$に対するべき写像の和集合をとったものを$\breve{\mathscr{P}}(f)$と定義する $$ \begin{align} \mathscr{P}(f) &: \mathscr{P}(A) \rightarrow \mathscr{P}(\mathscr{P}(B))\\ \breve{\mathscr{P}}(f)&: \mathscr{P}(A) \rightarrow \mathscr{P}(B) \\ \breve{\mathscr{P}}(f)&= \bigcup\{f(x)|x\in X\} \end{align} $$ - 直感的に$\mathscr{P}(\mathscr{P}(B))$は$B$の部分集合の集合 - これの和集合をとると$B$の部分集合、つまり$B$のべき集合の要素になる ![](https://i.imgur.com/w6tpSIJ.png) 1. 2つの写像$f:A\rightarrow B$と$g:A'\rightarrow B'$に対し、$(f,g)$を以下で定義する $$ \begin{align} (f,g) &: (A\times A') \rightarrow (B\times B')\\ (f,g)(a,a') &= (f(a),f(a')) \end{align} $$ - それぞれの写像を適用してタプルにするだけ ## 4.2.2 Abstraction of Semantic Functions | | Concrete | Abstract | | -------- | -------- | -------- | | State | $\mathbb{S}=\mathbb{L}\times\mathbb{M}$ | $\mathbb{S}^\sharp = \mathbb{L}\rightarrow\mathbb{M}^\sharp$ | |Domain and Range| $F:\mathscr{P}(\mathbb{S})\rightarrow \mathscr{P}(\mathbb{S})$ | $F^\sharp: \mathbb{S}^\sharp \rightarrow \mathbb{S}^\sharp$ | |Semantic Function Definition| $F(X) = I \cup Step(X)$ |$F^\sharp(X^\sharp) = \alpha(I) \cup^\sharp Step^\sharp(X^\sharp)$| |Step Function|$Step = \breve{\mathscr{P}}(\mathbb{\hookrightarrow})$|$Step^\sharp = \mathscr{P}(\rm{id},\cup^\sharp_M)\circ\pi\circ\breve{\mathscr{P}}(\hookrightarrow^\sharp)$ | |Transitional Relation| $\hookrightarrow \subseteq (\mathbb{L}\times\mathbb{M})\times(\mathbb{L}\times\mathbb{M})$|$\hookrightarrow^\sharp \subseteq(\mathbb{L}^\sharp\times\mathbb{M}^\sharp)\times(\mathbb{L}^\sharp\times\mathbb{M}^\sharp)$| ### State 4.2.1章で説明した抽象化 ### Domain and Range Semanticsのinputとoutputのドメイン ### Semantic Function Definition ### Step Function #### Concrete Notation2で示したように、関係$\hookrightarrow \subseteq\mathbb{S}\times\mathbb{S}$は写像$\hookrightarrow\in\mathbb{S}\rightarrow\mathscr{P}(\mathbb{S})$と同じ $Step$は$\hookrightarrow$のべき写像版と捉えられるので、$\breve{\mathscr{P}}(\hookrightarrow)$になる #### Abstract | Operator | Domain and Range |Operation| | -------- | -------- | ------- | |$\hookrightarrow^\sharp$|$\mathbb{L}\times\mathbb{M}^\sharp\rightarrow \mathscr{P}(\mathbb{L}\times\mathbb{M^\sharp})$|| | $\breve{\mathscr{P}}(\hookrightarrow^\sharp)$| $\mathscr{P}(\mathbb{L\times \mathbb{M^\sharp}})\rightarrow\mathscr{P}(\mathbb{L}\times\mathbb{M^\sharp})$ |各Stateに対し、遷移可能なStateをもってくる| |$\pi$|$\mathscr{P}(\mathbb{L}\times\mathbb{M^\sharp}) \rightarrow \mathscr{P}(\mathbb{L}\times\mathscr{P}(\mathbb{M}^\sharp))$|同じラベルで集約する | |$\mathscr{P}(\rm{id},\cup^\sharp_M)$|$\mathscr{P}(\mathbb{L}\times\mathscr{P}(\mathbb{M}^\sharp)) \rightarrow \mathscr{P}(\mathbb{L\times\mathbb{M^\sharp}})$|集約した抽象メモリの集合をjoinする| - $\pi$により同じラベルに対する抽象メモリ状態は１ヶ所に集約される - つまり、$\pi$のRangeは$\mathbb{L} \rightarrow \mathscr{P}(\mathbb{M}^\sharp)$ と捉えても良い(Notation 2) - 最終的に得られるRange $\mathscr{P}(\mathbb{L\times\mathbb{M^\sharp}})$は$(\mathbb{L} \rightarrow\mathbb{M}^\sharp)$と等価 - すなわち$\mathbb{S}^\sharp$で、全体としても$\mathbb{S}^\sharp\rightarrow \mathbb{S}^\sharp$になる ### Example4.4 One-step transition $Step^\sharp$ abstract stateの集合$\{(l_1,M^\sharp_1),(l_2,M^\sharp_2)\}$に対し$Step^\sharp=\mathscr{P}(\rm{id},\cup^\sharp_M)\circ\pi\circ\breve{\mathscr{P}}(\hookrightarrow^\sharp)$の適用を考える 1. $\breve{\mathscr{P}}(\hookrightarrow^\sharp)$を適用 $$ \begin{align} &\hookrightarrow^\sharp(l_1,M_1^\sharp)\ \cup\hookrightarrow^\sharp(l_2,M_2^\sharp)\\ =&\{(l_1,M_1'^\sharp),(l_2,M_1''^\sharp),(l_1,M_2''^\sharp)\} \end{align} $$ 1. 1.の結果に$\pi$を適用(ラベルごとに集約) $$ \{(l_1,\{M_1'^\sharp,M_2'^\sharp\}),(l_2,\{M''^\sharp_1\})\} $$ 1. 2.の結果に$\mathscr{P}(\rm{id},\cup^\sharp_M)$を適用（抽象メモリ状態をunionする） ### Conditions for Sound $\hookrightarrow^\sharp,\cup^\sharp,\cup^\sharp_M$ Soundnessを満たすための条件 #### $\hookrightarrow^\sharp$ $$ \breve{\mathscr{P}}(\hookrightarrow)\circ\gamma \subseteq \gamma\circ\breve{\mathscr{P}}(\hookrightarrow^\sharp) $$ （具体化した状態での遷移）　＜＝　（抽象状態で遷移させてから具体化） ![](https://i.imgur.com/YdCETmd.png) #### $\cup^\sharp,\cup^\sharp_M$ $$ \cup\circ(\gamma_-,\gamma_-) \subseteq \gamma_-\circ\cup^\sharp_- $$ （具体化させてからunion）　＜＝　（抽象状態でunionしてから具体化） #### Abstract semanticsの正当性 - concrete Domain/ concrete semantic Function - Galois connected abstract domain - abstract semantic function - sound operators を正しく定義するとよい ## 4.2.3 Recipe for Defining an Abstract Transition Semantics 1. memory State集合$\mathbb{M}$,labal集合$\mathbb{L}$の定義 1. Concrete Semanticsの定義 1. Abstract Domain、Abstract Semantic Functionを定義 1. Abstract Domain の性質チェック - $\mathbb{S}^\sharp$と$\mathbb{M}^\sharp$がそれぞれ完備半順序か - $\mathscr{P}(\mathbb{S})$と$\mathbb{S}^\sharp$,$\mathscr{P}(\mathbb{M})$と$\mathbb{M}^\sharp$がそれぞれでガロア接続か 1. Operatorのsoundnessをチェック 1. 上記が満たされる時、concrete Semantics $\rm{lfp} F$を過大近似する解析機が構築できる - Theorem4.2 Theorem 4.3 ### Theorem 4.2 Sound Static Analysis by $F^\sharp$ $F,F^\sharp$を上記で定義したとき $\mathbb{S}^\sharp$の高さが有限であり、$F^\sharp$が単調もしくはextensiveならば、以下の計算により$\rm{lfp} F$有限回の操作で過大近似できる $$ \bigsqcup_{i\geq0}F^{\sharp i}(\perp) $$ これは$\rm{lfp} F$の過大近似だから次の等価な２式がなりたつ $$ \rm{lfp}F \subseteq\gamma(\bigsqcup_{i\geq0}F^{\sharp i}(\perp))\\ \alpha(\rm{lfp}F)\sqsubseteq \bigsqcup_{i\geq0}F^{\sharp i}(\perp) $$ ### Theorem 4.3 widening operator - Theorem4.2での前提条件 - $\mathbb{S}^\sharp$の高さが有限 - $F^\sharp$が単調もしくはextensive これが満たされないとき、Theorem4.2は収束しない可能性がある Widening Operator $\bigtriangledown$を用いる $$ Y_0=\perp\\ Y_{i+1} = Y_i \bigtriangledown F^\sharp(Y_i) $$ この操作により得られる列は、Widening Operatorの定義により有限な鎖である $$ Y_0 \sqsubseteq Y_1 \sqsubseteq Y_2 \sqsubseteq ... \sqsubseteq Y_{lim} $$ 最後の要素$Y_{lim}$は$\rm{lfp}F$を過大近似する ### Using the Result of the Analysis Concrete Semantics $\rm{lfp}F$はProgramの全ての到達可能Stateを表す解析機は、これを過大近似する「Error Stateに到達しない」ことの検証 - 解析機の結果とError StateのSetの積集合が空 - ProgramはErrorStateに到達しない - 解析機の結果とError StateのSetの積集合が空でない - ProgramがErrorStateに到達する可能性はあるが、確実ではない - 過大近似により生じたimprecisionの可能性があるため - アラートを出してユーザにさらなるinspectionを求める（triage） # 4.3 Analysis Algorithms Based on Global Iterations ## 4.3.1 Basic Algorithms ### Algorithm from Theorem 4.2 - Theorem4.2での前提条件（再掲） - $\mathbb{S}^\sharp$の高さが有限 - $F^\sharp$が単調もしくはextensive - Theorem4.2でのAbstract Semantics（計算したい要素） - $\bigsqcup_{i\geq0}F^{\sharp i}(\perp)$ $\perp$が下限で$F^\sharp$が単調なので、以下の不等式を満たす鎖$\rm{Chain}$が作れる $$ \rm{Chain} = \{F^{\sharp i}(\perp)\ | i\in\mathbb{N}\}\\ \perp \sqsubseteq F^{\sharp 1}(\perp) \sqsubseteq F^{\sharp2 }(\perp) \sqsubseteq... $$ $\mathbb{S^\sharp}$の高さは有限だから、集合$\rm{Chain}$も有限で、次の最大要素が存在する $$ \rm{max}(Chain) = \bigsqcup_{i\geq0}F^{\sharp i}(\perp) $$ 前述の不等式より、これは鎖$\rm{Chain}$の右端要素に等しいつまり、$\rm{Chain}$の最右端要素がアルゴリズムで計算すべき要素 ##### 補足 $\rm{Chain}$の最大要素を$M$とすると、$\rm{Chain}$の任意の要素$c$に対して $$ M \sqcup c = M $$ ```python C = bottom while(C is in R): R = C C = f_abstract(C) return R ``` これは以下の漸化式を収束するまで繰り返すアルゴリズムである $$ X_0 = \perp\\ X_{i+1} = F^\sharp(X_i) $$ ### Algorithm from Theorem 4.3 $$ Y_0=\perp\\ Y_{i+1} = Y_i \bigtriangledown F^\sharp(Y_i) $$ これを満たすアルゴリズムを作れば良い ```python C = bottom while(C is in R): R = C C = widening(C,f_abstract(C)) return R ``` ## 4.3.2 Worklist Algorithm Theorem 4.3のアルゴリズムは極めて効率が悪い ### Loop Unrolling #### 1回目のループ Wideningは収束性を保証するためのテクニック 1回目はWideningを使わずに処理、ということも許される $$ \begin{align} C &= \perp\\ C &\leftarrow F^\sharp(C)\\ &= (\alpha(I) \cup^\sharp \mathscr{P}(\rm{id},\cup^\sharp_M)\circ\pi\circ\breve{\mathscr{P}}(\hookrightarrow^\sharp)(C))\\ &= (\alpha(I) \cup^\sharp \emptyset)\\ &= \alpha(I) \end{align} $$ 1回目は必ず$\alpha(I)$になる #### 2回目のループここでWideningの方のアルゴリズムに切り変える $$ \begin{align} C &= \alpha(I)\\ C &\leftarrow C \bigtriangledown F^\sharp(C)\\ &= \alpha(I) \bigtriangledown (\alpha(I) \cup^\sharp Step^\sharp(C))\\ &= \alpha(I) \bigtriangledown Step^\sharp(C) \end{align} $$ 直感的には成り立つ気がするけど証明はできてない $\bigtriangledown$は$\cup^\sharp$をさらにover-approximateするもののはずなので、$()$内の$\alpha(I)$は意味ないはずこれにより、アルゴリズムは次のようになる $$ Y_0=\alpha(I)\\ Y_{i+1} = Y_i \bigtriangledown Step^\sharp(Y_i) $$ ```python C = abstract(I) while(C is in R): R = C C = widening(C,step_abstract(C)) return R ``` #### 更新式まだ計算が重い $$ \begin{align} C &\leftarrow C \bigtriangledown Step^\sharp(C)\\ &= C \bigtriangledown \mathscr{P}(\rm{id},\cup^\sharp_M)\circ\pi\circ\breve{\mathscr{P}}(\hookrightarrow^\sharp)(C) \end{align} $$ $\breve{\mathscr{P}}(\hookrightarrow^\sharp)(C)$は各ラベルからの遷移を取ってくる操作対応するAbstract Memory Stateの変化がないなら遷移先も変わらないので、計算を省略してもいい ![](https://i.imgur.com/uzxaDEK.png) $$ \begin{align} &iter1:\{(l_1,m_1)\}\\ &iter2:\{(l_1,m_1),(l_2,m_2)\}\\ &iter3: \{(l_1,m_1\cup^\sharp m_1'),(l_2,m_2),(l_3,m_3)\} \end{align} $$ - $iter1$と$iter2$を比べると、$l_1$ラベルのmemory側に変化はない - $iter3$で遷移を調べるとき、$(l_1,m_1)$からの遷移はすでに探索済みなので再訪問する必要はない（更新前の$C$に入ってる） - 同様に、$iter4$では$(l_2,m_2)$からの遷移を調べる必要はない - $(l_1,m_1\cup^\sharp m_1')$からの遷移を調べる必要はある ```python worklist = labels c = abstract(I) while(worklist is not empty): old = c c = widen(c,step(c_worklist)) worklist = [l for l in labels if c(l) > old(l)] #label l gets new info return old ``` ### Precision - 一般にWideningはprecisionを大きく損なってしまう - 使用は必要なとき（収束が保障できないとき）に限った方がよい - 循環制御フロー（while,cyclicになるgotoとか）が遷移先のときはwideningを使うとよい - その他は上界演算$\cup^\sharp$で十分 # 4.4 Use Example of the Framework 4.3節までは理論的な話以下の言語を使って具体的に構築していく ## 4.4.1 Simple Imeprative Language 基本は3章で使った言語と同一(変更点を赤字で強調) $$ \begin{align} x &\in \mathbb{X} \\\\ C ::&=\\ &|\ \textbf{skip} \\ &|\ C;C \\ &|\ x := E \\ &|\ \textbf{input}(x) \\ &|\ \textbf{if}(B)\{C\}\textbf{else}(B)\{C\} \\ &|\ \textbf{while}(B)\{C\} \\ &|\ \color{red}{\textbf{goto}\ E} \\ E ::&= \\ &|\ n \\ &|\ x \\ &|\ \color{red}{E + E} \\ B ::&= \\ &|\ \textbf{true}|\textbf{false} \\ &|\ \color{red}{E < E} \\ &|\ \color{red}{E = E} \\ P ::&= C \end{align} $$ #### 変更点 - 各labelにはuniqueな整数ラベルが貼られる - 文法として動的gotoを追加 - $\rm{goto\ E}$の形で、$\rm{E}$の評価値のラベルにjumpする - 比較演算子の右辺値にもExpression $\rm{E}$を用いることができる - 簡単化のため、演算子を減らす - 二項演算子はAddition$+$だけ - 比較演算子は$<,=$の2つ ### Program Labels and Execution Order program中のラベル間の遷移関係からCFGを構築する $\ll C,l' \gg$を次のように定義する（ただし、$\textrm{label}(C)=l$） $$ \begin{align} \ll C,l' \gg = \textrm{case}\ C\ \textrm{of} \\ \textbf{skip} &: \{\textrm{next}(l) = l'\} \\ x:=E &: \{\textrm{next}(l) = l'\} \\ \textbf{input}(x) &: \{\textrm{next}(l) = l'\} \\ C_1;C_2 &: \{\textrm{next}(l) = \textrm{label}(C_1)\} \cup \ll C_1,\textrm{label}(C_2)\gg \cup \ll C_2,l' \gg \\ \textbf{if}(B)\{C_1\}\textbf{else}\{C_2\} &: \{\textrm{nextTrue}(l)=\textrm{label}(C_1),\textrm{nextFalse}(l)=\textrm{label}(C_2)\} \cup \ll C_1, l'\gg \cup \ll C_2, l' \gg \\ \textbf{while}(B)\{C\} &: \{\textrm{nextTrue}(l)=\textrm{label}(C),\textrm{nextFalse}(l)=l'\} \cup\ll C,l\gg\\ \textbf{goto}\ E &: \{\} \end{align} $$ #### $\textbf{skip},x:=E,\textbf{input}(x)$ $\{\textrm{next}(l) = l'\}$ ![](https://i.imgur.com/8f9HJ5n.png) #### $C_1,C_2$ $\{\textrm{next}(l) = \textrm{label}(C_1)\} \cup\ll C_1,\textrm{label}(C_2)\gg \cup \ll C_2,l' \gg$ ![](https://i.imgur.com/eFZlA5w.png) #### $\textbf{if}(B)\{C_1\}\textbf{else}\{C_2\}$ $\{\textrm{nextTrue}(l)=\textrm{label}(C_1),\textrm{nextFalse}(l)=\textrm{label}(C_2)\}\ \cup \ll C_1, l'\gg \cup \ll C_2, l' \gg$ ![](https://i.imgur.com/Fo3Snja.png) #### $\textbf{while}(B)\{C\}$ $\{\textrm{nextTrue}(l)=\textrm{label}(C),\textrm{nextFalse}(l)=l'\} \cup\ll C,l\gg$ ![](https://i.imgur.com/oIYj8QX.png) - 直感的に、$C$を実行したあとのラベルは$l'$であるという関係を表す - プログラム$p$とその最終ラベル$l_{end}$に対し$\ll p,l_{end} \gg$は、プログラム中のラベル遷移の集合 - つまりCFGを構築できる（gotoを除く） - backwardに探索するイメージ ## 4.4.2 Concrete State Transition Semantics Concrete Semanticsとは、以下の関数$F$の最小不動点$\textrm{lfp}F$のことだった $$ F:\wp(\mathbb{S})\rightarrow \wp(\mathbb{S})\\ F(X) = I \cup Step(X)\\ Step(X) = \breve{\wp}(\hookrightarrow) $$ $\hookrightarrow$を定義すれば$F$が定まるこの言語における「$\hookrightarrow$」を定義 $\hookrightarrow$とは、StateからStateへ遷移することができるという関係 $$ \hookrightarrow\subseteq\mathbb{S} \times \mathbb{S} $$ StateはLabelとMemoryの組で表現 $$ \mathbb{S} = \mathbb{L}\times\mathbb{M} \\ $$ Memoryとは、変数からValueへのマッピング $$ \mathbb{M} = \mathbb{X}\rightarrow\mathbb{V}\\ $$ Valueとは、整数値もしくはラベルのこと $$ \mathbb{V} = \mathbb{Z} \cup\mathbb{L} $$ :::info Valueは文脈によって整数値とラベルの2つの意味をもつ $x + y$において$x$,$y$は整数値と解釈されるし、$\textbf{goto}\ e$の$e$はラベルと解釈される整数値とポインタ値の区別をしてないイメージ ::: ラベルは次に実行するStatementを指し示しているから、ラベルの示すStatementによって場合分けして$\hookrightarrow$を定義 $$ \begin{align} \textbf{skip} &: (l,m) \hookrightarrow(\textrm{next}(l),m)\\ \textbf{input}(x) &: (l,m) \hookrightarrow(\textrm{next}(l),update_x(m,z))\\ x := E &: (l,m) \hookrightarrow(\textrm{next}(l),update_x(m,eval_E(m)))\\ C_1;C_2 &: (l,m) \hookrightarrow(\textrm{next}(l),m)\\ \textbf{if}(B)\{C_1\}\textbf{else}\{C_2\} &: (l,m) \hookrightarrow(\textrm{nextTrue}(l),filter_B(m))\\ &: (l,m) \hookrightarrow(\textrm{nextFalse}(l),filter_{\neg B}(m))\\ \textbf{while}(B)\{C\}&:(l,m)\hookrightarrow(\textrm{nextTrue}(l),filter_B(m))\\ &: (l,m) \hookrightarrow(\textrm{nextFalse}(l),filter_{\neg B}(m))\\ \textbf{goto}\ E &: (l,m) \hookrightarrow(eval_E(m),m) \end{align} $$ ところで、$\hookrightarrow$は$\mathbb{S} \rightarrow \wp({\mathbb{S}})$という写像と等価であるため、以下のような定義をしてもよい $$ \hookrightarrow((l,m)) = \begin{cases} \{(\textrm{next}(l),m)\} & (\textrm{label statement is}\ \textbf{skip}) \\ \{(\textrm{next}(l),update_x(m,z))\} & (\textrm{label statement is}\ \textbf{input}(x)) \\ \{(\textrm{next}(l),update_x(m,eval_E(m)))\} & (\textrm{label statement is}\ x := E) \\ \{(\textrm{next}(l),m)\} & (\textrm{label statement is}\ C_1;C_2)\\ \{(\textrm{nextTrue}(l),filter_B(m)),(\textrm{nextFalse}(l),filter_{\neg B}(m)) \} & (\textrm{label statement is}\ \textbf{if}(B)\{C_1\}\textbf{else}\{C_2\}) \\ \{(\textrm{nextTrue}(l),filter_B(m)),(\textrm{nextFalse}(l),filter_{\neg B}(m)) \} & (\textrm{label statement is}\ \textbf{while}(B)\{C\}) \\ \{(eval_E(m),m)\} & (\textrm{label statement is}\ \textbf{goto}\ E) \end{cases} $$ ## 4.4.3 Abstract State 4.2節で定義したガロア接続より、$\mathbb{M}^\sharp$は完備半順序で、$\wp(\mathbb{M})$とガロア接続 $$ (\wp(\mathbb{M}),\subseteq) \leftrightarrows (\mathbb{M^\sharp},\sqsubseteq) $$ これを次のように構築 concrete memory stateは変数名から値へのmap abstractでは変数名から値の抽象へのmapと定義 $$ \mathbb{M} = \mathbb{X} \rightarrow \mathbb{V}\\ \mathbb{M}^\sharp = \mathbb{X} \rightarrow \mathbb{V}^\sharp $$ ここで、value abstraction $\mathbb{V}^\sharp$もガロア接続で、 $$ (\wp(\mathbb{V}),\subseteq) \leftrightarrows (\mathbb{V^\sharp},\sqsubseteq) $$ Concreteにおける$\mathbb{V}=\mathbb{Z}\cup\mathbb{L}$は、$v\in \mathbb{V}$が文脈により整数値もしくはラベルの意味で解釈される、という意味だった静的解析ではどちらの意味になるのか判断できないので、2つの意味を別々にもたせる？ :::info $\textbf{goto}(E)$の抽象解釈のときにラベルが具体的な値でないと不都合だからかも（後述）実装するなら$\wp({\mathbb{L}})$は$\mathbb{Z}^\sharp$から計算することになりそう ::: $$ \mathbb{V}^\sharp = \mathbb{Z}^\sharp \times \mathbb{L}^\sharp $$ $\mathbb{Z}^\sharp$は$\wp(\mathbb{Z})$とガロア接続 $\mathbb{L}^\sharp$は$\wp(\mathbb{L})$のことつまり$\mathbb{V}^\sharp$は整数値の過大近似と、それに対応するラベルの集合 ![](https://i.imgur.com/CdJwDoU.png) ## 4.4.4 Abstract State Transition Semantics Abstract Semanticsは以下の関数$F^\sharp$からTheorem4.2/4.3のアルゴリズムを用いて計算できる $$ F^\sharp:\mathbb{S}^\sharp \rightarrow \mathbb{S}^\sharp\\ F^\sharp(S^\sharp) = \alpha(I) \cup^\sharp Step^\sharp(S^\sharp) \\ Step^\sharp = \wp(id,\cup^\sharp_M)\circ\pi\circ\breve{\wp}(\hookrightarrow^\sharp) $$ $\hookrightarrow^\sharp,\cup^\sharp,\cup_M^\sharp$を定義すれば$F^\sharp$が導かれる ### Define $\hookrightarrow^\sharp$ Concreteの遷移関係$\hookrightarrow$に基づいて、$\hookrightarrow^\sharp:\mathbb{S}^\sharp\rightarrow \wp(\mathbb{S^\sharp})$を定義 #### $\textbf{skip}$ $(l,M^\sharp) \hookrightarrow^\sharp(\textrm{next}(l),M^\sharp)$ ラベルを進めるだけ #### $\textbf{input}(x)$ $(l,M^\sharp) \hookrightarrow^\sharp(\textrm{next}(l),update_x^\sharp(M^\sharp,\alpha(\mathbb{Z})))$ ラベルを進め、抽象メモリ状態の変数$x$に$\mathbb{Z}$全体の抽象化を割り当てる #### $x:=E$ $(l,M^\sharp) \hookrightarrow^\sharp(\textrm{next}(l),update_x^\sharp(M^\sharp,eval_E^\sharp(M^\sharp)))$ ラベルを進め、抽象メモリ状態の変数$x$に$E$の抽象評価値を割り当てる #### $C_1;C_2$ $(l,M^\sharp) \hookrightarrow^\sharp (\textrm{next}(l),M^\sharp)$ `;`自体は何もしない ※ $\textrm{next}(l)=\textrm{label}(C_1)%$ #### $\textbf{if}(B)\{C_1\}\textbf{else}\{C_2\}$ #### $\textbf{while}(B)\{C\}$ $(l,M^\sharp) \hookrightarrow^\sharp (\textrm{nextTrue}(l),filter^\sharp_B(M^\sharp))$ $(l,M^\sharp) \hookrightarrow^\sharp (\textrm{nextFalse}(l),filter^\sharp_{\neg B}(M^\sharp))$ `ifelse/while`は制御フローの分岐点 #### $\textbf{goto}(E)$ $(l,M^\sharp) \hookrightarrow^\sharp (l_{jmp},M^\sharp)\ \textrm{for}\ l_{jmp}\in L\ \textrm{of}\ (z^\sharp,L)=\textrm{eval}^\sharp_E(M^\sharp)$ $\textrm{eval}^\sharp_E(M^\sharp)$の評価$(z^\sharp,L)\in \mathbb{V}^\sharp$のラベル集合$L$がジャンプ先の候補 #### Soundness of $\hookrightarrow^\sharp$ $update_x^\sharp,eval_E^\sharp,filter^\sharp_B,filter^\sharp_{\neg B}$がSoundnessであればよいある抽象に対し、(具体化してから操作)$\subseteq$(抽象操作してから具体化)となるのがSoundnessの要件 ##### Ex. $filter$ - Concrete:$filter_{\neg B}=\neg filter_B$ ![](https://i.imgur.com/6ZWOebn.png) - Abstract:$filter^\sharp_{\neg B}\neq \neg filter^\sharp_B$ - more Precisely, $filter^\sharp_{\neg B}(M^\sharp) \sqcap filter^\sharp_B(M^\sharp) \sqsupseteq \perp$ ![](https://i.imgur.com/GY1IYXO.png) ![](https://i.imgur.com/45r1isL.png) ##### Ex. ValueはInterval Abstractionを採用 $M^\sharp=\{x\mapsto[1,10]\},B=(x > 5)$ - $filter_B(M^\sharp) = \{x\mapsto[6,10]\}$ - $filter_{\neg B}(M^\sharp) = \{x\mapsto[1,5]\}$ $\therefore filter^\sharp_{\neg B}(M^\sharp) \sqcap filter^\sharp_B(M^\sharp)=\perp$ ##### Question ValueはInterval Abstractionを採用 $M^\sharp=\{x\mapsto[1,10]\},B=(x \% 2 == 0)$ - $filter_B(M^\sharp) = \{x\mapsto?\}$ - $filter_{\neg B}(M^\sharp) = \{x\mapsto?\}$ $\therefore filter^\sharp_{\neg B}(M^\sharp) \sqcap filter^\sharp_B(M^\sharp) \neq \perp$ <details> <summary>answer</summary> - $filter_B(M^\sharp) = \{x\mapsto[2,10]\}$ - $filter_{\neg B}(M^\sharp) = \{x\mapsto[1,9]\}$ </details> ### Define $\cup^\sharp,\cup^\sharp_M$ $\hookrightarrow^\sharp$同様、Soundにしたい上限演算$\sqcup$が$\mathbb{S}^\sharp$上で閉じているならば、$\sqcup$はSoundな$\cup^\sharp$として利用できる同様に、上限演算$\sqcup_M$が$\mathbb{M}^\sharp$上で閉じているならば、$\sqcup$はSoundな$\cup^\sharp_M$として利用できる #### Proof $$ \begin{align} (\gamma\circ\sqcup)(a^\sharp,b^\sharp) &= \gamma(a^\sharp \sqcup b^\sharp)\\ &\supseteq \gamma(a^\sharp)\cup\gamma(b^\sharp)\\ &=(\cup\circ(\gamma,\gamma))(a^\sharp,b^\sharp) \end{align} $$ すなわち、$(\gamma\circ\sqcup)\supseteq(\cup\circ(\gamma,\gamma))$ :::info $\gamma$の単調性により、 $\gamma(a^\sharp \sqcup b^\sharp) \supseteq \gamma(a^\sharp)$と$\gamma(a^\sharp \sqcup b^\sharp) \supseteq \gamma(b^\sharp)$が成立この2つから直ちに$\gamma(a^\sharp \sqcup b^\sharp) \supseteq \gamma(a^\sharp) \cup \gamma(b^\sharp)$ :::