0%

南大《软件分析》04.Data Flow Analysis - Applications II

前言

本文是南京大学李樾、谭添老师的《软件分析》课程的笔记。内容来自课程ppt以及课程讲解。

课程主页:Static Program Analysis | Tai-e (pascal-lab.net)

授课视频:视频南京大学《软件分析》课程01(Introduction)_哔哩哔哩_bilibili

课程实验repo:pascal-lab/Tai-e-assignments: Tai-e assignments for static program analysis (github.com)

Contents

  1. Overview of Data Flow Analysis
  2. Preliminaries of Data Flow Analysis
  3. Reaching Definitions Analysis
  4. Live Variables Analysis
  5. Available Expressions Analysis

Live Variables Analysis

Live Variables Analysis

Live variables analysis tells whether the value of variable v at program point p could be used along some path in CFG starting at p. If so, v is live at p; otherwise, v is dead at p.

  • Information of live variables can be used for register allocations. e.g., at some point all registers are full and we need to use one, then we should favor using a register with a dead value.

Understanding Live Variables Analysis

Abstraction

  • Data Flow Values/Facts

    • All the variables in a program

    • Can be represented by bit vertors

      e.g., V1, V2, V3, V4, …, V100(100 variables)

      $\underbrace{0000…0}_{100}$

      Bit i from the left represents variable Vi

Safe-approximation

Algorithm of Live Variables Analysis

INPUT: CFG($def_B$ and $use_B$ computed for each basic block B)

OUTPUT: $IN[B]$ and $OUT[B]$ for each basic block B

METHOD:

Example

Iteration 0(Init)

OUTPUT use def INPUT
exit 000 0000
B5 000 0000
B4 000 0000
B3 000 0000
B2 000 0000
B1 000 0000

Iteration 1

OUTPUT use def INPUT
exit 000 0000
B5 000 0000 p z 000 1000
B4 000 0000 U 000 1000 y x q 010 1000
B3 000 1000 x 100 1000
B2 010 1000 U 100 1000 k m y 100 1001
B1 100 1001 p q z y x 001 1101

Changes occur in IN[] of B1, B2, B3, B4, B5

Iteration 2

OUTPUT use def INPUT
exit 000 0000
B5 000 0000 p z 000 1000
B4 100 1001 U 000 1000 y x q 010 1001
B3 000 1000 x 100 1000
B2 010 1001 U 100 1000 k m y 100 1001
B1 100 1001 p q z y x 001 1101

Changes orrcur in IN[] of B4

Iteration 3

OUTPUT use def INPUT
exit 000 0000
B5 000 0000 p z 000 1000
B4 100 1001 U 000 1000 y x q 010 1001
B3 000 1000 x 100 1000
B2 010 1001 U 100 1000 k m y 100 1001
B1 100 1001 p q z y x 001 1101

No changes occur in any IN[]

Avaliable Expression Analysis

Avaliable Expression Analysis

An expression x op y is available at program point p if (1) all paths from the entry to p must pass through the evaluation of x op y, and (2) after the last evaluation of x op y, there is no redefinition of x or y

  • This definition means at program p, we can replace expression x op y by the result of its last evaluation
  • The information of available expressions can be used for detecting global common subexpressions

Understanding Avaliable Expression Analysis

Abstraction

  • Data Flow Values/Facts

    • All the expressions in a program

    • Can be represented by bit vectors

      e.g., E1, E2, E3, E4, …, E100 (100 expressions)

      $\underbrace{0000…0}_{100}$

      Bit i from the left represents expression Ei

Safe-approximation

  • Add to OUT the expression x op y(gen)
  • Delete from IN any expression involving variable a(kill)

$OUT[B] = gen_B \cup(IN[B] - kill_B)$

$IN[B] = \cap_{P~a~predecessor~of~B }OUT[P]$

$\cap$ : All paths from entry to point p must pass through the evaluation of x op y

For safety of the analysis, it may report an expression as unavailable even if it is availabe(must analysis -> under-approximation)

Algorithm of Avaliable Expressions Analysis

INPUT: CFG($kill_B$ and $gen_B$ computed for each basic block B)

OUTPUT: IN[B] and OUT[B] for each basic block B

METHOD:

Example

Iteration 0(Init)

INPUT gen kill OUTPUT
entry 00000
B1 11111
B2 11111
B3 11111
B4 11111
B5 11111

Iteration 1

INPUT gen kill OUTPUT
entry 00000
B1 00000 p-1 y 10000
B2 11111 $\cap$ 10000 z/5 $e^7*x$ k,p 01010
B3 01010 y + 3 z 00011
B4 01010 2*y $e^7*x$ x q 01110
B5 00011 $\cap$ 01110 $e^7*x$ z/5 m y 01010

Changes occur in OUT[] of B1 B2 B3 B4 B5

Iteration 2

INPUT gen kill OUTPUT
entry 00000
B1 00000 p-1 y 10000
B2 01110 $\cap$ 10000 z/5 $e^7*x$ k,p 01010
B3 01010 y + 3 z 00011
B4 01010 2*y $e^7*x$ x q 01110
B5 00011 $\cap$ 01110 $e^7*x$ z/5 m y 01010

No changes occur in any OUT[]

Analysis Comparison

Reaching Definitions Live Variables Available Expressions
Domain Set of definitions Set of variables Set of expressions
Direction Forwards Backwards Forwards
Mat/Must May May Must
Boundary $OUT[entry]=\phi$ $IN[exit]=\phi$ $OUT[entry]=\phi$
Initialization $OUT[B]=\phi$ $IN[B]=\phi$ $OUT[B]=\cup$
Transfer function $OUT=gen\cup(IN-kill)$ - -
Meet $\cup$ $\cup$ $\cap$