80 views
# Some Ideas and Questions about Value Set Analysis This blog is about my __basic ideas__ about VSA and __some questions__, whose aim is to try to expose my misunderstanding and throw some problems about the difficulty in applying VSA onto RadecoIL. The remainder of the blog is organized as follows: + _My Basic Ideas about Applying VSA onto RadecoIL_ + _Data Struct_ + _Algorithm_ + _Some Obstacle in front of Our Work_ + _Initial Global Data_ + _Call Site_ + _Condition and Loop_ By the way, all of these mentioned below will __ONLY__ concentrate on the basic VSA from ["Analyzing Memory Accesses in x86 Executables"](https://research.cs.wisc.edu/wpis/papers/cc04.pdf). + The __GMOD-Based Merge Function__ mentioned in ["Improved Memory-Access Analysis for x86 Executables"](http://research.cs.wisc.edu/wpis/papers/etaps08.invited.pdf) + __VSA with ASI__ mentioned in [DIVINE: DIscovering Variables IN Executables](http://ai2-s2-pdfs.s3.amazonaws.com/4e41/71cdf4b02565532c15951d959fe623f830a8.pdf) will be included after solving all of problems in this blog. ## My Basic Ideas about Applying VSA onto RadecoIL ### Data Struct VSA consists of three basic compositions: + a_loc: A_locs corresponds to the variables in source code, which contain local variables (gathered from R2), global (gathered from R2), heap-allocation variables and registers. + Value Set: N-tuple of PIC, for n memory regions, while RIC consist of four numbers (a, b, c, d) as a segment: __a * [b, c] + d__. In general, __pure numbers could be in the GLOBAL region (Like what the original paper did).__ + Abstract Store: Abstract Stores are a map from a_locs to Value Sets, for every single instruction. In RadecoIL, it could be bounded with every single OpCode/Phi node. From now on, I have three ideas about the whole data structures, but I'm not sure which to choose, or other better ones. - [ ] _Building a HashMap for every OpCode/Phi node with another HashMap. The latter will map every a\_loc to its value set._ The __shortage__ of this implement is the huge memory and time consumption, as both equal O(nmp), while n is the number of SSA nodes, m is the numbers of a_locs and p is the numbers of memory regions. - [ ] Another idea is based on the truth that __every Opcode/Phi node will only changes one or two a\_locs__. We could _maintain a HashMap from a\_locs to value sets during the traveling in SSA_. This will cut down the consumption, but increase the possibility of writing bugs. The reason is that this implementation need us to consider much more special cases, like a loop or conditional statements. - [ ] A middle course, maintain abstract store at every basic block's entry and exit. ### Algorithm The algoriths for all data structures are similar, working in as BFS. 1. The program maintains a __heap__, which consist of basic blocks. The sort key of the heap is __the number of unvisited predecessor__, from small to big. The aim of this heap is that we could distinguish whether the block is a loop entry. When Heap pops a basic block, if the number of unvisited predecessor is __zero__, it means is's not a loop entry block, otherwise, it is. 2. Push the start node into heap with the initial abstract store and a_locs. 3. If the heap is empty, go to step 7. 4. Pop a basic block, if it's a loop entry, use limited widening for every phi node, otherwise, use simple union/join. 5. Trave the Opcode node in order within the poped block, and calculate the abstract store at the exit of this block. 6. Push all the successor of the poped block. Go to step 4. 7. End the program. ## Some Obstacles in front of Our Work ### Initial Global Data However, the algorithm mentioned above is not complete. The first obstacle is the initial global data. The problem occurs in the step 2 for _initial abstract store_ We all know that global datas could have __initial value__. It could be an address, or a pure number, or more common, a zero number in .bss segment. But the point is, we only concentrate on single function, that means we could not know whether other functions have changed their value. __We cannot have a accurate initial abstrace store for global__. Only one solution I could have is to assume they are any value. ### Call Site Call Statements are the most difficult point in the whole alogrithm. __We cannot calculate abstract store when we meet an OpCall__, because we cannot know what happen will inside the callee, even the BP/SP could be changed in special situation like function alloc. In the original paper, the auther solved this problem by building a __supergraph of the executable__, which includes all the functions. But we could do that, because it's an amazing consumption on time and memory. Thus, the only solution is to __analyze functions in order__, from leaf to the root in a special tree. The tree consists of functions, where the father function node calls the son function node. But, there are still some sub-problems: + In common situation, the leaf functions are always library functions. So __how could we get the abstract store from library functions__? + The point of analyzing functions in order is that we should get the register who works for the return value. So __how could we finger this register out__? (See afC command in radare2, implementing afCr may be a way to go) + Another point of this implement is to find out the preserved registers, which will be used to determine the abstract store in caller function. The method mentioned in the original paper is try to find the smallest sp_delta, but __how to find the sp\_delta?__ + __Recursive function__ If above problems have been solved (or maybe there are still other problems), we could gather information form callee function and handle OpCall ## Condition and Loop We have mentioned a method to finger the loop entry. __By the way, that idea was made by myself, so I'm not sure it's total right, even I have proved it.__ There're also other problems about the limited widening: + The first problem is the, in the original paper, authers could easily calculate the value range of the register which controls the execution of the loop. However, in our radeco-lib, it's not an easy work. Because our block's selectors are always flag register, but not general registers. __How could we get the real condition for a conditional statement?__ It should be solved, even we bypass it in VSA, but we will meet it in C-AST. + The second problem is the Affine Relations of general registers. __I think it's a hard work to find them__ However, the problem in Condition and Loop is not so urgent, because we can give up a litte preciseness and replace Limited Widening with Ordinary Widening ;P ## Some Pre-work should be done 1. Make a HashMap from SSA node into it's registers (for calculating abstract store) 2. Gather global variables and their initial value from r2, especially some readonly data.