--- tags: decompiler title: pycdc patterns --- # pycdc rules ## R0 -> R16 Name: Separating Control Structures Original ``` if x: s1 for i in y: s2 ``` Transformation: ``` if x: s1 FET_null() for i in y: s2 ``` Description: Having consective control structures can lead to difficulty in parsing especially when decompiler may perceive two control structures as one. Adding nop instructions fixes this. ## Unsupported instructions: Note: strike through are those that are catered by python 3.9 conversion. - [x] ~~JUMP_IF_NOT_EXC_MATCH~~ - [x] ~~LIST_TO_TUPLE~~ - [x] CALL_FUNCTION_EX - [x] BUILD_MAP_UNPACK_WITH_CALL - [x] BUILD_SET - [x] BEGIN_FINALLY - [x] BUILD_TUPLE_UNPACK_WITH_CALL - [x] MAP_ADD - [x] ~~LOAD_ASSERTION_ERROR~~ - [x] ~~DICT_MERGE~~ - [x] UNPACK_EX - [x] WITH_CLEANUP_START - [x] GET_YIELD_FROM_ITER - [x] ~~WITH_EXCEPT_START~~ - [x] ~~RERAISE~~ - [x] CALL_FINALLY - [x] LOAD_CLASSDEREF - [x] BUILD_TUPLE_UNPACK - [x] ~~DICT_UPDATE~~ - [x] BUILD_MAP_UNPACK - [x] BEFORE_ASYNC_WITH - [ ] STORE_ANNOTATION - [ ] EXTENDED_ARG - [x] BUILD_LIST_UNPACK ![](https://i.imgur.com/txCQFp3.jpg) ## New rules: ![](https://i.imgur.com/r8ZIAF9.png) r1: Set initialization Original Stmt. ```python {x, y, z} ``` Transformation ```python FET_set(x, y, z) ``` r10: Expanding `yield` statement. Original Stmt. ```python yield from z ``` Transformation ```python for i in z: yield i ``` or ```python FET_yield_from(z) ``` r11: Dictionary generation to list generation. Original Stmt. ```python {a:b for a,b in c} ``` Transformation ```python [(a,b) for a,b in c] ``` :::success Add only 3 above. ::: r15: ```python def PYFUNCTYPE(restype, *argtypes): a= 1 class CFunctionType(_CFuncPtr): _argtypes_ = argtypes _restype_ = restype _a = a return CFunctionType ``` ```python def PYFUNCTYPE(restype, *argtypes): a= 1 class CFunctionType(_CFuncPtr): _argtypes_ = FET_argtypes _restype_ = restype_Scope_change _a = a_Scope_change return CFunctionType ``` ## Unsupported instructions 1. `BUILD_SET` py version: (3, 8) (2, 7) (3, 7) (3, 9) (3, 4) (3, 5) (3, 6) Instructions: ``` temp = {1, 2, 3} ``` transformation: ``` temp = custom_set(1, 2, 3) ``` py27: regex: `[LOAD_CONST]+[BUILD_SET]` Transormation: `[LOAD_NAME][LOAD_CONST]+[CALL_FUNCTION]` ``` 28 LOAD_CONST (':') 30 LOAD_CONST ('1') 32 LOAD_CONST (':') 34 LOAD_CONST ('5') 36 BUILD_SET 4 ``` transformes to: ``` 6 LOAD_NAME (foo) 8 LOAD_CONST (':') 10 LOAD_CONST ('1') 12 LOAD_CONST (':') 14 LOAD_CONST ('5') 16 CALL_FUNCTION (4 positional arguments) ``` :::warning py39 uses frozenset instead. use R25 ::: 2. `CALL_FUNCTION_EX` py version: instructions: ``` def send(self, *args, **kwargs): self._na('send() method') return self._sock.send(*args, **kwargs) ``` transform to: ``` def send(self, args, kwargs): self._na('send() method') return self._sock.send(args, kwargs) ``` regex: `[LOAD_FAST,LOAD_NAME, LOAD_ATTR][LOAD_FAST]+[CALL_FUNCTION_EX]` transforms to: `[LOAD_FAST,LOAD_NAME, LOAD_ATTR][LOAD_FAST]+[CALL_FUNCTION]` original: ``` 10 LOAD_FAST (self) 12 LOAD_ATTR (_sock) 14 LOAD_ATTR (send) 16 LOAD_FAST (args) 18 LOAD_FAST (kwargs) 20 CALL_FUNCTION_EX (keyword and positional arguments) ``` transforms to: ``` LOAD_FAST (self) 12 LOAD_ATTR (_sock) 14 LOAD_ATTR (send) 16 LOAD_FAST (args) 18 LOAD_FAST (kwargs) 20 CALL_FUNCTION (2 positional arguments) ``` 3. `BUILD_MAP_UNPACK_WITH_CALL`: ``` foo(a, b= 1, c=2, **d) ``` ``` 0 LOAD_NAME (foo) 2 LOAD_NAME (a) 4 BUILD_TUPLE 1 6 LOAD_CONST (1) 8 LOAD_CONST (2) 10 LOAD_CONST (('b', 'c')) 12 BUILD_CONST_KEY_MAP 2 14 LOAD_NAME (d) 16 BUILD_MAP_UNPACK_WITH_CALL (2 mappings) 18 CALL_FUNCTION_EX (keyword and positional arguments) 20 POP_TOP 22 LOAD_CONST (None) 24 RETURN_VALUE ``` Transforms to: ``` foo(a,double_star_d, b= 1, c=2) ``` ``` 0 LOAD_NAME (foo) 2 LOAD_NAME (a) 4 LOAD_NAME (double_star_d) 6 LOAD_CONST (1) 8 LOAD_CONST (2) 10 LOAD_CONST (('b', 'c')) 12 CALL_FUNCTION_KW (4 total positional and keyword args) 14 POP_TOP 16 LOAD_CONST (None) 18 RETURN_VALUE ``` 4. `BUILD_TUPLE_UNPACK_WITH_CALL` ``` foo(a,*d, b= 1, c=2, *e) ``` ``` 0 LOAD_NAME (foo) 2 LOAD_NAME (a) 4 BUILD_TUPLE 1 6 LOAD_NAME (d) 8 LOAD_NAME (e) 10 BUILD_TUPLE_UNPACK_WITH_CALL 3 12 LOAD_CONST (1) 14 LOAD_CONST (2) 16 LOAD_CONST (('b', 'c')) 18 BUILD_CONST_KEY_MAP 2 20 CALL_FUNCTION_EX (keyword and positional arguments) 22 POP_TOP 24 LOAD_CONST (None) 26 RETURN_VALUE ``` converts to: ``` foo(a,d_single_star,e_single_star, b= 1, c=2 ) ``` ``` 0 LOAD_NAME (foo) 2 LOAD_NAME (a) 4 LOAD_NAME (d_single_star) 6 LOAD_NAME (e_single_star) 8 LOAD_CONST (1) 10 LOAD_CONST (2) 12 LOAD_CONST (('b', 'c')) 14 CALL_FUNCTION_KW (5 total positional and keyword args) 16 POP_TOP 18 LOAD_CONST (None) 20 RETURN_VALUE ``` ![](https://i.imgur.com/00F47cV.png) 5. `BUILD_TUPLE_UNPACK` ``` temp = ("", *_strprefixes) ``` ``` 0 LOAD_CONST (('',)) 2 LOAD_NAME (_strprefixes) 4 BUILD_TUPLE_UNPACK 2 6 STORE_NAME (temp) 8 LOAD_CONST (None) 10 RETURN_VALUE ``` Transforms to: ``` temp = ("", _strprefixes_star) ``` ``` 0 LOAD_CONST ('') 2 LOAD_NAME (_strprefixes_star) 4 BUILD_TUPLE 2 6 STORE_NAME (temp) 8 LOAD_CONST (None) 10 RETURN_VALUE ``` 6. `BUILD_LIST_UNPACK` ``` temp = ["", *_strprefixes] ``` ``` 0 LOAD_CONST (('',)) 2 LOAD_NAME (_strprefixes) 4 BUILD_LIST_UNPACK 2 6 STORE_NAME (temp) 8 LOAD_CONST (None) 10 RETURN_VALUE ``` Transforms to: ``` temp = ["", _strprefixes_star] ``` ``` 0 LOAD_CONST ('') 2 LOAD_NAME (_strprefixes_star) 4 BUILD_LIST 2 6 STORE_NAME (temp) 8 LOAD_CONST (None) 10 RETURN_VALUE ``` 7. `BUILD_MAP_UNPACK` ``` temp = { *_strprefixes} ``` ``` 0 LOAD_NAME (_strprefixes) 2 BUILD_MAP_UNPACK 1 4 STORE_NAME (temp) 6 LOAD_CONST (None) 8 RETURN_VALUE ``` Transforms to: ``` temp = {_strprefixes_two_star} ``` ``` 0 LOAD_NAME (_strprefixes_two_star) 2 BUILD_SET 1 4 STORE_NAME (temp) 6 LOAD_CONST (None) 8 RETURN_VALUE ``` 8. `BEGIN_FINALLY` ``` try: z=z except: z=z finally: z=z ``` ``` 1: 0 SETUP_FINALLY (to 32) 2 SETUP_FINALLY (to 12) 2: 4 LOAD_NAME (z) 6 STORE_NAME (z) 8 POP_BLOCK 10 JUMP_FORWARD (to 28) 3: >> 12 POP_TOP 14 POP_TOP 16 POP_TOP 4: 18 LOAD_NAME (z) 20 STORE_NAME (z) 22 POP_EXCEPT 24 JUMP_FORWARD (to 28) 26 END_FINALLY >> 28 POP_BLOCK 30 BEGIN_FINALLY 6: >> 32 LOAD_NAME (z) 34 STORE_NAME (z) 36 END_FINALLY 38 LOAD_CONST (None) 40 RETURN_VALUE ``` CONVERTS TO: ``` try: z=z a= True except: z=z if a: z=z ``` ``` 1: 0 SETUP_FINALLY (to 14) 2: 2 LOAD_NAME (z) 4 STORE_NAME (z) 3: 6 LOAD_CONST (True) 8 STORE_NAME (a) 10 POP_BLOCK 12 JUMP_FORWARD (to 30) 4: >> 14 POP_TOP 16 POP_TOP 18 POP_TOP 5: 20 LOAD_NAME (z) 22 STORE_NAME (z) 24 POP_EXCEPT 26 JUMP_FORWARD (to 30) 28 END_FINALLY 6: >> 30 LOAD_NAME (a) 32 POP_JUMP_IF_FALSE (to 38) 7: 34 LOAD_NAME (z) 36 STORE_NAME (z) >> 38 LOAD_CONST (None) 40 RETURN_VALUE ``` 9. `UNPACK_EX` ``` tok, *remainder = _wsp_splitter(value, 1) ``` ``` 0 LOAD_NAME (_wsp_splitter) 2 LOAD_NAME (value) 4 LOAD_CONST (1) 6 CALL_FUNCTION (2 positional arguments) 8 UNPACK_EX 1 10 STORE_NAME (tok) 12 STORE_NAME (remainder) 14 LOAD_CONST (None) 16 RETURN_VALUE ``` converts to: ``` tok, one_star_remainder = _wsp_splitter(value, 1) ``` ``` 0 LOAD_NAME (_wsp_splitter) 2 LOAD_NAME (value) 4 LOAD_CONST (1) 6 CALL_FUNCTION (2 positional arguments) 8 UNPACK_SEQUENCE 2 10 STORE_NAME (tok) 12 STORE_NAME (remainder) 14 LOAD_CONST (None) 16 RETURN_VALUE ``` 10. `GET_YIELD_FROM_ITER` ``` def _pp(self, indent=''): yield from z ``` ``` 0 LOAD_GLOBAL (z) 2 GET_YIELD_FROM_ITER 4 LOAD_CONST (None) 6 YIELD_FROM 8 POP_TOP 10 LOAD_CONST (None) 12 RETURN_VALUE ``` convert to: reference for [yield from](https://stackoverflow.com/questions/9708902/in-practice-what-are-the-main-uses-for-the-yield-from-syntax-in-python-3-3) ``` def _pp(self, indent=''): for i in z: yield i ``` ``` 0 LOAD_GLOBAL (z) 2 GET_ITER >> 4 FOR_ITER (to 16) 6 STORE_FAST (i) 6: 8 LOAD_FAST (i) 10 YIELD_VALUE 12 POP_TOP 14 JUMP_ABSOLUTE (to 4) >> 16 LOAD_CONST (None) 18 RETURN_VALUE ``` 11. `MAP_ADD` ``` kwds['params'] = {a:b for a,b in c} ``` ``` 0 BUILD_MAP 0 2 LOAD_FAST (.0) >> 4 FOR_ITER (to 20) 2: 6 UNPACK_SEQUENCE 2 8 STORE_FAST (a) 10 STORE_FAST (b) 2: 12 LOAD_FAST (a) 2: 14 LOAD_FAST (b) 16 MAP_ADD 2 18 JUMP_ABSOLUTE (to 4) >> 20 RETURN_VALUE ``` converts to: ``` kwds['params'] = [(a,b) for a,b in c] ``` ``` 2: 0 BUILD_LIST 0 2 LOAD_FAST (.0) >> 4 FOR_ITER (to 22) 2: 6 UNPACK_SEQUENCE 2 8 STORE_FAST (a) 10 STORE_FAST (b) 2: 12 LOAD_FAST (a) 14 LOAD_FAST (b) 16 BUILD_TUPLE 2 18 LIST_APPEND 2 20 JUMP_ABSOLUTE (to 4) >> 22 RETURN_VALUE ``` 12. `BEFORE_ASYNC_WITH` ``` async def _fetch_certificate(self): async with _RSClient as session: z=z ``` ``` 0 LOAD_GLOBAL (_RSClient) 2 BEFORE_ASYNC_WITH 4 GET_AWAITABLE 6 LOAD_CONST (None) 8 YIELD_FROM 10 SETUP_ASYNC_WITH (to 22) 12 STORE_FAST (session) 4: 14 LOAD_FAST (z) 16 STORE_FAST (z) 18 POP_BLOCK 20 LOAD_CONST (None) >> 22 WITH_CLEANUP_START 24 GET_AWAITABLE 26 LOAD_CONST (None) 28 YIELD_FROM 30 WITH_CLEANUP_FINISH 32 END_FINALLY 34 LOAD_CONST (None) 36 RETURN_VALUE ``` converts to: ``` async def _fetch_certificate(self): with _RSClient as session: z=z ``` ``` 0 LOAD_GLOBAL (_RSClient) 2 SETUP_WITH (to 14) 4 STORE_FAST (session) 4: 6 LOAD_FAST (z) 8 STORE_FAST (z) 10 POP_BLOCK 12 LOAD_CONST (None) >> 14 WITH_CLEANUP_START 16 WITH_CLEANUP_FINISH 18 END_FINALLY 20 LOAD_CONST (None) 22 RETURN_VALUE ``` 13. `WITH_CLEANUP_START` ``` def _fetch_certificate(self): with _RSClient as session: z=z ``` ``` 0 LOAD_GLOBAL (_RSClient) 2 SETUP_WITH (to 14) 4 STORE_FAST (session) 4: 6 LOAD_FAST (z) 8 STORE_FAST (z) 10 POP_BLOCK 12 LOAD_CONST (None) >> 14 WITH_CLEANUP_START 16 WITH_CLEANUP_FINISH 18 END_FINALLY 20 LOAD_CONST (None) 22 RETURN_VALUE ``` converts to: ``` def _fetch_certificate(self): session = _RSClient z=z session.close() ``` ``` 0 LOAD_GLOBAL (_RSClient) 2 STORE_FAST (session) 3: 4 LOAD_FAST (z) 6 STORE_FAST (z) 4: 8 LOAD_FAST (session) 10 LOAD_ATTR (close) 12 CALL_FUNCTION (0 positional arguments) 14 POP_TOP 16 LOAD_CONST (None) 18 RETURN_VALUE ``` 14. `CALL_FINALLY` ``` def foo(): try: return pipepager(text, 'more') finally: os.unlink(filename) ``` ``` 0 SETUP_FINALLY (to 16) 3: 2 LOAD_GLOBAL (pipepager) 4 LOAD_GLOBAL (text) 6 LOAD_CONST ('more') 8 CALL_FUNCTION 2 10 POP_BLOCK 12 CALL_FINALLY (to 16) 14 RETURN_VALUE 5: >> 16 LOAD_GLOBAL (os) 18 LOAD_METHOD (unlink) 20 LOAD_GLOBAL (filename) 22 CALL_METHOD 1 24 POP_TOP 26 END_FINALLY 28 LOAD_CONST (None) 30 RETURN_VALUE ``` converts: ``` 0 SETUP_FINALLY (to 16) 3: 2 LOAD_GLOBAL (pipepager) 4 LOAD_GLOBAL (text) 6 LOAD_CONST ('more') 8 CALL_FUNCTION 2 10 STORE_FAST (return_here) 12 POP_BLOCK 14 BEGIN_FINALLY 5: >> 16 LOAD_GLOBAL (os) 18 LOAD_METHOD (unlink) 20 LOAD_GLOBAL (filename) 22 CALL_METHOD 1 24 POP_TOP 26 END_FINALLY 28 LOAD_CONST (None) 30 RETURN_VALUE ``` ``` def foo(): try: return_here= pipepager(text, 'more') finally: os.unlink(filename) ``` 15. `LOAD_CLASSDEREF` original: ``` def PYFUNCTYPE(restype, *argtypes): a= 1 class CFunctionType(_CFuncPtr): _argtypes_ = argtypes _restype_ = restype _a = a return CFunctionType ``` ``` 0 LOAD_NAME (__name__) 2 STORE_NAME (__module__) 4 LOAD_CONST ('PYFUNCTYPE.<locals>.CFunctionType') 6 STORE_NAME (__qualname__) 5: 8 LOAD_CLASSDEREF (argtypes) 10 STORE_NAME (_argtypes_) 6: 12 LOAD_CLASSDEREF (restype) 14 STORE_NAME (_restype_) 7: 16 LOAD_CLASSDEREF (a) 18 STORE_NAME (_a) 20 LOAD_CONST (None) 22 RETURN_VALUE ``` change to: ``` def PYFUNCTYPE(restype, *argtypes): a= 1 class CFunctionType(_CFuncPtr): _argtypes_ = argtypes_Scope_change _restype_ = restype_Scope_change _a = a_Scope_change return CFunctionType ``` ``` 0 LOAD_NAME (__name__) 2 STORE_NAME (__module__) 4 LOAD_CONST ('PYFUNCTYPE.<locals>.CFunctionType') 6 STORE_NAME (__qualname__) 4: 8 LOAD_NAME (argtypes_Scope_change) 10 STORE_NAME (_argtypes_) 5: 12 LOAD_NAME (restype_Scope_change) 14 STORE_NAME (_restype_) 6: 16 LOAD_NAME (a_Scope_change) 18 STORE_NAME (_a) 20 LOAD_CONST (None) 22 RETURN_VALUE ```