# [Note] Utilization of Python disassembler/decompiler Decompyle++ ###### tags: `python decompiler` , `python`, `decompiler`, `disassembler`, `pycdas`, `pycdc`, `decompyle++`, reverse engineer` [toc] ## Goal To retrieve the python source code by decompiling **.pyc* and **.pyo* to **.py.* The target here is the same with this sharing note [Utilization of uncompyle6](https://hackmd.io/@TomasZheng/B1oHY0Zs_). ## Introduction | Abbr. | Description | | -------- | -------- | | .py | This is normally the input source code that you've written. | | .pyc | This is the compiled bytecode. If you import a module, python will build a *.pyc file that contains the bytecode to make importing it again later easier (and faster). | | .pyo | This was a file format used before Python 3.5 for *.pyc files that were created with optimizations (-O) flag. | | .pyd | This is a dynamic link library that contains a Python module, or set of modules, to be called by other Python code. It could be a *.so file in Linux or a *.dll like file in Windows. This kind of files is hard to decompile. Both pycdas and pycdc are written in C++ and they are lightweight. On the commit *99b35a11* at 2021 Nov 22., the size of pycdc binary is 860 KB only. ## Environment ```shell Tomas# uname -a Linux tomas 5.4.0-91-generic #102~18.04.1-Ubuntu SMP Thu Nov 11 14:46:36 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux Tomas# python --version Python 2.7.17 Tomas# cmake --version cmake version 3.10.2 CMake suite maintained and supported by Kitware (kitware.com/cmake). ``` ## Build and Setup ```shell Tomas# git clone https://github.com/zrax/pycdc.git Tomas# cd pycdc Tomas# cmake . -- The C compiler identification is GNU 7.5.0 -- The CXX compiler identification is GNU 7.5.0 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found PythonInterp: /usr/bin/python (found version "2.7.17") -- Configuring done -- Generating done -- Build files have been written to: /home/tomas/tmp/disassembler/pycdc Tomas# make [ 2%] Generating bytes/python_10.cpp, bytes/python_11.cpp, bytes/python_13.cpp, bytes/python_14.cpp, bytes/python_15.cpp, bytes/python_16.cpp, bytes/python_20.cpp, bytes/python_21 .cpp, bytes/python_22.cpp, bytes/python_23.cpp, bytes/python_24.cpp, bytes/python_25.cpp, bytes/python_26.cpp, bytes/python_27.cpp, bytes/python_30.cpp, bytes/python_31.cpp, bytes/ python_32.cpp, bytes/python_33.cpp, bytes/python_34.cpp, bytes/python_35.cpp, bytes/python_36.cpp, bytes/python_37.cpp, bytes/python_38.cpp, bytes/python_39.cpp, bytes/python_310.c pp Scanning dependencies of target pycxx [ 4%] Building CXX object CMakeFiles/pycxx.dir/bytecode.cpp.o [ 7%] Building CXX object CMakeFiles/pycxx.dir/data.cpp.o [ 9%] Building CXX object CMakeFiles/pycxx.dir/pyc_code.cpp.o [ 12%] Building CXX object CMakeFiles/pycxx.dir/pyc_module.cpp.o [ 14%] Building CXX object CMakeFiles/pycxx.dir/pyc_numeric.cpp.o [ 17%] Building CXX object CMakeFiles/pycxx.dir/pyc_object.cpp.o [ 19%] Building CXX object CMakeFiles/pycxx.dir/pyc_sequence.cpp.o [ 21%] Building CXX object CMakeFiles/pycxx.dir/pyc_string.cpp.o [ 24%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_10.cpp.o [ 26%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_11.cpp.o [ 29%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_13.cpp.o [ 31%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_14.cpp.o [ 34%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_15.cpp.o [ 36%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_16.cpp.o [ 39%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_20.cpp.o [ 41%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_21.cpp.o [ 43%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_22.cpp.o [ 46%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_23.cpp.o [ 48%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_24.cpp.o [ 51%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_25.cpp.o [ 53%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_26.cpp.o [ 56%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_27.cpp.o [ 58%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_30.cpp.o [ 60%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_31.cpp.o [ 63%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_32.cpp.o [ 65%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_33.cpp.o [ 68%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_34.cpp.o [ 70%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_35.cpp.o [ 73%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_36.cpp.o [ 75%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_37.cpp.o [ 78%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_38.cpp.o [ 80%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_39.cpp.o [ 82%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_310.cpp.o [ 85%] Linking CXX static library libpycxx.a [ 85%] Built target pycxx Scanning dependencies of target pycdas [ 87%] Building CXX object CMakeFiles/pycdas.dir/pycdas.cpp.o [ 90%] Linking CXX executable pycdas [ 90%] Built target pycdas Scanning dependencies of target pycdc [ 92%] Building CXX object CMakeFiles/pycdc.dir/pycdc.cpp.o [ 95%] Building CXX object CMakeFiles/pycdc.dir/ASTree.cpp.o [ 97%] Building CXX object CMakeFiles/pycdc.dir/ASTNode.cpp.o [100%] Linking CXX executable pycdc [100%] Built target pycdc Tomas# tree -L 1 ├── ASTNode.cpp ├── ASTNode.h ├── ASTree.cpp ├── ASTree.h ├── bytecode.cpp ├── bytecode.h ├── bytecode_ops.inl ├── bytes ├── CMakeCache.txt ├── CMakeFiles ├── cmake_install.cmake ├── CMakeLists.txt ├── data.cpp ├── data.h ├── FastStack.h ├── libpycxx.a ├── LICENSE ├── Makefile ├── pyc_code.cpp ├── pyc_code.h ├── pycdas ├── pycdas.cpp ├── pycdc ├── pycdc.cpp ├── pyc_module.cpp ├── pyc_module.h ├── pyc_numeric.cpp ├── pyc_numeric.h ├── pyc_object.cpp ├── pyc_object.h ├── pyc_sequence.cpp ├── pyc_sequence.h ├── pyc_string.cpp ├── pyc_string.h ├── PythonBytecode.txt ├── README.markdown ├── scripts └── tests ``` We will get two binary **"pycdas"** and **"pycdc"** respectively after building. And now we are going to show how to utilize them. ## Generate the python bytecode Here is my sample code *test.py*. ```python # File name: test.py # This program prints Hello, world! print('Hello, world!') ``` To generate the bytecode ```shell Tomas# python -m test.py Hello, world! /usr/local/bin/python: No module named test.py Tomas# ls -al -rw-rw-r-- 1 tomas tomas 60 六 12 15:12 test.py -rw-rw-r-- 1 tomas tomas 117 六 12 15:13 test.pyc Tomas# python -O -m test.py Hello, world! /usr/local/bin/python: No module named test.py Tomas# ls -al -rw-rw-r-- 1 tomas tomas 60 六 12 15:12 test.py -rw-rw-r-- 1 tomas tomas 117 六 12 15:13 test.pyc -rw-rw-r-- 1 tomas tomas 117 六 12 15:13 test.pyo ``` ## Record the use of pycdc and pycdas ### Decompile the pyc and pyo by pycdc ```shell tomas# ls */ test/: test.py test.pyc test.pyo out/: tomas# uncompyle6 -o out test/*.pyc tomas# ls -al out total 12 drwxrwxr-x 2 tomas tomas 4096 六 12 15:32 . drwxr-xr-x 19 tomas tomas 4096 六 12 15:33 .. -rw-rw-r-- 1 tomas tomas 223 六 12 15:32 test.py tomas# ./pycdc test.pyo # Source Generated with Decompyle++ # File: test.pyo (Python 2.7) print 'Hello, world!' tomas# ./pycdc test.pyc # Source Generated with Decompyle++ # File: test.pyc (Python 2.7) print 'Hello, world!' ``` ### Disassemble the pyc and pyo by pycdas ```shell Tomas# ./pycdas test.pyc test.pyc (Python 2.7) [Code] File Name: test.py Object Name: <module> Arg Count: 0 Locals: 0 Stack Size: 1 Flags: 0x00000040 (CO_NOFREE) [Names] [Var Names] [Free Vars] [Cell Vars] [Constants] 'Hello, world!' None [Disassembly] 0 LOAD_CONST 0: 'Hello, world!' 3 PRINT_ITEM 4 PRINT_NEWLINE 5 LOAD_CONST 1: None 8 RETURN_VALUE Tomas# ./pycdas test.pyo test.pyo (Python 2.7) [Code] File Name: test.py Object Name: <module> Arg Count: 0 Locals: 0 Stack Size: 1 Flags: 0x00000040 (CO_NOFREE) [Names] [Var Names] [Free Vars] [Cell Vars] [Constants] 'Hello, world!' None [Disassembly] 0 LOAD_CONST 0: 'Hello, world!' 3 PRINT_ITEM 4 PRINT_NEWLINE 5 LOAD_CONST 1: None 8 RETURN_VALUE ``` ## uncompyle6 Usage >**To run pycdas**, the PYC Disassembler: `./pycdas [PATH TO PYC FILE]` The byte-code disassembly is printed to stdout. >> >**To run pycdc**, the PYC Decompiler: `./pycdc [PATH TO PYC FILE]` The decompiled Python source is printed to stdout. Any errors are printed to stderr. ## Summary This tool is convenient to reverse python bytecode, but sometime, it will encounter segmentation fault issue. We can utilize both pycdc and uncompyle6 to make it more clear. Besides, it is worthy noticing that these tools cannot reverse ****.pyd*** file! ## Refereonce https://github.com/zrax/pycdc https://stackoverflow.com/questions/35735669/how-do-i-decompile-python-3-5-pyc https://ephrain.net/linux-%E4%BD%BF%E7%94%A8-decompile-pycdc-%E5%8F%8D%E7%B5%84%E8%AD%AF-pyc-%E6%AA%94%E6%A1%88/