# [Note] Utilization of Python disassembler/decompiler Decompyle++
###### tags: `python decompiler` , `python`, `decompiler`, `disassembler`, `pycdas`, `pycdc`, `decompyle++`, reverse engineer`
[toc]
## Goal
To retrieve the python source code by decompiling **.pyc* and **.pyo* to **.py.*
The target here is the same with this sharing note [Utilization of uncompyle6](https://hackmd.io/@TomasZheng/B1oHY0Zs_).
## Introduction
| Abbr. | Description |
| -------- | -------- |
| .py | This is normally the input source code that you've written. |
| .pyc | This is the compiled bytecode. If you import a module, python will build a *.pyc file that contains the bytecode to make importing it again later easier (and faster). |
| .pyo | This was a file format used before Python 3.5 for *.pyc files that were created with optimizations (-O) flag. |
| .pyd | This is a dynamic link library that contains a Python module, or set of modules, to be called by other Python code. It could be a *.so file in Linux or a *.dll like file in Windows. This kind of files is hard to decompile.
Both pycdas and pycdc are written in C++ and they are lightweight. On the commit *99b35a11* at 2021 Nov 22., the size of pycdc binary is 860 KB only.
## Environment
```shell
Tomas# uname -a
Linux tomas 5.4.0-91-generic #102~18.04.1-Ubuntu SMP Thu Nov 11 14:46:36 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Tomas# python --version
Python 2.7.17
Tomas# cmake --version
cmake version 3.10.2
CMake suite maintained and supported by Kitware (kitware.com/cmake).
```
## Build and Setup
```shell
Tomas# git clone https://github.com/zrax/pycdc.git
Tomas# cd pycdc
Tomas# cmake .
-- The C compiler identification is GNU 7.5.0
-- The CXX compiler identification is GNU 7.5.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found PythonInterp: /usr/bin/python (found version "2.7.17")
-- Configuring done
-- Generating done
-- Build files have been written to: /home/tomas/tmp/disassembler/pycdc
Tomas# make
[ 2%] Generating bytes/python_10.cpp, bytes/python_11.cpp, bytes/python_13.cpp, bytes/python_14.cpp, bytes/python_15.cpp, bytes/python_16.cpp, bytes/python_20.cpp, bytes/python_21
.cpp, bytes/python_22.cpp, bytes/python_23.cpp, bytes/python_24.cpp, bytes/python_25.cpp, bytes/python_26.cpp, bytes/python_27.cpp, bytes/python_30.cpp, bytes/python_31.cpp, bytes/
python_32.cpp, bytes/python_33.cpp, bytes/python_34.cpp, bytes/python_35.cpp, bytes/python_36.cpp, bytes/python_37.cpp, bytes/python_38.cpp, bytes/python_39.cpp, bytes/python_310.c
pp
Scanning dependencies of target pycxx
[ 4%] Building CXX object CMakeFiles/pycxx.dir/bytecode.cpp.o
[ 7%] Building CXX object CMakeFiles/pycxx.dir/data.cpp.o
[ 9%] Building CXX object CMakeFiles/pycxx.dir/pyc_code.cpp.o
[ 12%] Building CXX object CMakeFiles/pycxx.dir/pyc_module.cpp.o
[ 14%] Building CXX object CMakeFiles/pycxx.dir/pyc_numeric.cpp.o
[ 17%] Building CXX object CMakeFiles/pycxx.dir/pyc_object.cpp.o
[ 19%] Building CXX object CMakeFiles/pycxx.dir/pyc_sequence.cpp.o
[ 21%] Building CXX object CMakeFiles/pycxx.dir/pyc_string.cpp.o
[ 24%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_10.cpp.o
[ 26%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_11.cpp.o
[ 29%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_13.cpp.o
[ 31%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_14.cpp.o
[ 34%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_15.cpp.o
[ 36%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_16.cpp.o
[ 39%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_20.cpp.o
[ 41%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_21.cpp.o
[ 43%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_22.cpp.o
[ 46%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_23.cpp.o
[ 48%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_24.cpp.o
[ 51%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_25.cpp.o
[ 53%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_26.cpp.o
[ 56%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_27.cpp.o
[ 58%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_30.cpp.o
[ 60%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_31.cpp.o
[ 63%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_32.cpp.o
[ 65%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_33.cpp.o
[ 68%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_34.cpp.o
[ 70%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_35.cpp.o
[ 73%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_36.cpp.o
[ 75%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_37.cpp.o
[ 78%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_38.cpp.o
[ 80%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_39.cpp.o
[ 82%] Building CXX object CMakeFiles/pycxx.dir/bytes/python_310.cpp.o
[ 85%] Linking CXX static library libpycxx.a
[ 85%] Built target pycxx
Scanning dependencies of target pycdas
[ 87%] Building CXX object CMakeFiles/pycdas.dir/pycdas.cpp.o
[ 90%] Linking CXX executable pycdas
[ 90%] Built target pycdas
Scanning dependencies of target pycdc
[ 92%] Building CXX object CMakeFiles/pycdc.dir/pycdc.cpp.o
[ 95%] Building CXX object CMakeFiles/pycdc.dir/ASTree.cpp.o
[ 97%] Building CXX object CMakeFiles/pycdc.dir/ASTNode.cpp.o
[100%] Linking CXX executable pycdc
[100%] Built target pycdc
Tomas# tree -L 1
├── ASTNode.cpp
├── ASTNode.h
├── ASTree.cpp
├── ASTree.h
├── bytecode.cpp
├── bytecode.h
├── bytecode_ops.inl
├── bytes
├── CMakeCache.txt
├── CMakeFiles
├── cmake_install.cmake
├── CMakeLists.txt
├── data.cpp
├── data.h
├── FastStack.h
├── libpycxx.a
├── LICENSE
├── Makefile
├── pyc_code.cpp
├── pyc_code.h
├── pycdas
├── pycdas.cpp
├── pycdc
├── pycdc.cpp
├── pyc_module.cpp
├── pyc_module.h
├── pyc_numeric.cpp
├── pyc_numeric.h
├── pyc_object.cpp
├── pyc_object.h
├── pyc_sequence.cpp
├── pyc_sequence.h
├── pyc_string.cpp
├── pyc_string.h
├── PythonBytecode.txt
├── README.markdown
├── scripts
└── tests
```
We will get two binary **"pycdas"** and **"pycdc"** respectively after building. And now we are going to show how to utilize them.
## Generate the python bytecode
Here is my sample code *test.py*.
```python
# File name: test.py
# This program prints Hello, world!
print('Hello, world!')
```
To generate the bytecode
```shell
Tomas# python -m test.py
Hello, world!
/usr/local/bin/python: No module named test.py
Tomas# ls -al
-rw-rw-r-- 1 tomas tomas 60 六 12 15:12 test.py
-rw-rw-r-- 1 tomas tomas 117 六 12 15:13 test.pyc
Tomas# python -O -m test.py
Hello, world!
/usr/local/bin/python: No module named test.py
Tomas# ls -al
-rw-rw-r-- 1 tomas tomas 60 六 12 15:12 test.py
-rw-rw-r-- 1 tomas tomas 117 六 12 15:13 test.pyc
-rw-rw-r-- 1 tomas tomas 117 六 12 15:13 test.pyo
```
## Record the use of pycdc and pycdas
### Decompile the pyc and pyo by pycdc
```shell
tomas# ls */
test/:
test.py test.pyc test.pyo
out/:
tomas# uncompyle6 -o out test/*.pyc
tomas# ls -al out
total 12
drwxrwxr-x 2 tomas tomas 4096 六 12 15:32 .
drwxr-xr-x 19 tomas tomas 4096 六 12 15:33 ..
-rw-rw-r-- 1 tomas tomas 223 六 12 15:32 test.py
tomas# ./pycdc test.pyo
# Source Generated with Decompyle++
# File: test.pyo (Python 2.7)
print 'Hello, world!'
tomas# ./pycdc test.pyc
# Source Generated with Decompyle++
# File: test.pyc (Python 2.7)
print 'Hello, world!'
```
### Disassemble the pyc and pyo by pycdas
```shell
Tomas# ./pycdas test.pyc
test.pyc (Python 2.7)
[Code]
File Name: test.py
Object Name: <module>
Arg Count: 0
Locals: 0
Stack Size: 1
Flags: 0x00000040 (CO_NOFREE)
[Names]
[Var Names]
[Free Vars]
[Cell Vars]
[Constants]
'Hello, world!'
None
[Disassembly]
0 LOAD_CONST 0: 'Hello, world!'
3 PRINT_ITEM
4 PRINT_NEWLINE
5 LOAD_CONST 1: None
8 RETURN_VALUE
Tomas# ./pycdas test.pyo
test.pyo (Python 2.7)
[Code]
File Name: test.py
Object Name: <module>
Arg Count: 0
Locals: 0
Stack Size: 1
Flags: 0x00000040 (CO_NOFREE)
[Names]
[Var Names]
[Free Vars]
[Cell Vars]
[Constants]
'Hello, world!'
None
[Disassembly]
0 LOAD_CONST 0: 'Hello, world!'
3 PRINT_ITEM
4 PRINT_NEWLINE
5 LOAD_CONST 1: None
8 RETURN_VALUE
```
## uncompyle6 Usage
>**To run pycdas**, the PYC Disassembler: `./pycdas [PATH TO PYC FILE]` The byte-code disassembly is printed to stdout.
>>
>**To run pycdc**, the PYC Decompiler: `./pycdc [PATH TO PYC FILE]` The decompiled Python source is printed to stdout. Any errors are printed to stderr.
## Summary
This tool is convenient to reverse python bytecode, but sometime, it will encounter segmentation fault issue. We can utilize both pycdc and uncompyle6 to make it more clear. Besides, it is worthy noticing that these tools cannot reverse ****.pyd*** file!
## Refereonce
https://github.com/zrax/pycdc
https://stackoverflow.com/questions/35735669/how-do-i-decompile-python-3-5-pyc
https://ephrain.net/linux-%E4%BD%BF%E7%94%A8-decompile-pycdc-%E5%8F%8D%E7%B5%84%E8%AD%AF-pyc-%E6%AA%94%E6%A1%88/