owned this note changed 5 years ago
Linked with GitHub

有效清理 Python Code 中的無用部份 - 利用 Python AST - 洪任諭 (PCMan)

歡迎來到 https://hackmd.io/c/COSCUP2018 共筆

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

點擊本頁上方的 開始用 Markdown 一起寫筆記!
手機版請點選上方 按鈕展開議程列表。

slides, code: PCMan/python-find-unused-func

Speaker: Jen Yee "PCMan" Hong

Speak

  • 這個方法不是很有效
  • 也不是很安全
  • 雖然說用了 AST tool,可是今天不會講任何compiler知識

Problems of Legacy Code

  • large and complicated code base
  • It works! (but I don't know why)
  • Unused variables
  • Unused functions
  • Unused imports
  • Unreachable code path
  • No documentations
  • Broken unit tests

Static Code Analysis

Some Helpful tools

  • Pyflake + autoflake
    • Delete unused variables
    • Delete unsued imports
    • Expand import *
    • Cannot delete unused functions
  • Vulture
    • (Claim to) find unused functions
    • Provide confidence levels
    • Unfortunately, does not work in some cases

Code Coverage

Coverage.py

  • Measure code coverage of Python programs
  • Beautiful reports
  • Often used to calculate unit test coverage
  • Example:
 > coverage run your_program.py
 > coverage report # text-based summary report
 > coverage html # generate colorful detailed reports

https://coverage.readthedocs.io/en/coverage-4.5.1a/

Code Coverage Tests

Pros:

  • Very detailed reports (statement level)
  • Can observe actual behavior at runtime
    Cons:
  • Need code that can run (not for broken legacy code)
  • Need test cases with good quality and coverage
  • May not work reliably with concurrency (such as gevent)

Alternatives?

grep ?
regex ?

DIY with Python AST!

Python AST

Some useful Python AST node types

  • ast.Module: the root node
  • ast.FunctionDef: function definition
  • ast.Attribute: access attribute of an object
  • ast.Name: symbol name
  • ast.Call: method invocation
  • ast.ImportFrom: from xxx import yyy
  • ast.NodeVisitor: tree traversal

Find unused function

  1. Find all defined functions in the whole source tree
  2. Find all

Pitfalls & Limitations

  • False negative (unsued but not recongnized)
    • Only compare symbol names
    • Different functions can have the same name
    • Unused recursive functions cannot be found (referenced by itself -> always used)
  • Cannot handle dynamic cases:
    • getattr(obj, 'func_name')()
    • globals()['func_name']()
    • from module import * -> use autoflake to remove this

The Whole Process

  1. Remove unused import and expand * with autoflake
  2. examine every getattr(), globals(), and __import__ in the code( to make sure they don'y reference functions)
  3. List unused functions
  4. Delete unsued functions (with an IDE lik PyCharm)
  5. Repeat steps 1 - 4 until all unused functions are deleted
    a. after deleting some functions, their dependencies may become unsued as well
tags: COSCUP2018 source
Select a repo