## Pretty Python Code Or: How I Learned To Stop Worry And Love Linters --- ## Housekeeping Objective: Why, What, and How of code style Agenda * Slides * Examples * Cookiecutter demo? * Questions? --- ## The Zen of Python Type `import this` in a Python interpreter ``` Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases aren't special enough to break the rules. Although practicality beats purity. Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you're Dutch. Now is better than never. Although never is often better than *right* now. If the implementation is hard to explain, it's a bad idea. If the implementation is easy to explain, it may be a good idea. Namespaces are one honking great idea -- let's do more of those! ``` --- ## Readability counts Code is read more often than it is written. Even if nobody else uses your code, _you_ will probably read it in a the future and try to figure out what you were thinking now. Ugly code can perform fine, but code that looks pretty and performs well is better. --- ## Simple is better than compact Example: Looking up the force constant of a torsion in Parsley and converting it to kJ/mol ```python3 parsley.get_parameter_handler('ProperTorsions').get_parameter({"id": "t143"})[0].k[0].value_in_unit(unit.kilojoule_per_mole) ``` Linters will complain (124 character line) ```python3 torsions = parsley.get_parameter_handler('ProperTorsions') torsion_param = torsions.get_parameter({"id": "t143"})[0] k_kcal = torsion_param.k[0] k = k.value_in_unit(unit.kilojoule_per_mole) ``` Functionally identical, but practically better --- ## Flat is better than nested ```python3 for parameter_handler in parsley.registered_parameter_handlers: if parameter_handler._TAGNAME == "ProperTorsions": for parameter in parameter_handler.parameters: if parameter.id == "t143": k_kcal = torsion_param.k[0] k = k.value_in_unit(unit.kilojoule_per_mole) ``` ```python3 torsions = parsley.get_parameter_handler('ProperTorsions') torsion_param = torsions.get_parameter({"id": "t143"})[0] k_kcal = torsion_param.k[0] k = k.value_in_unit(unit.kilojoule_per_mole) ``` --- ... or worse, with an index-based itereration ```python3 for i in range(parsley.n_registered_parameter_handlers): if parsley.registered_parameter_handlers[i] == "ProperTorsions": parameter_handler = parsley.registered_parameter_handlers[i] for j in range(parameter_handler.n_parameters): parameter = parameter_Handler.parameters[j] if parameter.id == "t143": k_kcal = torsion_param.k[0] k = k.value_in_unit(unit.kilojoule_per_mole) ``` --- ## Practicality beats purity ``` Special cases aren't special enough to break the rules. Although practicality beats purity. ``` OpenMM uses`lowerCamelCase` in its Python API, not `snake_case` recommended by PEP8. But it is **consistent with the rest of the project**, which is written in C++ and has bindings to other languages. ``` A style guide is about consistency. Consistency with this style guide is important. Consistency within a project is more important. Consistency within one module or function is the most important. ``` --- Existing OpenMM API ```python3 from simtk import openmm, unit my_sys = openmm.System() for _ in range(10): my_sys.addParticle(39.948 * unit.atomic_mass_units) for idx in range(my_sys.getNumParticles()): my_sys.getParticleMass(idx) ``` A strictly "Pythonic" API: ```python3 for _ in range(10): my_sys.add_particle(39.948 * unit.atomic_mass_units) [p.mass for p in my_sys.particles] ``` --- ## Linters enforce style across multiple developers Everybody has style perferences: * Tabs or spaces? * Single quote `'` or double quotes `"`? * Maximum line length? * Import orders? Linters are (mostly) deterministic, uncompromising, and unforgiving. They work the same for everybody, no matter the job title or programming experience. --- ## Using linters Generally very stable, do not often change behavior in significant ways Python linters are all installable with pip or conda and generally lightweight Most linters can be turned off on a per-line basis --- # Black "the uncompromising Python code formatter" It will make changes (potentially many) to the formatting of your code. Will not ask permission or warning, it'll just do it. ```shell=bash $ black my_script.py reformatted my_script.py All done! ✨ 🍰 ✨ 1 file reformatted. ``` [`yapf`](https://github.com/google/yapf) is an alternative, but used less commonly in scientific software today. --- ## Linters alone aren't perfect The torsion example from earlier is still hard to read ```python3 parsley.get_parameter_handler("ProperTorsions").get_parameter({"id": "t143"})[0].k[ 0 ].value_in_unit(unit.kilojoule_per_mole) ``` Compared to breaking up the logic by hand ```python3 torsions = parsley.get_parameter_handler("ProperTorsions") torsion_param = torsions.get_parameter({"id": "t143"})[0] k_kcal = torsion_param.k[0] k = k.value_in_unit(unit.kilojoule_per_mole) ``` --- # isort "I sort your imports so you don't have to" --- # flake8 Reports soft style issues, does not edit code. ```shell=bash $flake8 my_script.py my_script.py:5:80: E501 line too long (141 > 79 characters) my_script.py:7:1: E302 expected 2 blank lines, found 1 my_script.py:9:1: W391 blank line at end of file ``` Each code is a specific (often minor) thing that can help readability. --- ## flake8 plugins Lots of plugins available for special cases: [curated list](https://github.com/DmytroLitvinov/awesome-flake8-extensions) e.g. [`flake8-absolute-import`](https://pypi.org/project/flake8-absolute-import/) ```python3 from openff.toolkit.topology.molecule import Molecule # OK from .molecule import Molecule # ABS101 ``` I use this to enforce absolute imports. flake8 will yell at me if I try to commit a relative import. --- # pyupgrade https://github.com/asottile/pyupgrade Upgrades syntax for newer versions of Python --- # mypy **Optional** type hints be added to function/variable/class definitions - but are effectively comments when run as code ```python3 def square(input): return input ** 2 def square(input: float) -> float: return input ** 2 ``` --- ## Type hints make code more readable What does `ForceField.get_parameter()` take in and return? ```python3 class ForceField ... def get_parameter(self, parameter_attrs): ``` vs ```python3 def get_parameter( self, parameter_attrs: Dict[str, str], ) -> List[ParameterType]: ``` It's clarified in the docstring, but often we read the docs just to figure out a type, i.e. if this returns a list or just a parameter --- ## Type hints help clarify your API ```python3 class FrozenMolecule @classmethod def from_qcschema( cls, qca_record, client=None, toolkit_registry=GLOBAL_TOOLKIT_REGISTRY, allow_undefined_stereo=False, ): ``` What type is `qca_record` suposed to be? --- ## Type hints help clarify your API ```python3 class RDKitToolkitWrapper def assign_partial_charges( self, molecule, partial_charge_method=None, use_conformers=None, strict_n_conformers=False, _cls=None, ): ``` What should `partial_charge_method` be if not `None`? Does this function return anything? --- ## Type hints help clarify your API ```python3 class RDKitToolkitWrapper def assign_partial_charges( self, molecule: FrozenMolecule, partial_charge_method: str = "mmff94", use_conformers: Optional[List[Quantity]] = None, strict_n_conformers: bool = False, _cls=None, # Note this was not hinted ) -> None: ``` Now, the highlights of the docstring are encoded in the function signature and some of the behavior is clarified. --- ## Type hints help clarify your API https://github.com/openforcefield/openff-toolkit/blob/61eaadb312ecb396ac94610fd0a45223426da8e4/openff/toolkit/typing/engines/smirnoff/forcefield.py#L1246 --- ## Using linters manually Let linters do the boring work for you. Most work as a command-line tool that rewrite code in-place and/or give informative error messages about style issues. ```shell=bash $ black physical_validation $ isort physical_validation $ flake8 physical_validation ``` Some Python linters have settings that can be modified in configuration files, i.e. `setup.cfg`. --- ## Use IDEs! Linters intergrate well with IDEs. Take advantage of them - even default settings are often good. In PyCharm, you can auto-lint with something like control + alt + L (maybe option + command + L on Mac) --- ## Using linters automatically The `pre-commit` tool automatically runs linters for you in the background when you run `git commit` - it will error out and prevent the commit from going through if any steps fail You need a config file `.pre-commit-config.yaml`, `pre-commit` installed (conda/pip), and the hooks installed (`pre-commit install`). --- ## Pre-commit stopping a commit ```shell=bash git commit -m "Bad commit" black................................................Failed - hook id: black - exit code: 1 would reformat physical_validation/ensemble.py Oh no! 💥 💔 💥 1 file would be reformatted. ``` Run `black physical_validation`, then add and commit: ```shell=bash git commit -m 'Better commit' black................................................Passed [detached HEAD 986250e] Better commit 1 file changed, 1 insertion(+), 1 deletion(-) ``` --- ## Using linters in CI [Example workflow](https://github.com/openforcefield/openff-system/blob/master/.github/workflows/lint.yaml ) in The OpenFF Toolkit or [another](https://github.com/shirtsgroup/physical_validation/blob/master/.github/workflows/lint.yaml) in `physical_validation`. Basically ```shell=bash # CI service starts up VM, checks out your repo, etc. pip install black isort black project isort project ``` Takes ~30 seconds, start to finish Opinion: Best to separate linting checks from the rest of your tests --- ## Other linters Lots of other linters out there * autopep8 * pylint --- ## Opinions 1. Enforce style early in a project 2. Use automation where possible 3. Linting warnings are almost always valuable 4. Resist per-line ignores 5. Use meaningful variable names --- Links and resources * Run `import this` in a Python interpreter * [PEP8](https://www.python.org/dev/peps/pep-0008/) * A [style guide](https://github.com/openforcefield/openff-system/blob/master/docs/developing.md#style) I put together * [`flake8` docs](https://flake8.pycqa.org/en/latest/) * [`flake8` plugins](https://github.com/DmytroLitvinov/awesome-flake8-extensions) * [`mypy` docs](https://mypy.readthedocs.io/en/stable/)
{"metaMigratedAt":"2023-06-15T20:18:07.028Z","metaMigratedFrom":"Content","title":"Black","breaks":true,"contributors":"[{\"id\":\"ee42b2b7-0b06-4056-9b35-da31d66d8681\",\"add\":12626,\"del\":2803}]"}
    281 views
   Owned this note