## Pretty Python Code
Or: How I Learned To Stop Worry And Love Linters
---
## Housekeeping
Objective: Why, What, and How of code style
Agenda
* Slides
* Examples
* Cookiecutter demo?
* Questions?
---
## The Zen of Python
Type `import this` in a Python interpreter
```
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
```
---
## Readability counts
Code is read more often than it is written. Even if nobody else uses your code, _you_ will probably read it in a the future and try to figure out what you were thinking now.
Ugly code can perform fine, but code that looks pretty and performs well is better.
---
## Simple is better than compact
Example: Looking up the force constant of a torsion in Parsley and converting it to kJ/mol
```python3
parsley.get_parameter_handler('ProperTorsions').get_parameter({"id": "t143"})[0].k[0].value_in_unit(unit.kilojoule_per_mole)
```
Linters will complain (124 character line)
```python3
torsions = parsley.get_parameter_handler('ProperTorsions')
torsion_param = torsions.get_parameter({"id": "t143"})[0]
k_kcal = torsion_param.k[0]
k = k.value_in_unit(unit.kilojoule_per_mole)
```
Functionally identical, but practically better
---
## Flat is better than nested
```python3
for parameter_handler in parsley.registered_parameter_handlers:
if parameter_handler._TAGNAME == "ProperTorsions":
for parameter in parameter_handler.parameters:
if parameter.id == "t143":
k_kcal = torsion_param.k[0]
k = k.value_in_unit(unit.kilojoule_per_mole)
```
```python3
torsions = parsley.get_parameter_handler('ProperTorsions')
torsion_param = torsions.get_parameter({"id": "t143"})[0]
k_kcal = torsion_param.k[0]
k = k.value_in_unit(unit.kilojoule_per_mole)
```
---
... or worse, with an index-based itereration
```python3
for i in range(parsley.n_registered_parameter_handlers):
if parsley.registered_parameter_handlers[i] == "ProperTorsions":
parameter_handler = parsley.registered_parameter_handlers[i]
for j in range(parameter_handler.n_parameters):
parameter = parameter_Handler.parameters[j]
if parameter.id == "t143":
k_kcal = torsion_param.k[0]
k = k.value_in_unit(unit.kilojoule_per_mole)
```
---
## Practicality beats purity
```
Special cases aren't special enough to break the rules.
Although practicality beats purity.
```
OpenMM uses`lowerCamelCase` in its Python API, not `snake_case` recommended by PEP8. But it is **consistent with the rest of the project**, which is written in C++ and has bindings to other languages.
```
A style guide is about consistency. Consistency with this
style guide is important. Consistency within a project is
more important. Consistency within one module or function is
the most important.
```
---
Existing OpenMM API
```python3
from simtk import openmm, unit
my_sys = openmm.System()
for _ in range(10):
my_sys.addParticle(39.948 * unit.atomic_mass_units)
for idx in range(my_sys.getNumParticles()):
my_sys.getParticleMass(idx)
```
A strictly "Pythonic" API:
```python3
for _ in range(10):
my_sys.add_particle(39.948 * unit.atomic_mass_units)
[p.mass for p in my_sys.particles]
```
---
## Linters enforce style across multiple developers
Everybody has style perferences:
* Tabs or spaces?
* Single quote `'` or double quotes `"`?
* Maximum line length?
* Import orders?
Linters are (mostly) deterministic, uncompromising, and unforgiving. They work the same for everybody, no matter the job title or programming experience.
---
## Using linters
Generally very stable, do not often change behavior in significant ways
Python linters are all installable with pip or conda and generally lightweight
Most linters can be turned off on a per-line basis
---
# Black
"the uncompromising Python code formatter"
It will make changes (potentially many) to the formatting of your code. Will not ask permission or warning, it'll just do it.
```shell=bash
$ black my_script.py
reformatted my_script.py
All done! ✨ 🍰 ✨
1 file reformatted.
```
[`yapf`](https://github.com/google/yapf) is an alternative, but used less commonly in scientific software today.
---
## Linters alone aren't perfect
The torsion example from earlier is still hard to read
```python3
parsley.get_parameter_handler("ProperTorsions").get_parameter({"id": "t143"})[0].k[
0
].value_in_unit(unit.kilojoule_per_mole)
```
Compared to breaking up the logic by hand
```python3
torsions = parsley.get_parameter_handler("ProperTorsions")
torsion_param = torsions.get_parameter({"id": "t143"})[0]
k_kcal = torsion_param.k[0]
k = k.value_in_unit(unit.kilojoule_per_mole)
```
---
# isort
"I sort your imports so you don't have to"
---
# flake8
Reports soft style issues, does not edit code.
```shell=bash
$flake8 my_script.py
my_script.py:5:80: E501 line too long (141 > 79 characters)
my_script.py:7:1: E302 expected 2 blank lines, found 1
my_script.py:9:1: W391 blank line at end of file
```
Each code is a specific (often minor) thing that can help readability.
---
## flake8 plugins
Lots of plugins available for special cases: [curated list](https://github.com/DmytroLitvinov/awesome-flake8-extensions)
e.g. [`flake8-absolute-import`](https://pypi.org/project/flake8-absolute-import/)
```python3
from openff.toolkit.topology.molecule import Molecule # OK
from .molecule import Molecule # ABS101
```
I use this to enforce absolute imports. flake8 will yell at me if I try to commit a relative import.
---
# pyupgrade
https://github.com/asottile/pyupgrade
Upgrades syntax for newer versions of Python
---
# mypy
**Optional** type hints be added to function/variable/class definitions - but are effectively comments when run as code
```python3
def square(input):
return input ** 2
def square(input: float) -> float:
return input ** 2
```
---
## Type hints make code more readable
What does `ForceField.get_parameter()` take in and return?
```python3
class ForceField
...
def get_parameter(self, parameter_attrs):
```
vs
```python3
def get_parameter(
self,
parameter_attrs: Dict[str, str],
) -> List[ParameterType]:
```
It's clarified in the docstring, but often we read the docs just to figure out a type, i.e. if this returns a list or just a parameter
---
## Type hints help clarify your API
```python3
class FrozenMolecule
@classmethod
def from_qcschema(
cls,
qca_record,
client=None,
toolkit_registry=GLOBAL_TOOLKIT_REGISTRY,
allow_undefined_stereo=False,
):
```
What type is `qca_record` suposed to be?
---
## Type hints help clarify your API
```python3
class RDKitToolkitWrapper
def assign_partial_charges(
self,
molecule,
partial_charge_method=None,
use_conformers=None,
strict_n_conformers=False,
_cls=None,
):
```
What should `partial_charge_method` be if not `None`? Does this function return anything?
---
## Type hints help clarify your API
```python3
class RDKitToolkitWrapper
def assign_partial_charges(
self,
molecule: FrozenMolecule,
partial_charge_method: str = "mmff94",
use_conformers: Optional[List[Quantity]] = None,
strict_n_conformers: bool = False,
_cls=None, # Note this was not hinted
) -> None:
```
Now, the highlights of the docstring are encoded in the function signature and some of the behavior is clarified.
---
## Type hints help clarify your API
https://github.com/openforcefield/openff-toolkit/blob/61eaadb312ecb396ac94610fd0a45223426da8e4/openff/toolkit/typing/engines/smirnoff/forcefield.py#L1246
---
## Using linters manually
Let linters do the boring work for you. Most work as a command-line tool that rewrite code in-place and/or give informative error messages about style issues.
```shell=bash
$ black physical_validation
$ isort physical_validation
$ flake8 physical_validation
```
Some Python linters have settings that can be modified in configuration files, i.e. `setup.cfg`.
---
## Use IDEs!
Linters intergrate well with IDEs. Take advantage of them - even default settings are often good.
In PyCharm, you can auto-lint with something like control + alt + L (maybe option + command + L on Mac)
---
## Using linters automatically
The `pre-commit` tool automatically runs linters for you in the background when you run `git commit` - it will error out and prevent the commit from going through if any steps fail
You need a config file `.pre-commit-config.yaml`, `pre-commit` installed (conda/pip), and the hooks installed (`pre-commit install`).
---
## Pre-commit stopping a commit
```shell=bash
git commit -m "Bad commit"
black................................................Failed
- hook id: black
- exit code: 1
would reformat physical_validation/ensemble.py
Oh no! 💥 💔 💥
1 file would be reformatted.
```
Run `black physical_validation`, then add and commit:
```shell=bash
git commit -m 'Better commit'
black................................................Passed
[detached HEAD 986250e] Better commit
1 file changed, 1 insertion(+), 1 deletion(-)
```
---
## Using linters in CI
[Example workflow](https://github.com/openforcefield/openff-system/blob/master/.github/workflows/lint.yaml
) in The OpenFF Toolkit or [another](https://github.com/shirtsgroup/physical_validation/blob/master/.github/workflows/lint.yaml) in `physical_validation`.
Basically
```shell=bash
# CI service starts up VM, checks out your repo, etc.
pip install black isort
black project
isort project
```
Takes ~30 seconds, start to finish
Opinion: Best to separate linting checks from the rest of your tests
---
## Other linters
Lots of other linters out there
* autopep8
* pylint
---
## Opinions
1. Enforce style early in a project
2. Use automation where possible
3. Linting warnings are almost always valuable
4. Resist per-line ignores
5. Use meaningful variable names
---
Links and resources
* Run `import this` in a Python interpreter
* [PEP8](https://www.python.org/dev/peps/pep-0008/)
* A [style guide](https://github.com/openforcefield/openff-system/blob/master/docs/developing.md#style) I put together
* [`flake8` docs](https://flake8.pycqa.org/en/latest/)
* [`flake8` plugins](https://github.com/DmytroLitvinov/awesome-flake8-extensions)
* [`mypy` docs](https://mypy.readthedocs.io/en/stable/)
{"metaMigratedAt":"2023-06-15T20:18:07.028Z","metaMigratedFrom":"Content","title":"Black","breaks":true,"contributors":"[{\"id\":\"ee42b2b7-0b06-4056-9b35-da31d66d8681\",\"add\":12626,\"del\":2803}]"}