# Migrating Airflow to Pytest ## 1. Why? - `nose` is rather old package - `nose` documentation does not belong to the best ones - `pytest` has good and extensive documentation - `pytest` fixtures are really helpful and make tests more powerfull - `pytest` uses configuration `.ini` file There is also a case of parametrized test. Pytest allow to run a single parametrize test method. However, this is not possible when subclassing `unittests.TestCase`. This rises a first question. ### Full pytest or not Should we adopt `pytest` way of writing tests using `assert` statement or do we want to still use `unittests.TestCase`? Below you can find pros and cons of going fully pytest-way. #### Pros: - Explicit is better than implicit `assert true_statement` > `self.assertTrue(true_statement)` - no non-pythonic names like `assertTrue` #### Cons: - this changes the way we used to work - require a lot of changes If we decided on full migration then we may consider using: https://pypi.org/project/unittest2pytest/ ## 2. Proof of concept As the first step I changed `nosetests` to `pytest` in CI script (I also simplified the CI for faster runs). I ignored all `nosetest` flags and used `pytest.ini` for configuration. At the beginning I used the following configuration: ``` [pytest] addopts = -ra --doctest-modules --doctest-continue-on-failure faulthandler_timeout=180 log_print = False log_level = INFO filterwarnings = ignore::DeprecationWarning ignore::PendingDeprecationWarning ignore::RuntimeWarning ignore::UserWarning ``` For more information check pytest documentation: http://doc.pytest.org/en/latest/reference.html#ini-options-ref . After running CI build some test timouted resulting in totally broken build. ### 2.1 Broken tests So in next few iterations I excluded the following tests: ``` tests/dags/test_dag_serialization.py::TestStringifiedDAGs # Not run on actual CI tests/jobs/test_scheduler_job.py::TestSchedulerJob tests/www/api/experimental/test_dag_runs_endpoint.py::TestDagRunsEndpoint - test_get_dag_runs_success - test_get_dag_runs_success_with_state_parameter - test_get_dag_runs_success_with_capital_state_parameter - test_get_dag_runs_success_with_state_no_result tests/utils/test_dag_processing.py::TestDagFileProcessorAgent ``` By excluding those test using `@pytest.mark.skip`, I obtained runnable CI build. #### What is common for those tests? All tests seem to be related to DAGs, many is using multiprocessing and similar things. I think this maybe related to strange behaviour of tests like this one: ```python import unittest from multiprocessing import Process, Queue def my_function(q, x): raise Exception('test') class TestTest(unittest.TestCase): def test1(self): queue = Queue() p = Process(target=my_function, args=(queue, 1)) p.start() p.join() # this blocks until the process terminates queue.get() self.assertEqual(1, 2) ``` What does it happen in such case? ``` root@fc90041c43f3:/opt/airflow# nosetests tests.test_proces /usr/local/lib/python3.6/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>. """) Process Process-1: Traceback (most recent call last): File "/usr/local/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/usr/local/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/opt/airflow/tests/test_proces.py", line 6, in my_function raise Exception('test') Exception: test ^C ---------------------------------------------------------------------- Ran 1 test in 59.593s OK [2019-10-07 18:48:43,422] {settings.py:200} DEBUG - Disposing DB connection pool (PID 144) ``` As you can see the test hangs after raising the exception in subprocess. Moreover if I kill the process the test... succeeded? Pytest also hangs but on kill rises an error. I still have to debug those tests. ### 2.2 Side effects But even after excluding some modules there still were many failing or erroring tests. Moreover, they were failing on CI where whole test suite was run but not locally when I run single test. That suggest some side effects. To eliminate those effects I created a pytest fixture that is being used for all tests. It's task is to reset the environment variables before every test. ```python @pytest.fixture(autouse=True) def reset_environment(): """ Resets env variables. """ init_env = os.environ yield changed_env = os.environ for key, value in changed_env.items(): if key not in init_env: del os.environ[key] init_value = init_env[key] if value != init_value: os.environ[key] = init_value ``` The result: ``` ====== 34 failed, 3267 passed, 81 skipped, 128 error in 847.38s (0:14:07) ====== ``` here you can find the build with all errors: https://travis-ci.org/PolideaInternal/airflow/builds/594816055 Most of the errors are related to: ``` cls = <class 'airflow.contrib.hooks.wasb_hook.WasbHook'>, conn_id = '' 14962session = <sqlalchemy.orm.session.Session object at 0x7f814d73b278> 14963 14964 @classmethod 14965 @provide_session 14966 def _get_connections_from_db(cls, conn_id, session=None): 14967 db = ( 14968 session.query(Connection) 14969 .filter(Connection.conn_id == conn_id) 14970 .all() 14971 ) 14972 session.expunge_all() 14973 if not db: 14974 raise AirflowException( 14975> "The conn_id `{0}` isn't defined".format(conn_id)) 14976E airflow.exceptions.AirflowException: The conn_id `` isn't defined ``` There's 157 occurences of this error. ## 3. How to improve Airflow test suite 1. Achieve 100% success build, this needs: - Improve test that are broking the build (2.1) - Improve failing / errored test (2.2) 2. Improve test that rely on database side effects: - To achieve that we can adopt the following process: use reset_db fixture > analyse problem > fix test > check if test passes with fixture > remove fixture ```python from airflow.utils import db @pytest.fixture(autouse=True, scope="class") def reset_db(): db.resetdb() ``` Migration to `pytest` requires only the first step. The second one is a 'nice to have' but it seems that most will agree that it will truly improve our tests. ### Suggestions: - keep `reset_environment` fixture as it's quick and simple way to assert clean environment - add CI build / scheduled build with `reset_db` autouse fixture (or others) to check that tests are not relying on database side effects. As suggested by Jarek Potiuk we can try to make this fixture faster by dumping and reloading database state. In such form it could be used on average build. ### Ideas: - in future we can use `pytest-xdist` for faster, distributed running of tests: https://github.com/pytest-dev/pytest-xdist - we can use `pytests.mark` to mark for example system tests or flaky tests. This would simplify building custom CI builds that run selected tests: http://doc.pytest.org/en/latest/example/markers.html ## Action points: - Pytest or pytest + unittests? - How to approach fixing tests? Should we mark the failing test as expected to faild using `@pytest.mark.xfail` and then fix them? This way the work could be splitted.