# Causal Inference with [DoWhy](https://github.com/Microsoft/dowhy) <br> <br> <small>Neil John D. Ortega</small><br> <small> <small>ML Engineer @ LINE Fukuoka</small><br> <small>2021/02/26</small> </small> --- ## Agenda - Why Causal Inference (CI)? - Why [DoWhy](https://github.com/Microsoft/dowhy)? - 4-Step CI Workflow - Case Study: What Causes Hotel Booking Cancellations? - Recap --- ## Why Causal Inference (CI)? - Increased usage of ML algorithms in decision-making across verticals, often with huge consequences<!-- .element: class="fragment" --> - Increased need to understand<!-- .element: class="fragment" --> - why a model suggested a particular decision<!-- .element: class="fragment" --> - what the effects of that decision are (often poses ethical/social consequences)<!-- .element: class="fragment" --> - Causal inference could help but it has its own set of challenges:<!-- .element: class="fragment" --> - different frameworks can be used<!-- .element: class="fragment" --> - comparing assumptions not straightforward<!-- .element: class="fragment" --> - comparing robustness of results not straightforward<!-- .element: class="fragment" --> --- ## Why [DoWhy](https://github.com/Microsoft/dowhy)? - Allows modeling problems as a causal graph, making the assumptions more explicit<!-- .element: class="fragment" --> - Provides a single interface for different CI frameworks (structural causal model vs. potential outcome framework)<!-- .element: class="fragment" --> - Allows automatic validation of assumptions and assessment of robustness of estimates under "what-if" scenarios<!-- .element: class="fragment" --> <!-- ![Neural Architecture Search (NAS) methods](https://i.imgur.com/uqFJZBw.png) <small><strong>Fig. 1.</strong> Relationships between NAS method categories [1]. Accessed 8 Nov 2020.</small> --> --- ## 4-Step CI Workflow 1. **Model** the problem as a causal graph 2. **Identify** a target variable under the causal model 3. **Estimate** the causal effect based on the target variable 4. **Refute** the obtained estimate ---- ### (1) Model problem as a causal graph - Causal graph - DAG to convert domain knowledge as a set of causal assumptions ![](https://i.imgur.com/MJotPHh.png) ---- ### (1) Model problem as a causal graph - Intervention graph - keeping everything else the same, we select a treatment whose causal effect to the outcome we want to estimate ![](https://i.imgur.com/Z4mOwQn.png) ---- ### (2) Identify target variable - How to represent desired quantities from **intervention graph** using statistical observations from data generated by **causal graph** - Can we estimate these from given data? ---- ### (3) Estimate causal effect - Keeping all confounders constant, estimate the conditional probability $P(Y|T=t)$ of the outcome given the treatment - Statistical/ML methods are employed here - Face the same challenges for non-causal estimation (e.g. bias-variance tradeoff, etc.) ---- ### (4) Refute the obtained estimate - Test how the estimates behave under "what-if" scenarios - Can be done with - any one of the previous steps (unit test-like), or - the entire pipeline (integration test-like) - The more tests you can do, the better, i.e. :arrow_up: confidence with the model - :warning: **DOES NOT PROVE CORRECTNESS** --- ## Case Study: What Causes Hotel Booking Cancellations? --- ## Recap <style> .reveal ul {font-size: 32px !important;} </style> - Increased need to peak inside the black-box of automated decision making. Also need to determine the effects of a decision, should we implement one<!-- .element: class="fragment" --> - Causal inference lends itself to the above problem, but with its own set of challenges<!-- .element: class="fragment" --> - [DoWhy](https://github.com/Microsoft/dowhy) is a good starting point for integrating causal-based techniques to the DS/ML pipeline<!-- .element: class="fragment" --> - Good level of abstraction to the steps of CI workflow<!-- .element: class="fragment" --> - Out-of-the-box implementation of techniques for each of steps (mix and match, but with due diligence)<!-- .element: class="fragment" --> - Automated robustness checks for higher confidence with the model and its estimates<!-- .element: class="fragment" --> --- # Thank you! :nerd_face: --- ## References <!-- .slide: data-id="references" --> <style> .reveal p {font-size: 20px !important;} .reveal ul, .reveal ol { display: block !important; font-size: 32px !important; } section[data-id="references"] p { text-align: center !important; } </style> [1] Sharma, A. and E. Kiciman. “DoWhy: An End-to-End Library for Causal Inference.” ArXiv abs/2011.04216 (2020): n. pag.
{"metaMigratedAt":"2023-06-15T19:54:00.145Z","metaMigratedFrom":"YAML","title":"Causal Inference with DoWhy","breaks":true,"description":"View the slide with \"Slide Mode\".","slideOptions":"{\"spotlight\":{\"enabled\":false}}","contributors":"[{\"id\":\"ed2adf4d-7b64-4cc8-9c2f-656c184d7122\",\"add\":4995,\"del\":9736}]"}
    294 views