Available Greenhouse Gas emissions datasets are often incomplete due to inconsistent reporting and poor transparency. Filling the gaps in these datasets allows for more accurate targeting of mitigation strategies
and therefore a faster reduction of overall emissions.
This page is a guide for practitioners on how to use automated classification methods to complete these gaps. Different problems require different solutions, so this page is an attempt to guide you to the most likely methods that could work for your problem. No guarantees (please don't sue me I'm on an academic salary...), but it works for us!
Click here to view paper, and click here to cite. This page is public so please make comments or suggest edits using HackMD so that we can have an active guide to ML in industrial ecology!
"How to" guide
The figure below provides an outline of the dataset properties that should lead you to a decision about which classifiers are most suitable to your problem. Each of these steps is discussed in the 3 sections below.