# A privacy-preserving auditable system for carbon offset retirement tracking
# Abstract
In order to limit the damage of rising temperatures the world needs to reach net-zero carbon emissions as rapidly as possible. Organisations must decarbonise as quickly as they can but for emissions that are unavoidable and where decarbonisation is not currently possible, they should *offset* climate damage with an equivalent climate benefit. How then can organisations demonstrate that they are correctly using this offset mechanism? We propose a privacy-preserving system for tracking offsets used by an organisation while providing traceability and auditability through a commitment scheme and a smart contract-driven interactive proof. An implementation of the system at the University of Cambridge for tracking the carbon offsetting of business travel is also detailed, which utilizes the Tezos blockchain for public accountability.
# Introduction
Carbon credits are financial instruments granted to organizations that can prove that they have either avoided emitting carbon or have directly reduced atmospheric carbon through means that would not have been possible without financial incentives. The underlying mechanisms for reduction or avoidance come in many different forms. They can be nature-based (e.g avoided deforestation) or technological (e.g direct air capture or replacing fossil fuel generation with renewable energy where it is not yet financially feasible to do so).
Care must be taken to ensure carbon credits are only issued once for any given climate benefit. To ensure this, credits are tracked by issuing bodies in *registries* which links each credit with their benefit.
These issued carbon credits can be purchased by other organisations who have unavoidable carbon emissions they seek to mitigate. These emissions are matched with carbon credits of equivalent benefit and the two *offset* each other. In order to balance climate damage and climate benefit, when used as an offset a carbon credit must be _retired_. This is irreversible and ensures the credit can't be subsequently used to offset additional emissions.
The problem this paper addresses is how organisations can effectively track their retirement of carbon credits whilst maintaining transparency and accountability for their usage.
A solution requires the following properties:
**Multi-level privacy** Accurately tracking carbon emissions necessarily involves storing potentially sensitive data (e.g personally identifiable information for travel, supply information for industrial processes). This data cannot be shared publicly and exposure might be limited even internally in larger organisations. Multi-level privacy enables progressive summarisation and aggregation with each level being available to a wider audience.
**Auditability** Public statements of emissions should come with a mechanism for auditing the emitting activities that underly them and to establish that these are unavoidable emissions with offsetting additional to decarbonisation.
**Incremental attestation** Consumers of emissions data should be able to attest that, given a sequential set of data entries at a certain level of privacy, each entry follows the previous and also that each entry is derived from the entry at the privacy level immediately below it.
**Availability and permanence of public emissions data** In order to build reliable systems atop of public emissions data the data itself must be available. Additionally since data on emissions generating activities may outlive organisations themselves, the data must be permanent and not dependent on the emitting organisation's survival.
# Existing approaches
At present emissions data is tracked internally at many organisations, for some of whom it is a legal requirement^[https://www.gov.uk/government/publications/environmental-reporting-guidelines-including-mandatory-greenhouse-gas-emissions-reporting-guidance], but there are only a few mechanisms by which summaries of this data is made publicly available.
### Financial reporting
The method used predominantly by public listed companies, is that summaries are included in financial accounts. While this approach is questionably permanent (the UK Companies House is 178 years old and the US SEC is 88 years old after all) it fails to provide any of the other required properties. Summaries are also very coarse, with many being limited to Scope 1, 2 and 3 emissions only.
### Carbon credit registries
Another approach for organisations retiring carbon credits to offset their emissions is to report a summary of emissions data as a "retirement reason" when retiring individual credits. This rarely happens though and retirement reasons usually refer to individual emission scopes. This approach also has issues with availability and permanence, with registries being run and funded by standards bodies or their companies authorised to do so on theri behalf.
### Blockchain-based retirement systems
There have been a wide variety of efforts^[https://toucan.earth/] ^[https://www.flowcarbon.com/] ^[https://moss.earth/] ^[https://www.klimadao.finance/] to bring carbon credits on to blockchains, turning them in to *tokens* which represent the underlying credit itself. These tokens can then be moved or sold, finally being 'burnt' which retires the underlying credit. Some providers (such as Toucan) support issuing on-chain message-containing tokens that represent each retirement.
# A new system for tracking carbon credit retirements
This paper proposes a new system for tracking an organisation's carbon credit usage and retirements. It consists of a series of interlinked data layers, the lowest of which contains the private, sensitive information on the organisation's emitting activities and the highest of which contains a public coarse-grained summary. Each data entry is linked with the data entry both previous to it and below it.
Intermediate levels offer varying degrees of information hiding. In private, individual emitting activities are tracked at a granular level and are available for accounting and analysis internally within the organisation. Private entries are then summarised, removing personally identifiable and sensitive information, in to audit entries.
These auditable entries contain information that is semi-public, that is with access to all of them one could deanonymise or reverse-engineer internal data. Consider an organisation offsetting the carbon emissions from non-renewable energy sources used by computer servers. The time duration, location and utilisation of all servers would expose significant aspects of the organisation's operation. On the other hand revealing only a small number makes it very difficult to draw any real inferences.
Lastly are the public entries, further summarised from the auditable entries. These feature only information that the organisation is happy to reveal entirely to the public and may be limited to just the type of activity, the carbon emissions and the carbon credits retired or intending to be retired against the activity.
```mermaid
flowchart TB
Private --> Auditable --> Public
```
More formally, consider a system operating over $T$ timesteps with $3$ levels of access, public ($n=2$), audit ($n=1$) and private ($n=0$):
$$
D_n^t = (O_n(D_{n-1}^{t}), H(D_{n-1}^{t})), H(D_n^{t-1}))
$$
Where:
* $D_n^t$ is the data entry at level $n$ at timestep $t$
* $O_n$ is the operator at level $n$ which takes the data entry at level $n-1$ and removes or downsamples information
* $H$ is a cryptographically secure hash function
In plain terms, each data entry has three components. The first is a value derived from the data entry at the previous level by the operator. The second is a hash of the data entry at the previous level and finally, the third is the hash of the previous data entry at the current level.
Figure 1 shows (TODO: fix different terms and no time) a diagram of the resulting structure.

Operators are the mechanism for preserving privacy and there can be a unique one at each level. Some examples may be downsampling, aggregation or dropping columns from tabular data.
### Self-auditing
The datastructure can be seen as three interlinked hash chains and that, for each data entry, form a commitment scheme. Once published publicly to the blockchain the organisation has *committed* to the value of the audit entry whose hash is contained in the public entry.
Self-auditing is achieved through through the use of a smart contract that, after a series of $N$ public retirements, requires the organisation to reveal $M$ randomly-chosen audit entries where $M << N$. Even small ratios (such as 3 in 1,000) can give confidence that the data underpinning public entries is correct whilst revealing very little about the aggregate activity statistics of the organisation.
### Examples
#### Business travel
We consider application of the proposed system to an organisation wishing to offset necessarily business travel (we will detail the real-world implementation of such a system later).
At the most granular level, private, are the individual flight legs themselves which contain information such as the flyer's details, budget, reason for travel, details of approvals, time of travel and calculated emissions. Since this data contains personally identifiable information (PII) it should not be shared outside the organisation, additionally a large organisation this data would likely be kept within business units or departments.
The operator at the semi-public auditable level simply drops fields and thus it contain only the source, destination, mileage, reason for travel and calculated emissions. A small number of these can be revealed without leading to deanonymisation.
Finally the operator at the public level further drops fields resulting in entries containing only mileage and calculated emissions. Public information on mileage of business travel enables monitoring the progress of decarbonising travel, i.e there should be fewer low mileage flights over time.
```mermaid
flowchart TB
private[<b>Private</b>\nSource/destination\nFlyer's details\nMileage\nBudget approval\nReason for travel\nTime of travel\nEmissions] --> audit[<b>Auditable</b>\nSource/destination\nMileage\nEmissions\n] --> public[<b>Public</b>\nMileage\nEmissions]
```
#### Last-mile delivery
In this example, we consider application of the proposed system to last-mile delivery wherein packages are moved from a local hub to their final destinations at residential or commercial locations.
At the private level data is collected per-vehicle over some time interval, e.g day or shift, driver details, vehicle make, fuel type, a GPS trace for the period and calculated emissions.
The operator at the auditable level drops some fields but also applies a privacy-preserving transform^[Hoh, B., Gruteser, M., Xiong, H. and Alrabady, A., 2007, October. Preserving privacy in GPS traces via uncertainty-aware path cloaking. In Proceedings of the 14th ACM conference on Computer and communications security (pp. 161-171).] on the GPS traces, as otherwise this might expose private location information on e.g residential properties. The auditable level should contain vehicle make, fuel type, a privacy-preserving GPS trace, mileage and calculated emissions.
The operator at the public level removes all but vehicle make, fuel type, mileage and calculated emissions. As with business travel the data available in public entries allows for monitoring of decarbonisation of the delivery fleet.
```mermaid
flowchart TB
private[<b>Private</b>\nTime\nDriver details\nVehicle make\nFuel type\nGPS trace\nEmissions] --> audit[<b>Auditable</b>\nVehicle make\nFuel type\nPivate GPS trace\nMileage\nEmissions] --> public[<b>Public</b>\nVehicle make\nFuel type\nMileage\nEmissions]
```
#### Cloud hosting
In this final example we use an organisation that wishes to offset against emissions from non-renewable energy consumed by servers operating in their cloud environment.
At a private level data is collected per-server on an hourly interval and contains the server type, power usage over the interval, utilisation, energy generation mix and carbon emissions, geographic location, location (rack) in datacentre.
The operator at the auditable level would drop datacentre specific fields and coarsen geographic location. It would include server type, energy generation mix and carbon emission.
Finally the operator at the private level would include only power usage, energy generation mix and carbon emission. As with previous examples, this would enable monitoring of the decarbonisation of energy generation for the organisation's datacentres.
```mermaid
flowchart TB
private[<b>Private</b>\nTime\nServer type\nPower usage\nUtilisation\nGeographic location\nRack location\nEnergy Mix] --> audit[<b>Auditable</b>\nTime\nServer type\nPower usage\nUtilisation\nLocation\nEnergy Mix\nEmissions] --> public[<b>Public</b>Power usage\nEnergy Mix\nEmissions]
```
# Implementation
* Brief overview of the system being implemented, its purpose, and objectives.
## Architecture and Platform
* Describe the architecture and hardware platform used for the implementation.
## Tools and Libraries
* List the tools and libraries used for development and testing of the system.
## Implementation Details
* Detail how each component of the system was implemented, with examples where necessary. This should include any custom code or modifications made to existing libraries.
## Integration and Testing
* Discuss how the various components of the system were integrated and how the system was tested for robustness and correctness.
## Deployment
* Discuss how the system was deployed, including any challenges faced during deployment, and how the system was made available to users.
# Discussion
* SJ: What do we discuss here?
# Limitations and Future Work
* Discuss any limitations of the current implementation and possible directions for future work to address these limitations.