Peer scoring - SN6

Definitions

There are miners

m \in M

and questions

q \in Q

, each with a respective cutoff time

t_{q}

. For any pair

(m, q)

, a miner

m

submits a time series of forecasts

(p_{m, q, t})_{t \leq t_{q}},

with

t \in T_{q}

, a list of time points (depending on the question) preceding the cutoff

t_{q}

. If a miner does not submit a forecast, we denote their submission by

p_{\emptyset}

Let

S (p_{m, q, t}, o_{q})

be the score of a given prediction when the question resolves to

o_{q} \in {0, 1}

(with

1

indicating that the underlying event occurred and

0

otherwise).

Peer Score

For a set of

n

miners, the peer score on a question

q

at time step

t

takes the following form:

S (p_{m, q, t}, o_{q}) = \frac{1}{n} \sum_{j \neq m} (\log (p_{m, q, t}) - \log (p_{j, q, t})) = \log (p_{m, q, t}) - \log (GM (p_{j, q, t})_{j \neq m}),

where

GM (p_{j, q, t})_{j \neq m}

denotes the geometric mean of the predictions of all miners except

m

at time step

t

In order to prevent instabilities, we clip the miners' predictions. Indeed, otherwise one confident and wrong prediction—e.g., submitting

1

when the question resolves to

0

—would result in a score of

- \infty

, effectively eliminating the miner. We clip the predictions to the interval

(0.1, 0.99)

To incentivize full coverage, we adopt the following rule:

S (p_{\emptyset}, o_{q}) = worst possible score on a given question .

Weights

We associate a weight

w_{q, t}

to each submission, depending on its time

t \in T_{q}

. We choose exponentially decreasing weights based on the intuition that predicting gets exponentially harder as one goes back in time. Denote

T_{q} = [A_{q}, B_{q}] .

We divide the time segment

[A_{q}, B_{q}]

into

n

intervals defined by the endpoints

t_{0}, t_{1}, \dots, t_{n}

, where

t_{0} = A_{q}

and

t_{n} = B_{q}

. For each interval

[t_{j}, t_{j + 1}]

(currently of equal length, e.g., 4 hours), we set the weight

w_{q, t_{j}} = e^{- \frac{n}{n - j} + 1},

with

j

increasing from

0

n - 1

Averaging per Window

Each

p_{m, q, t}

is in fact the arithmetic average of the miner's predictions within a given time window

[t_{j}, t_{j + 1}]

. That is,

p_{j} = \frac{\sum_{t^{'} \in [t_{j}, t_{j + 1}]} p_{t^{'}}}{\sum_{t^{'} \in [t_{j}, t_{j + 1}]} 1} .

Weighted Average

Given our weights

w_{q, t}

and the miner's time series

(p_{m, q, t})

, we compute the following time-weighted average for each miner:

S_{m, q} = \frac{\sum_{t} w_{q, t} S (p_{m, q, t}, o_{q})}{\sum_{t} w_{q, t}} .

Moving Average and Extremisation

We build a moving average

L_{m, q} = \sum_{q^{'} \in Q_{N}} S_{m, q^{'}},

where the sum is taken over the last

N

questions (with

N

chosen as the average number of questions a miner would receive during the immunity period).

We then extremise this moving average to reward better predictors and to filter out miners who, on average, do not contribute any signal:

R_{m, q} = max (L_{m, q}, 0)^{2} .

The weight of the miner is then given by the normalised value:

W_{m, q} = \frac{R_{m, q}}{\sum_{m^{'} \neq m} R_{m^{'}, q}} .

New Miners

When a miner registers at time

t

on the subnet, they will submit predictions for questions

q

that opened at a time

t_{q, 0} < t

. In such cases, we assign the new miner a score of

0

, which corresponds to the baseline (i.e., when a miner is neither contributing new information compared to the aggregate nor being penalized). We also assign them a score of

0

for questions in the moving average that resolved before the miner registered.

Definitions

There are miners

m \in M

and questions

q \in Q

with respective cutoff times

t_{q}

. For any

(m, q)

a miner

m

submits a time series

(p_{m, q, t})_{t \leq t_{q}}

of forecasts ranging over

t \in T_{q}

a list of time points depending on the question and preceding the cutoff

t_{q}

. If a miner does not submit a forecast we denote his submission by

p_{\emptyset}

Let

S (p_{m, q, t}, o_{q})

be the score of a given prediction if the question resolved to

o_{E} \in {0, 1}

with value

1

if the underlying event occurred and

0

otherwise.

Peer score

For a set of

n

miners the peer score on a question

q

and time step

t

takes the following form:

S (p_{m, q, t}, o_{q}) = \frac{1}{n} \sum_{j \neq m} (\log (p_{m, q, t}) - \log (p_{j, q, t})) = l o g (p_{m, q, t}) - l o g (GM (p_{j, q, t})_{j \neq m})

where

GM (p_{j, q, t})_{j \neq m})

is the geometric mean of the predictions of all the miners except

m

for the time step

t

In order to prevent instabilities we clip the miners predictions. Indeed otherwise one confident and wrong prediction, e.g sending

1

while the question resolves to

0

, would result in a score of

- \infty

de facto eliminating the miner. We clip the predictions to

(0.1, 0.99)

In order to incentivise full coverage we have the following rule:

S (p_{\emptyset}, o_{q}) = worst possible score on a given question

Weights

We associate a weight

w_{q, t}

to each depending on the time of the submission

t \in T_{q}

We choose exponentially decreasing weights along the intuition that predicting gets exponentially harder as one goes back in time. Denote

T_{q} = [A_{q}, B_{q}]

We divide the time segment

[A_{q}, B_{q}]

into

n

intervals

[t_{j}, t_{j + 1}]

of equal length (currently 4 hours). Then for the interval

[t_{j}, t_{j + 1}]

we set the weight

w_{q, t_{j}} = e^{- \frac{n}{n - j} + 1}

where

t_{0} = A_{q}

and

t_{n} = B_{q}

and where

j

increases from

0

n - 1

Averaging per window

Each

p_{m, q, t}

is in fact the arithmetic average of the miner's predictions in a given time window

[t_{j}, t_{j + 1}]

i.e

p_{j} = \frac{\sum_{t^{'} \in [t_{j}, t_{j + 1}]} p_{t^{'}}}{\sum_{t^{'} \in [t_{j}, t_{j + 1}]} 1}

Weighted average

Given our weights (

w_{q, t}

) and the miner's time series

(p_{m, q, t})

we compute the following time weighted average for each miner:

S_{m, q} = \frac{\sum_{t} w_{q, t} S (p_{m, q, t}, o_{q})}{\sum_{t} w_{q, t}}

Moving average and extremisation

We build a moving average

L_{m, q} = \sum_{q} S_{m, q}

where

q

ranges over the last

N

questions. The parameter

N

is chosen to be the average number of questions a miner would get during the immunity period.

We finally extremise this moving average to reward more better predictors and filter our miners which on average do not contribute any signal.

R_{m, q} = max (L_{m, q}, 0)^{2}

The weight of the miner is then the normalised value:

W_{m, q} = \frac{R_{m, q}}{\sum_{m^{'} \neq m} R_{m^{'}, q}}

New miners

When a miner registers at a time

t

on the subnet they will send predictions for questions

q

that already opened at a time

t_{q, 0} < t

. When this happens we give the new miner a score of

0

which corresponds to the baseline, i.e when a miner is neither bringing new information compared to the aggregate nor is penalized. We will also give them a score of

0

on the questions in the moving average which resolved before the miner registered.

Definitions

Peer Score

Weights

Averaging per Window

Weighted Average

Moving Average and Extremisation

New Miners

Definitions

Peer score

Weights

Averaging per window

Weighted average

Moving average and extremisation

New miners

Read more

Scoring rule v3

scoring update v3