# Layer Sweep ## Metric $$ completeness_{layer} = \frac{p_{layer}(object) - p_{-1}(object)}{p_{-1}(object)}$$ $$ contribution_{layer} = \frac{p_{layer}(object) - p_{layer - 1}(object)}{p_{-1}(object)}$$ restoring $h_{subject}$ at different layers in a corrupted run (different subject), but only at the `subject_last` position. $$ causal\_score_{layer} = \frac{p(answer | h_{subject}^{layer} \text{ restored on corrupted run}) - p(answer | \text{clean run})}{p(answer | \text{clean run})}$$ | completeness | contribution | causal_score | faithfulness | -------- | -------- | -------- | -------- ||||| ||| | ||| | |||| |||| |||| |||| |||| <!-- relation = `The capital of {} is` Each calculated with 3 icl examples $$ completeness_{layer} = \frac{p_{layer}(object) - p_{-1}(object)}{p_{-1}(object)}$$  --- $$ contribution_{layer} = \frac{p_{layer}(object) - p_{layer - 1}(object)}{p_{-1}(object)}$$  | J_norm | Jh_norm | | -------- | -------- | || | --- ## Causal Tracing  ## ICL-Mean estimation performance at different layers  -->
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up