Prompt = The following is an improvement of the initial summary, which is factually correct, fluent, etc...
Feedback = This is incorrect.
1) -feedback, -reranking
2) -feedback, +rerank with sim(refinement, prompt)
3) +feedback, +rerank with sim(refinement, feedback)
4) +feedback, +rerank with sim(refinement, prompt and feedback)
a) +feedback, +rerank with sim(refinement, [prompt; feedback])
b) +feedback, +reranking with [sim(refinement; prompt) + $\alpha$ sim(refinement; feedback)] with $\alpha=1$ as default
$y^i = \{y^i_1, y^i_2, \dots\}$ where $y^n_i$ is the $i^{\text{th}}$ $n$-gram in $y$.
$f^i = \{f^i_1, f^i_2, \dots\}$
$y'^i = \{y'^i_1, y'^i_2, \dots\}$
Measure:
1) $\text{ROUGE}(f, y' - y)$ using $n$-grams above
2) $\text{ROUGE}(f, y') - \text{ROUGE}(f, y)$
3) $\text{sim}(f, y') - \text{sim}(f, y)$
4) Human Evaluation: "Does $y'$ incorporate (some/all) of the feedback in $f$?"