# Improving Robustness in Paper–Reviewer Assignment for AAAI 2026
## Overview
<!-- Paper–reviewer assignment is a foundational component of the conference peer-review process. High-quality assignments ensure that submissions are evaluated by reviewers who are both competent and willing. However, in modern day conferences, there is a huge increase in scale as well as an increasing risk of collusion; it is therefore important for the algorithm not just to maximize quality but also to be robust to malicious strategic behavior and even to encourage some diversity in reviewers, while also ensuring that an assignment is found quickly. For AAAI 2026, we designed a new assignment algorithm that does exactly this. -->
Paper–reviewer assignment sits at the heart of the conference peer-review process. When done well, it ensures that submissions are evaluated by reviewers who are both qualified and willing to engage.
As conferences continue to grow in size, however, assignment algorithms face new challenges. Scale has increased dramatically, and with it the risk of strategic or coordinated behavior. Modern assignment methods must therefore do more than simply maximize reviewer–paper similarity: they must also be robust to manipulation, promote diversity among reviewers, and at the same time remain computationally efficient.
For AAAI 2026, we designed a new paper-reviewer assignment algorithm with these goals in mind.
___
### Problem Setting and Default Algorithm Used in AAAI 2025
In the stardard setting,
- There is a set of **papers** $P$ with $n_p$ papers and a set of **reviewers** $R$ with $n_r$ reviewers.
- Each paper $p$ must be assigned exactly $\ell_p$ reviewers.
- Each reviewer $r$ can review at most $\ell_r$ papers.
- Assignments are represented by a **binary matrix** $x \in \{0,1\}^{n_p \times n_r}$, where:
- $x_{p,r} = 1$ means reviewer $r$ is assigned to paper $p$.
- A **similarity matrix** $S \in \mathbb{R}_{\ge 0}^{n_p \times n_r}$ measures how suitable a reviewer is for a paper. Each entry $S_{p,r}$ represents the predicted quality of reviewer $r$'s review of paper $p$.
- The overall quality of an assignment is defined as the **sum of similarities** over all assigned paper–reviewer pairs:
$$
\text{Quality}(x) = \sum_{p,r} x_{p,r} S_{p,r}
$$
A commonly used paper assignment algorithm, which is also used in AAAI 2025, is to solve the following linear problem which maximizes assignment quality subject to workload constraints:
$$
\begin{aligned}
\text{Maximize} \quad & \sum_{p,r} x_{p,r} S_{p,r} \\
\text{Subject to} \quad
& \sum_r x_{p,r} = \ell_p \quad && \forall p \in P \\
& \sum_p x_{p,r} \le \ell_r \quad && \forall r \in R \\
& x_{p,r} \in \{0,1\} \quad && \forall p \in P, r \in R
\end{aligned}
$$
## Assignment Process for AAAI 2026
### Two-Phase Matching
As with prior years, AAAI 2026 employed a two-phase reviewer assignment process designed to balance review quality, efficiency, and scalability.
In Phase 1, we matched 22,495 papers to a pool of 24,854 reviewers, assigning each paper:
• one Senior Program Committee (SPC) member,
• one reciprocal Program Committee (PC) member, and
• two non-reciprocal PC members.
Papers that received overwhelmingly negative feedback at this stage could be rejected early, allowing the review process to focus attention on more competitive submissions.
In Phase 2, additional reviewers were assigned as needed. The number and seniority of these reviewers depended on outcomes from Phase 1, including review quality, reviewer availability, and the need for additional expertise.
Overall, the two-phase structure helps filter out weaker submissions early, enables more efficient use of reviewer resources, and provides valuable flexibility in how reviewers are assigned as the process evolves.
### Similarity Score Computation
The similarity matrix $S$ is computed from two sources: content-based scores and bids. The content-based scores were computed using a text similarity model comparing the paper’s text with the reviewer’s past work on OpenReview, and normalized to be within $[0, 1]$.
Reviewer bids were incorporated in to the similarity score using the following transformation:
$$
S_{p,r} = (\text{content-based score})^{\text{bidscore}}
$$
Bid scores of **20**, **1**, **0.67**, **0.4**, and **0.25** correspond to the categories *not willing*, *not entered*, *in a pinch*, *willing*, and *eager*, respectively. This formulation preserves scores in $[0, 1]$, penalizing assignments when reviewers are unwilling and amplifying them when reviewers express interest.
For Phase 2 of the assignment, we realized that the text similarity model in OpenReview does not completely capture information regarding subject area. As such, we used $$\text{content-based score} = 0.7 * \text{text similarity score} + 0.3 * \text{subject area score}$$, where subject area information was scraped from OpenReview.
## A More Robust Assignment Algorithm used in AAAI 2026
Due to an increasing need for the robustness of the matching algorithm against manipulative bids, as well as an increase in scale for AAAI 2026, we adopted a new assignment algorithm [(Cui et. al., 2026)](http://arxiv.org/abs/2601.14402). It builds on a randomized assignment methods that maximizes a concave, perturbed similarity objective standard load and conflict constraints [(Xu et. al., 2024)](https://arxiv.org/abs/2310.05995). In addition, the algorithm incorporates *soft constraints* that explicitly encode several desiderata aimed at improving robustness:
The paper assignment problem is formulated as a **fractional optimization program** similar to the Default Algorithm mentioned above. Here, however, each $x_{p,r} \in [0, Q]$ represents the probability of assigning reviewer $r$ to paper $p$, with $Q \in (0,1]$ upper-bounding marginal assignment probabilities.
A **concave, nondecreasing perturbation function** $f : [0,Q] \rightarrow \mathbb{R}$ is applied to encourage randomized assignments.
The optimization maximizes a combination of similarity-based matching quality and additional soft objectives:
$$
\max_{x, s}\;
\sum_{p \in P}\sum_{r \in R} S_{p,r}\, f(x_{p,r})
\;+\;
\sum_{k=1}^{K} O^k(x, s^k)
$$
where the first term captures reviewer–paper similarity and each $O^k$ represents a soft objective (e.g., diversity or anti-collusion) with auxiliary variables $s^k$. Other engineering tricks (e.g. piecewise linear approximations) were used to ensure that the program runs sufficiently fast.
Overall, the formulation subsumes standard similarity-based matching while naturally extending it to support **randomization, diversity, and anti-collusion objectives** within a single optimization framework.
- **Reviewer diversity**
Encourages each paper to be reviewed by individuals from different geographic regions. This helps reduce correlated biases and groupthink, improves the breadth of feedback, and makes the overall matching less susceptible to coordinated behavior.
- **Coauthorship penalty**
Discourages assigning reviewers with prior collaborations to the same paper, even when no formal conflict of interest exists. Past collaborators often share perspectives and incentives, which can undermine the independence of reviews.
- **Bid-based 2-cycle penalty**
Reduces reciprocal reviewing arrangements in which two reviewers bid positively on and are assigned to each other’s papers. Such arrangements incentivize strategic bidding and pose a clear risk to the integrity of the review process.
---
## Results
Each phase of the assignment completed in under **30 minutes**, demonstrating that the additional robustness constraints can be incorporated at scale without compromising operational feasibility.
### Phase 1 Results
| Metric | Ours | Default |
|------------------|-------|---------|
| Relative Quality | 0.972 | 1.000 |
| Coauthors | 158 | 1028 |
| 2-cycles | 0 | 950 |
| Diversity | 0.747 | 0.555 |
In Phase 1, the new algorithm retained **97.2% of the maximum achievable assignment quality** while completely eliminating bid-based 2-cycles. At the same time, reviewer diversity increased by over **34%**, indicating a substantial reduction in collusion risk with only a modest trade-off in similarity.
Similar trends were observed in Phase 2.
### Phase 2 Results
| Metric | Ours | Default |
|------------------|-------|---------|
| Relative Quality | 0.976 | 1.000 |
| Coauthors | 240 | 637 |
| 2-cycles | 0 | 65 |
| Diversity | 0.662 | 0.627 |
---
### Score Analysis
In Phase 1, we used similarity scores directly output by OpenReview and incorporated reviewer bids. Compared to the default assignment, our algorithm produced scores with a **slightly lower mean but higher variance**, reflecting a broader exploration of feasible high-quality assignments under robustness constraints.
<!-- ![[Screenshot 2026-01-16 at 2.47.08 PM.png]]
![[Screenshot 2026-01-16 at 2.46.35 PM.png]] -->


*(Figures: similarity score distributions for Phase 1)*
In Phase 2, we modified how aggregate scores were computed. In addition to OpenReview similarities, we incorporated paper and reviewer subject-area information. Despite this change, the overall score distribution trends remained consistent with Phase 1.
<!-- ![[Screenshot 2026-01-16 at 12.41.10 PM.png]]
![[Screenshot 2026-01-16 at 12.41.47 PM.png]] -->


*(Figures: similarity score distributions for Phase 2)*
---
### Reviewer Load
Reviewer load distributions under the new algorithm closely matched those of the default assignment, indicating that robustness improvements did not come at the cost of uneven or excessive reviewer workloads.
<!-- ![[Screenshot 2026-01-16 at 11.45.01 PM.png]]
![[Screenshot 2026-01-16 at 11.44.51 PM.png]] -->
<!--  -->
<!--  -->


*(Figures: reviewer load histograms)*
---
### Bids Analysis

<!-- ![[Screenshot 2026-01-15 at 12.41.26 PM.png]] -->
Across Phase 1, *willing* and *eager* bids accounted for roughly **30%** of all bids, with similar distributions across subject areas.
An important question is whether bidding continues to matter after introducing strong robustness constraints. To examine this, we computed the ratio of papers assigned that a reviewer had bid on. Among **16,010 reviewers** with at least one positive bid, the median ratio was **1.0**, meaning that over half of these reviewers received only papers they explicitly bid on.
<!-- ![[Screenshot 2026-01-16 at 1.25.28 PM.png]] -->
<!--  -->

*(Figures: bid ratio distributions)*
We also looked at distribution of bids by subject area using the data from Phase 2 (this is because we only have subject area information from Phase 2). We see that distribution of bids is roughly equal across subject areas.
<!-- ![[Screenshot 2026-01-16 at 1.37.39 PM.png]] -->

### Subject Area Analysis
Subject-area analyses were conducted using Phase 2 data, as subject-area information was available only in that phase. Most submissions fell into Machine Learning and Computer Vision.
While average scores were broadly similar across areas, subject areas with more submissions tended to achieve slightly higher scores. This is expected, as larger areas typically have a denser pool of suitable reviewers.
<!-- ![[Screenshot 2026-01-16 at 2.33.48 PM.png]]
![[Screenshot 2026-01-16 at 2.34.35 PM.png]] -->
<!-- 
 -->


*(Figures: subject area counts and score distributions)*
---
## Concluding Remarks
In this post, we introduced the paper assignment algorithm used for AAAI 2026 and explained how we implemented it. Our new algorithm substantially improve the robustness of large-scale paper–reviewer assignments—eliminating clear forms of strategic behavior and increasing diversity—while retaining nearly all of the assignment quality achieved by standard methods. For future conferences, we suggest that reviewers submit their bids and provide more information about their past work to help the algorithm better calculate similarity scores and subject area scores, which would help improve the matching.
## References
- Michael Cui, Chenxin Dai, Yixuan Even Xu, and Fei Fang.
**A Unified Framework for Scalable and Robust Paper Assignment.**
*arXiv preprint*, 2026.
https://arxiv.org/abs/2601.14402
- Yixuan Even Xu, Steven Jecmen, Zimeng Song, and Fei Fang.
**A One-Size-Fits-All Approach to Improving Randomness in Paper Assignment.**
*arXiv preprint*, 2024.
https://arxiv.org/abs/2310.05995