Xingyi Yang
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    ## Common Response We thank all reviews for their comments and constructive suggestions. We would like to remark on our technical novelty as a common response. In this study, we have three modules integrated together to address the inconsistency issues in SSOD. **Adaptive Sample Assignment** addresses the label-shifting phenomenon caused by the extra static label assignment in SSOD. The static label assignment breaks one important property that exists in semi-supervised classification. In classification, the instance-level pseudo-label satisfies $$\hat{\mathbf y} = \mathop{\mathrm{argmin}}_{\hat{\mathbf y}}\mathcal L(f_t(\mathbf x^u), \hat{\mathbf y}^u)$$ meaning that the one-hot pseudo-label $\hat{\mathbf y}$ can be reapplied to the teacher model and aligns with its own prediction. This property is critical to semi-supervised learning as it would not induce biases and label noises even in the simplest scenario where the student model is identical to the teacher. However, this rule can be easily broken in most previous SSOD methods that adopt heuristic and static anchor assignments. That is, the assigned labels for anchors differ greatly from their own predictions, which is the root of the pseudo-label drifting phenomenon in `Fig. 1. `. Therefore, we propose to assign anchors that minimize the loss on unlabeled images \begin{align} \min_{a_1, \cdots, a_N} \sum_n^N \Big[\mathcal{L}_{cls}\big(f_s( \mathbf x^u)_n, \hat{\mathbf {y}}_{a_n}^u)\big) + \mathcal{L}_{reg}\big(f_s(\mathbf x_j^u)_n, \hat{\mathbf {y}}_{a_n}^u\big)\Big] \end{align} $$\min_{a_1, \cdots, a_N} \sum_n^N \Big[\mathcal{L}_{cls}\big(f_s( \mathbf x^u)_n, \hat{\mathbf {y}}_{a_n}^u)\big) + \mathcal{L}_{reg}\big(f_s(\mathbf x_j^u)_n, \hat{\mathbf {y}}_{a_n}^u\big)\Big]$$ where $n$ is the anchor index, and $a_n \in \{1, 2, \cdots, L+1\}$ stands for the assigned pseudo-bbox index from the $L$ predicted bboxes, and the index $L+1$ represents the background label. In this way, as we apply the pseudo-bboxes back to a student model that is identical to the teacher, the newly-assigned label from the pseudo-bboxes could align with the model's prediction. This dynamic assignment strategy dubbed adaptive sample assignment (ASA) is therefore devised to reduce noises in the pseudo targets of anchors for semi-supervised object detection. We find that ASA takes a similar formulation as that proposed in supervised models such as [1], and thus applies ASA to both labeled and unlabeled images to make a unified detector for SSOD. In summary, although the ASA module is similar to the form of [1] that is used in supervised object detection, it is motivated differently to solve the unique pseudo-label drifting problem in SSOD. Moreover, we have compared the performance of ASA under both supervised and semi-supervised settings in `Tab. 4`, and found that the performance gain of ASA in SSOD is almost twice as much as that in the supervised setting. The extra improvement comes from the prohibition of the label noises induced by the static anchor assignment. **Novelty in Feature alignment**. The feature alignment between classification and regression is also to address a problematic pseudo-bbox evaluation protocol that is unique in SSOD and absent from semi-supervised classification. The pure reliance on confidence score to output pseudo-bboxes would result in unsatisfactory bboxes that are sensitive to input augmentations and model weight change and tend to oscillate during training. Better calibration of cls-reg tasks would calibrate the classification score and the bbox quality. No prior work in SSOD applied the feature alignment module to reduce the gap between high classification confidence and accurate bounding box prediction. Besides, [2] restricts its feature re-sampling on each scale, while alignment across multiple feature scales(FAM-3D) has not been introduced, even in the general object detection task. Our 3D-Feature alignment is indeed new. As the first attempt to address the cls-reg inconsistency, 1.0% mAP improvement (including the 0.4% mAP extra improvement of FAM-3D) on MS-COCO should be enough to highlight its significance. We have uploaded a new demo video in the `Supplementary Material` to illustrate the superiority of the FAM-3D module. It shows that FAM-3D present high-score but low-quality noise predictions. **Novelty in GMM-based thresholding**. We take the first attempt to introduce the GMM model to dynamically adjust the pseudo-label threshold in SSOD. It reduces the inconsistency in the long training process when some labels are first treated as background but then treated as foreground, which incurs both inefficient learning and confirmation bias issues in SSOD. This GMM module is considered significant as it would free us from the tedious finetuning of the threshold hyperparameter and also brings performance boost. Therefore, the three modules are closely related to the SSOD settings and are integrated to solve the inconsistency issues. We achieve truly compelling improvement over past SSOD methods, with 40.0 mAP with COCO 10% labels and 47.2 mAP on COCO additional evaluations. We believe this work provides another perspective to analyze the unique inconsistency problems in SSOD and thus contributes to the community. [1] Zheng Ge, Songtao Liu, Feng Wang, Zeming Li, and Jian Sun. Yolox: Exceeding yolo series in 2021. [2] Chengjian Feng, Yujie Zhong, Yu Gao, Matthew R Scott, and Weilin Huang. Tood: Task-aligned one-stage object detection. ICCV 2021. # Reviewer tBiG We thank the reviewer for acknowledging our problem setup. And thanks again for considering our experiments comprehensive and convincing. ```>>> Q1``` The ASA algorithm is not applicable to anchor-free object detections. ```>>> A1``` **Our ASA generalises to both anchor-based and anchor-free detectors**. Actually, anchor-free object detectors differ from anchor-based methods by the definition of ``anchors``. Anchor-based methods predefine a set of anchors boxes on each feature map, while anchor-free regress bboxes from the center points of feature maps. In this study, we unify the notation of both **anchor points** in anchor-free detectors and **anchor boxes** in anchor-based detectors as our `anchor` (`Footnote on page 4`). Our ASA algorithm is designed for adaptive anchor assignment and can be applied in both anchor-based and anchor-free methods by simply replacing the assignment modules. ```>>> Q2.1``` How does FAM-3D module reduce the pseudo-label inconsistency? ```>>> A2.1``` We thank the reviewer for the question. In SSOD, bboxes are evaluated only according to the classification scores. Unfortunately, the classification score does not necessarily reflect the quality of its bounding box, resulting in false positives whose scores are higher than the threshold but have unsatisfactory bboxes. In FAM-3D, the regression branch is enhanced, preventing high-confidence predictions with poor bounding box regression. Let's say we have a box with a large confidence score (>threshold). Simple detection head may produce error regression results, while FAM-3D module provides more accurate bounding box with flexible feature selection. In return, such high-score high-quality pseudo boxes further refine the students with calibrated predictions. We upload a new demo video in `supplementary ` for your reference. ```>>> Q2.2``` How do we know that the improvement shown in Table 5 is solely due the better consistency due to FAM-3D? ```>>> A2.2``` Thanks for your question. As illustrated in `Figure 6`, FAM-3D reduces the mis-calibration between classification score and bbox IOU, such that the classification score serves as better indicator for pseudo box filtering. `Table 5` also compared the relative improvement of FAM-3D under both semi-supervised and fully-supervised settings. The improvement under semi-supervised setting is almost twice that as in fully-supervised setting, implying the extra benefit of FAM-3D under SSOD. ```>>> Q2.3``` Is the process in Eq. 4 differentiable? ```>>> A2.3```Yes. both Eq. 3 and 4 are fully differentiable. For $0 \leq d_2\leq 1$, the features $\mathbf P(:, :, l+1)$ at level $l+1$ are rescaled to the same size of $\mathbf P(:, :, l)$ and then a weighted average of resized $\mathbf P(:, :, l+1)$ and $\mathbf P(:, :, l)$ is conducted according to $d_2$ to interpolate at non-integer offsets. For example, when $d_2=0.5$, then we would re-sample as $$\mathbf P'(i, j, l) = 0.5 \times \text{resize}(\mathbf P)(i, j, l+1) + 0.5 \times \mathbf P(i, j, l)$$ By the way, the notation $\mathbf P$ is adopted to represent a pyramid of featuremaps, with $\mathbf P(i, j, l)$ standing for the feature at planar coordinate $(i, j)$ ($1\le i\le H_l$ and $1 \le j \le W_l$) in the $l^{\mathrm{th}}$ pyramid level. ```>>> Q3``` A principled method to address the pseudo-label drifting issue. ```>>> A3``` Thanks for the nice question. All designs are closely bounded with a single and core problem: "**How can we reduce the pseudo label inconsistency in the SSOD?**", which is depicted in the common response. Summarizing, ASA reduces the target shifting with the consistent assignment; FAM3D calibrates the classification score and bbox accuracy; GMM alleviates the fluctuation of the confidence score thus stabilizing the pseudo target number during SSOD training. We have elaborated our motivation in the adaptive anchor assignment method to solve this issue. # Reviewer eai7 We thank the reviewer for the constructive comments and would like to address them as follows. `>>> Q1` Technical novelty `>>> A1` We thank the Reviewer eai7 for the question. Our technical contribution is described in the common response and we summarize them again 1. **Novelty in the problem definition**. We believe that our biggest contribution is the formal introduction of three inconsistency problems in SSOD. We define them, point out their existence and bring up new solutions. 2. **Novelty in adaptive sample assignment**. Our adaptive sample assignment is specially designed for SSOD. The pseudo-label drifting issue is explained in the general reply(Also see `Figure 1 in the paper`), and we have elaborated our motivation in the adaptive anchor assignment method to solve this issue. We would revise our submission to highlight this point. Coincidentally,our approach shares a similar form with standard adaptive assignment in the fully supervised scenario. But the two methods are designed with different problem setups and motivations. 3. **Novelty in Feature alignment**. No prior work in SSOD applied the feature alignment mdule to reduce the gap between classification confidence andbounding box accuracy. Besides, feature alignment across multiple feature scales(FAM-3D) has not been introduced, even in the general object detection task. Our 3D-Feature alignment is indeed new. 4. **Novelty in GMM-based thresholding**. We take the first attempt to introduce the GMM model to dynamically adjust the pseudo-label threshold in SSL. In sum, all of our problem setup and techiques are, for the first time, introduced in the SSOD task. Our proposed methods brings about ~3 mAP improvement on MS-COCO datasets, which should also be considered as a strong and practical contribution. We kindly hope the reviewers should moderately consider our novelty. `>>> Q2` Relations between the introduced modules `>>> A2` Thanks for the question. All designs are closely bounded with a single and core problem: "**How can we reduce the inconsistency in the SSOD?**" ASA reduces the target shifting with the consistent assignment; FAM3D calibrate the classification score and bbox accuracy; GMM alleviates the fluctuation of the confidence score thus stabilizing the pseudo target number during SSOD training. Three modules address three different aspects of a single problem, with sufficient evidence(`Fig 3-6`) and significant improvements (`Table 4`, `Fig 7-8`). `>>> Q3` Writings improvement `>>> A3` We sincerely thank the reviewer for the suggestion. We will soon revised our manuscript for better presentation and language quality. Stay tuned. # Reviewer 6uS6 We sincerely thank the reviewer for the insightful and constructive comments. We are glad that the reviewer finds our work well written and clearly presented. The concerns are fully addressed as follows. ```>>> Q1``` Highlight the novelty and contritubitions ```>>> A1``` We thank the Reviewer for the question. Our methodology novelty and contributions are summarized below 1. **Novelty in the problem definition**. We believe that our biggest contribution is the formal introduction of three inconsistency problems in SSOD. We define them, point out their existence and bring up new solutions.A good question is worth a million good answers. 2. **New solution in inproper pseudo label assignment**. Our adaptive sample assignment is specially designed for SSOD. The pseudo-label drifting issue is explained in the general reply(Also see `Figure 1 in the paper`), and we have elaborated our motivation in the adaptive anchor assignment method to solve this issue. We have revise our submission to highlight this point. Coincidentally, our approach shares a similar form with the standard adaptive assignment[1][2][3] in the fully supervised scenario. But our method is designed with completely different problem setups and motivations. We are also surprised to find that our ASA, though simple, sufficiently address the assignment shifting in SSOD. 3. **Novelty in Feature alignment**. No prior work in SSOD applied the feature alignment module to reduce the gap between high classification confidence and accurate bounding box prediction. Besides, [3] restricts its re-sampling on each scale, while alignment across multiple feature scales(FAM-3D) has not been introduced, even in the general object detection task. Our 3D-Feature alignment is indeed new. As the first attempt to address the cls-reg inconsistency, 0.3% - 0.4% mAP improvement on MSCOCO should be enough to highlight its significance. 4. **Novelty in GMM-based thresholding**. We take the first attempt to introduce the GMM model to dynamically adjust the pseudo-label threshold in SSL. In sum, all of our problem setup and techiques are, for the first time, introduced in the SSOD task. Our proposed methods brings about ~3 mAP improvement on MS-COCO datasets, which should also be considered as a strong contribution. We kindly hope the reviewers should moderately consider our contribution. [1] Zheng Ge, Songtao Liu, Zeming Li, Osamu Yoshie, and Jian Sun. Ota: Optimal transport assignment for object detection. CVPR 2021. [2] Zheng Ge, Songtao Liu, Feng Wang, Zeming Li, and Jian Sun. Yolox: Exceeding yolo series in 2021. [3] Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. End-to-end object detection with transformer, ECCV 2020. [4] Chengjian Feng, Yujie Zhong, Yu Gao, Matthew R Scott, and Weilin Huang. Tood: Task-aligned one-stage object detection. ICCV 2021. # Reviewer 4adR ```>>> Q1``` Definition of j in equation 2 ```>>> A1``` Sorry for the typo. $C_{ij}$ in Eq 2 should be written as $C_{il}$, which represents the matching cost between the $I$-th anchor/feature point and the $l$-th gt bbox. We have revise Eq 2 accordingly. ```>>> Q2``` Effect of the $\lambda_{dist}$. ```>>> A2``` We thank R4adR for the question. Though our experiments, $\lambda_{dist}=0.001$ serves to stableize our training as a weak centerness prior. We provides the results with for $\lambda_{dist}\in \{0.001, 0.02 , 0.01\}$. For $\lambda_{dist}<0.001$, the assignment is highly unstable, especially in the beginning of training when the matching cost is high and inaccurate. It results in a very low mAP. When the $\lambda_{dist}$ is large, the centerness prior cancels out the performance benefit of our proposed ASA. | $\lambda_{dist}$ |<0.001 | 0.001 |0.002 |0.01| |--|--|--|--|--| | mAP | 35.0$\pm$ 5 | 40.0 |39.8| 39.2| ```>>> Q3``` Name for adaptive thresholding ```>>> A3``` Thanks for the suggestion. We have revised the name from `temporal consistency` to `thresholding consistency`. ```>>> Q4``` Connection between figure 1 and the three contributions ```>>> A4``` We thank the reviwer for the question. We believe that Figure 1 highlights our 3 motivations clearly. 1. **Assignment Inconsistency**. On the right figures, the IOU-based strategy may make errorous assignment, with bbox even assigned to anchor far away of the real object with high loss values(Red dots far from the polar bear). Our ASA, on the other hand, consistently assign the to the feature close to the object center with low matching cost. 2. **Cls-Reg inconsistency**. On the left features, the classification loss quickly decreases and overfits, while then regression objective can not be properly optimized. It shows that the imbalanced optimization of the two tasks. Our feature alignment resamples the feature for regression branch, thus balancing the two taks with faster convergence. 3. **Temporal Inconcistency**. On the right figures, there are bound to be more and more predicted bounding boxes given a fixed score threshold as training goes on. Therefore, the false positives are inevitable to some extent. For example, the bounding boxes of the polar bear endures consistent updating through training, even with multiple pseudo label for the same object. Our GMM apporach (Row 2), on the other hand, reduces the abnormal pseudo targets, as the increasing adaptive threshold (shown in Fig. 5) would supress the false predictions and thus results in superior performance. `Figure 1` thus clearly states all our motivations. If anything is still unclear, please leave messages here and we will try our best to address them.

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully