박동민
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # NeurIPS Rebuttal --- ## Reviewer1 WA(6), Confidence(4) We deeply appreciate the reviewers’ valuable comments and some concerns. We hope that the concerns can be resolved through our clarifications in this rebuttal. `Q1.Is it necessary to train the MQ-Net in each round? What if we train it in the first round (or first several rounds) and fix it in the remaining round?` The goal of MQ-Net is to adaptively find the best trade-off between purity and informativeness throughout the entire AL period, since the optimal balance varies with respect to the learning stage of the target classifier. Thus, if we fix the MQ-Net after the first round (or first several rounds), it no longer adjusts this trade-off, leading to a suboptimal result. For example, if the MQ-Net emphasizes purity over informativeness at the first round and keeps sticking to the policy, many informative examples would be ignored in query selection at later AL rounds. `Q2. The performance comparison between MQ-Net and other simple alternatives is not discussed. How do the MQ-Net's performances differ from those simple alternatives?` This is a very good point. As shown in Figure 4, the best trade-off between purity and informativeness differs by the learning stage and the open-set noise ratio. However, if we use heuristic rules as simple alternatives, we should search for the best rules every time we learn a new classifier on new datasets. However, as shown in Figure 3, MQ-Net successfully finds the best trade-off throughout the learning stage with varying OOD ratios by leveraging our meta-learning framework. This flexibility of MQ-Net is a clear advantage over the simple alternatives. `Q3. The architecture of MQ-Net is not clearly stated. The activation functions are not reported, and the layer number is only reported in the appendix.` We appreciate the reviewer for pointing out this important issue. We used a shallow MLP architecture with a layer number of 2, a hidden dimension size of 64, and the ReLU activation function. We clarified these details of the architecture in Section 5.1 with the R1Q3 mark. `Q4. The multi-score version is intuitive by adding more input dimensions to MQ-Net. Furthermore, when more scores are used, there would be a score selection problem in MQ-Net. The solution to that problem will increase the quality of the paper.` Thank you very much for helping us improve our paper. Though using multiple (more than two) scores is a very interesting topic, it seems to be beyond the scope of this paper. We leave this topic for potential future work. --- ## Reviewer2 R(3), Confidence(4) ## TODO: Q2-2(done?), Q3-1(done?), Q3-2(2)(done?), Q4(done?), Q8(done?) We deeply appreciate the reviewers’ valuable comments and reasonable concerns. We hope that they can be resolved through our clarifications in this rebuttal. #### Major Concerns (Q1~3). `Q1. About problem definition: There is no need to create a new concept and call it an open-set noise problem. This concept is wider. For instance, some instances belong to ID but contain noise in x. It is not OOD data but it still contains noise.` Yes, we deal with the IN/OOD problem, the same setting as in CCAL and SIMILAR. In fact, “open-set noise” is frequently used as a **synonym** of “out-of-distribution (OOD)” data in machine learning literature on open-set recognition [Salehi et al., 2021], OOD detection [Yang et al., 2021], open-set noisy label handling [Wei et al., 2021, Wang et al., 2018], and open-set semi-supervised learning [Saito et al., 2021, Yu et al., 2020, Huang et al., 2021]. In particular, looking at [Saito et al., 2021, Yu et al., 2020, Huang et al., 2021] in which OOD examples are mixed with **clean** IN examples, they use the term “**open-set** semi-supervised learning” just like our paper. When the noise in IN examples is addressed as well, it is common to specify closed-set noise and open-set noise together (e.g., see [Sachdeva et al., 2021]), which is beyond the scope of this paper. Overall, we haven’t created a new wider concept or setting. Following your advice, we will clarify that closed-set noise is not involved in the method. [Salehi et al., 2021] "A unified survey on anomaly, novelty, open-set, and out-of-distribution detection: Solutions and future challenges.", arXiv preprint arXiv:2110.14051, 2021. [Yang et al., 2021] "Generalized out-of-distribution detection: A survey," arXiv preprint arXiv:2110.11334, 2021. [Wei et al., 2021] "Open-set label noise can improve robustness against inherent label noise," In NeurIPS, 2021. [Wang et al., 2018] "Iterative learning with open-set noisy labels," In CVPR, 2018. [Saito et al., 2021] "Openmatch: Open-set semi-supervised learning with open-set consistency regularization," In NeurIPS, 2021. [Yu et al., 2020] "Multi-task curriculum framework for open-set semi-supervised learning," In ECCV, 2020. [Huang et al., 2021] "Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for Open-Set Semi-Supervised Learning," In ICCV, 2021. [Sachdeva et al., 2021] "EvidentialMix: Learning with Combined Open-set and Closed-set Noisy Labels," In WACV, 2021. `Q2-1. About motivation of the purity-informativeness dilemma: Figure1 is not convincing enough since LL4AL and CCAL are both non-typical methods and they are not comparable. Also, the example only shows the first 10 rounds and shows low-noise (10% and 20 % OOD rate) cases.` LL is indeed a representative HI-focused method in that it only uses the predicted loss as the informativeness score and there is no purity score combined. CCAL is also a representative HP-focused method since it carefully incorporates the purity score by using CSI. Thus, they are comparable from the perspective of showing the dominance between the HI-focused method and the HP-focused method throughout the learning stages (i.e., AL rounds). Our 10-round experiment is a quite typical setting in AL literature [Yoo et al., Moon et al.]. For a higher OOD rate (e.g., 30%), a similar trend was observed, where the cross point appeared at a later round. Following your suggestion, we replaced Figure `1(c)` with the plot for a 30% OOD rate (see the revised paper). [Yoo et al.] "Learning loss for active learning.", In CVPR, 2019. [Moon et al.] "Confidence-aware learning for deep neural networks.", In ICML, 2020. `Q2-2. About motivation of the purity-informativeness dilemma: Why can’t we maintain purity all the time and at the same time acquire high informativeness? If there is an ideal method, the proportion of OOD samples in the unlabeled data pool will naturally be higher and higher, and more attention should be attached to purity.` Usually, the purity score favors examples for which the model exhibits high confidence (i.e., certain in the model's prediction), while the informativeness score favors examples for which the model exhibits low confidence (i.e., uncertain in the model's prediction). That is, an opposite trend between the two scores is natural, e.g., if an example shows a high purity score, then its informativeness score is likely to be low. Therefore, it is very difficult to achieve high purity and high informativeness all the time. Favoring high purity over the AL rounds is not challenging because we can select a query set with high purity by including only easy examples in an unlabeled set. However, at the later round, this strategy will not make a significant gain in the model performance due to the low informativeness in the selected set. We empirically observed that, as the model performance increases, ‘fewer but highly-informative’ examples are more impactful than ‘more but less-informative’ examples in terms of improving the model performance. Therefore, it is necessary to emphasize informativeness in later AL rounds even at the risk of choosing OOD examples. For the second question, the size of the unlabeled set is assumed to be considerably larger than those of the query set and the labeled set, e.g., vast amounts of unlabeled images collected by web-crawling. Then, even though there is an ideal method, the proportion of OOD examples in the unlabeled pool would slightly increase throughout the AL rounds. `Q3-1. About experiment results: There is no error bar. The author didn't provide the code.` We added the error bar and standard deviation in Figure 3 and Table 8 in the revised version. We are sorry to miss our source code and now provide it at [the link](https://anonymous.4open.science/r/MQNet-43E6/) (updated in the revised paper with the R2Q3 mark). For implementing SIMILAR and CCAL, we used the same source code available at their official Github links (SIMILAR: https://github.com/decile-team/distil and CCAL: https://github.com/RUC-DWBI-ML/CCAL). `Q3-2. About experiment results: The experimental results on CCAL and SIMILAR are strange on low noise situations (10% and 20% OOD rate), which are even worse than typical uncertainty-based sampling strategy (e.g., CONF). SIMILAR authors said "If there is low-noise then it should only be less challenging and the performance should at least be consistent and better than MARGIN." CCAL author showed me their new experiments on low-noise data scenarios, also better than typical uncertainty-based measures. Is it a fair comparison?` We appreciate the reviewer for these careful comments and answer your questions in two perspectives. (1) We would like to clarify that a low performance of CCAL is also reported in their original paper [Du et al., 2021]. See the left-most plot (20% OOD rate) of Figure 4 for CIFAR-10. The accuracy of CCAL is lower than those of several baselines. For your convenience, here is [the link](https://bit.ly/3cXQ3Cm) to Figure 4. For SIMILAR, a low OOD rate was not considered in their original paper [Kothawade et al., 2021]. We are not aware of the new experiment results which the reviewer received from the CCAL authors and are not sure whether such unpublished, private communication can be considered for the review. [Kothawade et al. 2021] "Similar: Submodular information measures based active learning in realistic scenarios." In NeurIPS, 2021 [Du et al. 2021] "Contrastive coding for active learning under class distribution mismatch.", In ICCV, 2021. (2) Moreover, at the initial phase of our work, we had thought that a low OOD rate was less challenging, just like the SIMILAR authors thought. However, it turned out that our initial thought was wrong for the following reason. In the low-noise case, the standard AL methods such as CONF and MARGIN can query many IN examples **even without careful consideration of purity**. As shown in Table R1, with 10% noise, the ratio of IN examples in the query set reaches 75.2% at the last AL round in CONF. This number is farily similar to 88.4% and 90.2% in CCAL and SIMILAR respectively. In contrast, the difference between CONF and CCAL or SIMILAR becomes much larger (i.e., from 16.2% to 41.8% or 67.8%) with the high-noise case (20% noise in Table R2). Therefore, especially in the low-noise case, the two purity-based methods, SIMILAR and CCAL, have the potential risk of **overly selecting less-informative IN examples** that the model already shows high confidence, leading to lower generalization performance than the standard AL methods. Overall, putting these two facts together, we believe that the low performance of CCAL and SIMILAR on a low OOD rate is not strange and hope that our analysis is persuasive to you. Table R1: Accuracy and ratio of IN examples in the query set for our split-dataset experiment on CIFAR10 with open-set noise of **10%**, where $\% Q_{in}$ means % of IN examples in a query set. | Method | Round |1|2|3|4|5|6|7|8|9|10| |:-----:|:----------:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:| | CONF | ACC |62.3|74.8|80.8|84.5|86.8|89.0|90.6|91.5|92.4|92.8| | | $\% Q_{in}$|87.6|82.2|80.8|79.0|75.2|76.2|74.0|74.6|74.0|**75.2**| | CCAL | ACC |61.2|71.8|78.2|82.3|85.0|87.0|88.2|89.2|89.8|90.6| | | $\% Q_{in}$|89.0|88.4|89.2|88.6|89.6|88.8|90.4|88.0|88.6|**88.4**| |SIMILAR| ACC |63.5|73.5|77.9|81.5|84.0|86.3|87.6|88.5|89.2|89.9| | | $\% Q_{in}$|91.4|91.0|91.6|92.6|92.6|91.4|92.2|90.6|90.8|**90.2**| Table R2: Accuracy and ratio of IN examples in the query set for our split-dataset experiment on CIFAR10 with open-set noise of **60%**, where $\% Q_{in}$ means % of IN examples in a query set. | Method | Round |1|2|3|4|5|6|7|8|9|10| |:-----:|:----------:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:| | CONF | ACC |56.1|65.2|69.6|73.6|76.3|80.3|81.6|83.7|84.9|85.4| | | $\% Q_{in}$|37.4|32.2|28.2|25.4|25.6|20.0|20.8|17.0|18.0|**16.2**| | CCAL | ACC |56.5|67.0|72.2|76.3|80.2|82.9|84.6|85.7|86.6|87.5| | | $\% Q_{in}$|42.0|38.6|39.8|41.2|38.6|42.2|42.2|40.4|42.2|**41.8**| |SIMILAR| ACC |57.6|67.6|72.0|75.7|79.7|82.2|84.2|85.9|86.8|87.4| | | $\% Q_{in}$|56.0|61.0|67.2|66.6|67.4|67.2|68.0|67.1|68.2|**67.8**| #### Minor Questions (Q4~11). `Q4. In lines 63-64, learning is for updating the META model to better output Φ(<P(x),I(x)>; w), instead of using it to get better P(x) and I(x). It feels like it is just to train a classifier, but in deep learning tasks, the feature representation and classifier are jointly trained.` Because $P(x)$ and $I(x)$ are obtained from the classifier, $P(x)$ and $I(x)$ get better as the training progresses (with more labeled examples). Of course, the meta-model $\Phi(\langle P(x),I(x) \rangle; w)$ is improved to produce better prioritization. Thus, the score functions and the meta-model are improved together, as you precisely expect. `Q5. In line 95, why choose a classifier-dependent approach to get a meta-model, what is the motivation?` In fact, we didn’t choose a classifier-dependent approach. Line 95 is just the introduction of OSR methods in the related work section. `Q6. In line 277-288, why did you use CSI and LL for purity and informativeness scores, respectively?` As shown in Section 5.4, we used CONF and LL as the informativeness scores, and CSI and ReAct as the purity scores. We choose the combination of LL and CSI as the default setting of MQ-Net, since it shows good overall accuracy, as reported in Table 2. The other combinations also showed better accuracy than the baselines. `Q7. Is MQ-Net jointly trained with the backbone classifier? or not like LL?` MQ-Net is disjointly trained with the backbone classifier. That is, the training procedure alternates between the classifier and MQ-Net, and the details can be found in the algorithm pseudocode in Appendix B. `Q8. Equation 3 looks similar to find pareto fronts. Could the author provide some discussions about the situation that the size of a pareto front set is less than the batch size in the active learning process?` This is a good point. Equation 3 is similar to find pareto fronts, but it is slightly different. The pareto front is a set of examples having at least one dominance in purity or informativeness over all other examples (see [Liu et al., 2015] for details). However, the skyline constraint in Equation 3 is just to ensure the output score of MQ-Net to satisfy $\Phi (z_{x_i})$ $>$ $\Phi(z_{x_j})$ if $P(x_i)>P(x_j)$ **AND** $I(x_i)>I(x_j)$ for all $i$, $j$. That is, the Pareto front does not necessarily have to be the set with the highest MQ-Net score. Thus, regardless of the size of a pareto front set, MQ-Net just queries examples in the order of their meta-scores within the budget (i.e., batch size). [Liu et al., 2015] "Finding pareto optimal groups: Group-based skyline." In VLDB, 2015. `Q9. Is ResNet18 pre-trained?` No, we did not use any pre-trained networks. `Q10. Did the author conduct repeat trials per experiment?` Of course. We clarified it in Line 294-295. We also added the error bar in the revised paper. `Q11. In line 284-285, the author already defined the cost of querying OOD data samples. Why is it not an evaluation metric in later experimental result analysis?` Thank you very much for helping us improve our paper. According to the reviewer’s suggestion, we conducted additional experiments and analyzed the effect of different costs for querying OOD examples in Appendix F in the revised version. Table 7 summarizes the performance change with four different labeling costs (i.e., 0.5, 1, 2, and 4) for the split-dataset setup on CIFAR10 with an open-set ratio of 40%. Overall, MQ-Net consistently outperformed the four baselines regardless of the labeling costs. Meanwhile, CCAL and SIMILAR were more robust to the higher labeling cost than CONF and CORESET. This is because CCAL and SIMILAR, which favor high purity examples, query more in-distribution examples than CONF and CORESET, so they are less affected by labeling costs, especially when cost $\tilde{c}$ is high. --- ## Reviewer3 A(7), Confidence(2) ## TODO: Q4 We deeply appreciate the reviewers’ constructive comments and positive feedback on our manuscript. `Q1. Computational efficiency of MQ-Net?` This is a good point. MQ-Net needs one more meta-training phase at every AL round. However, it is not very expensive, because MQ-Net uses a very light MLP architecture and the amount of a meta-training set is small. For example, the size of the meta-training set is only 1% of the labeled+unlabeled set for our split CIFAR10/100 experiments. `Q2. A running example of what the OOD data and informative versus non-informative data looks like would be helpful.` Thank you very much for your careful comment. As you mentioned, Figure 1(a) is intended to explain the purity-informativeness dilemma, but we couldn’t include the details due to lack of the space. Let us detail our intention to present Figure 1(a). The task is to classify dogs and cats in a given image dataset. (1) The HP-LI subset includes trivial (easy) cases of dogs and cats. (2) The HP-HI subset includes moderate and hard cases of dogs and cats, e.g., properly labeled dog-like cats and cat-like dogs. (3) The LP-HI subset includes other similar animals (e.g., wolves and jaguars) which may share some features with dogs and cats. (4) The LP-LI subset includes other dissimilar animals. Overall, it is clear that HP-HI is the most preferable; however, it is NOT clear which of HP-LI and LP-HI is more preferable. This issue is defined as the purity-informativeness dilemma. We will add this explanation in the supplementary material or an external web page (e.g., GitHub repository). `Q3. Equation 1 is defined as the optimal query set approach, but is not mentioned otherwise in the paper. Also, the cost constraint in Equation 1 is used in MQ-Net but is not mentioned in that section.` Equation 1 formalizes the open-set AL problem. Here, MQ-Net is used to derive a query set $S_Q$ in each round. More specifically, the examples in the order of the meta-score $\Phi(x; w)$ within the budget $b$ form the query set. We expect that this query set is very close to $S_Q^*$ in Equation 1. In Section 4.1, we focused on the training of the MQ-Net itself, and the overall procedure involving the budget was not contained. The overall procedure is clearly described in Appendix B (see the AL procedure pseudocode). We will improve the presentation so that Equation 1 and Section 4.1 can be better connected. `Q4. The intuitive interpretation of L(S_Q) was not clear. The main idea from previous sections was that we want a loss that emphasizes informativeness more in later rounds. How does this loss function do that?` This is a very good question. $L(S_Q)$ is designed to favor high-loss (i.e., uncertain) examples, which tend to be highly informative, in the meta-model $\Phi(\cdot; w)$. Also, the HP-HI subset is the most preferred by the skyline constraint. Because the backbone classifier is not mature at early rounds, the loss value may not precisely represent the informativeness. As its side effect, simply in-distribution (IN) examples can be selected more often at early rounds than at later rounds. As the classifier becomes mature, the loss value is able to precisely capture the informativeness. Consequently, the informativeness can be properly emphasized at later rounds using Equation (3) **with the varying capability of the classifier**. Figure 4(a) empirically confirms that informativeness is more emphasized as the training progresses. Also, in Figure 3, the gap between MQ-Net and purity-based methods (CCAL and SIMILAR) becomes larger at later rounds. Another new evidence is provided below. We measure the proportion of IN examples in a query set at each round of MQ-Net. As shown in Table R3, this proportion decreases as the training progresses, because informative rather than pure examples are favored at later rounds. Table R3: Accuracy and ratio of IN examples in the query set for our split-dataset experiment on CIFAR10 with open-set noise of 40%, where $\% Q_{in}$ means % of IN examples in a query set. | Method | Round |1|2|3|4|5|6|7|8|9|10| |:-----:|:----------:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:| |MQ-Net | ACC |59.6|73.1|79.5|82.9|85.7|88.2|89.3|90.1|90.9|91.5| | | $\% Q_{in}$|88.4|81.1|76.6|72.8|66.2|63.4|57.6|61.4|56.7|57.6| `Q5. Typos.` Thank you very much for helping us improve our paper. The second appearance of HP should be changed to HI, and we fixed the typo. --- ## Reviewer4 R(3), Confidence(3) We are very glad that you have acknowledged the main contribution of this paper. At the same time, we deeply appreciate your valuable comments and reasonable concerns. During the rebuttal process, we have already addressed all of your concerns in the revised paper. Therefore, we look forward to hearing your positive feedback. #### Major Concerns (Q1~3). `Q1-1. No code is provided and no report about the resource usage.` Thank you very much for helping us improve our paper. We provide our code at [the link](https://anonymous.4open.science/r/MQNet-43E6/). All methods are implemented with PyTorch 1.8.0 and executed using a single NVIDIA Tesla V100 GPU. The experiments for ImageNet could be run smoothly using this resource. We updated this information about the code and resource usage in the revised paper with the R4Q1 mark. `Q1-2. It is also unclear how hyperparameters of the MQ-Net were chosen.` Since MQ-Net is trained on low-dimensional meta-input, we decided to use a shallow MLP architecture with a layer number of 2, a hidden dimension size of 64, and the ReLU activation function. The hyperparameters of MQ-Net including its architecture and the optimization configurations are specified in Appendix C. `Q2. No standard deviation and random baseline.` Thanks again for pointing out these issues. We added the error bar and the result of the random baseline in Figure 3 and Table 8 in the revised paper. Also, we added the result of the random baseline in Table 1. Evidently, the standard deviations are very small, and the significance of the empirical results is sufficiently high. `Q3. How do you do the z-score normalization, i.e., over which parts do you compute mean and standard deviation?` We conduct z-score normalization for each scalar score $O(x)$ and $Q(x)$. That is, we iteratively compute the mean and standard deviation over the unlabeled examples for every AL round. The mean and standard deviation are computed before the meta-training, and they are used for the z-score normalization at that round. We further clarified this procedure in the revised paper in Lines 237-240 with the R4Q3 mark. #### Minor Questions (Q4~8). `Q4. Figure 4: it appears that all red OOD samples received a high purity score - shouldn't it be vice-versa?` In Figure 4, most red OOD examples received around 0.7--0.8 purity scores which are regarded as being low in CSI-based purity scores. Some OOD examples may receive a high purity score over 0.9, since the open-set recognition performance of CSI is not very accurate in AL due to the lack of a sufficient amount of clean labeled examples. `Q5. In Theorem 4.1, the constraint that the activation function needs to be monotonically non-decreasing is not mentioned (it is however in the appendix).` Thank you for pointing out this important issue. According to your suggestion, we fixed Theorem 4.1 by adding the constraint of an activation function. See the updated Theorem 4.1 in the revised paper. `Q6. In the problem statement, different annotation costs for annotation of IN and OOD examples are introduced. But in the experiments, the costs for OOD and IN labeling are set as the same.` We agree and appreciate this comment. According to the reviewer’s suggestion, we conducted additional experiments and analyzed the effect of different costs for querying OOD examples in Appendix F in the revised version. Table 7 summarizes the performance change with four different labeling costs (i.e., 0.5, 1, 2, and 4) for the split-dataset setup on CIFAR10 with an open-set ratio of 40%. Overall, MQ-Net consistently outperformed the four baselines regardless of the labeling costs. Meanwhile, CCAL and SIMILAR were more robust to the higher labeling cost than CONF and CORESET. This is because CCAL and SIMILAR, which favor high purity examples, query more in-distribution examples than CONF and CORESET, so they are less affected by labeling costs, especially when cost $\tilde{c}$ is high. `Q7. Some parts of the paper could be improved in readability.` Per your suggestion, we removed an unimportant notation, $T_{in}$, in the problem statement. `Q8. Typos.` Thank you very much for helping us improve our paper. We fixed all the typos in the revised paper.

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully