Shaozhe
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # NeurIPS 2022 Rebuttal ## To all We thank all reviewers for their positive feedback (the proposed GCD method is effective, solid and well-motivated [GP8j, dTQi, JCAw], class number estimation is effective [dTQi, JCAw], experimental results are extensive, impressive and convincing [GP8j, dTQi, JCAw], and the paper is well-written and easy to follow [GP8j, dTQi], etc) and constructive comments. <!-- * the proposed S-FINCH effectively exploits reliable cross-instance positive relations for better representation learning, which considers labelled and unlabelled data under a unified schema. [GP8j, dTQi, JCAw] * the proposed S-FINCH provides an effective way to estimate the class number without repeated runs. [dTQi, JCAw] * the paper is well-motivated: our method can work efficiently without knowing the number of clusters, which is useful in real-world applications. [dTQi] * the solution to the problem of clustering on data with unknown classes is solid. [dTQi] * the experimental results are comprehensive, extensive, impressive and convincing, showing our method outperforms other baselines by a significant margin on different datasets with different granularity. [GP8j, dTQi, JCAw] * the ablation study supports the effectiveness of our method. [GP8j, JCAw] * the paper is well organized, well-written, and easy to follow. [GP8j, dTQi] --> We have addressed individual concerns carefully in the response to each reviewer. In the updated version, we revised the paper following the suggestions and highlighted the revisions in blue. Additional results, details and discussions will be included in the final version. ## Reviewer GP8j **Novelty** We study the practical and challenging problem of generalized category discovery which is a relatively new task recently introduced in [32]. * First, to tackle this challenge, we propose a joint representation learning and category discovery framework that effectively explores the cross-instance positive relations from both unlabelled data, which is substantially different from [32] that simply uses two augmented versions of each instance to form the positive pairs. * Second, it is non-trivial to obtain reliable positive relations from the unlabelled data that contains instances from both seen and unseen classes. To this end, we propose S-FINCH, as part of our framework to dynamically explore positive relations during training for the GCD task. Meanwhile, S-FINCH is also a transfer clustering algorithm for GCD to produce the class discovery results. * Third, we propose a much more efficient class number estimation approach with a one-by-one merging strategy. (See Table 6 in Appendix D) Hence, we believe our novelty is not simply a modified version of FINCH. **Details on GCD and test stage** We strictly follow the process of Vaze et al. [32] for fair comparison. GCD considers the situation where we have a collection of images where part of the images are labelled and the rest are not. The objective is to automatically group the unlabelled images based on their semantics. Hence, the model takes all images as input and predicts a label assignment for each unlabelled instance. The same unlabelled data are used during training and testing, except that the random augmentation is applied during training. For evaluation, we also follow [32] to adopt Hungarian algorithm to compare the predicted label assignment with the ground-truth labels for the unlabelled data. More in-depth details can be found in Appendix E of [32]. We have clarified the process in the updated paper (lines 277-279). **Questions** > Q1: What is the difference between GCD and the open-set data clustering? A1: We are not sure what open-set data clustering means here. If the review refers to open-set recognition (OSR), OSR aims at detecting the unlabelled instances from new categories without distinguishing between unseen classes, i.e., OSR can be considered as a K+1 classification problem and there is no discovery process for the new class, while differently, GCD not only finds the new classes but also groups unlabelled instances based on their semantics. If the reviewer refers to clustering, unsupervised clustering is inherently ambiguous (i.e., objects can be partitioned into different groups based on different but equally valid criteria), while GCD considers a partially supervised setting that has prior knowledge of some known classes, which aims at removing the ambiguity problem in clustering. Hence, GCD is a more practical and realistic setting. > Q2: As the method is designed for the GCD task that could discover new classes from the unlabeled data whatever the classes they are from, I want to know the results of the scenarios where the unlabeled data are only from seen classes and the unlabeled data are only from the unseen classes. A2: Thanks for this helpful suggestion. We have experimented by directly testing on the suggested cases and reported results. The results are as follows. We can see that our method performs well on these two cases. We outperforms Vaze et al. [32] on all unseen classes and some seen classes. We also included the results in Appendix G of the updated supplementary material. | All unseen| CIFAR10 | CIFAR100 | ImgNet100| CUB| SCars| Herbarium | | ----| :----: | :----: | :----: | :----:| :----: | :----: | | Vaze et al [32]| 87.5 | 57.7 | 69.0 | 53.2| 32.6 | 16.1 | | Ours | **97.6** | **82.7** | **74.3** | **56.5**| **39.3** | **37.9** | | All seen| CIFAR10 | CIFAR100 | ImgNet100| CUB| SCars| Herbarium | | ----| :----: | :----: | :----: | :----:| :----: | :----: | | Vaze et al [32]| **99.1** | **93.1** | 77.6 | **92.5** | **85.3** | 30.7 | | Ours | 98.5 | 84.4 | **83.3** | 79.1 | 72.0 | **55.4** | > Q3: The data splitting in the experiments are too ideal where the numbers of data from both seen and novel classes are balance. I wonder the experimental results when the unlabeled data from the novel classes are small. A3: We agree that real-world data could contain complicated data distributions and we often meet the long-tailed distribution. Actually, Herbarium19 is a long-tailed dataset in which different classes contain an unbalanced number of instances, varying from 10 to 500. So the experimental results on Herbarium19 can demonstrate the feasibility of this challenging setting. In addition, to mimic the case where there are only few instances from novel classes, we experiment on CIFAR-100 and CUB-200 by only including 10% instances in each unseen class. The results are shown below. We can see that under this challenging scenario, the performance of both methods drops, while the results are still reasonably good for our method. <mark>When the number of unlabelled instances from unseen classes greatly decreases, our method maintains reasonably good performance on 'unseen' classes, while Vaze et al. [32] gives much worse performance, though it achives better performance on 'seen' classes on some datasets. Overall, our method consistently achives the best trade off.</mark> | CIFAR-100| all | seen | unseen | | ----| :----: | :----: | :----: | | Vaze et al [32]| **87.7** | **91.6** | 8.9 | | Ours | 79.9 | 81.0 | **58.1** | | ImgNet-100| all | seen | unseen | | ----| :----: | :----: | :----: | | Vaze et al [32]| 64.3 | 69.8 | 37.1 | | Ours | **78.4** | **81.3** | **63.7** | | CUB-200 | all | seen | unseen | | ----| :----: | :----: | :----: | | Vaze et al [32]| **73.9** | **86.5** | 10.7 | | Ours | 52.6 | 58.2 | **24.7** | | Herbarium19 | all | seen | unseen | | ----| :----: | :----: | :----: | | Vaze et al [32]| 25.0 | 28.3 | 10.0 | | Ours | **42.9** | **47.5** | **22.2** | > Q4: As the results are only evaluated on the small or medium scale datasets, I wonder if the proposed approach still be superior on the large-scale datasets or with a more powerful pretrained model. A4: We agree with the hypothesis and observation of the reviewer. Indeed, as the reviewer might be aware, we have included experiments on a wide spectrum of datasets (see Table 8 in Appendix F), from small scale to large-scale (though medium large compared with other larger ones), from coarse- to fine-grained, and our method consistently performs well on all of them. We believe the same conclusion will hold true for other larger datasets. However, we are short of computing power to carry out more such experiments. On the other hand, to validate the effects of a more powerful pretrained model, we experiment on CIFAR-100 by replacing the ViT-B-16 with the more powerful ViT-B-8 pretrained by DINO. The results are as follows. <mark>We can see, with a more powerful pretrained backbone, the performance can be further improved on all cases, indicating our framework can generalize to different backbones.</mark> |Model | all | seen | unseen| | :---- |:----: |:----:|:----: | |ViT-B-16 | 81.5 | 82.4 | 79.7 | |ViT-B-8 | **83.1** | **82.6** | **84.2** | **Typos** Thanks. We have fixed them in the revised paper. ## Reviewer dTQi We thank the reviewer for constructive suggestions and questions. For questions: > Q1: In Sec. 3.3, it seems the chain length is an important hyper-parameter. It would be great to investigate how different chain lengths impact the clustering. A1: We agree the chain length is an important factor. Intuitively, the chain length $\lambda$ should be positively correlated with the number of labelled instances in each class $n_l$, and meanwhile, $\lambda$ should be smaller than $n_l$ and not too small when $n_l$ is small (e.g., the extreme case such as $\lambda$ will lead to slow convergence and less useful chains). The square root is the simplest option that satisfies this intuition. Thus, we simply apply the square root. This might not be the best option but we didn’t thoroughly investigate this due to extraordinary performance of the simple square root. We compare our dynamic length with the fixed chain length in Table 2, Appendix A. As can be seen, our dynamic length overall works better than the best possible fixed chain length. In addition, we also experiment with other possible dynamic chain length formulations like $\lambda=\lceil n_\ell/2 \rceil$ and $\lambda=\lceil\sqrt[3]{n_\ell}\ \rceil$. Results are shown below. <mark>We can observe that neither the larger $\lambda=\lceil n_\ell/2 \rceil$ nor the smaller $\lambda=\lceil\sqrt[3]{n_\ell}\ \rceil$ works well generally. We hypothesize the reason is that different formulations lead to different cluster numbers at each level, thus producing more wrong or fewer right positive relations. There might be other options, but since the square root works considerably well, we decide to use it as our default choice.</mark> | CIFAR-100 | all | seen | unseen | | ----| :----: | :----: | :----: | | $\lceil n_\ell/2 \rceil$ | 81.4 | **84.5** | 75.2 | | $\lceil\sqrt[3]{n_\ell}\ \rceil$ | 72.5 | 77.1 | 63.2| | $\lceil\sqrt{n_\ell}\ \rceil$ (Ours) | **81.5** | 82.4 | **79.7** | | CUB-200 | all | seen | unseen | | ----| :----: | :----: | :----: | | $\lceil n_\ell/2 \rceil$ | 45.5 | 45.5 | 45.5 | | $\lceil\sqrt[3]{n_\ell}\ \rceil$ | 42.4 |45.0 | 41.1 | | $\lceil\sqrt{n_\ell}\ \rceil$ (Ours) | **57.1** | **58.7** | **55.6** | > Q2: In Sec. 3.3, although the proposed SFN is interesting and effective, It is difficult to understand why these constraints are adopted. It will be helpful to provide some insights into why the simple way fails and why SFN uses these constraints. A2: We found that simple FN or a long chain often leads to a single large cluster containing all instances, which is fatal to hierarchical clustering. To avoid such a situation, we thus apply the following constraints for SFN: * we use short chains with the dynamically determined length, to reduce the chance of forming a single cluster with all instances. * a labelled instance is not allowed to be the SFN of another labelled instance more than once, to further avoid multiple chains being connected together. This also inherently ensures that chains of different labelled classes will not be merged. * neither of the above two constraints are adopted to the unlabelled instances, such that the unlabelled instances can join a labelled chain or an unlabelled cluster based on their semantic similarities. > Q3: In Sec. 3.4, the idea of estimating the clustering quality by joint reference score is great. But I have several questions: a) What partition of D^{l} and D^{v} is used in experiments? How does this partition impact the cluster number estimation? b) It is helpful to give some insights into why such a joint reference score is adopted. (a) We simply set $|D^{l}|$:$|D^{v}|$ to 8:2, which is a common ratio widely used in validation, without any tuning. We have followed the suggestion to add this detail in the updated paper (line 289). We further experimented other ratios, namely 9:1 and 7:3. The results change slightly (as follows), but overall our method is not sensitive to different ratios. | Ratio | CIFAR10 | CIFAR100 | ImgNet100| CUB| SCars| Herbarium | | ---- | :----: | :----: | :----: | :----: | :----: | :----: | |GT|10|100|100|200|196|683| |9:1|13|97|107|152|195|446| |8:2|12|103|100|155|182|490| |7:3|12|102|107|151|183|423| (b) The intuition is that we want the overall measurement on the labelled and unlabelled subsets to be the best, such that we have a good overall score. Thus, we combine the labelled clustering accuracy and silhouette score into the joint reference score. (1) silhouette score is an intrinsic clustering quality index. We want to maximize inner-cluster compactness and inter-cluster discrepancy on unlabelled data (without access to GT labels). (2) labelled clustering accuracy is an extrinsic index. We hope labelled data can be accurately clustered to the greatest extent (with access to GT labels). > Q4: In L220, the author simply picks the third level. Please give a detailed analysis of this parameter. We expect the cluster number is neither too large nor too small at the picked level. If it is too large (at a lower level), we will have fewer pairs of positive relations in each mini-batch. If it is too small (at a higher level), excessive wrong positive relations will be generated. So a good choice is the level that over-clusters the labelled instances from known classes to some extent. We empirically find the third level appears to a good level that slightly over-clusters the labelled instances from the known classes. We further experiment on other levels and find that the overclustering level 3 and 4 are similarly good, while level 2 worse because of less positive relations being explored in each mini-batch. Even under level 2, our method still performs on par with Vaze et al [32]. Details are added in Appendix A (lines 30-37). | CIFAR-100 | all | seen | unseen | | ----| :----: | :----: | :----: | | Vaze et al [32]| 70.8 | 77.6 | 57.0 | | Ours w/ level 2 | 72.4 | 79.6 | 58.0 | | Ours w/ level 3 | 81.5 | **82.4** | 79.7 | | Ours w/ level 4 | **81.6** | 81.9 | **80.8** | | CUB-200 | all | seen | unseen | | ----| :----: | :----: | :----: | | Vaze et al [32]| 51.3 | 56.6 | 48.7| | Ours w/ level 2 | 50.9 | 55.8 | 48.5 | | Ours w/ level 3 | **57.1** | **58.7** | **55.6** | | Ours w/ level 4 | 52.9 | 53.1 | 52.8 | > Q5: Can you list the latency of each method? We have reported the latency for clustering between our method and Vaze et al [32] in Appendix D (Table 6) of our submission. The latency mainly consists of two parts: feature extraction and clustering. For all methods, the feature extraction takes less than 0.02s per image <mark>as shown below</mark>, while our clustering method is 6-30 times faster than that of Vaze et al. [32]. We measured the latency more comprehensively and reported the more detailed latency in Appendix D. | Methods | Time cost | | -------- | -------- | | RankStats+ [7] | 0.015s±0.001 | UNO+ [6] | 0.017s±0.001 | ORCA [2] |0.015s±0.001 | Vaze et al. [32] | 0.014s±0.001 | Ours | 0.014s±0.001 > Q6: How many times do you run for each dataset? Can you provide the mean and variance of the main tables? In our original paper, we only report once experimental results. We have followed the suggestion to repeat our experiments 5 times (3 times on Herbarium19 due to time limit) and added mean and std in our main table (Table 1 & 2). The std is generally quite small indicating the stability of our method. **Limitations** Indeed, we have broadly discussed the limitations of our approach in Appendix H. For the additional questions: > (1) From Table 3, it seems the proposed method prefers to infer a small cluster number. Will it bring some potential negative impact? (1) Class number estimation remains an open challenge, especially in the GCD. So far, only Vaze et al. [32] provides a solution for class number estimation in GCD. We notice that the small class number estimation mainly happens on the challenging fine-grained datasets, where the difference between classes is subtle thus multiple similar classes might be considered as one. Hence, in the scenarios where an accurate cluster number estimation is required, this can bring negative impact. Though our method achieves comparablelly good performance with Vaze et al. [32] much more efficiently, more efforts are still needed to achieve accurate novel class number estimation. > (2) It seems that the method can’t be SOTA on seen classes. Will it be limited to data with all known classes? <!-- (2) We agree that our method is not SOTA on seen (known) classes of some datasets, though the gap between our method with other methods is not big. For those with SOTA performance on seen classes, we observe a sharp drop on the unseen classes, indicating a notable bias towards the seen classes. Our method shows superior performance on all unseen classes and some seen classes, thus achieving overall SOTA performance across the board. We further evaluated the case where the unlabelled instances are only from seen classes. <mark>Experimental results are shown below. Our method can still achieve reasonably good performance although Vaze et al [32] may reach higher accuracy on some of datasets because it overfits on seen classes.</mark> More details are in Appendix G in the updated supplementary material. --> (2) Please refer to response Q2 & A2 to reviewer GP8j. We agree that our method is not SOTA on seen classes of some datasets, though the gap between our method with other methods is not big. For other methods outperforming our method on seen classes, we observe a sharp drop on their unseen classe performance, indicating a notably bias towards the seen classes. Our method shows superior performance on all unseen classes and some seen classes, thus achieving overall SOTA performance across the board. We further evaluated the case where the unlabelled instances are only from seen classes, our method can still achieve reasonably good performance (more details in Appendix G in the updated supplementary material). **Typos** Thank you. We have fixed them in the updated paper. ## Reviewer JCAw We thank the reviewer for the detailed investigation and valuable suggestions. We carefully answer each question as follows: > Q1: In Sec. 3.3, the reason why the chain length λ (Line209) is set to the square root of the number of labelled instances in each class (i.e., λ=⌈nℓ) is not clear. How about the sensitivity of the results w.r.t. the different values of λ? It seems that there were no discussions or quantitative results about this claim. A1: Please refer to the response to reviewer dTQi (Q1 & A1). We provided the additional discussion and results on other choices in Appendix A. The choice of chain length should be positively correlated to (but smaller than) the labelled instance number, while the number should not be too small. The concrete formulation to this constraint is not unique, and square root is the simplest formulation we think of. <mark>We experiment with other possible choices and found our simple square root appears to be an effective option, though there might be other better (possibly more complex) alternatives. > Q2: In Sec. 3.3, in order to obtain a high purity of each cluster, the authors chose the third level from the bottom of hierarchy to generate the pseudo labels. Why did the authors use this level? What will the results change if we choose the higher or lower level? Or how do the different levels affect the quality of pseudo labels? A2: <mark>Please refer to the response to reviewer dTQi (Q4 & A4)</mark>. The consideration for such a choice is that the level should slightly over-cluster the labelled instances such that chains can be generated to produce useful positive relations. We generally found level three appears to satisfy this and adopt it for all datasets. We further show comparison with other levels in Appendix A (lines 30-37, Table 3). > Q3: In [3], a similar strategy, named Neighborhood Contrastive Learning (NCL), was also used to construct the pseudo-positive pairs for contrastive learning. Could the author compare the difference and connection between NCL and the proposed method in this paper? How about the performance gain of the pseudo-positive pair generation in this method compared to NCL [3] if we fix the other parts? A3: **Difference**: (1) NCL uses a memory bank to store samples, but ours uses batch-wise samples which is more memory-efficient. (2) NCL directly uses the k nearest neighbors in the memory bank to generate positive samples, while ours uses connected components in graphs of each mini-batch to generate global positive relations, subject to the constraints of SFN, bringing more thoughtful and reliable positive relations than kNN. (3) NCL also relies on a hard negative generation process for training, which leverages the assumption that unlabelled data are all from unseen classes, which holds true for NCD but not for GCD setting. **Connection**: the idea of pseudo positive generation is similar and shown to be effective. For the experimental comparison, according to the formulation of the number of positive samples (i.e., $|M|/C^u/2$) in NCL, this number is 1 in our setting of CIFAR-100 and CUB-200. Thus, it is exactly the experiment by replacing S-FINCH with the nearest neighbor for positive relation generation, which can be seen in the first row of Table 4 in the main paper. Under this same setting, S-FINCH achieves significant improvement over the nearest neighbor (i.e., NCL positive relation extraction strategy). > Q4: The biggest concern: I noticed that some results in Tab. 1 in this paper are highly different from the results reported in Tab. 2 of [1], even for the same method on CIFAR10, CIFAR100,ImageNet-100. For example, [1] claimed that their method achieved 76.9%, 84.5%, and 61.7% respectively on "All", "Old", and "New" classes on CIFAR100 dataset; however, in the Tab. 1 of this paper, the performances of the method in [1] on CIFAR-100 dataset were just reported as 70.8%, 77.6%,57.0%, which were significantly lower than the paper [1]. This is a common observation if we compare Tab. 1 of this paper with Tab. 2 of [1], or Tab. 2 of this paper with Tab. 3 of [1]. We can conclude that the performances of different baselines methods (RankStats+, UNO+, method of [1]) reported in this paper are mostly lower than the results in [1]. If we adopt the results in [1], the proposed method in this paper did not outperform other baselines. Could the author explain more about these results? It is so weird because some results are still consistent across these two papers. For example, the method in [1] always achieved the same performance (91.5%/97.9%/88.2% for "All"/"Seen"/"Unseen" classes) on CIFAR-100 in Tab.1 of this paper and Tab.2 of [1]. A4: We double-checked this and can confirm that we compare with the latest published and arXiv version of [1] ([paper link](https://openaccess.thecvf.com/content/CVPR2022/papers/Vaze_Generalized_Category_Discovery_CVPR_2022_paper.pdf)), in which they have used more rigorous and challenging data splits. They also improved the evaluation protocol to better reflect the challenge in the GCD task (see Appendix E of [1]). Our numbers of [1] are from the up-to-date [1], and we strictly follow [1] to adopt the new data splits and evaluation protocol for fair comparison. The numbers that the reviewer referred to are from an earlier version of [1]. > Q5: In Tab. 5, if we compare the Rows (0)-(2), we can conclude that of we adopt "k-means & u-u" or "FINCH & u-u", the performances on "Seen" classes will significantly decrease, compared to Row (0). Furthermore, if we compare Rows (0)(4), we can observe that the combination of "S-FINCH & u-ℓ can drop the performances on "Unseen classes" (on CIFAR-100) or on "Seen classes" (on CUB-200). Could the author provide more detailed discussion about these observations? A5: (1) The "kmeans & u-u" in Row (1) and “FINCH & u-u” in Row (2) adopt kmeans and FINCH on all the instances, no matter seen or unseen, to group instances together. Labelled instances from different classes are not prohibited to be put in the same cluster, thus wrong positive relations connecting labelled instances from different seen classes can be used for training, which is harmful to the performance of <mark>seen classes</mark>, causing slight performance drop on seen classes in Row (0) vs Row (1, 2). However, overall, more useful positive relations are introduced to improve the training, thus boosting the performance significantly on the unseen classes, leading to notable overall performance boost. (2) For "S-FINCH & u-ℓ", positive pair relations are only introduced for the seen classes, biasing the training to the seen classes, thus bringing negative effects on the unseen classes as well as the seen classes on the more challenging fine-grained CUB-200 dataset. While after introducing the “u-u” pairs in Row (5) to remove the bias, we can see that both the seen and unseen performance are consistently boosted, further demonstrating the effectiveness of our positive relation generation method for GCD. > Q6: In Appendix C, I observed the peaks of three curves almost coincide at similar values. However, as for SCars dataset, the peaks of the labelled accuracy and silhouette score are located at highly different x-axis values. Is it a common phenomenon in the three fine-grained datasets? Could the author provide more discussion about this? A6: We further plotted the curves on other datasets in Appendix C. We observe this is a common phenomenon in CUB-200 and S-Cars because inter-class differences are subtle, except the Herbarium19 due to its more challenging long-tailed distribution challenge. For fine-grained datasets, the inter-cluster distance will not change much with the change of cluster number, while the inner-cluster distance can decrease with the increase of cluster number. Thus, silhouette score favors overclustering under such cases. For clustering accuracy on labelled data, labelled instances from the same class may be misclustered into multiple clusters, and underclustering may reduce the error. Hence, its trend is the opposite with silhouette score. While our class number estimation takes both measures into consideration, resulting in a reasonably good estimation. > Q7: In Appendix D, the time efficiencies between [1] and the proposed method in this paper. How about the time efficiency of the estimator in [2] compared to the above methods? A7: The method of [1] is an improved method of DTC [2] by Brent’s optimization algorithm for GCD. This optimization leads to a significant efficiency improvement from O(n) to O(log(n)). As our method is already notably more efficient than [1], we didn’t compare with [2]. [1] Generalized category discovery. CVPR 2022. [2] Learning to discover novel visual categories via deep transfer clustering. ICCV 2019. [3] Neighborhood contrastive learning for novel class discovery. CVPR 2021.

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully