Yiming Li
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # Rebuttal_DBD ## Response to Reviewer RsZj We sincerely thank you for your valuable time and comments. We are encouraged by the positive comments on the **novelties**, **extensive experiments**, **good performance**, and **benefits to the field**. **<u>Q1</u>**: Did not compare DBD with some recently works such as "Spectral signatures in backdoor attacks" in NIPS 2018 and "Detecting Backdoor Attacks on Deep Neural Networks by Activation Clustering" in AAAI 2019, which both are based on data removal. It is unclear how differently DBD removes the poisoned data as compared with the existing works. **<u>R1</u>**: Thanks for these insightful comments! As we mentioned in Section 4.4, we do not intend to accurately separate poisoned and benign samples, as detection-based methods (e.g., Spectral Signatures (SS) and Activation Clustering (AC)) did. This is mainly because these methods may not able to remove enough poisoned samples while preserving enough benign samples simultaneously, i.e., there is a trade-off between BA and ASR. However, we do understand your concern. To alleviate it, we compare the filtering ability of our DBD (stage 2) and that of your suggested methods. As shown in Table 1-2, the filtering performance of DBD is on par with that of SS and AC. DBD is even better than those methods when filtering poisoned samples generated by more complicated attacks (i.e., WaNet and Label-Consistent). Table 1. The successful filtering rate (poisoned/all, %) w.r.t. the number of all filtering samples (in the target class) on CIFAR-10 dataset. | | $\epsilon \rightarrow$ | 250 | 500 | 1000 | 1500 | |:----------------:|:----------------------:|:-------------:|:--------------:|:--------------:|:--------------:| | BadNets | SS <br> DBD | 95.73<br>100 | 93.20<br>97.60 | 87.60<br>90.87 | 80.71<br>70.09 | | Blended | SS <br> DBD | 0.80<br>97.87 | 10.53<br>94.67 | 28.87<br>87.27 | 35.29<br>75.16 | | WaNet | SS <br> DBD | 0<br>100 | 0<br>100 | 1.00<br>99.47 | 7.42<br>97.46 | | Label-Consistent | SS <br> DBD | 2.40<br>43.47 | 4.40<br>37.07 | 9.00<br>34.47 | 13.78<br>32.53 | Table 2. The number of remaining poisoned samples over filtered non-malicious samples on CIFAR-10 dataset. | | BadNets | Blended | WaNet | Label-Consistent | |:--------------------:|:----------:|:----------:|:----------:|:----------------:| | SS ($\epsilon=500$) | 1801/42500 | 2421/42500 | 2500/42500 | 1217/42500 | | SS ($\epsilon=1000$) | 1186/35000 | 2067/35000 | 2400/35000 | 1115/35000 | | AC | 0/42500 | 0/37786 | 5000/45546 | 1250/39998 | | DBD | 8/25000 | 6/25000 | 38/25000 | 13/25000 | Besides, we also conduct the standard training on non-malicious samples filtered by SS and AC. As shown in Table 3, the hidden backdoor will still be created in many cases, even though the detection-based defenses are sometimes accurate. Table 3. The BA (%) over ASR (%) of models trained on non-malicious samples filtered by SS and AC on CIFAR-10 dataset. | | SS-500 | SS-1000 | AC | |:----------------:|:-----------:|:-----------:|:-----------:| | BadNets | 92.99/100 | 93.27/99.99 | 85.90/0 | | Blended | 92.84/99.07 | 92.56/99.18 | 77.17/0 | | WaNet | 92.69/98.13 | 91.92/99.00 | 84.60/99.02 | | Label-Consistent | 92.93/99.79 | 92.88/99.86 | 75.95/99.75 | Please refer to Appendix (Section M) in our revision for more details. **<u>Q2</u>**: I suspect the proposed approach works only with SimCLR. What is the necessary conditions that makes a feature extractor scatter poisoned data points in the feature space? Can you point out other feature extractors having the same properties as SimCLR? Does DBD works with these extractors too? **<u>R2</u>**: Thanks for your insightful questions! We believe that all self-supervised methods (not just SimCLR) can be adopted in our DBD. As we described in Introduction and Section 3, this is mainly due to the power of the decoupling process and the involved strong data transformations. To alleviate your concerns, we also examine our DBD with other self-supervised methods. As shown in Table 4, all DBD variants have a similar performance. Please refer to Appendix (Section N) in our revision for more details. Table 4. The BA (%) over ASR (%) of our DBD with different self-supervised methods on CIFAR-10 dataset. | | BadNets | Blended | WaNet | Label-Consistent | |:------:|:----------:|:----------:|:----------:|:----------------:| | SimCLR | 92.41/0.96 | 92.18/1.73 | 91.20/0.39 | 91.45/0.34 | | MoCo | 93.01/1.21 | 92.42/0.24 | 91.69/1.30 | 91.46/0.19 | | BYOL | 91.98/0.82 | 91.38/0.51 | 91.37/1.28 | 90.09/0.17 | **<u>Q3</u>**: The effect of the final fine-tuning step is unclear. How does DBD perform without this phase? **<u>R3</u>**: Thanks for this question. As we described in Section 4.5, the (semi-supervised) fine-tuning process can prevent the side-effects of poisoned samples while utilizing their containing useful information, and therefore increase the BA and decrease the ASR simultaneously. The results contained in Table 5 can verify it. Please refer to our Section 5.3.3 (Table 3) for more details. Note: 'SS with SCE' denotes the DBD without semi-supervised fine-tuning. Table 5. The BA (%) over ASR (%) of our DBD with or without semi-supervised fine-tuning. | | BadNets | Blended | Label-Consistent | WaNet | |:---------:|:----------:|:----------:|:----------------:|:----------:| | DBD (w/o) | 82.34/5.12 | 82.30/6.24 | 81.81/5.43 | 81.15/7.08 | | DBD (w/) | 92.41/0.96 | 92.18/1.73 | 91.45/0.34 | 91.20/0.39 | Table 3. The BA (%) over ASR (%) of models trained on non-malicious samples filtered by SS and AC on CIFAR-10 dataset. | | SS($\varepsilon=500$)<br>BA/ASR | SS($\varepsilon=1000$)<br>BA/ASR | AC<br>BA/ASR | |:----------------:|:-------------------------------:|:--------------------------------:|:------------:| | BadNets | 92.99/100 | 93.27/99.99 | 85.90/0 | | Blended | 92.84/99.07 | 92.56/99.18 | 77.17/0 | | WaNet | 92.69/98.13 | 91.92/99.00 | 84.60/99.02 | | Label-Consistent | 92.93/99.79 | 92.88/99.86 | 75.95/99.75 | | | SS($\varepsilon=500$) | SS($\varepsilon=1000$) | AC | |:----------------:|:---------------------:|:----------------------:|:-----------:| | | BA/ASR | BA/ASR | BA/ASR | | BadNets | 92.99/100 | 93.27/99.99 | 85.90/0 | | Blended | 92.84/99.07 | 92.56/99.18 | 77.17/0 | | WaNet | 92.69/98.13 | 91.92/99.00 | 84.60/99.02 | | Label-Consistent | 92.93/99.79 | 92.88/99.86 | 75.95/99.75 | **<u>Q4</u>**: I also suggest the authors to move the section about the resistance to adaptive attacks in Appendix into the main paper as the adaptive attacks are becoming a more serious threats today. Please explain your adaptive attack settings more clearly (for example, what trigger size you used and how you tuned the hyper-parameters). **<u>R4</u>**: Thank you for this constructive suggestion! We have moved it into the main paper (Section X) and provided more details in our revision. **<u>Q5</u>**: It would also be good if the author discuss how an attack may work around the proposed defense, and how to further defend such workarounds. **<u>R5</u>**: Thank you for this constructive suggestion! We discussed the resistance of our DBD to potential adaptive attacks in Appendix H. The results show that our method is resistant to the discussed adaptive attack. As such, we did not further analyze how to better defend this adaptive attack. We are very willing to test your suggested adaptive methods if you can kindly provide more details. ## Response to Reviewer h7wA We sincerely thank you for your valuable time and comments. We are encouraged by the positive comments on the **novelties**, **strong sets of experiments and baselines**, **practicability**, and **good writing**. **<u>Q1</u>**: The proposed method modifies an underlying training procedure multiplying training time, which I think is significant for practitioners. Addressing this issue is essential to support the practicality of the method. **<u>R1</u>**: Thank you for this insightful comment! As we analyzed in Appendix K, our method is roughly 2-3 times slower than the standard training process in general, which we think is tolerable and therefore our DBD is still practical. In fact, most existing backdoor defenses (e.g., NC and NAD) take additional computational costs. However, we do understand your concern. There are two possible ways to accelerate DBD: 1. If there is a secure pretrained backbone (e.g, the one from trusted sources), people can directly use it and save the time of the self-supervised stage. 2. People can adopt existing accelerated methods (e.g., mixed precision training) towards each stage of DBD to accelerate the whole training process. **<u>Q2</u>**: The primary assumption of the paper is that learning in semi-supervised learning is safe. However, Carlini [1] 's recent work demonstrates the attacker's effectiveness under the same threat model, i.e., when the attacker is only allowed to poison data. If the poisoning is efficient, then the proposed defense exposes the model to a different attack. **<u>R2</u>**: Thank you for this insightful comment! Carlini's work requires that attackers can know what are (some of) labeled samples and unlabeled samples. This threat model is practical in the classical semi-supervised training where the attacker can control its training set. However, in our method, whether a sample is labeled or not is determined by the first two stages of DBD, whose results are not accessible to attackers. As such, his work can not be used to attack our DBD since our threat model targets the poisoning-based backdoor attacks, where attackers can not know and control the training details. We have added more details in Appendix (Section H) to discuss it. ## Response to Reviewer MY71 We sincerely thank you for your valuable time and comments. We are encouraged by the positive comments on the **well generalization** and **effectiveness**. **<u>Q1</u>**: The paper lacks a theoretical analysis of the proposed method. I understand that this paper mainly focused on empirical performance, but it is quite surprising that the proposed methods perform well on label-consistent attacks. This is because that proposed method decouples the label corruption and feature corruption. When label corruption no longer exists, what is the advantage of the proposed method? **<u>R1</u>**: Thank you for the comments and the insightful question! - We admit that we failed to provide a theoretical analysis of our DBD. This part is very difficult, especially when our method is the first work trying to analyze the learning behavior of poisoned samples. We hope that this work can inspire follow-up works to better understand the learning of poisoned samples (theoretically or empirically). - Our DBD is effective in defending against clean-label attacks is not because it can successfully filter poisoned samples. Our DBD fails to filter poisoned samples since there is no label corruption (as you mentioned). The success of DBD (in defending against clean-label attacks) is mostly because the strong data augmentations involved in the self-supervised learning damage trigger patterns and therefore make them unlearnable (without the guidance of labels). In particular, clean-label attacks are more difficult to learn the trigger pattern compared with poisoned-label attacks since the features about the target class contained in the poisoned images will hinder the learning of the trigger pattern. **<u>Q2</u>**: For the second step label noise learning, there are also many choices instead of just using the symmetric cross-entropy method. Investigating more noisy-label algorithms might be interesting. **<u>R2</u>**: Thank you for this constructive suggestion! We do understand your concern that the selection of noisy-label algorithms may sharply influence the overall performance of DBD. To alleviate it, we also examine our DBD with other noisy-label methods (i.e., GCE and NCE+RCE). As shown in Table 2, all DBD variants have a similar performance. Please refer to Appendix (Section O) in our revision for more details. Table 2. The BA (%) over ASR (%) of our DBD with different noisy-label methods on CIFAR-10 dataset. | | BadNets | Blended | WaNet | Label-Consistent | |:---------------:|:----------:|:----------:|:----------:|:----------------:| | DBD (SCE) | 92.41/0.96 | 92.18/1.73 | 91.20/0.39 | 91.45/0.34 | | DBD (GCE) | 92.93/0.88 | 93.06/1.27 | 92.25/1.51 | 91.05/0.15 | | DBD (NCE + RCE) | 92.95/1.00 | 92.65/0.78 | 92.24/1.40 | 91.08/0.14 | **<u>Q3</u>**: The two-step method makes the algorithm not in an end-to-end fashion. It would be interesting to investigate the possibility to make an end-to-end algorithm. **<u>R3</u>**: Thank you for this constructive suggestion! We think one of the most interesting parts of this paper is to reveal that the classical end-to-end training paradigm strengthens the backdoor threat. We believe keeping our DBD in a not end-to-end fashion may better emphasize it. Besides, since no human operation is required between each stage in our DBD, the whole training pipeline is general and can be regarded as end-to-end in a way. However, we are very willing to try it if you can kindly provide more details about the end-to-end design. **<u>Q4</u>**: I do not think excluding the detection-based method is fair since those methods are strong baselines especially for the badnet and blending attack. Also, since the main contribution of the paper is the empirical performance, it is necessary to compare with different kinds of baselines. **<u>R4</u>**: Thanks for these insightful comments! As we mentioned in Section 4.4, we do not intend to accurately separate poisoned and benign samples, as detection-based methods (e.g., Spectral Signatures (SS) and Activation Clustering (AC)) did. This is mainly because these methods may not able to remove enough poisoned samples while preserving enough benign samples simultaneously, i.e., there is a trade-off between BA and ASR. However, we do understand your concern. To alleviate it, we compare the filtering ability of our DBD (stage 2) with two representative detection-based methods (i.e., SS and AC). ## Response to Reviewer G82o We sincerely thank you for your valuable time and comments. We are encouraged by the positive comments on the **simple but effective idea** and **extensive experiments**. **<u>Q1</u>**: The primary concern of this paper is the extra computation cost of the proposed DBD pipeline. Considering training a self-supervised and semi-supervised model costs way more computational resources than a supervised learning model. Would it be possible to use a public pre-trained feature extractor to replace the self-supervised feature extractor? The authors are welcomed to discuss this. **<u>R1</u>**: Thank you for the insightful comment and question! As we analyzed in Appendix K, our method is roughly 2-3 times slower than the standard training process in general, which we think is tolerable and therefore our DBD is still practical. In fact, most existing backdoor defenses (e.g., NC and NAD) require additional computational costs. However, we do understand your concern. There are two possible ways to accelerate DBD: - If there is a secure pretrained backbone (e.g, the one from trusted sources), people can directly use it and save the time of the self-supervised stage (as you suggested). - People can adopt existing accelerated methods (e.g., mixed precision training) towards each stage of DBD. <font color="blue">We download a pre-trained ResNet-50 backbone from https://github.com/leftthomas/SimCLR, and replace the self-supervised feature extractor with it in our DBD.</font> However, we need to notice that using public pre-trained feature extractors is still risky when the model source is not secure. For example, if the pre-trained feature extractor is infected (with the same poisoned samples contained in the training set), using it in our DBD will still create hidden backdoors, as shown in Table 3 (Line 'DBD without SS'). Table 1. The BA (%) over ASR (%) of our DBD with or without pre-trained feature extractor on CIFAR-10 dataset. | | BadNets | Blended | Label-Consistent | WaNet | |:---------:|:----------:|:----------:|:----------------:|:----------:| | DBD (w/) | 94.53/0.54 | 94.81/0.73 | 93.22/0 | 94.08/1.36 | | DBD (w/o) | 92.41/0.96 | 92.18/1.73 | 91.45/0.34 | 91.20/0.39 | **<u>Q2</u>**: In Section 5.2, the authors list the results on "No defense" to show the backdoor defenses' impact on the original model's accuracy and the effectiveness of the backdoor mitigation. It is not clear how the model works without a defense. Suppose the authors train the original model in a supervised learning manner. In that case, directly using a self-supervised learning paradigm should also be included to illustrate each step's contribution. **<u>R2</u>**: Thank you for the constructive suggestion! In our paper, the 'No Defense' means training a model with the standard end-to-end supervised training. We do understand your concern that analyzing the effects of each step's contribution is important. Note that directly using a self-supervised learning paradigm should be considered as a defense, since it introduces the decoupling process. In the ablation study (Section 5.3.3, Table 3), we have discussed the effectiveness of each stage, including directly using a self-supervised learning paradigm as you mentioned, in our DBD. The results show that it is also effective in reducing backdoor threats (although it is less effective compared with our DBD). Please refer to Table 3 (Line 'SS with CE' and 'SS with SCE') for more details. **<u>Q3</u>**: In the fine-tuning step, the sensitivity of lambda should also be discussed, since it controls the ratios of unlablelled data. **<u>R3</u>**: Thank you for this constructive suggestion! We do understand your concern that analyzing the effects of key hyper-parameters is important. In fact, we have included these experiments in Section 5.3.1 (Figure 5). The results show that DBD can still maintain relatively high benign accuracy (and low ASR) even when the filtering rate $\alpha$ is relatively small (e.g., 30%). However, we also have to notice that the high-credible dataset may contain poisoned samples when $\alpha$ is very large, which in turn creates hidden backdoors again during the fine-tuning process. Defenders should specify $\alpha$ based on their specific needs. **<u>Q4</u>**: In Figure 1(a), the poisoned samples' embedding seems far from the target label (label 3) and closer to other un-targeted labels (label 1, 7, and 9). It would be great if the authors could give explanations. **<u>R4</u>**: Thank you for the insightful question! Firstly, we have to notice that the t-SNE only preserves 'local structure' instead of the 'global structure'. In other words, it can show the difference between clusters while its specific distances are sometimes meaningless. However, I do understand your concern about why a backdoor attack can succeed even when the embeddings of poisoned samples are not close to those of the (benign) samples with the target label. In general, a successful backdoor attack only needs to require that the embeddings of different types of samples (i.e., samples with different labels and samples with or without trigger) can be split into different separated clusters. The remaining FC layers will 'assign' the label to each cluster. Besides, Figure 1(a) is the visualization of a poison-label attack where different poisoned samples may have different ground-truth labels and therefore contain many features different from those contained in benign samples with the target label. This is probably why your mentioned phenomenon occurs. In contrast, Figure 1(b) is the visualization of the clean-label attack where all poisoned samples have the same ground-truth label. As such, in this case, the embeddings of poisoned samples are close to those of samples with the target label. Title: Thank you for the recognition and insightful comments! Thanks for your recognition and insightful comments! Please kindly find our detailed explanations as follows: --- **Q1**: The safety of semi-supervised learning. **R1**: Thank you for your further questions and we do understand your concern. We try to explain it from three aspects, as follows: - Suppose the attackers poison only unlabeled samples, as Carlini's work did, Carlini's attack is not able to attack the semi-supervised stage of our DBD since the input of our DBD method is fully labeled data. In this case, our DBD will abandon unlabeled data in the beginning, such that the interpolated malicious unlabeled data will be removed. - Suppose the attackers add labels to those malicious unlabeled samples, we admit that those samples will be used in the training process of our DBD. However, in this case, Carlini's attack itself is still not able to attack the semi-supervised stage of our DBD. Specifically, as shown in Section 3.1 of Carlini's work, the attackers need to insert a path of unlabeled samples benign from a selected labeled sample $x'$ (with the target class) and end with the target sample $x^{*}$ to fulfill their malicious purposes. However, since the labeled and unlabeled dataset partition depends on the loss calculated in the second stage of DBD and defender pre-assigned filtering rate $\alpha$, attackers are not able to know which sample is labeled, $i.e.$, they can not pick a $x'$. - Suppose the attackers somehow really find a way to ensure that the attacker-specified $x'$ can be chosen as the labeled sample in the semi-supervised stage of our DBD, the attack goal of misclassifying the selected sample as the target class can be achieved probably. We will explore how to do it in our future work. If such a single-point attack succeeds in our DBD method, we will not surprised, as DBD is not designed for defending against this attack. In particular, Carlini's work has provided some useful methods to defend against this attack, such as the pairwise influence (see Section 5.3 in Carlini's work). This method can be naturally combined with our DBD method at the second stage, to identify triggered and interpolated points simultaneously. According to the above three aspects, we don't think that Carlini's work is a severe threat to our DBD method. We hope that the above explanations could provide more clear information to alleviate your concern. The discussion with you is very pleasant and helpful, and we are willing to discuss with you if you have any further concerns. Thanks again for your insightful comment, which inspires us to explore a more effective adaptive attack against our DBD method in our future work. --- **Q2**: More details about the speed constraints. **R2**: Thank you for the constructive suggestions! Following your suggestion, we report the training time of each stage in our DBD as follows: Table 1. The training time (minutes) of each stage in our DBD on CIFAR-10 dataset. | | Stage 1 (Self-supervised Learning) | Stage 2 (Filtering) + Stage 3 (Fine-tuning) | |:---:|:----------------------------------:|:-------------------------------------------:| | DBD | 50.8 | 260.5 | As shown in Table 1, using safe pre-trained backbones (to avoid the self-supervised learning stage) can indeed save some time in our DBD. Note that we only trained 100 epochs (instead of the standard 1000 epochs) in our DBD to save time, as we described in Appendix K. This is why we can only save 50.8 minutes if we use pre-trained backbones, otherwise, we can save 508 minutes. Inspired by your questions, we also notice that our DBD can be further accelerated by using fewer epochs in Stage 2-3. As shown in Table 2, our DBD has converged after 115 epochs. In other words, we can save 115.8 more minutes by setting the epoch as 115 (instead of 200) in Stage 2-3. People should assign training epochs based on their specific needs. Table 2. The benign accuracy (\%) $w.r.t.$ the training epoch in the stage 2-3 of our DBD on CIFAR-10 dataset. | Epoch | 25 | 40 | 60 | 80 | 85 | 115 | 116 | 117 | 118 | 125 | 126 | 127 | |:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:| | BA | 85.86 | 88.96 | 90.60 | 91.85 | 92.42 | 93.02 | 93.02 | 93.16 | 93.15 | 93.37 | 93.04 | 93.00 | Thank you again for your insightful questions. We will add more details and discussions in Appendix K in our final version.

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully