10605_project
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Owners
        • Signed-in users
        • Everyone
        Owners Signed-in users Everyone
      • Write
        • Owners
        • Signed-in users
        • Everyone
        Owners Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Help
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Owners
  • Owners
  • Signed-in users
  • Everyone
Owners Signed-in users Everyone
Write
Owners
  • Owners
  • Signed-in users
  • Everyone
Owners Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    --- layout: post title: Rethinking Architecture Selection in Differentiable NAS tags: [DARTS, NAS, deep-learning] authors: {TBD} --- <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"> </script> <!-- content --> # Rethinking Architecture Selection in Differentiable NAS ### November 2022 | # DARTS # optimization # NAS # deep-learning ### 10-605 Fall 2022, Course Project ### Shiyu He, Liang Li, Wenjia Zhang This post is the course project of 10-605, Fall 2022 semester. In this blog post, we review Rethinking Architecture Selection in Differentiable NAS (Wang et al., [2021](https://arxiv.org/abs/2108.04392))[[1]](#1), a conference paper at ICLR 2021. We start from describling the problem of supernet optimization while doing artichitecture search, then introduce a solution proposed by the paper with theoretical and empirical results. Sepcifically, how do architecture parameters' values reflect the operation strength? Does magnitude of architecture parameters necessarily indicate the operation's contribution to the supernet's performance? In this blog post, we will discuss these questions from multiple different perspectives. We will show that the magnitude of parameters does not necessarily indicate performance contribution, and supernet selection can be significantly improved with an alternative perturbation-based method. ## Table of Contents 1. [Introduction and Background](#sec1) 2. [A new selection method: Measuring operation influence directly](#sec2) 3. [Evaluation with SOTs]() 4. [Result and Analysis]() # ## <strong> 1 Introduction and Background<a name="sec1"></a> </strong> Neural Architecture Search (NAS) algorithms have been popular across academia and industry for its potential to automatize the process of discovering high-performance architectures. One of the well known methods, DARTS (Liu et al., [2019](https://arxiv.org/abs/1806.09055))[[2]](#2), enables the search process to be performed with a gradient-based optimizer in an end-to-end differentiable manner. Particularly, the resulting supernet can be optimized via gradient-based methods, and the operations associated with the largest architecture parameters are selected to form the final architecture. > <strong>DARTS</strong> > > DARTS (Liu et al., [2019](https://arxiv.org/abs/1806.09055))[[2]](#2) is a pioneering work that introduces the general differentiable NAS framework, which we will keep talking about thoughout this post. In DARTS, the topology and operation are searched together. Concretely, at the end of the search, it selects one operation for every edge in the normal (reduction) cell based on the architecture parameter α. Then it selects two input edges for each node in the cell by comparing the largest α of every input edge. The final architecture consists of one operation on each of the eight edges in the normal cell. The operation on an edge will be either a "none" op, or selected from a pool of seven candidates: skip connection, avg pool 3x3, max pool 3x3, sep conv 3x3, sep conv 5x5, dil conv 3x3, and dil conv 5x5. > <center><img src="https://i.imgur.com/4FyyHHE.png" width="500" height="250"/> ###### [Figure 1](https://drive.google.com/file/d/1s6vb_KJnjDtqJXHmCFCR-dMWEVEWqx2D/view): An Overview of DARTS in NAS Framework (Borrowed from lecture slides) <img src="https://i.imgur.com/nHXcF1W.png" width="500" height="250"/> ###### [Figure 2](https://paperswithcode.com/method/darts). DARTS Mechanism Overview: (a) Operations on each edge are initially unknown. (b) Continuous relaxation of the search space by placing a mixture of candidate operations on each edge. (c ) Joint optimization of the mixing probabilities and the network weights by solving a bilevel optimization problem. (d) Inducing the final architecture from the learned mixing probabilities. </center> # Despite its simplicity, the effectiveness of DARTS is doubted. For example, a simple randomized search (Li & Talwalkar, [2019](https://arxiv.org/abs/1902.07638))[[3]](#3) outperforms the original DARTS, and several new NAS baselines have been proposed on top of DARTS performance. In addition, the reproducibility issues of published NAS results also causes difficulty for further investigation. Besides, the robustness of DARTS has been doubted by some research. In an experiment (Zela et al. [2020](https://arxiv.org/abs/1909.09656))[[4]](#4), DARTS was tested on four different search spaces, and the observed performance showed significantly degenerated patterns. It is also observed that DARTS degenerates to networks filled with parametric-free operations such as the skip connection or even random noise, leading to the poor performance of the selected architecture. Currently the analysis of DARTS failure mostly concentrates on the supernet optimization procedure, based on the implicit assumption that the values of architecture parameters reflect the operation strength. However, is this assumption reliable when we do architecture selection? In this post, we raise the question: <em>is this underlying assumption true in most cases?</em> We then show analysis by testing the assumption, and prove that in many cases, architecture parameter does not really indicate the operation importance in a supernet. Specifically, neither the operation with larger architecture parameter necessarily results in higher validation accuracy, nor is it a reliable indicator to select the best operation. Instead, the <strong><em>strength</em></strong> of each operation should be evaluated based on its contribution to the supernet performance. On the other hand, a perturbation-based method, that is to select operations of an edge by its supernet accuracy perturbation level, can be utilized to select the final architecture from the supernet. [[click here to go to top]](#sec1) [[click here to go to reference]](#ref) ## <strong> 2 A new selection method: Measuring operation influence directly <a name="sec2"></a></strong> ### <strong> 2.1 Challenge the Implicit Assumption </strong> Now it’s time to take a close look at the relationship between architecture parameter, $\alpha$, and operation strength in order to finally answer the question: Does $\alpha$ necessarily represent the strength of an underlying operation? From here we will use $\alpha$ to represent DARTS architecture parameter. DARTS jointly optimizes $\alpha$ and model weight $w$ with the following bilevel objective via alternative gradient updates: #### $$Min_{\alpha}\ L_{val}(w^*, \alpha), \ \ \ s.t.\ \ \ w^* = arg\ min_{w \in W}\ L_{train} (w, \alpha) \ \ \ \ \ (1)$$ Based on this objective function, $\alpha$ is used to form the output for each edge and compute the final architecture from the supernet, with an implicit assumption that higher $\alpha$ value represents stronger underlying operations. Let's start from giving a precise definition of the <strong><em>strength</em></strong> of an operation. For a single edge on a pretrained supernet; the strength of an operation on the edge can be naturally defined as the supernet accuracy after we discretize to this operation. The operation that achieves the best discretization accuracy can be considered as the best operation for the given edge (with sufficient tringing and finetuning to guarantee convergence, of course). <em>Figure 3</em> below illustrates the relationship between $\alpha$ and operation strength given randomly selected edge on DARTS supernet. As we can see, the magnitude of $\alpha$ does not necessarily (sometime even inversely!) agree with their strength measured by discretization accuracy at convergence. ![Figure_3](https://i.imgur.com/xR5St7U.png) ###### [Figure 3](https://arxiv.org/abs/2108.04392): $\alpha$ vs discretization accuracy at convergence of all operations on 2 randomly selected edges from a pretrained DARTS supernet (one subplot per edge). # From DARTS supernet's selection perspective in <em>Figure 4</em>, we can see that based on $\alpha$ selection <em>(4(a))</em>, $\alpha_{skip\_connect}$ > $\alpha_{sep\_conv\_3x3}$ on 12 of 14 edges. While based on strength selection <em>(4(b))</em>, the supernet benefits more from discretizing to $sep\_conv\_3x3$ than $skip\_connect$ on half of the edges. Consequently, the derived child architecture selected by $\alpha$ values will lack representation ability and perform poorly due to too many skip connections. <center><img src="https://i.imgur.com/6sLZE3S.png" width="600" height="300"/></center> ###### [Figure 4(a)](https://arxiv.org/abs/2108.04392): Operation strength on each edge of Search space 2 ($skip\_connect, sep\_conv\_3x3$). Edges in red associate with the largest $\alpha$. <center><img src="https://i.imgur.com/aT6iSpw.png" width="600" height="300"/></center> ###### [Figure 4(b)](https://arxiv.org/abs/2108.04392): Operation strength on each edge of Search space 2 ($skip\_connect, sep\_conv\_3x3$). Operations that result in the highest discretization validation accuracy at convergence. Parameterized operations are marked red. # So far we have demonstrated the "skip connection domination" issue caused by selecting architecture based on $\alpha$ parameter values. Several other works [[5]](#5)[[6]](#6) have also verified this issue, pointing out that DARTS tends to assign large $\alpha$ to skip connections, resulting in shallow architectures with poor generability. Some other works [[1]](#1)[[7]](#7) also expanded the topic and proved that this phenomenon by itself is a reasonable outcome while DARTS refines its estimation of the optimal feature map, but will inevitably rendering $\alpha_{skip}$ ineffective in the architecture selection. To solve this, an alternative indicator, <em>operation strength</em>, was proposed, and it shows potential in evaluating each operation's contribution to the supernet performance. ### <strong> 2.2 An Alternative: Perturbation-Based Architecture Selection</strong> In the previous section, we re-examine the assumption that magnitude of alpha is some indication of the operation strength. In the next section, we will propose the new architecture selection called Perturbation-Based Arcthiecture Selection. As intuitive as the name, this method shifts the focus from alpha . Instead, it seeks for direct relationship between operation strength and superset's performance, and hopes to achieve better training performance. Before we dive into the details, let us get some better understanding of this approach. The major difference of the perturbation-based architecture selection and the traditional DARTs (Liu et al., 2019) is the opposite stand on the assumption that alpha indicates operation importance in a supernet. Typically, during the process of constructing supernet, we use a measured called discretization accuracy to account for how much an operation influences the performance of superset. Discretization is defined as the process of transform continuous values to discrete values by grouping. The disturbance to the superset during the discretization process might impact the validation accuracy during the training time [[1]](#1). Therefore, the process needs to add in additional fine-tuning so supernet could converge. As a result, calculating the validation accuracy of the supernet is quite a computation expensive process, since we need to adjust precisely, or tune, the superset to compute the discretization accuracy at each of the operation for each edge. The new proposed perturbation-based process attempts to reduce the computation cost, while still retaining the same, or even better validation accuracy of the superset. The algorithm introduces a selection process, such that for a set of selected operations, it randomly selected an edge, and evaluate the validation accuracy without that particular selected edge. With all the validation accuracies, we will then identify the best operation for the edge, measured by the operation that led to the largest drop in validation accuracy when removing it. Note that this is similar to the decision tree pruning technique. In order to avoid overfitting and reducing complexity of the decision tree, we often need to tune the parameters of the model. One of the popular methods is reduced error pruning (Elomaa et al., 2001) [[8]](#8), which is only keeping the tree nodes that does not decrease the prediction or validation accuracy. The benefit of the perturbation-based architecture approach is that it preserves the operation that has the biggest influence to the validation accuracy of the supernet, while reducing the expensive computation cost of re-calculating for supernet convergence. <center><img src="https://i.imgur.com/9T8FLAI.png"/> ###### [Figure 5](https://i.imgur.com/9T8FLAI.png): Pseudo-Code of the Perturbation-based Algorithm </center> ### <strong> 2.3 Implementation on DARTS </strong> This method will be implemented and operated as a layer on top of DARTS. This would replace the discretization process for the entire supernet. Whenever it selects the best operation, that particular operation will be discretized. In order to recover the validation accuracy it lost, we will need to tune the supernet a few epoches of selection. In the implementation examples below, we limited this number to 5. ## <strong> 3 Evaluation with SOTs </strong> In the next section, we will evaluate the performances of the proposed new method with State-Of-the-Art methods, namely DARTS (Liu et al., 2019) [[2]](#2) and NAS-Bench-201 (Dong & Yang, 2020) [[9]](#9),as well as other variants of DARTS. In the paragraph below, we will refer the proposed new method perturbation-based Architecture selection as PT. We will also demonstrate how PT can be incorporated into the current state-of-the-arts methods, which will be referred to DARTS-PT. ### <strong> 3.1 Performances with DARTS in CNN Search Space </strong> Based on the algorithm skeleton we have discussed in Figure 5, we are incorporating the DT as part of the architecture selection process in DARTS. Meanwhile, we hold other search and training process unchanged in DARTS. The empirical result showed that DARTS-DT is performing significantly better than the traditional DARTS, with the test error reduced from 3.00% to 2.61%. We can then compare performances across various of DARTS variants, including SDARTS (Smooth-DARTS) (Chen & Hsieh, 2020) [[10]](#10) and SGAS (Sequantial Greedy Architecture Search) (Li et al., 2020) [[11]](#11). When applying the modification on SGAS, due to its design, it will perform progressive search space shrinking than the other methods. Therefore, when comparing with the these methods, we only compare by substituting the magnitude based operation search method. At the end, we are able to see the test error of best performance of DARTS-DT reached 2.48%, which is ranked among the top comparing to other variants of DARTS. The explanation behind this result is that with DARTS, the architecture selection remains critical, and DARTS-DT appears to be a more competitive method comparing to traditional DARTS with the addition of perturbation architecture selection method. ### <strong> 3.2 Performances with NAS-Bench-201 in Search Space </strong> In order to expand the applicability of DT, we can further explore the traditional magnitude-based method and the new DT approach to the NAS-Bench-201 searh space. The NAS-Bench-201 follows a similar architecture design as DARTS, both are conducted in cell-based approach (Dong & Yang, 2020) [[9]](#9). The comparsion is conducted over 3 datasets, cifar, cifar100, and imagenet16-120. The cifar consists total of 60,000 images, each of 32x32 colour in 10 classes, and the CIFAR-100 consists of 100 classes containing 600 image (Krizhevsky, 2009) [[12]](#12). As we can see in Figure 6, by ploting the test accuracy, the PT based approach outperforms the magnitude-based selection and is more stable for all three datasets. <center><img src="https://i.imgur.com/Xe7YM9R.png"/> ###### [Figure 6](https://i.imgur.com/Xe7YM9R.png): Test acuracy of magnitude-based method and DT on NAS-Bench-201 Search Space on 3 datasets </center> ### <strong> 3.3 Conclusion on Performances </strong> Figure 7 shows the complete comparison of DT and the magnitude-based selection counterpart for the same method. The test error results bounded by the red box are significantly outperforming the traditional magnitude-based selection approaches with various of state-of-arts methods, including but not limited to DARTS and NAS-Bench-201. One of the key takeaways is that architecture incorporated with DT seemed to have better performances. The other highlight of the table is that the search cost of replacing magnitude-based selection method seems higher, considering the equivalent number of parameters. For example, it would take on average of 0.8 GPU days to train an DARTS+PT architecture, whereras DARTS would take half of the cost, which is 0.4 GPU days to train the entire architecture. This raises the question of the trade-off between the increase in performances and search cost. <center><img src="https://i.imgur.com/3KjGjQU.png" width="550" height="520"/> ###### [Figure 7]: Test acuracies of state-of-art classifiers architecture on CIFAR-10 </center> ## <strong> 4 Result and Analysis </strong> * Robustness of DARTS According to Zela et al. (2020)[[13]](#13), DARTS gives a poor and unstable performance while yielding architectures. To examine the issue of robustness with DARTS, the following comparison experiment results showed that DARTS tends to generate architectures with many skip connections and high test error. However, with the implementation of perturbation-based architecture selection combined with DARTS, test errors are improved significantly across different spaces and datasets, and meaningful architectures are generated. This experiment further proved the usefulness of DARTS itself, but not with the original magnitude-based architecture selection. <center><img src="https://i.imgur.com/wxQQ7Mw.png" width="500" height="250"/> ###### [Figure 8]: test error(%) for several differentiable NAS across 4 spaces and 3 datasets </center> <center><img src="https://i.imgur.com/X0j3YJG.png" width="700" height="200"/> ###### [Figure 9]: Comparison of normal cells found on S2 and S4. </center> * Progressive tuning As mentioned before, at each round of operation selection, the supernet will be tuned after discretizing an edge, and this action will help the supernet to regain some accuracy. To eliminate the conjecture that the higher performance of perturbation-based architecture selection comes from this tuning process, an ablation study is conducted to compare the performance of original DARTS (magnitude-based) with tuning with new DARTS (perturbation-based) with tuning. As the result shows, DARTS (perturbation-based) with tuning outperformed for most of the time, which indicates that the operation selection method is the critical part improving the performance. <center><img src="https://i.imgur.com/qfR2FrQ.png" width="400" height="300"/> ###### [Figure 10]: validation accuracy (%) in the operation selection phase on S2. </center> * Fixing $\alpha$ Unlike magnitude-based, perturbation-based architecture selection does not make selections based on the magnitude of $\alpha$, which makes people rethink the necessity of optimizing $\alpha$. After conducting comparison experiments, it is surprising to find that DARTS (perturbation-based) with controlled $\alpha$ performed at least as good as DARTS (perturbation-based). This result indicates that while using the proposed perturbation-based architecture selection, optimizing $\alpha$ could be pointless, and a strong architecture search method is still formed. <center><img src="https://i.imgur.com/1Uaim8H.png" width="450" height="70"/> ###### [Figure 11]: DARTS+PT vs. DARTS+PT(fixed $\alpha$) test error (%) on cifar10. </center> <strong> Conclusion </strong> This research investigates on the magnitude-based architecture process of DARTS and proved that high magnitude of ɑ does not correspond to high operation strength. Alternatively, the perturbation-based architecture selection method is introduced, which measures the contribution to the supernet’s validation accuracy for each operation on each edge and then makes selections. Various test results acknowledge its much improved performance with DARTS. This new method will provide scientists another approach to generate a better final architecture, and also more flexibility in training supernets without the need to optimize ɑ. The significance of ɑ in Differentiable NAS methods is suggested to be reconsidered. [[click here to go to top]]() [[click here to go to reference]]() <h2 id="ref">References</h2> <a id="1">[1]</a> Ruochen Wang, Minhao Cheng, Xiangning Chen, Xiaocheng Tang, Cho-Jui Hsieh. Rethinking Architecture Selection in Differentiable NAS, [2021](https://arxiv.org/abs/2108.04392) <a id="2">[2]</a> Hanxiao Liu, Karen Simonyan, and Yiming Yang. DARTS: Differentiable architecture search. In International Conference on Learning Representations, [2019](https://arxiv.org/abs/1806.09055). <a id="3">[3]</a> Liam Li and Ameet Talwalkar. Random search and reproducibility for neural architecture search, [2019](https://arxiv.org/abs/1902.07638). <a id="4">[4]</a> Arber Zela, Thomas Elsken, Tonmoy Saikia, Yassine Marrakchi, Thomas Brox, and Frank Hutter. Understanding and robustifying differentiable architecture search. In International Conference on Learning Representations, [2020](https://arxiv.org/abs/1909.09656). <a id="5">[5]</a> Kaifeng Bi, Changping Hu, Lingxi Xie, Xin Chen, Longhui Wei, and Qi Tian. Stabilizing darts with amended gradient estimation on architectural parameters, [2019](https://arxiv.org/abs/1910.11831). <a id="6">[6]</a> Hanwen Liang, Shifeng Zhang, Jiacheng Sun, Xingqiu He, Weiran Huang, Kechen Zhuang, and Zhenguo Li. Darts+: Improved differentiable architecture search with early stopping, [2019](https://arxiv.org/abs/1909.06035). <a id="7">[7]</a> Klaus Greff, Rupesh K. Srivastava, and Ju ̈rgen Schmidhuber. Highway and residual networks learn unrolled iterative estimation. In International Conference on Learning Representations (ICLR), [2017](https://arxiv.org/abs/1612.07771). <a id="8">[8]</a> Tapio Elomaa, Matti Kaariainen. An Analysis of Redcued Error Pruning. Journal of Artificial Intelligence Research, [2001](https://arxiv.org/abs/1106.0668). <a id="9">[9]</a> Xuanyi Dong, Yi Yang. NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search. In International Conference on Learning Representations, [2020](https://arxiv.org/abs/2001.00326v2). <a id="10">[10]</a> Xiangning Chen, Cho-Jui Hsieh. Stabilizing Differentiable Architecture Search via. Perturbation-based Regularization. In International Conference on Learning Representations, [2020](https://https://arxiv.org/abs/2002.05283). <a id="11">[11]</a> Guohao Li, Guocheng Qian, Itzel C. Delgadillo, Matthias Muller, Ali Thabet, Bernard Ghanem1. Conference on Computer Vision and Pattern Recognition (CVPR), [2020](https://www.deepgcns.org/auto/sgas). <a id="12">[12]</a> Alex Krizhevsky. Learning Multiple Layers of Features from Tiny Images, Chapter 3: Object Classification Experiments, [2009](https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf). <a id="13">[13]</a> Arber Zela, Thomas Elsken, Tonmoy Saikia, Yassine Marrakchi, Thomas Brox, and Frank Hutter. Understanding and robustifying differentiable architecture search. In International Conference on Learning Representations, [2020](https://arxiv.org/abs/1909.09656).

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully