mkusner
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    ## Paper TODO #### Gbetondji <!-- - cat plots for 2 eigenvalue conditions --> <!-- - table of 2 eigenvalue conditions for all inputs + all layers --> <!-- - mimetic initialization --> <!-- - $|| HFC(X_l) ||_2 / || LFC(X_l) ||_2$ (y-axis) vs. each layer l (x-axis) in each of the models (can plot all models in the same plot in different colors) --> <!-- - add asymmetry of H and A plots --> - show ViT- has better performance than ViT, ViT+ as layers are increased <!-- - flatness measure --> <!-- - Figure 1 increase norm a bit, right now neighboring images look very similar --> <!-- - send me all individual photos in Figure 1, will modify the figure to add more detail and other details --> <!-- - Figure 2 increase font sizes, change legends to remove parentheses and text inside parentheses --> <!-- - Figure 4 increase font sizes --> - describing the tables / plots - describing the experimental setups #### Matt <!-- - related work --> <!-- - limitations --> <!-- - conclusion --> MATT TODO - modify Figure 1 <!-- - show how without residual connection, we get rank collapse, an alternative proof to \cite{dong2021attention}. describe how this is a generalization of work below! (look at work again to make sure it has fixed weights) --> - talk about pitfalls of prior theoretical analyses (i.e., anti-oversmoothing statements, work below assuming $W_V W_{proj} = I$ (2nd bullet point), when do vits work) - talk about how other work can be explained by this work (i.e., memetic intialization, small eigenspectrum (third bullet point below, successful pretrained models) RELATED: - SIDENOTE: this related work analyzes eigenvalues of self-attention to motivate a contrastive learning solution to oversmoothing, a new layer colled ContraNorm: \url{https://arxiv.org/pdf/2303.06562.pdf} Here they define `effective rank' which is how `vanishing' is the distribution of singular values. This is also vaguely related, focusses on dynamics of contrastive learning: \url{https://arxiv.org/pdf/2303.04435.pdf}) - another related paper which shows that with $W_V W_{proj} = I$ and no residual connections we get rank deficiency: \url{https://arxiv.org/pdf/2306.01610.pdf} - another fix which divides all weights by their spectral norm: \url{https://proceedings.mlr.press/v202/zhai23a/zhai23a.pdf}. This forces the eigenspectrum to be small <!-- \end{itemize} --> - double-check all proofs - full pass through paper ## new directions #### Gbetondji - doubtful about weight-sharing - success has to do with domain - might not do well on images e.g., sinkformer image results aren't very convincing (cat vs. dog), best results are on point clouds - stability - overlap between stability and normalization - without normalization, models don't work, even with $(1-\tau)$ - theory from graph NNs - need new architecture - prefer to be between theory and practice, not develop new strong theoretical results - doubly stochastic attention: seems to work okay (in the sinkformer paper) - HOW DO VISION TRANSFORMERS WORK? (https://arxiv.org/pdf/2202.06709.pdf) - ViT low-pass, CNNs high-pass - ViT loss smoother and flatter not convex, CNNs closer to convex not as flat - flatness: seems related to generalization - convexity: related to data efficiency - avg pooling: works as a low-pass filter, balances high-pass filter bias, and flattens landscape - what's the equivalent for ViT? (i.e., a high-pass filter), this is what the anti-oversmoothing paper does: upweight high-frequencies - best possible architecture: - flat and convex - above paper: non-principled ViT + CNN layers - can we use gradient flow ideas to make a block that balances high and low pass - i.e., GNNs as Gradient Flow - experiments: - visualize loss landscape for different eigenvalues choices: both flat and convex - similar to HOW DO VISION TRANSFORMERS WORK? - convexity: Hessian experiments - data efficiency: - small datasets CNNs outperform ViT ## new idea Consider the doubly-stochastic transformer: $X^{(t+1)} = X^{(t)} + A X^{(t)} W$ where $A$ is bi-stochastic (i.e., its rows and columns sum to 1) and symmetric, and $W$ is symmetric. We can parameterize both $A$ and $W$ as follows: $W = L^\top L$ $A = n_r(n_c( \cdots n_r(n_c(\exp(X^\top V^\top V X))) \cdots ))$ where $n_r(\cdot)$ and $n_c(\cdot)$ mean row and column normalization. This is Sinkhorn's algorithm, described in more detail here: https://proceedings.mlr.press/v151/sander22a/sander22a.pdf This transformer has some very nice properties: - eigenvalues of $A$ are $\in [-1,1]: - (upper) https://res.mdpi.com/d_attachment/symmetry/symmetry-12-00369/article_deploy/symmetry-12-00369-v2.pdf - (both, need better source) https://mathoverflow.net/questions/239951/lower-bound-for-smallest-eigenvalue-of-symmetric-doubly-stochastic-metropolis-ha - the above property means we can have our own version of Theorem 5.1 from GNNs as Gradient Flows: https://arxiv.org/pdf/2206.10991.pdf - this means we have conditions on eigenvalues of W to avoid oversmoothing - and can prove a similar theorem 1 from GRAND: https://arxiv.org/pdf/2106.10934.pdf (just need to require that maximum eigenvalue of W is $1/\lambda_{A,\max}$ and we are done so long as we weight $X^{(t)}$ by $\alpha$ and $A X^{(t)} W$ by $1-\alpha$, for $\alpha \in (0,1)$ which shouldn't change the analysis of theorem 5.1) - (maybe) can make an extension to sheafs ## meeting notes ### 27/06/23 ## derivation $$ \frac{\partial \mathbf{x}(t)}{\partial t} = \mbox{div}[\frac{1}{g(\mathbf{x}(t),t)}\nabla\mathbf{x}(t)]$$ $$g(\mathbf{x}(t),t) := k(x(t),x(t')) = \exp(\phi(x(t))^\top\phi(x(t')) $$ $$ \frac{\partial \mathbf{x}(t)}{\partial t} = \nabla\mathbf{K}^*\nabla\mathbf{x}(t)$$ $$y_i = \sum_{j \in N_i} g_\theta(x_j - x_i) (x_j-x_i)$$ $$div(g \nabla x)$$ $$TV(X) = \sum_{j\in N_i} |x_i-x_j|$$ $$\dot{x} = div(\frac{\nabla x}{\|\nabla x \|})$$ - vector calculus background: https://en.wikipedia.org/wiki/Vector_calculus_identities ## papers to read READ!!!!: ### Semantic Diffusion Network for Semantic Segmentation - https://proceedings.neurips.cc/paper_files/paper/2022/file/396446770f5e8496ca1feb02079d4fb7-Paper-Conference.pdf ### IMAGE SELECTIVE SMOOTHING AND EDGE DETECTION BY NONLINEAR DIFFUSION - https://epubs.siam.org/doi/pdf/10.1137/0729012?casa_token=27EEtqsq9nAAAAAA:FY_KstyZAMyWeFQr624t3VBG89QU10t0v0KHF7dXt1nc6NzyjXhQ63fJSodwoZzOveG_rBnalF8 ### Scale-Space and Edge Detection Using Anisotropic Diffusion - https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=56205&casa_token=x8kggJ_mwH8AAAAA:q_qPlNaLPj-bArP6_EWQjpNHNevGMscMglQkgr06D5GfEyCJdxrivFARjRw8Tzg5Xd_0IvnaIg&tag=1 ### GREAD: Graph Neural Reaction-Diffusion Networks - https://arxiv.org/pdf/2211.14208.pdf ### STABILIZING TRANSFORMER TRAINING BY PREVENTING ATTENTION ENTROPY COLLAPSE - https://arxiv.org/pdf/2303.06296.pdf ### Non-local Neural Networks - https://arxiv.org/pdf/1711.07971.pdf ### A Tour of Modern Image Filtering - http://i2pc.es/coss/Docencia/SignalProcessingReviews/Milanfar2013.pdf ### ANTI-OVERSMOOTHING IN DEEP VISION TRANSFORMERS VIA THE FOURIER DOMAIN ANALYSIS: FROM THEORY TO PRACTICE - https://arxiv.org/pdf/2203.05962.pdf ### Anisotropic Diffusion in Image Processing - https://www.mia.uni-saarland.de/weickert/Papers/book.pdf ## Dynamical Models for NNs ### NEURAL MECHANICS: SYMMETRY AND BROKEN CONSERVATION LAWS IN DEEP LEARNING DYNAMICS - Daniel Kunin*, Javier Sagastuy-Brena, Surya Ganguli, Daniel L.K. Yamins, Hidenori Tanaka* - https://arxiv.org/pdf/2012.04728.pdf - per-channel norm is more stable during training than parameter weights - differentiable symmetry (for parameters): if loss doesn't change under differentiable transformation of these parameters - i.e., $f(\theta) = f(\psi(\theta, \alpha))$ where $\theta \rightarrow \psi(\theta, \alpha)$ is the differentiable transformation - let $\alpha = I \in G$ be the identity element of $G$: for all $\theta$ it is the case that $\psi(\theta,I) = \theta$, then this enforces the following structure on the gradient $\nabla F$, $$\langle \nabla f, \partial_\alpha \psi|_{\alpha=I} \rangle = 0$$ i.e., the gradient $\nabla f$ is perpendicular to the vector field $\partial_\alpha \psi|_{\alpha=I}$ for all $\theta$. (similar result for the Hessian) - they describe constraints or symmetries on gradients/Hessians for common symmetries (translation, scale, rescale) - GD update: $\theta = \theta - \eta g(\theta)$ is a forward Euler discretization with step size $\eta$ of the ODE (gradient flow) $\frac{d\theta}{d t} = -g(\theta)$ (as $\eta \rightarrow 0$, GD matches the ODE) - given symmetries, they derive conservation laws throughout training: how the gradient w.r.t. time is constrained - e.g., for translation symmetry, the sum of parameters is conserved, effectively constraining dynamics to a hyperplane. - gradient flow is too simple, need to take into account other things! - weight decay: $$ \frac{d \theta}{d t} = -g(\theta) - \lambda \theta$$ - momentum: $$ (1 - \beta) \frac{d \theta}{d t} = -g(\theta) $$ - stochasticity: via CLT assume $\hat{g}_{\mathcal{B}}(\theta) - g(\theta)$ is a Gaussian r.v. with mean $\mu = 0$ and covariance $\Sigma(\theta)$. NOTE! Gradient of the loss (whether batched or not) is orthogonal to the generator vector field $\partial_\alpha \psi|_{\alpha=I}$. So the noise must observe the same property. For this to hold for arbitrary noise we must have that $\Sigma(\theta) \partial_\alpha \psi|_{\alpha=I} = 0$. i.e., the symmetry in NN architectures projects the noise from SGD onto low rank subspaces, leaving the gradient flow dynamics in these directions unchanged. TODO: interaction between sthochasticity and discretization is non-trivial though! - discretization: can use _modified equation analysis_ (Warming and Hyett, 1974) - _modified loss_: divergence between gradient flow and discretized SGD is given by the gradient correction $-\frac{\eta}{2}Hg$, the gradient of the squared norm $-\frac{\eta}{4}|\nabla \mathcal{L}|^2$. The modified loss is $\tilde{\mathcal{L}} = \mathcal{L} + -\frac{\eta}{4}|\nabla \mathcal{L}|^2$ and the modified gradient flow ODE is $$\frac{d \theta}{d t} = -g(\theta) - \frac{\eta}{2}H(\theta) g(\theta)$$ - _modified flow_: think of a continuous trajectory $\theta(t)$ that goes through the discrete steps taken by GD, then identify the differential equation that generates the trajectory: rearranging the update equation for GD, $\theta_{t+1} = \theta_t - \eta g(\theta_t)$ and assuming $\theta(t) = \theta_t$ and $\theta(t + \eta) = \theta_{t+1}$ gives the equality $-g(\theta_t) = \frac{\theta(t + \eta) - \theta(t)}{\eta}$, Taylor expand the RHS to get the differential equation $-g(\theta_t) = \frac{d \theta}{d t} + \frac{\eta}{2} \frac{d^2 \theta}{dt^2} + O(\eta^2)$. As $\eta \rightarrow 0$ we get gradient flow, but for small $\eta$ we get $$\frac{d \theta}{d t} = -g(\theta) - \frac{\eta}{2} \frac{d^2 \theta}{dt^2}$$. This was done by Kovachki \& Stuart, 2019 to get a more realistic continuous model for momentum. - they get exact learning dynamics for more realistic versions of gradient flow ### Noether’s Learning Dynamics: Role of Symmetry Breaking in Neural Networks - Hidenori Tanaka , Daniel Kunin - https://arxiv.org/pdf/2105.02716.pdf - gap between tools in physics and learning dynamics in NNs - root cause: gradient flow $\frac{d q}{dt} = -\nabla f$ is first-order in time whereas classical mechanics, e.g., Newton's second law $m \frac{d^2q}{dt^2} = - \nabla f$ is second-order in time - bridging the gap: (discretized) GD with finite learning rate introduces implicit acceleration, making it amenable to the Lagrangian formulation: a variational formulation used to study the role of symmetry in dynamical systems ### Graph Neural Networks as Gradient Flows: understanding graph convolutions via energy - Francesco Di Giovanni, James Rowbottom, Benjamin P. Chamberlain, Thomas Markovich, Michael M. Bronstein - https://arxiv.org/pdf/2206.10991.pdf - leverage fundamental physical idea: particles evolve by minimizing an energy. So! we can study the dynamics of a system through the functional expression of the energy. - class of differential equations that minimize an energy: _gradient flows_ - 2 ways to understand GNN dynamics 1. start from energy functional - parameterize energy functional - take GNN equations to follow direction of steepest descent - <b>physical interpretation</b>: multi-particle dynamics - e.g., $\mathcal{E}_\theta(\cdot): \mathbb{R}^{n \times d} \rightarrow \mathbb{R}$ is the energy function of: $\dot{\mathbf{F}}(t) := \frac{d \mathbf{F}(t)}{dt} = -\nabla_{\mathbf{F}} \mathcal{E}_\theta(\mathbf{F}(t))$ 2. start from evolution equations (i.e., $\frac{d \theta}{d t}$) - describe a general GNN update rule that has GCN, GraphSage, and GCNII as special cases - derive an energy which when differentiated yields the general GNN update rule - they interpret this energy physically, they separate the interaction matrix into positive and negative eigenvalue components then describe this as attraction vs. repulsion terms - [!] different from Hidenori Tanaka's works, time here refers to GNN layers instead of training epochs - closed form update for any layer given eigenvectors of $\mathbf{W}$ and $\Delta$ (the graph Laplacian) - describe oversmoothing as when positive eigenvalues of $\mathbf{W}$ dominate the negative ones - without residual connections $\mathbf{W}$ is less powerful - MAIN MESSAGE OF SECTION 4: discrete gradient flow $\mathcal{E}_\theta$ are equivalent to linear graph convolutions with symmetric weights shared across layers - MAIN MESSAGE OF SECTION 4: graph convolutions with symmetric weights identify curves along with the energy $\mathcal{E}_\theta$ decreases, i.e., acting as 'approximate' gradient flows ### Charting Flat Minima Using the Conserved Quantities of Gradient Flow - Bo Zhao, Iordan Ganev, Robin Walters, Rose Yu, Nima Dehmamy - https://openreview.net/pdf?id=JvcLG3eek70 - 2022 - NeurIPS Workshop - question: _How are parameter space symmetries of NN architectures related to flat minima?_ - discover a set of continuous symmetries which keep nonlinear NNs invariant under certain conditions - <b>another reference to Noether's theorem!!!</b>: when a continuous symmetry exists, some quantity will remain unchanged during gradient flow dynamics - given a group $G$, and action $\cdot$ on parameter space $\Theta$ is a symmetry of loss $\mathcal{L}$ if it leaves the loss invariant: $\mathcal{L}(g \cdot \theta) = \mathcal{L}(\theta), \;\;\; \forall \theta \in \Theta, g \in G$ - describe a non-linear symmetry for NNs - the consequence of this is the existence of extended, flat minima - define: _conserved quantity_ of gradient flow (GF) is a function $Q(\cdot)$ such that for any two time points $s,t$ along a GF trajectory $Q(\theta(s)) = Q(\theta(t))$ - some $Q$ for GF have been discovered <b>(see paper below!)</b> - they generalize the above result: for all $M$ in the Lie algebra $\mathfrak{g}$ ### Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced ## Learning Symmetries ### Automatic Symmetry Discovery with Lie Algebra Convolutional Network - Nima Dehmamy, Robin Walters, Yanchen Liu, Dashun Wang, Rose Yu - https://arxiv.org/pdf/2109.07103.pdf - introduce L-conv: infinitesimal version of G-conv - L-conv leads to Noether's theorem and conservation laws - can optimize Noether current to discover symmetries - future work: try to incorporate spatial symmetries (e.g., particle physics) ### A tradeoff between universality of equivariant models and learnability of symmetries - Vasco Portilheiro - https://arxiv.org/pdf/2210.09444.pdf - formalizing symmetry learning and providing impossibility results ### Learning Linear Symmetries in Data Using Moment Matching - Colin Hagemeyer - https://arxiv.org/pdf/2204.01213.pdf - April 4th, 2022 - motivation: symmetries are useful for a lot of physics/ML problems (data augmentation, CNNs, identity-preserving transformations) - goal: learning <b>discrete</b> symmetries - makes assumptions to restrict symmetries to have eigenvalues that are either 1 or -1 ### AugNet: End-to-End Unsupervised Visual Representation Learning with Image Augmentation - Mingxiang Chen, Zhanguo Chang, Haonan Lu, Bitao Yang, Zhuang Li, Liufang Guo, Zhecheng Wang - https://arxiv.org/pdf/2106.06250.pdf - June 11th, 2021 - learn low-dimensional embeddings to encode semantic similarity - they use 8 augmentation methods to make the embedding invariant to these augmentations ## Surveys ### Geometric Deep Learning Grids, Groups, Graphs, Geodesics, and Gauges ## Meeting Notes

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully