# 1. Vision Transformers (An image is worth 16x16 words) - Pretrained on 300+ million images - SOTA on ImageNet (88.55%) - Added CLS token to patch tokens => responsible for predicting true label ###### tags: `vit`
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up