# Patches are all you need? - Main claim: patches are what lead to an improved performace at least to a certain extent - Stem [implementation](https://github.com/rwightman/pytorch-image-models/blob/7c67d6aca992f039eece0af5f7c29a43d48c00e4/timm/models/convmixer.py#L40-L44) ```python self.stem = nn.Sequential( nn.Conv2d(in_chans, dim, kernel_size=patch_size, stride=patch_size), activation(), nn.BatchNorm2d(dim) ) ``` - Blocks [implementation](https://github.com/rwightman/pytorch-image-models/blob/7c67d6aca992f039eece0af5f7c29a43d48c00e4/timm/models/convmixer.py#L45-L56) ```python self.blocks = nn.Sequential( *[nn.Sequential( Residual(nn.Sequential( nn.Conv2d(dim, dim, kernel_size, groups=dim, padding="same"), activation(), nn.BatchNorm2d(dim) )), nn.Conv2d(dim, dim, kernel_size=1), activation(), nn.BatchNorm2d(dim) ) for i in range(depth)] ) ``` - Note how `nn.Conv2d(dim, dim, kernel_size, groups=dim, padding="same")` mixes spatial dimensions only while `nn.Conv2d(dim, dim, kernel_size=1)` mixes channels only. Same in spirit as MLP-Mixer ###### tags: `vit`