# 🐛 LLaMA 3.2 Vision — RuntimeError: `cutlassF: no kernel found to launch!`
## 🔍 Problem Description
While running Meta's `Llama-3.2-11B-Vision` model using Hugging Face Transformers and PyTorch, the following error occurred during inference:
```bash=
\anaconda3\envs\llama3vision\lib\site-packages\transformers\models\mllama\modeling_mllama.py", line 316, in forward
attn_output = F.scaled_dot_product_attention(query, key, value, attn_mask=attention_mask)
RuntimeError: cutlassF: no kernel found to launch!
```
This error typically happens when using scaled dot-product attention in PyTorch:
```python=
attn_output = F.scaled_dot_product_attention(query, key, value, attn_mask=attention_mask)
```
## 📦 Setup Details
* Torch version: 2.2.0+cu121 (2.1.2+cu121 would cause error)
```bash
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121
```