# M3 Attention Maps We show two attention maps of a head using two fixed utterances. The left column shows the maps from the shorter utterance; the right column shows the longer. We also show the properties of a head: **id**, **category**, **globalness (G)**, **verticality (V)**, **Diagonality (D)**. Each metric value of a head is followed by the metric rank among all heads. For example, `G: 5.120 (8)` means the head has the globalness of **5.120** and is the **8th** global head among all heads. ## Layer 1 ![](https://i.imgur.com/mQsnOoq.png) ## Layer 2 ![](https://i.imgur.com/pSzPCdv.png) ## Layer 3 ![](https://i.imgur.com/ynrWa9x.png) ## Layer 4 ![](https://i.imgur.com/iPQDuaD.png) ## Layer 5 ![](https://i.imgur.com/ym0CrNH.png) ## Layer 6 ![](https://i.imgur.com/osPOhaR.png) ## Layer 7 ![](https://i.imgur.com/uF94So6.png) ## Layer 8 ![](https://i.imgur.com/3TeNyZt.png) ## Layer 9 ![](https://i.imgur.com/p5sGkS7.png) ## Layer 10 ![](https://i.imgur.com/zRSZJaY.png) ## Layer 11 ![](https://i.imgur.com/BJuBV4M.png) ## Layer 12 ![](https://i.imgur.com/xbtWH94.png)