DAnA - HackMD

# DAnA ## Introduction - authors propose three concerns regarding previous work: (1) loss of information due to global average pooling on support features,(2) CNN are bad at modelling varying spatial distribution, (3) class-specific representations heavily assume those vectors are representative enough. - the hypothesis is that, if the above three problems are trivial in FSOD, the choice of support images should not affect the performance too much. - Experimental setup is such that the set of support images change every time when the models are tested on the same query set for 100 times. - proposes two modules: background attenuation (BA) and cross-image spatial attention (CISA) modules. - BA module is inspired by wave interference, where each feature vector of a high-level feature map is viewed as a wave along the channel dimension. When the representative vector is added back to the feature map, **the local features having different wave patterns can be considered as noise** by the detection network. - Cross-image spatial-attention is proposed to adaptively transform support images into query-positioned-aware vectors (QPA). - QPA carries support information that is considered the most relevant to each query region. By finding the correlation between query proposals and QPA vectors, the model can determine whether the query region belongs to a target object. - The dual-awareness attention mechanisms are used to **capture object-wise correlations.** ![](https://i.imgur.com/GarZo3x.png) ## Methodology ### BA block - ### CISA block