Embedded Gaussian Affinity

Introduced by Wang et al. in Non-local Neural Networks

Embedded Gaussian Affinity is a type of affinity or self-similarity function between two points $\mathbf{x_{i}}$ and $\mathbf{x_{j}}$ that uses a Gaussian function in an embedding space:

$$ f\left(\mathbf{x_{i}}, \mathbf{x_{j}}\right) = e^{\theta\left(\mathbf{x_{i}}\right)^{T}\phi\left(\mathbf{x_{j}}\right)} $$

Here $\theta\left(x_{i}\right) = W_{θ}x_{i}$ and $\phi\left(x_{j}\right) = W_{φ}x_{j}$ are two embeddings.

Note that the self-attention module used in the original Transformer model is a special case of non-local operations in the embedded Gaussian version. This can be seen from the fact that for a given $i$, $\frac{1}{\mathcal{C}\left(\mathbf{x}\right)}\sum_{\forall{j}}f\left(\mathbf{x}_{i}, \mathbf{x}_{j}\right)g\left(\mathbf{x}_{j}\right)$ becomes the softmax computation along the dimension $j$. So we have $\mathbf{y} = \text{softmax}\left(\mathbf{x}^{T}W^{T}_{\theta}W_{\phi}\mathbf{x}\right)g\left(\mathbf{x}\right)$, which is the self-attention form in the Transformer model. This shows how we can relate this recent self-attention model to the classic computer vision method of non-local means.

Source: Non-local Neural Networks

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Object Detection	3	21.43%
Nutrition	1	7.14%
Ensemble Learning	1	7.14%
Medical Object Detection	1	7.14%
Food recommendation	1	7.14%
Object Localization	1	7.14%
Action Classification	1	7.14%
Action Recognition	1	7.14%
Instance Segmentation	1	7.14%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Affinity Functions