TSRUc, or Transformation-based Spatial Recurrent Unit c, is a modification of a ConvGRU used in the TriVD-GAN architecture for video generation.
Instead of computing the reset gate $r$ and resetting $h_{t−1}$, the TSRUc computes the parameters of a transformation $\theta$, which we use to warp $h_{t−1}$. The rest of our model is unchanged (with $\hat{h}_{t-1}$ playing the role of $h'_{t}$ in $c$’s update equation from ConvGRU. The TSRUc module is described by the following equations:
$$ \theta_{h,x} = f\left(h_{t−1}, x_{t}\right) $$
$$ \hat{h}_{t-1} = w\left(h_{t-1}; \theta_{h, x}\right) $$
$$ c = \rho\left(W_{c} \star_{n}\left[\hat{h}_{t-1};x_{t}\right] + b_{c} \right) $$
$$ u = \sigma\left(W_{u} \star_{n}\left[h_{t-1};x_{t}\right] + b_{u} \right) $$
$$ h_{t} = u \odot h_{t-1} + \left(1-u\right) \odot c $$
In these equations $\sigma$ and $\rho$ are the elementwise sigmoid and ReLU functions respectively and the $\star_{n}$ represents a convolution with a kernel of size $n \times n$. Brackets are used to represent a feature concatenation.
Source: Transformation-based Adversarial Video Prediction on Large-Scale DataPaper | Code | Results | Date | Stars |
---|
Component | Type |
|
---|---|---|
Convolution
|
Convolutions | |
Dense Connections
|
Feedforward Networks | |
Max Pooling
|
Pooling Operations | |
ReLU
|
Activation Functions | |
Sigmoid Activation
|
Activation Functions |