ESPNet is a convolutional neural network for semantic segmentation of high resolution images under resource constraints. ESPNet is based on a convolutional module, efficient spatial pyramid (ESP), which is efficient in terms of computation, memory, and power.
Source: ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic SegmentationPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Speech Recognition | 10 | 23.26% |
Automatic Speech Recognition (ASR) | 8 | 18.60% |
Semantic Segmentation | 5 | 11.63% |
Speech Separation | 2 | 4.65% |
Decoder | 2 | 4.65% |
Real-Time Semantic Segmentation | 2 | 4.65% |
Robust Speech Recognition | 1 | 2.33% |
Speech Enhancement | 1 | 2.33% |
Spoken Language Understanding | 1 | 2.33% |
Component | Type |
|
---|---|---|
1x1 Convolution
|
Convolutions | |
Convolution
|
Convolutions | |
ESP
|
Image Model Blocks | |
Kaiming Initialization
|
Initialization | |
PReLU
|
Activation Functions |