VOS is a type of video object segmentation model consisting of two network components. The target appearance model consists of a light-weight module, which is learned during the inference stage using fast optimization techniques to predict a coarse but robust target segmentation. The segmentation model is exclusively trained offline, designed to process the coarse scores into high quality segmentation masks.
Source: Learning Fast and Robust Target Models for Video Object SegmentationPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Video Object Segmentation | 80 | 21.62% |
Semantic Segmentation | 79 | 21.35% |
Video Semantic Segmentation | 78 | 21.08% |
Semi-Supervised Video Object Segmentation | 32 | 8.65% |
Optical Flow Estimation | 10 | 2.70% |
Unsupervised Video Object Segmentation | 7 | 1.89% |
One-shot visual object segmentation | 7 | 1.89% |
Visual Object Tracking | 6 | 1.62% |
Object Detection | 6 | 1.62% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |