Bilateral Reference for High-Resolution Dichotomous Image Segmentation
We introduce a novel bilateral reference framework (BiRefNet) for high-resolution dichotomous image segmentation (DIS). It comprises two essential components: the localization module (LM) and the reconstruction module (RM) with our proposed bilateral reference (BiRef). The LM aids in object localization using global semantic information. Within the RM, we utilize BiRef for the reconstruction process, where hierarchical patches of images provide the source reference and gradient maps serve as the target reference. These components collaborate to generate the final predicted maps. We also introduce auxiliary gradient supervision to enhance focus on regions with finer details. Furthermore, we outline practical training strategies tailored for DIS to improve map quality and training process. To validate the general applicability of our approach, we conduct extensive experiments on four tasks to evince that BiRefNet exhibits remarkable performance, outperforming task-specific cutting-edge methods across all benchmarks. Our codes are available at https://github.com/ZhengPeng7/BiRefNet.
PDF AbstractCode
Results from the Paper
Ranked #1 on RGB Salient Object Detection on HRSOD (using extra training data)
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Uses Extra Training Data |
Benchmark |
---|---|---|---|---|---|---|---|
Camouflaged Object Segmentation | CAMO | BiRefNet | MAE | 0.023 | # 1 | ||
Weighted F-Measure | 0.888 | # 1 | |||||
S-Measure | 0.907 | # 1 | |||||
Camouflaged Object Segmentation | CHAMELEON | BiRefNet | S-measure | 0.928 | # 1 | ||
weighted F-measure | 0.898 | # 1 | |||||
MAE | 0.015 | # 1 | |||||
Camouflaged Object Segmentation | COD | BiRefNet | MAE | 0.030 | # 6 | ||
Weighted F-Measure | 0.888 | # 1 | |||||
S-Measure | 0.907 | # 1 | |||||
RGB Salient Object Detection | DAVIS-S | BiRefNet (DUTS, HRSOD, UHRSD) | S-measure | 0.973 | # 1 | ||
F-measure | 0.978 | # 1 | |||||
MAE | 0.005 | # 1 | |||||
RGB Salient Object Detection | DAVIS-S | BiRefNet (HRSOD, UHRSD) | S-measure | 0.973 | # 1 | ||
F-measure | 0.977 | # 2 | |||||
MAE | 0.006 | # 2 | |||||
RGB Salient Object Detection | DAVIS-S | BiRefNet (DUTS) | S-measure | 0.946 | # 8 | ||
F-measure | 0.937 | # 8 | |||||
MAE | 0.012 | # 7 | |||||
Dichotomous Image Segmentation | DIS-TE1 | BiRefNet | max F-Measure | 0.866 | # 2 | ||
weighted F-measure | 0.829 | # 1 | |||||
MAE | 0.036 | # 1 | |||||
S-Measure | 0.889 | # 1 | |||||
E-measure | 0.917 | # 1 | |||||
HCE | 115 | # 2 | |||||
Dichotomous Image Segmentation | DIS-TE2 | BiRefNet | max F-Measure | 0.906 | # 2 | ||
weighted F-measure | 0.876 | # 1 | |||||
MAE | 0.031 | # 2 | |||||
S-Measure | 0.913 | # 2 | |||||
E-measure | 0.943 | # 2 | |||||
HCE | 283 | # 2 | |||||
Dichotomous Image Segmentation | DIS-TE3 | BiRefNet | max F-Measure | 0.920 | # 2 | ||
weighted F-measure | 0.888 | # 2 | |||||
MAE | 0.029 | # 1 | |||||
S-Measure | 0.918 | # 2 | |||||
E-measure | 0.937 | # 4 | |||||
HCE | 617 | # 3 | |||||
Dichotomous Image Segmentation | DIS-TE4 | BiRefNet | max F-Measure | 0.906 | # 2 | ||
weighted F-measure | 0.866 | # 1 | |||||
MAE | 0.038 | # 1 | |||||
S-Measure | 0.902 | # 3 | |||||
E-measure | 0.940 | # 2 | |||||
HCE | 2830 | # 3 | |||||
Dichotomous Image Segmentation | DIS-VD | BiRefNet | max F-Measure | 0.897 | # 2 | ||
weighted F-measure | 0.863 | # 1 | |||||
MAE | 0.036 | # 1 | |||||
S-Measure | 0.905 | # 1 | |||||
E-measure | 0.937 | # 2 | |||||
HCE | 1039 | # 3 | |||||
RGB Salient Object Detection | DUT-OMRON | BiRefNet (DUTS, HRSOD, UHRSD) | MAE | 0.035 | # 1 | ||
F-measure | 0.845 | # 2 | |||||
S-Measure | 0.881 | # 1 | |||||
mean F-Measure | 0.838 | # 1 | |||||
mean E-Measure | 0.908 | # 1 | |||||
Weighted F-Measure | 0.830 | # 1 | |||||
RGB Salient Object Detection | DUT-OMRON | BiRefNet (HRSOD, UHRSD) | MAE | 0.039 | # 3 | ||
F-measure | 0.831 | # 5 | |||||
S-Measure | 0.875 | # 2 | |||||
mean F-Measure | 0.817 | # 2 | |||||
mean E-Measure | 0.889 | # 2 | |||||
Weighted F-Measure | 0.804 | # 3 | |||||
RGB Salient Object Detection | DUT-OMRON | BiRefNet (DUTS) | MAE | 0.035 | # 1 | ||
F-measure | 0.810 | # 6 | |||||
S-Measure | 0.860 | # 5 | |||||
mean E-Measure | 0.884 | # 3 | |||||
RGB Salient Object Detection | DUTS-TE | BiRefNet (DUTS, HRSOD, UHRSD) | MAE | 0.016 | # 1 | ||
max F-measure | 0.944 | # 1 | |||||
S-Measure | 0.941 | # 1 | |||||
mean E-Measure | 0.969 | # 1 | |||||
mean F-Measure | 0.933 | # 1 | |||||
Weighted F-Measure | 0.932 | # 1 | |||||
RGB Salient Object Detection | DUTS-TE | BiRefNet (DUTS) | MAE | 0.025 | # 6 | ||
max F-measure | 0.910 | # 6 | |||||
S-Measure | 0.922 | # 5 | |||||
mean E-Measure | 0.946 | # 3 | |||||
RGB Salient Object Detection | DUTS-TE | BiRefNet (HRSOD, UHRSD) | MAE | 0.020 | # 2 | ||
max F-measure | 0.935 | # 2 | |||||
S-Measure | 0.937 | # 2 | |||||
mean E-Measure | 0.953 | # 2 | |||||
mean F-Measure | 0.918 | # 2 | |||||
Weighted F-Measure | 0.910 | # 2 | |||||
RGB Salient Object Detection | HRSOD | BiRefNet (DUTS, HRSOD, UHRSD) | S-Measure | 0.960 | # 1 | ||
max F-Measure | 0.962 | # 1 | |||||
MAE | 0.011 | # 1 | |||||
RGB Salient Object Detection | HRSOD | BiRefNet (HRSOD, UHRSD) | S-Measure | 0.960 | # 1 | ||
max F-Measure | 0.958 | # 2 | |||||
MAE | 0.014 | # 2 | |||||
RGB Salient Object Detection | HRSOD | BiRefNet (DUTS) | S-Measure | 0.943 | # 6 | ||
max F-Measure | 0.934 | # 7 | |||||
MAE | 0.021 | # 8 | |||||
Camouflaged Object Segmentation | NC4K | BiRefNet | S-measure | 0.915 | # 1 | ||
weighted F-measure | 0.890 | # 1 | |||||
MAE | 0.023 | # 1 | |||||
RGB Salient Object Detection | UHRSD | BiRefNet (DUTS) | S-Measure | 0.922 | # 7 | ||
max F-Measure | 0.928 | # 7 | |||||
MAE | 0.035 | # 7 | |||||
RGB Salient Object Detection | UHRSD | BiRefNet (DUTS, HRSOD, UHRSD) | S-Measure | 0.952 | # 3 | ||
max F-Measure | 0.960 | # 1 | |||||
MAE | 0.016 | # 1 | |||||
RGB Salient Object Detection | UHRSD | BiRefNet (HRSOD, UHRSD) | S-Measure | 0.953 | # 1 | ||
max F-Measure | 0.960 | # 1 | |||||
MAE | 0.019 | # 2 |