1 code implementation • 6 Aug 2023 • Zheng Ma, Mianzhi Pan, Wenhan Wu, Kanzhi Cheng, Jianbing Zhang, ShuJian Huang, Jiajun Chen
Experiments on our proposed datasets demonstrate that popular VLMs underperform in the food domain compared with their performance in the general domain.
no code implementations • 18 Oct 2022 • Zheng Ma, Shi Zong, Mianzhi Pan, Jianbing Zhang, ShuJian Huang, Xinyu Dai, Jiajun Chen
In recent years, vision and language pre-training (VLP) models have advanced the state-of-the-art results in a variety of cross-modal downstream tasks.