1 code implementation • 19 May 2023 • Suhyeon Lee, Won Jun Kim, Jinho Chang, Jong Chul Ye
Many recent works have focused on training adapter networks that serve as an information bridge between image processing networks and LLMs; but presumably, in order to achieve maximum reasoning potential of LLMs on visual information as well, visual and language features should be allowed to interact more freely.