Vision and Language Pre-Trained Models

FashionCLIP is a fine-tuned CLIP model on fashion data (more than 800K pairs). It is the first foundation model for Fashion.

Source: Contrastive language and vision learning of general fashion concepts

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Retrieval 1 100.00%

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories