Zero-shot 3D Point Cloud Classification
6 papers with code • 2 benchmarks • 2 datasets
Most implemented papers
PointCLIP: Point Cloud Understanding by CLIP
On top of that, we design an inter-view adapter to better extract the global feature and adaptively fuse the few-shot knowledge learned from 3D into CLIP pre-trained in 2D.
PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world Learning
In this paper, we first collaborate CLIP and GPT to be a unified 3D open-world learner, named as PointCLIP V2, which fully unleashes their potential for zero-shot 3D classification, segmentation, and detection.
Uni3D: Exploring Unified 3D Representation at Scale
Scaling up representations for images or text has been extensively investigated in the past few years and has led to revolutions in learning vision and language.
CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth Pre-training
To address this issue, we propose CLIP2Point, an image-depth pre-training method by contrastive learning to transfer CLIP to the 3D domain, and adapt it to point cloud classification.
Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation
Insufficient synergy neglects the idea that a robust 3D representation should align with the joint vision-language space, rather than independently aligning with each modality.
OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation
When integrated with powerful 2D open-world models such as ODISE and GroundingDINO, excellent results were observed on open-vocabulary instance segmentation.