no code implementations • 6 Jun 2024 • Xinhang Liu, Yu-Wing Tai, Chi-Keung Tang, Pedro Miraldo, Suhas Lohit, Moitreya Chatterjee
Extensions of Neural Radiance Fields (NeRFs) to model dynamic scenes have enabled their near photo-realistic, free-viewpoint rendering.
1 code implementation • 25 Apr 2024 • Haomiao Ni, Bernhard Egger, Suhas Lohit, Anoop Cherian, Ye Wang, Toshiaki Koike-Akino, Sharon X. Huang, Tim K. Marks
To guide video generation with the additional image input, we propose a "repeat-and-slide" strategy that modulates the reverse denoising process, allowing the frozen diffusion model to synthesize a video frame-by-frame starting from the provided image.
no code implementations • 17 Apr 2024 • Deepti Hegde, Suhas Lohit, Kuan-Chuan Peng, Michael J. Jones, Vishal M. Patel
To this end, we propose CLIX$^\text{3D}$, a multimodal fusion and supervised contrastive learning framework for 3D object detection that performs alignment of object features from same-class samples of different domains while pushing the features from different classes apart.
no code implementations • 17 Apr 2024 • Deepti Hegde, Suhas Lohit, Kuan-Chuan Peng, Michael J. Jones, Vishal M. Patel
This can enable improved performance in downstream tasks that are equivariant to such transformations.
no code implementations • 23 Feb 2024 • Sourya Basu, Suhas Lohit, Matthew Brand
Recent work by Finzi et al. (2021) directly solves the equivariance constraint for arbitrary matrix groups to obtain equivariant MLPs (EMLPs).
no code implementations • ICCV 2023 • Nithin Gopalakrishnan Nair, Anoop Cherian, Suhas Lohit, Ye Wang, Toshiaki Koike-Akino, Vishal M. Patel, Tim K. Marks
To this end, and capitalizing on the powerful fine-grained generative control offered by the recent diffusion-based generative models, we introduce Steered Diffusion, a generalized framework for photorealistic zero-shot conditional image generation using a diffusion model trained for unconditional generation.
no code implementations • 28 Sep 2023 • Manish Sharma, Moitreya Chatterjee, Kuan-Chuan Peng, Suhas Lohit, Michael Jones
We first pretrain these factor matrices on the RGB modality, for which plenty of training data are assumed to exist and then augment only a few trainable parameters for training on the IR modality to avoid over-fitting, while encouraging them to capture complementary cues from those trained only on the RGB modality.
no code implementations • 25 Sep 2023 • Zachariah Carmichael, Suhas Lohit, Anoop Cherian, Michael Jones, Walter Scheirer
Prototypical part neural networks (ProtoPartNNs), namely PROTOPNET and its derivatives, are an intrinsically interpretable approach to machine learning.
1 code implementation • CVPR 2023 • Anoop Cherian, Kuan-Chuan Peng, Suhas Lohit, Kevin A. Smith, Joshua B. Tenenbaum
To answer this question, we propose SMART: a Simple Multimodal Algorithmic Reasoning Task and the associated SMART-101 dataset, for evaluating the abstraction, deduction, and generalization abilities of neural networks in solving visuo-linguistic puzzles designed specifically for children in the 6--8 age group.
no code implementations • 8 Sep 2022 • Sk Miraj Ahmed, Suhas Lohit, Kuan-Chuan Peng, Michael J. Jones, Amit K. Roy-Chowdhury
In such cases, transferring knowledge from a neural network trained on a well-labeled large dataset in the source modality (RGB) to a neural network that works on a target modality (depth, infrared, etc.)
1 code implementation • 19 Oct 2021 • David W. Romero, Suhas Lohit
Frequently, transformations occurring in data can be better represented by a subset of a group than by a group as a whole, e. g., rotations in $[-90^{\circ}, 90^{\circ}]$.
no code implementations • 29 Sep 2021 • Huan Wang, Suhas Lohit, Michael Jeffrey Jones, Yun Fu
We achieve new state-of-the-art accuracy by using the original KD loss armed with stronger augmentation schemes, compared to existing state-of-the-art methods that employ more advanced distillation losses.
no code implementations • 8 Dec 2020 • Suhas Lohit, Shubhendu Trivedi
These newly proposed convolutional layers naturally extend the notion of convolution to functions on the unit sphere $S^2$ and the group of rotations $SO(3)$ and these layers are equivariant to 3D rotations.
no code implementations • 7 Dec 2020 • Suhas Lohit, Michael Jones
Model compression methods are important to allow for easier deployment of deep learning models in compute, memory and energy-constrained environments such as mobile phones.
1 code implementation • 5 Dec 2020 • Huan Wang, Suhas Lohit, Mike Jones, Yun Fu
What makes a "good" DA in KD?
no code implementations • 5 Dec 2020 • Huan Wang, Suhas Lohit, Michael Jones, Yun Fu
We add loss terms for training the student that measure the dissimilarity between student and teacher outputs of the auxiliary classifiers.
no code implementations • 3 Dec 2020 • Suhas Lohit, Rushil Anirudh, Pavan Turaga
Motion capture (mocap) and time-of-flight based sensing of human actions are becoming increasingly popular modalities to perform robust activity analysis.
1 code implementation • 18 Jun 2020 • Rushil Anirudh, Suhas Lohit, Pavan Turaga
In this paper, we propose the generative patch prior (GPP) that defines a generative prior for compressive image recovery, based on patch-manifold models.
1 code implementation • CVPR 2019 • Suhas Lohit, Qiao Wang, Pavan Turaga
We call this a temporal transformer network (TTN).
no code implementations • 8 Sep 2018 • Suhas Lohit, Rajhans Singh, Kuldeep Kulkarni, Pavan Turaga
Using standard datasets, we demonstrate that, when tested over a range of MRs, a rate-adaptive network can provide high quality reconstruction over a the entire range, resulting in up to about 15 dB improvement over previous methods, where the network is valid for only one MR. We demonstrate the effectiveness of our approach for sample-efficient object tracking where video frames are acquired at dynamically varying MRs. We also extend this algorithm to learn the measurement operator in conjunction with image recognition networks.
no code implementations • 8 Jun 2018 • Li-Chi Huang, Kuldeep Kulkarni, Anik Jha, Suhas Lohit, Suren Jayasuriya, Pavan Turaga
Visual Question Answering (VQA) is a complex semantic task requiring both natural language processing and visual recognition.
no code implementations • 30 Aug 2017 • Suhas Lohit, Pavan Turaga
Non-Euclidean constraints are inherent in many kinds of data in computer vision and machine learning, typically as a result of specific invariance requirements that need to be respected during high-level inference.
no code implementations • 15 Aug 2017 • Suhas Lohit, Kuldeep Kulkarni, Ronan Kerviche, Pavan Turaga, Amit Ashok
We show empirically that our algorithm yields reconstructions with higher PSNRs compared to iterative algorithms at low measurement rates and in presence of measurement noise.
no code implementations • CVPR 2016 • Kuldeep Kulkarni, Suhas Lohit, Pavan Turaga, Ronan Kerviche, Amit Ashok
The intermediate reconstruction is fed into an off-the-shelf denoiser to obtain the final reconstructed image.
1 code implementation • CVPR 2016 • Kuldeep Kulkarni, Suhas Lohit, Pavan Turaga, Ronan Kerviche, Amit Ashok
The intermediate reconstruction is fed into an off-the-shelf denoiser to obtain the final reconstructed image.