no code implementations • 3 Apr 2024 • Cheng Zhao, Su Sun, Ruoyu Wang, Yuliang Guo, Jun-Jun Wan, Zhou Huang, Xinyu Huang, Yingjie Victor Chen, Liu Ren
Most 3D Gaussian Splatting (3D-GS) based methods for urban scenes initialize 3D Gaussians directly with 3D LiDAR points, which not only underutilizes LiDAR data capabilities but also overlooks the potential advantages of fusing LiDAR with camera data.
no code implementations • 3 Apr 2024 • Su Sun, Cheng Zhao, Yuliang Guo, Ruoyu Wang, Xinyu Huang, Yingjie Victor Chen, Liu Ren
The 3D Inpainter with abstract representation at coarse levels is trained offline using various scenes to complete occluded surfaces.
1 code implementation • 29 Mar 2024 • Abhinav Kumar, Yuliang Guo, Xinyu Huang, Liu Ren, Xiaoming Liu
We argue that the cause of failure is the sensitivity of depth regression losses to noise of larger objects.
3D Object Detection 3D Object Detection From Monocular Images +3
no code implementations • 23 Mar 2024 • Yuliang Guo, Abhinav Kumar, Cheng Zhao, Ruoyu Wang, Xinyu Huang, Liu Ren
Monocular 3D reconstruction for categorical objects heavily relies on accurately perceiving each object's pose.
2 code implementations • 25 Jan 2024 • Tianhe Ren, Shilong Liu, Ailing Zeng, Jing Lin, Kunchang Li, He Cao, Jiayu Chen, Xinyu Huang, Yukang Chen, Feng Yan, Zhaoyang Zeng, Hao Zhang, Feng Li, Jie Yang, Hongyang Li, Qing Jiang, Lei Zhang
We introduce Grounded SAM, which uses Grounding DINO as an open-set object detector to combine with the segment anything model (SAM).
1 code implementation • NeurIPS 2023 • Yunhao Ge, Hong-Xing Yu, Cheng Zhao, Yuliang Guo, Xinyu Huang, Liu Ren, Laurent Itti, Jiajun Wu
A major challenge in monocular 3D object detection is the limited diversity and quantity of objects in real datasets.
no code implementations • 20 Nov 2023 • Shisheng Hu, Jie Gao, Xinyu Huang, Mushu Li, Kaige Qu, Conghao Zhou, Xuemin, Shen
A DT of the ISAC device is constructed to predict the impact of potential decisions on the long-term computation cost of the server, based on which the decisions are made with closed-form formulas.
2 code implementations • 23 Oct 2023 • Xinyu Huang, Yi-Jie Huang, Youcai Zhang, Weiwei Tian, Rui Feng, Yuejie Zhang, Yanchun Xie, Yaqian Li, Lei Zhang
Specifically, for predefined commonly used tag categories, RAM++ showcases 10. 2 mAP and 15. 4 mAP enhancements over CLIP on OpenImages and ImageNet.
no code implementations • 9 Jun 2023 • Xinyu Huang, Wen Wu, Xuemin Sherman Shen
In this paper, we propose a digital twin (DT)-assisted resource demand prediction scheme to enhance prediction accuracy for multicast short video streaming.
2 code implementations • 6 Jun 2023 • Youcai Zhang, Xinyu Huang, Jinyu Ma, Zhaoyang Li, Zhaochuan Luo, Yanchun Xie, Yuzhuo Qin, Tong Luo, Yaqian Li, Shilong Liu, Yandong Guo, Lei Zhang
We are releasing the RAM at \url{https://recognize-anything. github. io/} to foster the advancements of large models in computer vision.
2 code implementations • 10 Mar 2023 • Xinyu Huang, Youcai Zhang, Jinyu Ma, Weiwei Tian, Rui Feng, Yuejie Zhang, Yaqian Li, Yandong Guo, Lei Zhang
This paper presents Tag2Text, a vision language pre-training (VLP) framework, which introduces image tagging into vision-language models to guide the learning of visual-linguistic features.
no code implementations • 15 Feb 2023 • Yuting Fang, Stuart T. Johnston, Matt Faria, Xinyu Huang, Andrew W. Eckford, Jamie Evans
Our results show that the activation probability at the B-NM increases as this B-NM is located closer to the center of the B-NM population and the aggregate absorption rate of the drug molecules non-linearly increases as the population density increases.
no code implementations • 13 Nov 2022 • Xinyu Huang, Mushu Li, Wen Wu, Conghao Zhou, Xuemin Sherman Shen
Particularly, two DTs are constructed for emulating the cloud-edge collaborative transcoding process by analyzing spatial-temporal information of individual videos and transcoding configurations of transcoding queues, respectively.
1 code implementation • 12 Jul 2022 • Xinyu Huang, Youcai Zhang, Ying Cheng, Weiwei Tian, RuiWei Zhao, Rui Feng, Yuejie Zhang, Yaqian Li, Yandong Guo, Xiaobo Zhang
However, the image-text pairs co-occurrent on the Internet typically lack explicit alignment information, which is suboptimal for VLP.
no code implementations • 7 Jun 2022 • HaoYuan Chen, Chen Li, Xiaoyan Li, Md Mamunur Rahaman, Weiming Hu, Yixin Li, Wanli Liu, Changhao Sun, Hongzan Sun, Xinyu Huang, Marcin Grzegorzek
In addition, we conducted an ablation experiment and an interchangeability experiment to verify the ability and interchangeability of the three channels.
no code implementations • 25 May 2022 • Weiming Hu, HaoYuan Chen, Wanli Liu, Xiaoyan Li, Hongzan Sun, Xinyu Huang, Marcin Grzegorzek, Chen Li
Ensemble learning is a way to improve the accuracy of algorithms, and finding multiple learning models with complementarity types is the basis of ensemble learning.
no code implementations • 17 May 2022 • Haiqing Zhang, Chen Li, Shiliang Ai, HaoYuan Chen, Yuchao Zheng, Yixin Li, Xiaoyan Li, Hongzan Sun, Xinyu Huang, Marcin Grzegorzek
The gold standard for gastric cancer detection is gastric histopathological image analysis, but there are certain drawbacks in the existing histopathological detection and diagnosis.
Histopathological Image Classification Image Classification +3
no code implementations • 9 May 2022 • Xinyu Huang, Conghao Zhou, Wen Wu, Mushu Li, Huaqing Wu, Xuemin, Shen
In this paper, we present a digital twin (DT)-assisted adaptive video streaming scheme to enhance personalized quality-of-experience (PQoE).
no code implementations • 18 Apr 2022 • Shuojia Zou, Chen Li, Hongzan Sun, Peng Xu, Jiawei Zhang, Pingli Ma, YuDong Yao, Xinyu Huang, Marcin Grzegorzek
The detection of tiny objects in microscopic videos is a problematic point, especially in large-scale experiments.
no code implementations • 18 Apr 2022 • Yuchao Zheng, Chen Li, Xiaomin Zhou, HaoYuan Chen, Hao Xu, Yixin Li, Haiqing Zhang, Xiaoyan Li, Hongzan Sun, Xinyu Huang, Marcin Grzegorzek
Method: This paper proposes a deep ensemble model based on image-level labels for the binary classification of benign and malignant lesions of breast histopathological images.
1 code implementation • CVPR 2022 • Yuyan Li, Yuliang Guo, Zhixin Yan, Xinyu Huang, Ye Duan, Liu Ren
In this paper, we propose a 360 monocular depth estimation pipeline, OmniFusion, to tackle the spherical distortion issue.
Ranked #6 on Depth Estimation on Stanford2D3D Panoramic
no code implementations • 17 Feb 2022 • Weiming Hu, Chen Li, Xiaoyan Li, Md Mamunur Rahaman, Yong Zhang, HaoYuan Chen, Wanli Liu, YuDong Yao, Hongzan Sun, Ning Xu, Xinyu Huang, Marcin Grzegorze
Traditional machine learning methods achieve maximum accuracy of 76. 02% and deep learning method achieves a maximum accuracy of 95. 37%.
no code implementations • 14 Feb 2022 • Jian Wu, Wanli Liu, Chen Li, Tao Jiang, Islam Mohammad Shariful, Hongzan Sun, Xiaoqi Li, Xintong Li, Xinyu Huang, Marcin Grzegorzek
Image analysis technology is used to solve the inadvertences of artificial traditional methods in disease, wastewater treatment, environmental change monitoring analysis and convolutional neural networks (CNN) play an important role in microscopic image analysis.
2 code implementations • 13 Dec 2021 • Youcai Zhang, Yuhao Cheng, Xinyu Huang, Fei Wen, Rui Feng, Yaqian Li, Yandong Guo
Multi-label learning in the presence of missing labels (MLML) is a challenging problem.
no code implementations • 13 Apr 2021 • Xintong Li, Weiming Hu, Chen Li, Tao Jiang, Hongzan Sun, Xiaoyan Li, Xinyu Huang, Marcin Grzegorzek
Finally, the application prospect of the analytical method in this field is discussed.
no code implementations • 5 Jan 2021 • Tantan Zhao, Lijun He, Xinyu Huang, Fan Li
In this paper, by considering the interaction between video encoding and edge caching, we investigate the quality of experience (QoE)-driven cross-layer optimization of secure video transmission over the wireless backhaul link in cloud-edge collaborative networks.
Multimedia
no code implementations • 16 Oct 2020 • Xinyu Huang, Lijun He, Xing Chen, Liejun Wang, Fan Li
In this paper, we propose a joint task type and vehicle speed-aware task offloading and resource allocation strategy to decrease the vehicl's energy cost for executing tasks and increase the revenue of the vehicle for processing tasks within the delay constraint.
1 code implementation • 23 Jan 2019 • Wei Li, Chengwei Pan, Rong Zhang, Jiaping Ren, Yuexin Ma, Jin Fang, Feilong Yan, Qichuan Geng, Xinyu Huang, Huajun Gong, Weiwei Xu, Guoping Wang, Dinesh Manocha, Ruigang Yang
Our augmented approach combines the flexibility in a virtual environment (e. g., vehicle movements) with the richness of the real world to allow effective simulation of anywhere on earth.
no code implementations • 27 Nov 2018 • Qichuan Geng, Hong Zhang, Xinyu Huang, Sen Wang, Feixiang Lu, Xinjing Cheng, Zhong Zhou, Ruigang Yang
As it is labor-intensive to annotate semantic parts on real street views, we propose a specific approach to implicitly transfer part features from synthesized images to real street views.
1 code implementation • 8 Sep 2018 • Yan Xia, Yang Zhang, Dingfu Zhou, Xinyu Huang, Cheng Wang, Ruigang Yang
Then, the image together with the retrieved shape model is fed into the proposed network to generate the fine-grained 3D point cloud.
no code implementations • 1 Aug 2018 • Qichuan Geng, Xinyu Huang, Zhong Zhou, Ruigang Yang
Confusing classes that are ubiquitous in real world often degrade performance for many vision related applications like object detection, classification, and segmentation.
2 code implementations • 16 Mar 2018 • Xinyu Huang, Peng Wang, Xinjing Cheng, Dingfu Zhou, Qichuan Geng, Ruigang Yang
In this paper, we provide a sensor fusion scheme integrating camera videos, consumer-grade motion sensors (GPS/IMU), and a 3D semantic map in order to achieve robust self-localization and semantic segmentation for autonomous driving.
no code implementations • 26 Oct 2016 • Yajie Zhao, Qingguo Xu, Xinyu Huang, Ruigang Yang
The main purpose of this paper is to synthesize realistic face images without occlusions based on the images captured by these cameras.