Search Results for author: Bo Li

Found 589 papers, 249 papers with code

Profanity-Avoiding Training Framework for Seq2seq Models with Certified Robustness

no code implementations • EMNLP 2021 • Hengtong Zhang, Tianhang Zheng, Yaliang Li, Jing Gao, Lu Su, Bo Li

To address this problem, we propose a training framework with certified robustness to eliminate the causes that trigger the generation of profanity.

Dialogue Generation Style Transfer

Paper
Add Code

Alibaba Speech Translation Systems for IWSLT 2018

no code implementations • IWSLT (EMNLP) 2018 • Nguyen Bach, Hongjie Chen, Kai Fan, Cheung-Chi Leung, Bo Li, Chongjia Ni, Rong Tong, Pei Zhang, Boxing Chen, Bin Ma, Fei Huang

This work describes the En→De Alibaba speech translation system developed for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2018.

Sentence Translation

Paper
Add Code

Relative Pose Estimation of Calibrated Cameras with Known SE(3) Invariants

1 code implementation • ECCV 2020 • Bo Li, Evgeniy Martyushev, Gim Hee Lee

In this paper, we present a complete comprehensive study of the relative pose estimation problem for a calibrated camera constrained by known $\mathrm{SE}(3)$ invariant, which involves 5 minimal problems in total.

Pose Estimation Translation

Paper
Code

SemanticAdv: Generating Adversarial Examples via Attribute-conditioned Image Editing

1 code implementation • ECCV 2020 • Haonan Qiu, Chaowei Xiao, Lei Yang, Xinchen Yan, Honglak Lee, Bo Li

Deep neural networks (DNNs) have achieved great successes in various vision applications due to their strong expressive power.

Adversarial Attack Attribute +2

Paper
Code

RLeXplore: Accelerating Research in Intrinsically-Motivated Reinforcement Learning

1 code implementation • 29 May 2024 • Mingqi Yuan, Roger Creus Castanyer, Bo Li, Xin Jin, Glen Berseth, Wenjun Zeng

Extrinsic rewards can effectively guide reinforcement learning (RL) agents in specific tasks.

320

Paper
Code

Real-Time Dynamic Robot-Assisted Hand-Object Interaction via Motion Primitives

no code implementations • 29 May 2024 • Mingqi Yuan, Huijiang Wang, Kai-Fung Chu, Fumiya Iida, Bo Li, Wenjun Zeng

These challenges arise from the need for accurate real-time perception of human actions, adaptive control algorithms for robots, and the effective coordination between human and robotic movements.

Paper
Add Code

AI Risk Management Should Incorporate Both Safety and Security

no code implementations • 29 May 2024 • Xiangyu Qi, Yangsibo Huang, Yi Zeng, Edoardo Debenedetti, Jonas Geiping, Luxi He, Kaixuan Huang, Udari Madhushani, Vikash Sehwag, Weijia Shi, Boyi Wei, Tinghao Xie, Danqi Chen, Pin-Yu Chen, Jeffrey Ding, Ruoxi Jia, Jiaqi Ma, Arvind Narayanan, Weijie J Su, Mengdi Wang, Chaowei Xiao, Bo Li, Dawn Song, Peter Henderson, Prateek Mittal

The exposure of security vulnerabilities in safety-aligned language models, e. g., susceptibility to adversarial attacks, has shed light on the intricate interplay between AI safety and AI security.

Paper
Add Code

Scalable Visual State Space Model with Fractal Scanning

no code implementations • 23 May 2024 • Lv Tang, Haoke Xiao, Peng-Tao Jiang, Hao Zhang, Jinwei Chen, Bo Li

To address this challenge, State Space Models (SSMs) like Mamba have emerged as efficient alternatives, initially matching Transformer performance in NLP tasks and later surpassing Vision Transformers (ViTs) in various CV tasks.

Image Classification

Paper
Add Code

ChatScene: Knowledge-Enabled Safety-Critical Scenario Generation for Autonomous Vehicles

no code implementations • 22 May 2024 • Jiawei Zhang, Chejian Xu, Bo Li

We present ChatScene, a Large Language Model (LLM)-based agent that leverages the capabilities of LLMs to generate safety-critical scenarios for autonomous vehicles.

Autonomous Driving Language Modelling +1

Paper
Add Code

Spatial Matching of 2D Mammography Images and Specimen Radiographs: Towards Improved Characterization of Suspicious Microcalcifications

no code implementations • 21 May 2024 • Noor Nakhaei, Chrysostomos Marasinou, Akinyinka Omigbodun, Nina Capiro, Bo Li, Anne Hoyt, William Hsu

Accurate characterization of suspicious microcalcifications is critical to determine whether these calcifications are associated with invasive disease.

Paper
Add Code

Multi-level Personalized Federated Learning on Heterogeneous and Long-Tailed Data

no code implementations • 10 May 2024 • Rongyu Zhang, Yun Chen, Chenrui Wu, Fangxin Wang, Bo Li

Federated learning (FL) offers a privacy-centric distributed learning framework, enabling model training on individual clients and central aggregation without necessitating data exchange.

Autonomous Vehicles Image Classification +2

Paper
Add Code

WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning

no code implementations • 6 May 2024 • Yuanhan Zhang, Kaichen Zhang, Bo Li, Fanyi Pu, Christopher Arif Setiadharma, Jingkang Yang, Ziwei Liu

Multimodal information, together with our knowledge, help us to understand the complex and dynamic world.

Multiple-choice Video Understanding +1

Paper
Add Code

Provably Unlearnable Examples

no code implementations • 6 May 2024 • Derui Wang, Minhui Xue, Bo Li, Seyit Camtepe, Liming Zhu

Nevertheless, the absence of mechanisms that can verify how robust the UEs are against unknown unauthorized models and train-time techniques engenders several problems.

Data Augmentation

Paper
Add Code

Outlier Gradient Analysis: Efficiently Improving Deep Learning Model Performance via Hessian-Free Influence Functions

no code implementations • 6 May 2024 • Anshuman Chhabra, Bo Li, Jian Chen, Prasant Mohapatra, Hongfu Liu

Influence functions offer a robust framework for assessing the impact of each training data sample on model predictions, serving as a prominent tool in data-centric learning.

Paper
Add Code

Leveraging the Human Ventral Visual Stream to Improve Neural Network Robustness

1 code implementation • 4 May 2024 • Zhenan Shao, Linjian Ma, Bo Li, Diane M. Beck

Human object recognition likely owes its robustness, in part, to the increasingly resilient representations that emerge along the hierarchy of the ventral visual cortex.

Decision Making Object Recognition

Paper
Code

ASAM: Boosting Segment Anything Model with Adversarial Tuning

1 code implementation • 1 May 2024 • Bo Li, Haoke Xiao, Lv Tang

In the evolving landscape of computer vision, foundation models have emerged as pivotal tools, exhibiting exceptional adaptability to a myriad of tasks.

Image Segmentation Segmentation +1

Paper
Code

Introducing v0.5 of the AI Safety Benchmark from MLCommons

1 code implementation • 18 Apr 2024 • Bertie Vidgen, Adarsh Agrawal, Ahmed M. Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, Max Bartolo, Borhane Blili-Hamelin, Kurt Bollacker, Rishi Bomassani, Marisa Ferrara Boston, Siméon Campos, Kal Chakra, Canyu Chen, Cody Coleman, Zacharie Delpierre Coudert, Leon Derczynski, Debojyoti Dutta, Ian Eisenberg, James Ezick, Heather Frase, Brian Fuller, Ram Gandikota, Agasthya Gangavarapu, Ananya Gangavarapu, James Gealy, Rajat Ghosh, James Goel, Usman Gohar, Sujata Goswami, Scott A. Hale, Wiebke Hutiri, Joseph Marvin Imperial, Surgan Jandial, Nick Judd, Felix Juefei-Xu, Foutse khomh, Bhavya Kailkhura, Hannah Rose Kirk, Kevin Klyman, Chris Knotz, Michael Kuchnik, Shachi H. Kumar, Srijan Kumar, Chris Lengerich, Bo Li, Zeyi Liao, Eileen Peters Long, Victor Lu, Sarah Luger, Yifan Mai, Priyanka Mary Mammen, Kelvin Manyeki, Sean McGregor, Virendra Mehta, Shafee Mohammed, Emanuel Moss, Lama Nachman, Dinesh Jinenhally Naganna, Amin Nikanjam, Besmira Nushi, Luis Oala, Iftach Orr, Alicia Parrish, Cigdem Patlak, William Pietri, Forough Poursabzi-Sangdeh, Eleonora Presani, Fabrizio Puletti, Paul Röttger, Saurav Sahay, Tim Santos, Nino Scherrer, Alice Schoenauer Sebag, Patrick Schramowski, Abolfazl Shahbazi, Vin Sharma, Xudong Shen, Vamsi Sistla, Leonard Tang, Davide Testuggine, Vithursan Thangarasa, Elizabeth Anne Watkins, Rebecca Weiss, Chris Welty, Tyler Wilbers, Adina Williams, Carole-Jean Wu, Poonam Yadav, Xianjun Yang, Yi Zeng, Wenhui Zhang, Fedor Zhdanov, Jiacheng Zhu, Percy Liang, Peter Mattson, Joaquin Vanschoren

We created a new taxonomy of 13 hazard categories, of which 7 have tests in the v0. 5 benchmark.

Paper
Code

SambaLingo: Teaching Large Language Models New Languages

no code implementations • 8 Apr 2024 • Zoltan Csaki, Bo Li, Jonathan Li, Qiantong Xu, Pian Pawakapan, Leon Zhang, Yun Du, Hengyu Zhao, Changran Hu, Urmish Thakker

In this paper, we present a comprehensive investigation into the adaptation of LLMs to new languages.

Paper
Add Code

ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming

1 code implementation • 6 Apr 2024 • Simone Tedeschi, Felix Friedrich, Patrick Schramowski, Kristian Kersting, Roberto Navigli, Huu Nguyen, Bo Li

When building Large Language Models (LLMs), it is paramount to bear safety in mind and protect them with guardrails.

Paper
Code

KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual Checking

1 code implementation • 3 Apr 2024 • Jiawei Zhang, Chejian Xu, Yu Gai, Freddy Lecue, Dawn Song, Bo Li

This paper introduces KnowHalu, a novel approach for detecting hallucinations in text generated by large language models (LLMs), utilizing step-wise reasoning, multi-formulation query, multi-form knowledge for factual checking, and fusion-based detection mechanism.

Fact Checking Hallucination +1

Paper
Code

QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs

1 code implementation • 30 Mar 2024 • Saleh Ashkboos, Amirkeivan Mohtashami, Maximilian L. Croci, Bo Li, Martin Jaggi, Dan Alistarh, Torsten Hoefler, James Hensman

We introduce QuaRot, a new Quantization scheme based on Rotations, which is able to quantize LLMs end-to-end, including all weights, activations, and KV cache in 4 bits.

Quantization

158

Paper
Code

Checkpoint Merging via Bayesian Optimization in LLM Pretraining

no code implementations • 28 Mar 2024 • Deyuan Liu, Zecheng Wang, Bingning Wang, WeiPeng Chen, Chunshan Li, Zhiying Tu, Dianhui Chu, Bo Li, Dianbo Sui

The rapid proliferation of large language models (LLMs) such as GPT-4 and Gemini underscores the intense demand for resources during their training processes, posing significant challenges due to substantial computational and environmental costs.

Bayesian Optimization

Paper
Add Code

Multi-Task Dense Prediction via Mixture of Low-Rank Experts

1 code implementation • 26 Mar 2024 • YuQi Yang, Peng-Tao Jiang, Qibin Hou, Hao Zhang, Jinwei Chen, Bo Li

Furthermore, to control the parameters and computational cost brought by the increase in the number of experts, we take inspiration from LoRA and propose to leverage the low-rank format of a vanilla convolution in the expert network.

Decoder

Paper
Code

TablePuppet: A Generic Framework for Relational Federated Learning

1 code implementation • 23 Mar 2024 • Lijie Xu, Chulin Xie, Yiran Guo, Gustavo Alonso, Bo Li, Guoliang Li, Wei Wang, Wentao Wu, Ce Zhang

In this paper, we formalize this problem as relational federated learning (RFL).

Federated Learning

Paper
Code

PPA-Game: Characterizing and Learning Competitive Dynamics Among Online Content Creators

no code implementations • 22 Mar 2024 • Renzhe Xu, Haotian Wang, Xingxuan Zhang, Bo Li, Peng Cui

We introduce the Proportional Payoff Allocation Game (PPA-Game) to model how agents, akin to content creators on platforms like YouTube and TikTok, compete for divisible resources and consumers' attention.

Paper
Add Code

Contrastive Balancing Representation Learning for Heterogeneous Dose-Response Curves Estimation

1 code implementation • 21 Mar 2024 • Minqin Zhu, Anpeng Wu, Haoxuan Li, Ruoxuan Xiong, Bo Li, Xiaoqing Yang, Xuan Qin, Peng Zhen, Jiecheng Guo, Fei Wu, Kun Kuang

Estimating the individuals' potential response to varying treatment doses is crucial for decision-making in areas such as precision medicine and management science.

counterfactual Decision Making +2

Paper
Code

Empowering Segmentation Ability to Multi-modal Large Language Models

no code implementations • 21 Mar 2024 • YuQi Yang, Peng-Tao Jiang, Jing Wang, Hao Zhang, Kai Zhao, Jinwei Chen, Bo Li

Multi-modal large language models (MLLMs) can understand image-language prompts and demonstrate impressive reasoning ability.

Dialogue Generation Segmentation +1

Paper
Add Code

Few-shot Object Localization

1 code implementation • 19 Mar 2024 • Yunhan Ren, Bo Li, Chengyang Zhang, Yong Zhang, BaoCai Yin

This task achieves generalized object localization by leveraging a small number of labeled support samples to query the positional information of objects within corresponding images.

Model Optimization Object +2

Paper
Code

RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content

no code implementations • 19 Mar 2024 • Zhuowen Yuan, Zidi Xiong, Yi Zeng, Ning Yu, Ruoxi Jia, Dawn Song, Bo Li

The innovative use of constrained optimization and a fusion-based guardrail approach represents a significant step forward in developing more secure and reliable LLMs, setting a new standard for content moderation frameworks in the face of evolving digital threats.

Data Augmentation

Paper
Add Code

Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression

no code implementations • 18 Mar 2024 • Junyuan Hong, Jinhao Duan, Chenhui Zhang, Zhangheng Li, Chulin Xie, Kelsey Lieberman, James Diffenderfer, Brian Bartoldson, Ajay Jaiswal, Kaidi Xu, Bhavya Kailkhura, Dan Hendrycks, Dawn Song, Zhangyang Wang, Bo Li

While state-of-the-art (SoTA) compression methods boast impressive advancements in preserving benign task performance, the potential risks of compression in terms of safety and trustworthiness have been largely neglected.

Ethics Fairness +1

Paper
Add Code

COLEP: Certifiably Robust Learning-Reasoning Conformal Prediction via Probabilistic Circuits

1 code implementation • 17 Mar 2024 • Mintong Kang, Nezihe Merve Gürel, Linyi Li, Bo Li

In this work, we propose a certifiably robust learning-reasoning conformal prediction framework (COLEP) via probabilistic circuits, which comprise a data-driven learning component that trains statistical models to learn different semantic concepts, and a reasoning component that encodes knowledge and characterizes the relationships among the trained models for logic reasoning.

Conformal Prediction

Paper
Code

Shake to Leak: Fine-tuning Diffusion Models Can Amplify the Generative Privacy Risk

1 code implementation • 14 Mar 2024 • Zhangheng Li, Junyuan Hong, Bo Li, Zhangyang Wang

While diffusion models have recently demonstrated remarkable progress in generating realistic images, privacy risks also arise: published models or APIs could generate training images and thus leak privacy-sensitive training information.

Inference Attack Membership Inference Attack

Paper
Code

2023 Low-Power Computer Vision Challenge (LPCVC) Summary

no code implementations • 11 Mar 2024 • Leo Chen, Benjamin Boardley, Ping Hu, Yiru Wang, Yifan Pu, Xin Jin, Yongqiang Yao, Ruihao Gong, Bo Li, Gao Huang, Xianglong Liu, Zifu Wan, Xinwang Chen, Ning Liu, Ziyi Zhang, Dongping Liu, Ruijie Shan, Zhengping Che, Fachao Zhang, Xiaofeng Mou, Jian Tang, Maxim Chuprov, Ivan Malofeev, Alexander Goncharenko, Andrey Shcherbin, Arseny Yanchenko, Sergey Alyamkin, Xiao Hu, George K. Thiruvathukal, Yung Hsiang Lu

This article describes the 2023 IEEE Low-Power Computer Vision Challenge (LPCVC).

Paper
Add Code

3D-aware Image Generation and Editing with Multi-modal Conditions

no code implementations • 11 Mar 2024 • Bo Li, Yi-ke Li, Zhi-fen He, Bin Liu, Yun-Kun Lai

3D-consistent image generation from a single 2D semantic label is an important and challenging research topic in computer graphics and computer vision.

Attribute Disentanglement +2

Paper
Add Code

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

no code implementations • 8 Mar 2024 • Gemini Team, Machel Reid, Nikolay Savinov, Denis Teplyashin, Dmitry, Lepikhin, Timothy Lillicrap, Jean-Baptiste Alayrac, Radu Soricut, Angeliki Lazaridou, Orhan Firat, Julian Schrittwieser, Ioannis Antonoglou, Rohan Anil, Sebastian Borgeaud, Andrew Dai, Katie Millican, Ethan Dyer, Mia Glaese, Thibault Sottiaux, Benjamin Lee, Fabio Viola, Malcolm Reynolds, Yuanzhong Xu, James Molloy, Jilin Chen, Michael Isard, Paul Barham, Tom Hennigan, Ross Mcilroy, Melvin Johnson, Johan Schalkwyk, Eli Collins, Eliza Rutherford, Erica Moreira, Kareem Ayoub, Megha Goel, Clemens Meyer, Gregory Thornton, Zhen Yang, Henryk Michalewski, Zaheer Abbas, Nathan Schucher, Ankesh Anand, Richard Ives, James Keeling, Karel Lenc, Salem Haykal, Siamak Shakeri, Pranav Shyam, Aakanksha Chowdhery, Roman Ring, Stephen Spencer, Eren Sezener, Luke Vilnis, Oscar Chang, Nobuyuki Morioka, George Tucker, Ce Zheng, Oliver Woodman, Nithya Attaluri, Tomas Kocisky, Evgenii Eltyshev, Xi Chen, Timothy Chung, Vittorio Selo, Siddhartha Brahma, Petko Georgiev, Ambrose Slone, Zhenkai Zhu, James Lottes, Siyuan Qiao, Ben Caine, Sebastian Riedel, Alex Tomala, Martin Chadwick, Juliette Love, Peter Choy, Sid Mittal, Neil Houlsby, Yunhao Tang, Matthew Lamm, Libin Bai, Qiao Zhang, Luheng He, Yong Cheng, Peter Humphreys, Yujia Li, Sergey Brin, Albin Cassirer, Yingjie Miao, Lukas Zilka, Taylor Tobin, Kelvin Xu, Lev Proleev, Daniel Sohn, Alberto Magni, Lisa Anne Hendricks, Isabel Gao, Santiago Ontanon, Oskar Bunyan, Nathan Byrd, Abhanshu Sharma, Biao Zhang, Mario Pinto, Rishika Sinha, Harsh Mehta, Dawei Jia, Sergi Caelles, Albert Webson, Alex Morris, Becca Roelofs, Yifan Ding, Robin Strudel, Xuehan Xiong, Marvin Ritter, Mostafa Dehghani, Rahma Chaabouni, Abhijit Karmarkar, Guangda Lai, Fabian Mentzer, Bibo Xu, Yaguang Li, Yujing Zhang, Tom Le Paine, Alex Goldin, Behnam Neyshabur, Kate Baumli, Anselm Levskaya, Michael Laskin, Wenhao Jia, Jack W. Rae, Kefan Xiao, Antoine He, Skye Giordano, Lakshman Yagati, Jean-Baptiste Lespiau, Paul Natsev, Sanjay Ganapathy, Fangyu Liu, Danilo Martins, Nanxin Chen, Yunhan Xu, Megan Barnes, Rhys May, Arpi Vezer, Junhyuk Oh, Ken Franko, Sophie Bridgers, Ruizhe Zhao, Boxi Wu, Basil Mustafa, Sean Sechrist, Emilio Parisotto, Thanumalayan Sankaranarayana Pillai, Chris Larkin, Chenjie Gu, Christina Sorokin, Maxim Krikun, Alexey Guseynov, Jessica Landon, Romina Datta, Alexander Pritzel, Phoebe Thacker, Fan Yang, Kevin Hui, Anja Hauth, Chih-Kuan Yeh, David Barker, Justin Mao-Jones, Sophia Austin, Hannah Sheahan, Parker Schuh, James Svensson, Rohan Jain, Vinay Ramasesh, Anton Briukhov, Da-Woon Chung, Tamara von Glehn, Christina Butterfield, Priya Jhakra, Matthew Wiethoff, Justin Frye, Jordan Grimstad, Beer Changpinyo, Charline Le Lan, Anna Bortsova, Yonghui Wu, Paul Voigtlaender, Tara Sainath, Shane Gu, Charlotte Smith, Will Hawkins, Kris Cao, James Besley, Srivatsan Srinivasan, Mark Omernick, Colin Gaffney, Gabriela Surita, Ryan Burnell, Bogdan Damoc, Junwhan Ahn, Andrew Brock, Mantas Pajarskas, Anastasia Petrushkina, Seb Noury, Lorenzo Blanco, Kevin Swersky, Arun Ahuja, Thi Avrahami, Vedant Misra, Raoul de Liedekerke, Mariko Iinuma, Alex Polozov, Sarah York, George van den Driessche, Paul Michel, Justin Chiu, Rory Blevins, Zach Gleicher, Adrià Recasens, Alban Rrustemi, Elena Gribovskaya, Aurko Roy, Wiktor Gworek, Sébastien M. R. Arnold, Lisa Lee, James Lee-Thorp, Marcello Maggioni, Enrique Piqueras, Kartikeya Badola, Sharad Vikram, Lucas Gonzalez, Anirudh Baddepudi, Evan Senter, Jacob Devlin, James Qin, Michael Azzam, Maja Trebacz, Martin Polacek, Kashyap Krishnakumar, Shuo-Yiin Chang, Matthew Tung, Ivo Penchev, Rishabh Joshi, Kate Olszewska, Carrie Muir, Mateo Wirth, Ale Jakse Hartman, Josh Newlan, Sheleem Kashem, Vijay Bolina, Elahe Dabir, Joost van Amersfoort, Zafarali Ahmed, James Cobon-Kerr, Aishwarya Kamath, Arnar Mar Hrafnkelsson, Le Hou, Ian Mackinnon, Alexandre Frechette, Eric Noland, Xiance Si, Emanuel Taropa, Dong Li, Phil Crone, Anmol Gulati, Sébastien Cevey, Jonas Adler, Ada Ma, David Silver, Simon Tokumine, Richard Powell, Stephan Lee, Kiran Vodrahalli, Samer Hassan, Diana Mincu, Antoine Yang, Nir Levine, Jenny Brennan, Mingqiu Wang, Sarah Hodkinson, Jeffrey Zhao, Josh Lipschultz, Aedan Pope, Michael B. Chang, Cheng Li, Laurent El Shafey, Michela Paganini, Sholto Douglas, Bernd Bohnet, Fabio Pardo, Seth Odoom, Mihaela Rosca, Cicero Nogueira dos santos, Kedar Soparkar, Arthur Guez, Tom Hudson, Steven Hansen, Chulayuth Asawaroengchai, Ravi Addanki, Tianhe Yu, Wojciech Stokowiec, Mina Khan, Justin Gilmer, Jaehoon Lee, Carrie Grimes Bostock, Keran Rong, Jonathan Caton, Pedram Pejman, Filip Pavetic, Geoff Brown, Vivek Sharma, Mario Lučić, Rajkumar Samuel, Josip Djolonga, Amol Mandhane, Lars Lowe Sjösund, Elena Buchatskaya, Elspeth White, Natalie Clay, Jiepu Jiang, Hyeontaek Lim, Ross Hemsley, Zeyncep Cankara, Jane Labanowski, Nicola De Cao, David Steiner, Sayed Hadi Hashemi, Jacob Austin, Anita Gergely, Tim Blyth, Joe Stanton, Kaushik Shivakumar, Aditya Siddhant, Anders Andreassen, Carlos Araya, Nikhil Sethi, Rakesh Shivanna, Steven Hand, Ankur Bapna, Ali Khodaei, Antoine Miech, Garrett Tanzer, Andy Swing, Shantanu Thakoor, Lora Aroyo, Zhufeng Pan, Zachary Nado, Jakub Sygnowski, Stephanie Winkler, Dian Yu, Mohammad Saleh, Loren Maggiore, Yamini Bansal, Xavier Garcia, Mehran Kazemi, Piyush Patil, Ishita Dasgupta, Iain Barr, Minh Giang, Thais Kagohara, Ivo Danihelka, Amit Marathe, Vladimir Feinberg, Mohamed Elhawaty, Nimesh Ghelani, Dan Horgan, Helen Miller, Lexi Walker, Richard Tanburn, Mukarram Tariq, Disha Shrivastava, Fei Xia, Qingze Wang, Chung-Cheng Chiu, Zoe Ashwood, Khuslen Baatarsukh, Sina Samangooei, Raphaël Lopez Kaufman, Fred Alcober, Axel Stjerngren, Paul Komarek, Katerina Tsihlas, Anudhyan Boral, Ramona Comanescu, Jeremy Chen, Ruibo Liu, Chris Welty, Dawn Bloxwich, Charlie Chen, Yanhua Sun, Fangxiaoyu Feng, Matthew Mauger, Xerxes Dotiwalla, Vincent Hellendoorn, Michael Sharman, Ivy Zheng, Krishna Haridasan, Gabe Barth-Maron, Craig Swanson, Dominika Rogozińska, Alek Andreev, Paul Kishan Rubenstein, Ruoxin Sang, Dan Hurt, Gamaleldin Elsayed, Renshen Wang, Dave Lacey, Anastasija Ilić, Yao Zhao, Adam Iwanicki, Alejandro Lince, Alexander Chen, Christina Lyu, Carl Lebsack, Jordan Griffith, Meenu Gaba, Paramjit Sandhu, Phil Chen, Anna Koop, Ravi Rajwar, Soheil Hassas Yeganeh, Solomon Chang, Rui Zhu, Soroush Radpour, Elnaz Davoodi, Ving Ian Lei, Yang Xu, Daniel Toyama, Constant Segal, Martin Wicke, Hanzhao Lin, Anna Bulanova, Adrià Puigdomènech Badia, Nemanja Rakićević, Pablo Sprechmann, Angelos Filos, Shaobo Hou, Víctor Campos, Nora Kassner, Devendra Sachan, Meire Fortunato, Chimezie Iwuanyanwu, Vitaly Nikolaev, Balaji Lakshminarayanan, Sadegh Jazayeri, Mani Varadarajan, Chetan Tekur, Doug Fritz, Misha Khalman, David Reitter, Kingshuk Dasgupta, Shourya Sarcar, Tina Ornduff, Javier Snaider, Fantine Huot, Johnson Jia, Rupert Kemp, Nejc Trdin, Anitha Vijayakumar, Lucy Kim, Christof Angermueller, Li Lao, Tianqi Liu, Haibin Zhang, David Engel, Somer Greene, Anaïs White, Jessica Austin, Lilly Taylor, Shereen Ashraf, Dangyi Liu, Maria Georgaki, Irene Cai, Yana Kulizhskaya, Sonam Goenka, Brennan Saeta, Ying Xu, Christian Frank, Dario de Cesare, Brona Robenek, Harry Richardson, Mahmoud Alnahlawi, Christopher Yew, Priya Ponnapalli, Marco Tagliasacchi, Alex Korchemniy, Yelin Kim, Dinghua Li, Bill Rosgen, Kyle Levin, Jeremy Wiesner, Praseem Banzal, Praveen Srinivasan, Hongkun Yu, Çağlar Ünlü, David Reid, Zora Tung, Daniel Finchelstein, Ravin Kumar, Andre Elisseeff, Jin Huang, Ming Zhang, Ricardo Aguilar, Mai Giménez, Jiawei Xia, Olivier Dousse, Willi Gierke, Damion Yates, Komal Jalan, Lu Li, Eri Latorre-Chimoto, Duc Dung Nguyen, Ken Durden, Praveen Kallakuri, Yaxin Liu, Matthew Johnson, Tomy Tsai, Alice Talbert, Jasmine Liu, Alexander Neitz, Chen Elkind, Marco Selvi, Mimi Jasarevic, Livio Baldini Soares, Albert Cui, Pidong Wang, Alek Wenjiao Wang, Xinyu Ye, Krystal Kallarackal, Lucia Loher, Hoi Lam, Josef Broder, Dan Holtmann-Rice, Nina Martin, Bramandia Ramadhana, Mrinal Shukla, Sujoy Basu, Abhi Mohan, Nick Fernando, Noah Fiedel, Kim Paterson, Hui Li, Ankush Garg, Jane Park, DongHyun Choi, Diane Wu, Sankalp Singh, Zhishuai Zhang, Amir Globerson, Lily Yu, John Carpenter, Félix de Chaumont Quitry, Carey Radebaugh, Chu-Cheng Lin, Alex Tudor, Prakash Shroff, Drew Garmon, Dayou Du, Neera Vats, Han Lu, Shariq Iqbal, Alex Yakubovich, Nilesh Tripuraneni, James Manyika, Haroon Qureshi, Nan Hua, Christel Ngani, Maria Abi Raad, Hannah Forbes, Jeff Stanway, Mukund Sundararajan, Victor Ungureanu, Colton Bishop, Yunjie Li, Balaji Venkatraman, Bo Li, Chloe Thornton, Salvatore Scellato, Nishesh Gupta, Yicheng Wang, Ian Tenney, Xihui Wu, Ashish Shenoy, Gabriel Carvajal, Diana Gage Wright, Ben Bariach, Zhuyun Xiao, Peter Hawkins, Sid Dalmia, Clement Farabet, Pedro Valenzuela, Quan Yuan, Ananth Agarwal, Mia Chen, Wooyeol Kim, Brice Hulse, Nandita Dukkipati, Adam Paszke, Andrew Bolt, Kiam Choo, Jennifer Beattie, Jennifer Prendki, Harsha Vashisht, Rebeca Santamaria-Fernandez, Luis C. Cobo, Jarek Wilkiewicz, David Madras, Ali Elqursh, Grant Uy, Kevin Ramirez, Matt Harvey, Tyler Liechty, Heiga Zen, Jeff Seibert, Clara Huiyi Hu, Andrey Khorlin, Maigo Le, Asaf Aharoni, Megan Li, Lily Wang, Sandeep Kumar, Norman Casagrande, Jay Hoover, Dalia El Badawy, David Soergel, Denis Vnukov, Matt Miecnikowski, Jiri Simsa, Praveen Kumar, Thibault Sellam, Daniel Vlasic, Samira Daruki, Nir Shabat, John Zhang, Guolong Su, Jiageng Zhang, Jeremiah Liu, Yi Sun, Evan Palmer, Alireza Ghaffarkhah, Xi Xiong, Victor Cotruta, Michael Fink, Lucas Dixon, Ashwin Sreevatsa, Adrian Goedeckemeyer, Alek Dimitriev, Mohsen Jafari, Remi Crocker, Nicholas FitzGerald, Aviral Kumar, Sanjay Ghemawat, Ivan Philips, Frederick Liu, Yannie Liang, Rachel Sterneck, Alena Repina, Marcus Wu, Laura Knight, Marin Georgiev, Hyo Lee, Harry Askham, Abhishek Chakladar, Annie Louis, Carl Crous, Hardie Cate, Dessie Petrova, MICHAEL QUINN, Denese Owusu-Afriyie, Achintya Singhal, Nan Wei, Solomon Kim, Damien Vincent, Milad Nasr, Christopher A. Choquette-Choo, Reiko Tojo, Shawn Lu, Diego de Las Casas, Yuchung Cheng, Tolga Bolukbasi, Katherine Lee, Saaber Fatehi, Rajagopal Ananthanarayanan, Miteyan Patel, Charbel Kaed, Jing Li, Shreyas Rammohan Belle, Zhe Chen, Jaclyn Konzelmann, Siim Põder, Roopal Garg, Vinod Koverkathu, Adam Brown, Chris Dyer, Rosanne Liu, Azade Nova, Jun Xu, Alanna Walton, Alicia Parrish, Mark Epstein, Sara McCarthy, Slav Petrov, Demis Hassabis, Koray Kavukcuoglu, Jeffrey Dean, Oriol Vinyals

In this report, we present the latest model of the Gemini family, Gemini 1. 5 Pro, a highly compute-efficient multimodal mixture-of-experts model capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio.

Ranked #20 on Code Generation on HumanEval

Code Generation Math Word Problem Solving +1

Paper
Add Code

BjTT: A Large-scale Multimodal Dataset for Traffic Prediction

2 code implementations • 8 Mar 2024 • Chengyang Zhang, Yong Zhang, Qitan Shao, Jiangtao Feng, Bo Li, Yisheng Lv, Xinglin Piao, BaoCai Yin

The key challenge of the TTG task is how to associate text with the spatial structure of the road network and traffic data for generating traffic situations.

Traffic Prediction

Paper
Code

Testing Business Cycle Theories: Evidence from the Great Recession

no code implementations • 6 Mar 2024 • Bo Li

Empirical business cycle studies using cross-country data usually cannot achieve causal relationships while within-country studies mostly focus on the bust period.

Paper
Add Code

COMMIT: Certifying Robustness of Multi-Sensor Fusion Systems against Semantic Attacks

no code implementations • 4 Mar 2024 • Zijian Huang, Wenda Chu, Linyi Li, Chejian Xu, Bo Li

In this work, we propose the first robustness certification framework COMMIT certify robustness of multi-sensor fusion systems against semantic attacks.

Autonomous Vehicles object-detection +2

Paper
Add Code

KeNet:Knowledge-enhanced Doc-Label Attention Network for Multi-label text classification

no code implementations • 4 Mar 2024 • Bo Li, Yuyan Chen, Liang Zeng

It is imperative to additionally acknowledge that the significance of knowledge is substantiated in the realm of MLTC.

Information Retrieval Multi Label Text Classification +4

Paper
Add Code

Differentially Private Synthetic Data via Foundation Model APIs 2: Text

1 code implementation • 4 Mar 2024 • Chulin Xie, Zinan Lin, Arturs Backurs, Sivakanth Gopi, Da Yu, Huseyin A Inan, Harsha Nori, Haotian Jiang, Huishuai Zhang, Yin Tat Lee, Bo Li, Sergey Yekhanin

Lin et al. (2024) recently introduced the Private Evolution (PE) algorithm to generate DP synthetic images with only API access to diffusion models.

Privacy Preserving

Paper
Code

Improving Adversarial Energy-Based Model via Diffusion Process

no code implementations • 4 Mar 2024 • Cong Geng, Tian Han, Peng-Tao Jiang, Hao Zhang, Jinwei Chen, Søren Hauberg, Bo Li

Generative models have shown strong generation ability while efficient likelihood estimation is less explored.

Denoising Density Estimation

Paper
Add Code

Perceptive self-supervised learning network for noisy image watermark removal

1 code implementation • 4 Mar 2024 • Chunwei Tian, Menghua Zheng, Bo Li, Yanning Zhang, Shichao Zhang, David Zhang

Specifically, mentioned paired watermark images are obtained in a self supervised way, and paired noisy images (i. e., noisy and reference images) are obtained in a supervised way.

Self-Supervised Learning

Paper
Code

Boosting Box-supervised Instance Segmentation with Pseudo Depth

no code implementations • 2 Mar 2024 • Xinyi Yu, Ling Yan, PengTao Jiang, Hao Chen, Bo Li, Lin Yuanbo Wu, Linlin Ou

This innovative approach empowers the network to simultaneously predict masks and depth, enhancing its ability to capture nuanced depth-related information during the instance segmentation process.

Box-supervised Instance Segmentation Depth Estimation +4

Paper
Add Code

HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding

1 code implementation • 1 Mar 2024 • Zhaorun Chen, Zhuokai Zhao, Hongyin Luo, Huaxiu Yao, Bo Li, Jiawei Zhou

While large vision-language models (LVLMs) have demonstrated impressive capabilities in interpreting multi-modal contexts, they invariably suffer from object hallucinations (OH).

Hallucination Object +1

Paper
Code

ROME: Memorization Insights from Text, Probability and Hidden State in Large Language Models

no code implementations • 1 Mar 2024 • Bo Li, Qinghua Zhao, Lijie Wen

Probing the memorization of large language models holds significant importance.

Memorization

Paper
Add Code

Tree-Regularized Tabular Embeddings

1 code implementation • 1 Mar 2024 • Xuan Li, Yun Wang, Bo Li

Tabular neural network (NN) has attracted remarkable attentions and its recent advances have gradually narrowed the performance gap with respect to tree-based models on many public datasets.

Binary Classification

Paper
Code

A Novel Approach to Industrial Defect Generation through Blended Latent Diffusion Model with Online Adaptation

1 code implementation • 29 Feb 2024 • Hanxi Li, Zhengxun Zhang, Hao Chen, Lin Wu, Bo Li, Deyin Liu, Mingwen Wang

Effectively addressing the challenge of industrial Anomaly Detection (AD) necessitates an ample supply of defective samples, a constraint often hindered by their scarcity in industrial contexts.

Anomaly Detection Decoder +1

Paper
Code

DART: Depth-Enhanced Accurate and Real-Time Background Matting

no code implementations • 24 Feb 2024 • Hanxi Li, Guofeng Li, Bo Li, Lin Wu, Yan Cheng

In this paper, we leverage the rich depth information provided by the RGB-Depth (RGB-D) cameras to enhance background matting performance in real-time, dubbed DART.

Bayesian Inference Edge-computing +1

Paper
Add Code

Mitigating Fine-tuning Jailbreak Attack with Backdoor Enhanced Alignment

no code implementations • 22 Feb 2024 • Jiongxiao Wang, Jiazhao Li, Yiquan Li, Xiangyu Qi, Junjie Hu, Yixuan Li, Patrick McDaniel, Muhao Chen, Bo Li, Chaowei Xiao

Despite the general capabilities of Large Language Models (LLMs) like GPT-4 and Llama-2, these models still request fine-tuning or adaptation with customized data when it comes to meeting the specific business demands and intricacies of tailored use cases.

Paper
Add Code

Handling Ambiguity in Emotion: From Out-of-Domain Detection to Distribution Estimation

no code implementations • 20 Feb 2024 • Wen Wu, Bo Li, Chao Zhang, Chung-Cheng Chiu, Qiujia Li, Junwen Bai, Tara N. Sainath, Philip C. Woodland

The evidential uncertainty measure is extended to quantify the uncertainty in emotion distribution estimation.

Classification Emotion Classification

Paper
Add Code

ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs

1 code implementation • 19 Feb 2024 • Fengqing Jiang, Zhangchen Xu, Luyao Niu, Zhen Xiang, Bhaskar Ramasubramanian, Bo Li, Radha Poovendran

In this paper, we propose a novel ASCII art-based jailbreak attack and introduce a comprehensive benchmark Vision-in-Text Challenge (ViTC) to evaluate the capabilities of LLMs in recognizing prompts that cannot be solely interpreted by semantics.

Paper
Code

Beyond Quantities: Machine Learning-based Characterization of Inequality in Infrastructure Quality Provision in Cities

no code implementations • 14 Feb 2024 • Bo Li, Ali Mostafavi

While a growing of body of literature has recognized the importance of characterizing infrastructure inequality in cities and provided quantified metrics to inform urban development plans, the majority of the existing approaches focus primarily on measuring the quantity of infrastructure, assuming that more infrastructure is better.

Paper
Add Code

Game of Trojans: Adaptive Adversaries Against Output-based Trojaned-Model Detectors

no code implementations • 12 Feb 2024 • Dinuka Sahabandu, Xiaojun Xu, Arezoo Rajabi, Luyao Niu, Bhaskar Ramasubramanian, Bo Li, Radha Poovendran

We propose and analyze an adaptive adversary that can retrain a Trojaned DNN and is also aware of SOTA output-based Trojaned model detectors.

Paper
Add Code

HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal

1 code implementation • 6 Feb 2024 • Mantas Mazeika, Long Phan, Xuwang Yin, Andy Zou, Zifan Wang, Norman Mu, Elham Sakhaee, Nathaniel Li, Steven Basart, Bo Li, David Forsyth, Dan Hendrycks

Automated red teaming holds substantial promise for uncovering and mitigating the risks associated with the malicious use of large language models (LLMs), yet the field lacks a standardized evaluation framework to rigorously assess new methods.

181

Paper
Code

C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models

1 code implementation • 5 Feb 2024 • Mintong Kang, Nezihe Merve Gürel, Ning Yu, Dawn Song, Bo Li

Specifically, we provide conformal risk analysis for RAG models and certify an upper confidence bound of generation risks, which we refer to as conformal generation risk.

Retrieval

Paper
Code

Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks

1 code implementation • 30 Jan 2024 • Andy Zhou, Bo Li, Haohan Wang

Despite advances in AI alignment, language models (LM) remain vulnerable to adversarial attacks or jailbreaking, in which adversaries modify input prompts to induce harmful behavior.

Paper
Code

Validating Climate Models with Spherical Convolutional Wasserstein Distance

no code implementations • 26 Jan 2024 • Robert C. Garrett, Trevor Harris, Bo Li, Zhuo Wang

The validation of global climate models is crucial to ensure the accuracy and efficacy of model output.

Paper
Add Code

GRATH: Gradual Self-Truthifying for Large Language Models

no code implementations • 22 Jan 2024 • Weixin Chen, Dawn Song, Bo Li

GRATH iteratively refines truthfulness data and updates the model, leading to a gradual improvement in model truthfulness in a self-supervised manner.

Paper
Add Code

Benchmarking Large Multimodal Models against Common Corruptions

1 code implementation • 22 Jan 2024 • Jiawei Zhang, Tianyu Pang, Chao Du, Yi Ren, Bo Li, Min Lin

This technical report aims to fill a deficiency in the assessment of large multimodal models (LMMs) by specifically examining the self-consistency of their outputs when subjected to common corruptions.

Benchmarking

Paper
Code

BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models

1 code implementation • 20 Jan 2024 • Zhen Xiang, Fengqing Jiang, Zidi Xiong, Bhaskar Ramasubramanian, Radha Poovendran, Bo Li

Moreover, we show that LLMs endowed with stronger reasoning capabilities exhibit higher susceptibility to BadChain, exemplified by a high average attack success rate of 97. 0% across the six benchmark tasks on GPT-4.

Backdoor Attack

Paper
Code

Efficient Adapter Finetuning for Tail Languages in Streaming Multilingual ASR

no code implementations • 17 Jan 2024 • Junwen Bai, Bo Li, Qiujia Li, Tara N. Sainath, Trevor Strohman

Meanwhile, the heterogeneous nature and imbalanced data abundance of different languages may cause performance degradation, leading to asynchronous peak performance for different languages during training, especially on tail ones.

Paper
Add Code

Crafter: Facial Feature Crafting against Inversion-based Identity Theft on Deep Models

no code implementations • 14 Jan 2024 • Shiming Wang, Zhe Ji, Liyao Xiang, Hao Zhang, Xinbing Wang, Chenghu Zhou, Bo Li

However, such methods can not defend against adaptive attacks, in which an attacker takes a countermove against a known defence strategy.

Paper
Add Code

Convolutional Neural Network Ensemble Learning for Hyperspectral Imaging-based Blackberry Fruit Ripeness Detection in Uncontrolled Farm Environment

no code implementations • 9 Jan 2024 • Chollette C. Olisah, Ben Trewhella, Bo Li, Melvyn L. Smith, Benjamin Winstone, E. Charles Whitfield, Felicidad Fernández Fernández, Harriet Duncalfe

To address this engineering application challenge, this paper proposes a novel multi-input convolutional neural network (CNN) ensemble classifier for detecting subtle traits of ripeness in blackberry fruits.

Ensemble Learning

Paper
Add Code

CaMML: Context-Aware Multimodal Learner for Large Models

no code implementations • 6 Jan 2024 • Yixin Chen, Shuai Zhang, Boran Han, Tong He, Bo Li

In this work, we introduce Context-Aware MultiModal Learner (CaMML), for tuning large multimodal models (LMMs).

Ranked #51 on Visual Question Answering on MM-Vet

Visual Question Answering

Paper
Add Code

VOT: Revolutionizing Speaker Verification with Memory and Attention Mechanisms

no code implementations • 28 Dec 2023 • Hongyu Wang, Hui Li, Bo Li

Speaker verification is to judge the similarity of two unknown voices in an open set, where the ideal speaker embedding should be able to condense discriminant information into a compact utterance-level representation that has small intra-speaker distances and large inter-speaker distances. We propose a novel model named Voice Transformer(VOT) for speaker verification.

Speaker Verification

Paper
Add Code

WordScape: a Pipeline to extract multilingual, visually rich Documents with Layout Annotations from Web Crawl Data

1 code implementation • NeurIPS 2023 • Maurice Weber, Carlo Siebenschuh, Rory Butler, Anton Alexandrov, Valdemar Thanner, Georgios Tsolakis, Haris Jabbar, Ian Foster, Bo Li, Rick Stevens, Ce Zhang

Together with the pipeline, we will additionally release 9. 5M urls to word documents which can be processed using WordScape to create a dataset of over 40M pages.

document understanding Question Answering +1

Paper
Code

Labels Need Prompts Too: Mask Matching for Natural Language Understanding Tasks

no code implementations • 14 Dec 2023 • Bo Li, Wei Ye, Quansen Wang, Wen Zhao, Shikun Zhang

Textual label names (descriptions) are typically semantically rich in many natural language understanding (NLU) tasks.

Natural Language Understanding

Paper
Add Code

CF-NeRF: Camera Parameter Free Neural Radiance Fields with Incremental Learning

no code implementations • 14 Dec 2023 • Qingsong Yan, Qiang Wang, Kaiyong Zhao, Jie Chen, Bo Li, Xiaowen Chu, Fei Deng

Neural Radiance Fields (NeRF) have demonstrated impressive performance in novel view synthesis.

Incremental Learning Novel View Synthesis

Paper
Add Code

USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models

no code implementations • 13 Dec 2023 • Shaojin Ding, David Qiu, David Rim, Yanzhang He, Oleg Rybakov, Bo Li, Rohit Prabhavalkar, Weiran Wang, Tara N. Sainath, Zhonglin Han, Jian Li, Amir Yazdanbakhsh, Shivani Agrawal

We conducted extensive experiments with a 2-billion parameter USM on a large-scale voice search dataset to evaluate our proposed method.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Decoupling Degradation and Content Processing for Adverse Weather Image Restoration

no code implementations • 8 Dec 2023 • Xi Wang, Xueyang Fu, Peng-Tao Jiang, Jie Huang, Mi Zhou, Bo Li, Zheng-Jun Zha

The former facilitates channel-dependent degradation removal operation, allowing the network to tailor responses to various adverse weather types; the latter, by integrating Fourier's global properties into channel-independent content features, enhances network capacity for consistent global content reconstruction.

Image Restoration

Paper
Add Code

An explanation for the distribution characteristics of stock returns

no code implementations • 5 Dec 2023 • Bo Li

In this work, we assume that the effects of events or information on prices obey normal distribution, while financial markets often overreact or underreact to events or information, resulting in non normal distributions of stock returns.

Paper
Add Code

Efficient Incremental Potential Contact for Actuated Face Simulation

no code implementations • 3 Dec 2023 • Bo Li, Lingchen Yang, Barbara Solenthaler

We present a quasi-static finite element simulator for human face animation.

Paper
Add Code

Revisiting Single Image Reflection Removal In the Wild

1 code implementation • 29 Nov 2023 • Yurui Zhu, Xueyang Fu, Peng-Tao Jiang, Hao Zhang, Qibin Sun, Jinwei Chen, Zheng-Jun Zha, Bo Li

This research focuses on the issue of single-image reflection removal (SIRR) in real-world conditions, examining it from two angles: the collection pipeline of real reflection pairs and the perception of real reflection locations.

Reflection Removal

Paper
Code

Panoptic Video Scene Graph Generation

3 code implementations • CVPR 2023 • Jingkang Yang, Wenxuan Peng, Xiangtai Li, Zujin Guo, Liangyu Chen, Bo Li, Zheng Ma, Kaiyang Zhou, Wayne Zhang, Chen Change Loy, Ziwei Liu

PVSG relates to the existing video scene graph generation (VidSGG) problem, which focuses on temporal interactions between humans and objects grounded with bounding boxes in videos.

Graph Generation Panoptic Scene Graph Generation +5

Paper
Code

ChatTraffic: Text-to-Traffic Generation via Diffusion Model

1 code implementation • 27 Nov 2023 • Chengyang Zhang, Yong Zhang, Qitan Shao, Bo Li, Yisheng Lv, Xinglin Piao, BaoCai Yin

The key challenge of the TTG task is how to associate text with the spatial structure of the road network and traffic data for generating traffic situations.

Traffic Prediction

Paper
Code

DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer

1 code implementation • 27 Nov 2023 • Junyuan Hong, Jiachen T. Wang, Chenhui Zhang, Zhangheng Li, Bo Li, Zhangyang Wang

To ensure that the prompts do not leak private information, we introduce the first private prompt generation mechanism, by a differentially-private (DP) ensemble of in-context learning with private demonstrations.

In-Context Learning Language Modelling +3

Paper
Code

Generalization and Hallucination of Large Vision-Language Models through a Camouflaged Lens

no code implementations • 19 Nov 2023 • Lv Tang, Peng-Tao Jiang, Zhihao Shen, Hao Zhang, Jinwei Chen, Bo Li

Large Vision-Language Model (LVLM) has seen burgeoning development and increasing attention recently.

counterfactual Hallucination +3

Paper
Add Code

TextGuard: Provable Defense against Backdoor Attacks on Text Classification

1 code implementation • 19 Nov 2023 • Hengzhi Pei, Jinyuan Jia, Wenbo Guo, Bo Li, Dawn Song

In this work, we propose TextGuard, the first provable defense against backdoor attacks on text classification.

Sentence text-classification +1

Paper
Code

SparseSpikformer: A Co-Design Framework for Token and Weight Pruning in Spiking Transformer

no code implementations • 15 Nov 2023 • Yue Liu, Shanlin Xiao, Bo Li, Zhiyi Yu

As the third-generation neural network, the Spiking Neural Network (SNN) has the advantages of low power consumption and high energy efficiency, making it suitable for implementation on edge devices.

Paper
Add Code

Geometry-Calibrated DRO: Combating Over-Pessimism with Free Energy Implications

no code implementations • 8 Nov 2023 • Jiashuo Liu, Jiayun Wu, Tianyu Wang, Hao Zou, Bo Li, Peng Cui

Machine learning algorithms minimizing average risk are susceptible to distributional shifts.

Paper
Add Code

Identifying and Mitigating Vulnerabilities in LLM-Integrated Applications

no code implementations • 7 Nov 2023 • Fengqing Jiang, Zhangchen Xu, Luyao Niu, Boxin Wang, Jinyuan Jia, Bo Li, Radha Poovendran

Successful exploits of the identified vulnerabilities result in the users receiving responses tailored to the intent of a threat initiator.

Code Completion

Paper
Add Code

OtterHD: A High-Resolution Multi-modality Model

1 code implementation • 7 Nov 2023 • Bo Li, Peiyuan Zhang, Jingkang Yang, Yuanhan Zhang, Fanyi Pu, Ziwei Liu

In this paper, we present OtterHD-8B, an innovative multimodal model evolved from Fuyu-8B, specifically engineered to interpret high-resolution visual inputs with granular precision.

Ranked #86 on Visual Question Answering on MM-Vet

Visual Question Answering

3,483

Paper
Code

Invariant-Feature Subspace Recovery: A New Class of Provable Domain Generalization Algorithms

1 code implementation • 2 Nov 2023 • Haoxiang Wang, Gargi Balasubramaniam, Haozhe Si, Bo Li, Han Zhao

First, in the binary classification setup of Rosenfeld et al. (2021), we show that our first algorithm, ISR-Mean, can identify the subspace spanned by invariant features from the first-order moments of the class-conditional distributions, and achieve provable domain generalization with $d_s+1$ training environments.

Binary Classification Domain Generalization +2

Paper
Code

IMPRESS: Evaluating the Resilience of Imperceptible Perturbations Against Unauthorized Data Usage in Diffusion-Based Generative AI

1 code implementation • NeurIPS 2023 • Bochuan Cao, Changjiang Li, Ting Wang, Jinyuan Jia, Bo Li, Jinghui Chen

IMPRESS is based on the key observation that imperceptible perturbations could lead to a perceptible inconsistency between the original image and the diffusion-reconstructed image, which can be used to devise a new optimization strategy for purifying the image, which may weaken the protection of the original image from unauthorized data usage (e. g., style mimicking, malicious editing).

Image Generation

Paper
Code

Bipartite Graph Pre-training for Unsupervised Extractive Summarization with Graph Convolutional Auto-Encoders

1 code implementation • 29 Oct 2023 • Qianren Mao, Shaobo Zhao, Jiarui Li, Xiaolei Gu, Shizhu He, Bo Li, JianXin Li

Pre-trained sentence representations are crucial for identifying significant sentences in unsupervised document extractive summarization.

Extractive Summarization Sentence +2

Paper
Code

DiffAttack: Evasion Attacks Against Diffusion-Based Adversarial Purification

1 code implementation • NeurIPS 2023 • Mintong Kang, Dawn Song, Bo Li

In particular, we propose a deviated-reconstruction loss at intermediate diffusion steps to induce inaccurate density gradient estimation to tackle the problem of vanishing/exploding gradients.

Adversarial Purification

Paper
Code

CBD: A Certified Backdoor Detector Based on Local Dominant Probability

1 code implementation • NeurIPS 2023 • Zhen Xiang, Zidi Xiong, Bo Li

Notably, for backdoor attacks with random perturbation triggers bounded by $\ell_2\leq0. 75$ which achieves more than 90\% attack success rate, CBD achieves 100\% (98\%), 100\% (84\%), 98\% (98\%), and 72\% (40\%) empirical (certified) detection true positive rates on the four benchmark datasets GTSRB, SVHN, CIFAR-10, and TinyImageNet, respectively, with low false positive rates.

Backdoor Attack Conformal Prediction

Paper
Code

Gradual Domain Adaptation: Theory and Algorithms

1 code implementation • 20 Oct 2023 • Yifei He, Haoxiang Wang, Bo Li, Han Zhao

Unsupervised domain adaptation (UDA) adapts a model from a labeled source domain to an unlabeled target domain in a one-off way.

Unsupervised Domain Adaptation

Paper
Code

Effective and Efficient Federated Tree Learning on Hybrid Data

no code implementations • 18 Oct 2023 • Qinbin Li, Chulin Xie, Xiaojun Xu, Xiaoyuan Liu, Ce Zhang, Bo Li, Bingsheng He, Dawn Song

To address this, we propose HybridTree, a novel federated learning approach that enables federated tree learning on hybrid data.

Federated Learning

Paper
Add Code

RGM: A Robust Generalizable Matching Model

1 code implementation • 18 Oct 2023 • Songyan Zhang, Xinyu Sun, Hao Chen, Bo Li, Chunhua Shen

Finding corresponding pixels within a pair of images is a fundamental computer vision task with various applications.

Optical Flow Estimation

Paper
Code

Exploring Decision-based Black-box Attacks on Face Forgery Detection

no code implementations • 18 Oct 2023 • Zhaoyu Chen, Bo Li, Kaixun Jiang, Shuang Wu, Shouhong Ding, Wenqiang Zhang

Further, the fake faces by our method can pass face forgery detection and face recognition, which exposes the security problems of face forgery detectors.

Face Recognition

Paper
Add Code

Towards Training-free Open-world Segmentation via Image Prompt Foundation Models

no code implementations • 17 Oct 2023 • Lv Tang, Peng-Tao Jiang, Hao-Ke Xiao, Bo Li

The realm of computer vision has witnessed a paradigm shift with the advent of foundational models, mirroring the transformative influence of large language models in the domain of natural language processing.

Segmentation

Paper
Add Code

Ring-A-Bell! How Reliable are Concept Removal Methods for Diffusion Models?

1 code implementation • 16 Oct 2023 • Yu-Lin Tsai, Chia-Yi Hsu, Chulin Xie, Chih-Hsun Lin, Jia-You Chen, Bo Li, Pin-Yu Chen, Chia-Mu Yu, Chun-Ying Huang

While efforts have been made to mitigate such problems, either by implementing a safety filter at the evaluation stage or by fine-tuning models to eliminate undesirable concepts or styles, the effectiveness of these safety measures in dealing with a wide range of prompts remains largely unexplored.

Paper
Code

Unraveling Fundamental Properties of Power System Resilience Curves using Unsupervised Machine Learning

no code implementations • 16 Oct 2023 • Bo Li, Ali Mostafavi

Trapezoidal archetypes explain resilience curves based on 1. duration of sustained function loss and 2. constant recovery rate.

Paper
Add Code

LRRU: Long-short Range Recurrent Updating Networks for Depth Completion

no code implementations • ICCV 2023 • YuFei Wang, Bo Li, Ge Zhang, Qi Liu, Tao Gao, Yuchao Dai

Existing deep learning-based depth completion methods generally employ massive stacked layers to predict the dense depth map from sparse input data.

Depth Completion

Paper
Add Code

Octopus: Embodied Vision-Language Programmer from Environmental Feedback

1 code implementation • 12 Oct 2023 • Jingkang Yang, Yuhao Dong, Shuai Liu, Bo Li, Ziyue Wang, Chencheng Jiang, Haoran Tan, Jiamu Kang, Yuanhan Zhang, Kaiyang Zhou, Ziwei Liu

Large vision-language models (VLMs) have achieved substantial progress in multimodal perception and reasoning.

Decision Making

240

Paper
Code

InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining

1 code implementation • 11 Oct 2023 • Boxin Wang, Wei Ping, Lawrence McAfee, Peng Xu, Bo Li, Mohammad Shoeybi, Bryan Catanzaro

After instruction tuning on Retro, InstructRetro demonstrates significant improvement over the instruction tuned GPT on a wide range of zero-shot tasks.

Decoder Question Answering +3

8,923

Paper
Code

PST: Improving Quantitative Trading via Program Sketch-based Tuning

no code implementations • 9 Oct 2023 • Zhiming Li, Junzhe Jiang, Yushi Cao, Aixin Cui, Bozhi Wu, Bo Li, Yang Liu, Dongning Sun

Particularly, PST first proposes using a novel symbolic program sketch to embed the abstract human expert knowledge of market trends.

Program Synthesis reinforcement-learning

Paper
Add Code

AI-based association analysis for medical imaging using latent-space geometric confounder correction

no code implementations • 3 Oct 2023 • Xianjing Liu, Bo Li, Meike W. Vernooij, Eppo B. Wolvius, Gennady V. Roshchupkin, Esther E. Bron

AI has greatly enhanced medical image analysis, yet its use in epidemiological population imaging studies remains limited due to visualization challenges in non-linear models and lack of confounder control.

Paper
Add Code

RLLTE: Long-Term Evolution Project of Reinforcement Learning

2 code implementations • 28 Sep 2023 • Mingqi Yuan, Zequn Zhang, Yang Xu, Shihao Luo, Bo Li, Xin Jin, Wenjun Zeng

We present RLLTE: a long-term evolution, extremely modular, and open-source framework for reinforcement learning (RL) research and application.

Language Modelling Large Language Model +2

442

Paper
Code

Massive End-to-end Models for Short Search Queries

no code implementations • 22 Sep 2023 • Weiran Wang, Rohit Prabhavalkar, Dongseong Hwang, Qiujia Li, Khe Chai Sim, Bo Li, James Qin, Xingyu Cai, Adam Stooke, Zhong Meng, CJ Zheng, Yanzhang He, Tara Sainath, Pedro Moreno Mengibar

In this work, we investigate two popular end-to-end automatic speech recognition (ASR) models, namely Connectionist Temporal Classification (CTC) and RNN-Transducer (RNN-T), for offline recognition of voice search queries, with up to 2B model parameters.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Rethinking Imitation-based Planner for Autonomous Driving

no code implementations • 19 Sep 2023 • Jie Cheng, Yingbing Chen, Xiaodong Mei, Bowen Yang, Bo Li, Ming Liu

In recent years, imitation-based driving planners have reported considerable success.

Autonomous Driving Data Augmentation

Paper
Add Code

Visible and NIR Image Fusion Algorithm Based on Information Complementarity

no code implementations • 19 Sep 2023 • Zhuo Li, Bo Li

Second, to generate the initial visible-NIR complementarity weight map, the difference maps of visible and NIR are filtered by the extend-DoG filter.

Paper
Add Code

Self-supervised Multi-view Clustering in Computer Vision: A Survey

no code implementations • 18 Sep 2023 • Jiatai Wang, Zhiwei Xu, Xuewen Yang, Hailong Li, Bo Li, Xuying Meng

However, as contrastive learning continues to evolve within the field of computer vision, self-supervised learning has also made substantial research progress and is progressively becoming dominant in MVC methods.

Clustering Contrastive Learning +3

Paper
Add Code

Zero-Shot Co-salient Object Detection Framework

1 code implementation • 11 Sep 2023 • Haoke Xiao, Lv Tang, Bo Li, Zhiming Luo, Shaozi Li

Despite recent advancements in deep learning models, these models still rely on training with well-annotated CoSOD datasets.

Co-Salient Object Detection Object +2

Paper
Code

DiffSmooth: Certifiably Robust Learning via Diffusion Models and Local Smoothing

1 code implementation • 28 Aug 2023 • Jiawei Zhang, Zhongzhu Chen, huan zhang, Chaowei Xiao, Bo Li

Diffusion models have been leveraged to perform adversarial purification and thus provide both empirical and certified robustness for a standard model.

Adversarial Purification Denoising

Paper
Code

Classification Committee for Active Deep Object Detection

no code implementations • 16 Aug 2023 • Lei Zhao, Bo Li, Xingxing Wei

The role of the classification committee is to select the most informative images according to their uncertainty values from the view of classification, which is expected to focus more on the discrepancy and representative of instances.

Active Learning Classification +3

Paper
Add Code

An Interpretable Machine Learning Model with Deep Learning-based Imaging Biomarkers for Diagnosis of Alzheimer's Disease

no code implementations • 15 Aug 2023 • Wenjie Kang, Bo Li, Janne M. Papma, Lize C. Jiskoot, Peter Paul De Deyn, Geert Jan Biessels, Jurgen A. H. R. Claassen, Huub A. M. Middelkoop, Wiesje M. van der Flier, Inez H. G. B. Ramakers, Stefan Klein, Esther E. Bron

However, some machine learning methods based on imaging data have poor interpretability because it is usually unclear how they make their decisions.

3D Object Classification Interpretable Machine Learning

Paper
Add Code

Target before Shooting: Accurate Anomaly Detection and Localization under One Millisecond via Cascade Patch Retrieval

1 code implementation • 13 Aug 2023 • Hanxi Li, Jianfei Hu, Bo Li, Hao Chen, Yongbin Zheng, Chunhua Shen

In this framework, the anomaly detection problem is solved via a cascade patch retrieval procedure that retrieves the nearest neighbors for each test image patch in a coarse-to-fine fashion.

Ranked #1 on Supervised Anomaly Detection on BTAD

Supervised Anomaly Detection

Paper
Code

LoRA-FA: Memory-efficient Low-rank Adaptation for Large Language Models Fine-tuning

no code implementations • 7 Aug 2023 • Longteng Zhang, Lin Zhang, Shaohuai Shi, Xiaowen Chu, Bo Li

The low-rank adaptation (LoRA) method can largely reduce the amount of trainable parameters for fine-tuning large language models (LLMs), however, it still requires expensive activation memory to update low-rank weights.

Paper
Add Code

Eva: A General Vectorized Approximation Framework for Second-order Optimization

no code implementations • 4 Aug 2023 • Lin Zhang, Shaohuai Shi, Bo Li

Second-order optimization algorithms exhibit excellent convergence properties for training deep learning models, but often incur significant computation and memory overheads.

Paper
Add Code

Benchmarking and Analyzing Generative Data for Visual Recognition

no code implementations • 25 Jul 2023 • Bo Li, Haotian Liu, Liangyu Chen, Yong Jae Lee, Chunyuan Li, Ziwei Liu

Advancements in large pre-trained generative models have expanded their potential as effective data generators in visual recognition.

Benchmarking Retrieval

Paper
Add Code

MMBench: Is Your Multi-modal Model an All-around Player?

3 code implementations • 12 Jul 2023 • YuAn Liu, Haodong Duan, Yuanhan Zhang, Bo Li, Songyang Zhang, Wangbo Zhao, Yike Yuan, Jiaqi Wang, Conghui He, Ziwei Liu, Kai Chen, Dahua Lin

In response to these challenges, we propose MMBench, a novel multi-modality benchmark.

Visual Question Answering

2,876

Paper
Code

SoK: Privacy-Preserving Data Synthesis

no code implementations • 5 Jul 2023 • Yuzheng Hu, Fan Wu, Qinbin Li, Yunhui Long, Gonzalo Munilla Garrido, Chang Ge, Bolin Ding, David Forsyth, Bo Li, Dawn Song

As the prevalence of data analysis grows, safeguarding data privacy has become a paramount concern.

Image Generation Privacy Preserving

Paper
Add Code

Structured Network Pruning by Measuring Filter-wise Interactions

no code implementations • 3 Jul 2023 • Wenting Tang, Xingxing Wei, Bo Li

Utilizing this new redundancy criterion, we propose a structured network pruning approach SNPFI (Structured Network Pruning by measuring Filter-wise Interaction).

Image Classification Network Pruning

Paper
Add Code

Query-Efficient Decision-based Black-Box Patch Attack

no code implementations • 2 Jul 2023 • Zhaoyu Chen, Bo Li, Shuang Wu, Shouhong Ding, Wenqiang Zhang

In this work, we first explore the decision-based patch attack.

Face Verification Image Classification

Paper
Add Code

Learning to Pan-sharpening with Memories of Spatial Details

1 code implementation • 28 Jun 2023 • Maoxun Yuan, Tianyi Zhao, Bo Li, Xingxing Wei

To address this issue, in this paper we observe that the spatial details from PAN images are mainly high-frequency cues, i. e., the edges reflect the contour of input PAN images.

Paper
Code

FunQA: Towards Surprising Video Comprehension

1 code implementation • 26 Jun 2023 • Binzhu Xie, Sicheng Zhang, Zitang Zhou, Bo Li, Yuanhan Zhang, Jack Hessel, Jingkang Yang, Ziwei Liu

Surprising videos, such as funny clips, creative performances, or visual illusions, attract significant attention.

Question Answering Text Generation +3

Paper
Code

Synthetic data shuffling accelerates the convergence of federated learning under data heterogeneity

1 code implementation • 23 Jun 2023 • Bo Li, Yasin Esfandiari, Mikkel N. Schmidt, Tommy S. Alstrøm, Sebastian U. Stich

In this paper, we establish a precise and quantifiable correspondence between data heterogeneity and parameters in the convergence rate when a fraction of data is shuffled across clients.

Federated Learning

Paper
Code

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

no code implementations • NeurIPS 2023 • Boxin Wang, Weixin Chen, Hengzhi Pei, Chulin Xie, Mintong Kang, Chenhui Zhang, Chejian Xu, Zidi Xiong, Ritik Dutta, Rylan Schaeffer, Sang T. Truong, Simran Arora, Mantas Mazeika, Dan Hendrycks, Zinan Lin, Yu Cheng, Sanmi Koyejo, Dawn Song, Bo Li

Yet, while the literature on the trustworthiness of GPT models remains limited, practitioners have proposed employing capable GPT models for sensitive applications such as healthcare and finance -- where mistakes can be costly.

Adversarial Robustness Ethics +1

Paper
Add Code

Prior-knowledge-informed deep learning for lacune detection and quantification using multi-site brain MRI

no code implementations • 18 Jun 2023 • Bo Li, Jeroen de Bresser, Wiro Niessen, Matthias Van Osch, Wiesje M. van der Flier, Geert Jan Biessels, Meike W. Vernooij, Esther Bron

Lacunes of presumed vascular origin, also referred to as lacunar infarcts, are important to assess cerebral small vessel disease and cognitive diseases such as dementia.

Paper
Add Code

Deep learning-based group-wise registration for longitudinal MRI analysis in glioma

no code implementations • 18 Jun 2023 • Claudia Chinea Hammecher, Karin van Garderen, Marion Smits, Pieter Wesseling, Bart Westerman, Pim French, Mathilde Kouwenhoven, Roel Verhaak, Frans Vos, Esther Bron, Bo Li

The proposed methods may serve as an alternative to classical toolboxes, to provide further insight into glioma growth.

Image Registration

Paper
Add Code

Evaluation and Optimization of Gradient Compression for Distributed Deep Learning

1 code implementation • 15 Jun 2023 • Lin Zhang, Longteng Zhang, Shaohuai Shi, Xiaowen Chu, Bo Li

To accelerate distributed training, many gradient compression methods have been proposed to alleviate the communication bottleneck in synchronous stochastic gradient descent (S-SGD), but their efficacy in real-world applications still remains unclear.

Quantization

Paper
Code

MIMIC-IT: Multi-Modal In-Context Instruction Tuning

2 code implementations • 8 Jun 2023 • Bo Li, Yuanhan Zhang, Liangyu Chen, Jinghao Wang, Fanyi Pu, Jingkang Yang, Chunyuan Li, Ziwei Liu

We release the MIMIC-IT dataset, instruction-response collection pipeline, benchmarks, and the Otter model.

Ranked #88 on Visual Question Answering on MM-Vet

In-Context Learning Visual Question Answering

3,483

Paper
Code

MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos

1 code implementation • 7 Jun 2023 • JieLin Qiu, Jiacheng Zhu, William Han, Aditesh Kumar, Karthik Mittal, Claire Jin, Zhengyuan Yang, Linjie Li, JianFeng Wang, Ding Zhao, Bo Li, Lijuan Wang

To address these challenges and provide a comprehensive dataset for this new direction, we have meticulously curated the \textbf{MMSum} dataset.

Text Summarization Video Summarization

Paper
Code

How to Estimate Model Transferability of Pre-Trained Speech Models?

1 code implementation • 1 Jun 2023 • Zih-Ching Chen, Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Shuo-Yiin Chang, Rohit Prabhavalkar, Hung-Yi Lee, Tara N. Sainath

In this work, we introduce a "score-based assessment" framework for estimating the transferability of pre-trained speech models (PSMs) for fine-tuning target tasks.

Paper
Code

Competing for Shareable Arms in Multi-Player Multi-Armed Bandits

1 code implementation • 30 May 2023 • Renzhe Xu, Haotian Wang, Xingxuan Zhang, Bo Li, Peng Cui

In reality, agents often have to learn and maximize the rewards of the resources at the same time.

Multi-Armed Bandits

Paper
Code

UMD: Unsupervised Model Detection for X2X Backdoor Attacks

no code implementations • 29 May 2023 • Zhen Xiang, Zidi Xiong, Bo Li

Backdoor (Trojan) attack is a common threat to deep neural networks, where samples from one or more source classes embedded with a backdoor trigger will be misclassified to adversarial target classes.

Paper
Add Code

On the Tool Manipulation Capability of Open-source Large Language Models

1 code implementation • 25 May 2023 • Qiantong Xu, Fenglu Hong, Bo Li, Changran Hu, Zhengyu Chen, Jian Zhang

In this paper, we ask can we enhance open-source LLMs to be competitive to leading closed LLM APIs in tool manipulation, with practical amount of human supervision.

125

Paper
Code

Mixture-of-Expert Conformer for Streaming Multilingual ASR

no code implementations • 25 May 2023 • Ke Hu, Bo Li, Tara N. Sainath, Yu Zhang, Francoise Beaufays

We evaluate the proposed model on a set of 12 languages, and achieve an average 11. 9% relative improvement in WER over the baseline.

Automatic Speech Recognition speech-recognition +1

Paper
Add Code

GrowSP: Unsupervised Semantic Segmentation of 3D Point Clouds

1 code implementation • CVPR 2023 • Zihui Zhang, Bo Yang, Bing Wang, Bo Li

Our method consists of three major components, 1) the feature extractor to learn per-point features from input point clouds, 2) the superpoint constructor to progressively grow the sizes of superpoints, and 3) the semantic primitive clustering module to group superpoints into semantic elements for the final semantic segmentation.

3D Semantic Segmentation Segmentation +1

146

Paper
Code

Reconstructive Neuron Pruning for Backdoor Defense

1 code implementation • 24 May 2023 • Yige Li, Xixiang Lyu, Xingjun Ma, Nodens Koren, Lingjuan Lyu, Bo Li, Yu-Gang Jiang

Specifically, RNP first unlearns the neurons by maximizing the model's error on a small subset of clean samples and then recovers the neurons by minimizing the model's error on the same data.

backdoor defense

Paper
Code

Modular Domain Adaptation for Conformer-Based Streaming ASR

no code implementations • 22 May 2023 • Qiujia Li, Bo Li, Dongseong Hwang, Tara N. Sainath, Pedro M. Mengibar

Speech data from different domains has distinct acoustic and linguistic characteristics.

Domain Adaptation speech-recognition +1

Paper
Add Code

Can Public Large Language Models Help Private Cross-device Federated Learning?

no code implementations • 20 May 2023 • Boxin Wang, Yibo Jacky Zhang, Yuan Cao, Bo Li, H. Brendan McMahan, Sewoong Oh, Zheng Xu, Manzil Zaheer

We study (differentially) private federated learning (FL) of language models.

Federated Learning

Paper
Add Code

Re-thinking Data Availablity Attacks Against Deep Neural Networks

no code implementations • 18 May 2023 • Bin Fang, Bo Li, Shuang Wu, Ran Yi, Shouhong Ding, Lizhuang Ma

The unauthorized use of personal data for commercial purposes and the clandestine acquisition of private data for training machine learning models continue to raise concerns.

Paper
Add Code

Towards Generalizable Data Protection With Transferable Unlearnable Examples

no code implementations • 18 May 2023 • Bin Fang, Bo Li, Shuang Wu, Tianyi Zheng, Shouhong Ding, Ran Yi, Lizhuang Ma

One of the crucial factors contributing to this success has been the access to an abundance of high-quality data for constructing machine learning models.

Paper
Add Code

Otter: A Multi-Modal Model with In-Context Instruction Tuning

1 code implementation • 5 May 2023 • Bo Li, Yuanhan Zhang, Liangyu Chen, Jinghao Wang, Jingkang Yang, Ziwei Liu

Large language models (LLMs) have demonstrated significant universal capabilities as few/zero-shot learners in various tasks due to their pre-training on vast amounts of text data, as exemplified by GPT-3, which boosted to InstrctGPT and ChatGPT, effectively following natural language instructions to accomplish real-world tasks.

Ranked #8 on Visual Question Answering on BenchLMM

In-Context Learning Instruction Following +2

3,483

Paper
Code

Evaluating ChatGPT's Information Extraction Capabilities: An Assessment of Performance, Explainability, Calibration, and Faithfulness

1 code implementation • 23 Apr 2023 • Bo Li, Gexiang Fang, Yang Yang, Quansen Wang, Wei Ye, Wen Zhao, Shikun Zhang

The capability of Large Language Models (LLMs) like ChatGPT to comprehend user intent and provide reasonable responses has made them extremely popular lately.

132

Paper
Code

The Second Monocular Depth Estimation Challenge

no code implementations • 14 Apr 2023 • Jaime Spencer, C. Stella Qian, Michaela Trescakova, Chris Russell, Simon Hadfield, Erich W. Graf, Wendy J. Adams, Andrew J. Schofield, James Elder, Richard Bowden, Ali Anwar, Hao Chen, Xiaozhi Chen, Kai Cheng, Yuchao Dai, Huynh Thai Hoa, Sadat Hossain, Jianmian Huang, Mohan Jing, Bo Li, Chao Li, Baojun Li, Zhiwen Liu, Stefano Mattoccia, Siegfried Mercelis, Myungwoo Nam, Matteo Poggi, Xiaohua Qi, Jiahui Ren, Yang Tang, Fabio Tosi, Linh Trinh, S. M. Nadim Uddin, Khan Muhammad Umair, Kaixuan Wang, YuFei Wang, Yixing Wang, Mochu Xiang, Guangkai Xu, Wei Yin, Jun Yu, Qi Zhang, Chaoqiang Zhao

This paper discusses the results for the second edition of the Monocular Depth Estimation Challenge (MDEC).

Monocular Depth Estimation

Paper
Add Code

Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study

1 code implementation • 13 Apr 2023 • Boxin Wang, Wei Ping, Peng Xu, Lawrence McAfee, Zihan Liu, Mohammad Shoeybi, Yi Dong, Oleksii Kuchaiev, Bo Li, Chaowei Xiao, Anima Anandkumar, Bryan Catanzaro

Thus, it is still an open question: shall we pretrain large autoregressive LMs with retrieval?

Decoder Open-Ended Question Answering +2

8,923

Paper
Code

Fast vehicle detection algorithm based on lightweight YOLO7-tiny

no code implementations • 12 Apr 2023 • Bo Li, Yihua Chen, Hao Xu, Fei Zhong

The swift and precise detection of vehicles plays a significant role in intelligent transportation systems.

Fast Vehicle Detection

Paper
Add Code

Can SAM Segment Anything? When SAM Meets Camouflaged Object Detection

1 code implementation • 10 Apr 2023 • Lv Tang, Haoke Xiao, Bo Li

In this study, we try to ask if SAM can address the COD task and evaluate the performance of SAM on the COD benchmark by employing maximum segmentation evaluation and camouflage location evaluation.

Object object-detection +3

Paper
Code

Predictive Heterogeneity: Measures and Applications

no code implementations • 1 Apr 2023 • Jiashuo Liu, Jiayun Wu, Bo Li, Peng Cui

As an intrinsic and fundamental property of big data, data heterogeneity exists in a variety of real-world applications, such as precision medicine, autonomous driving, financial applications, etc.

Autonomous Driving Crop Yield Prediction +3

Paper
Add Code

Invertible Convolution with Symmetric Paddings

1 code implementation • 30 Mar 2023 • Bo Li

We show that symmetrically padded convolution can be analytically inverted via DFT.

Paper
Code

Efficient Decision-based Black-box Patch Attacks on Video Recognition

no code implementations • ICCV 2023 • Kaixun Jiang, Zhaoyu Chen, Hao Huang, Jiafeng Wang, Dingkang Yang, Bo Li, Yan Wang, Wenqiang Zhang

First, STDE introduces target videos as patch textures and only adds patches on keyframes that are adaptively selected by temporal difference.

Video Recognition

Paper
Add Code

Graph Transformer GANs for Graph-Constrained House Generation

no code implementations • CVPR 2023 • Hao Tang, Zhenyu Zhang, Humphrey Shi, Bo Li, Ling Shao, Nicu Sebe, Radu Timofte, Luc van Gool

We present a novel graph Transformer generative adversarial network (GTGAN) to learn effective graph node relations in an end-to-end fashion for the challenging graph-constrained house generation task.

Generative Adversarial Network House Generation +1

Paper
Add Code

TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets

3 code implementations • CVPR 2023 • Weixin Chen, Dawn Song, Bo Li

To answer these questions, we propose an effective Trojan attack against diffusion models, TrojDiff, which optimizes the Trojan diffusion and generative processes during training.

Image Generation

Paper
Code

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

no code implementations • 2 Mar 2023 • Yu Zhang, Wei Han, James Qin, Yongqiang Wang, Ankur Bapna, Zhehuai Chen, Nanxin Chen, Bo Li, Vera Axelrod, Gary Wang, Zhong Meng, Ke Hu, Andrew Rosenberg, Rohit Prabhavalkar, Daniel S. Park, Parisa Haghani, Jason Riesa, Ginger Perng, Hagen Soltau, Trevor Strohman, Bhuvana Ramabhadran, Tara Sainath, Pedro Moreno, Chung-Cheng Chiu, Johan Schalkwyk, Françoise Beaufays, Yonghui Wu

We introduce the Universal Speech Model (USM), a single large model that performs automatic speech recognition (ASR) across 100+ languages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

DeAR: Accelerating Distributed Deep Learning with Fine-Grained All-Reduce Pipelining

1 code implementation • 24 Feb 2023 • Lin Zhang, Shaohuai Shi, Xiaowen Chu, Wei Wang, Bo Li, Chengjian Liu

Communication scheduling has been shown to be effective in accelerating distributed training, which enables all-reduce communications to be overlapped with backpropagation computations.

Scheduling

Paper
Code

Pose-Controllable 3D Facial Animation Synthesis using Hierarchical Audio-Vertex Attention

no code implementations • 24 Feb 2023 • Bin Liu, Xiaolin Wei, Bo Li, Junjie Cao, Yu-Kun Lai

In this paper, a novel pose-controllable 3D facial animation synthesis method is proposed by utilizing hierarchical audio-vertex attention.

Attribute Face Model

Paper
Add Code

UML: A Universal Monolingual Output Layer for Multilingual ASR

no code implementations • 22 Feb 2023 • Chao Zhang, Bo Li, Tara N. Sainath, Trevor Strohman, Shuo-Yiin Chang

Consequently, the UML enables to switch in the interpretation of each output node depending on the language of the input speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Delving into the Adversarial Robustness of Federated Learning

no code implementations • 19 Feb 2023 • Jie Zhang, Bo Li, Chen Chen, Lingjuan Lyu, Shuang Wu, Shouhong Ding, Chao Wu

In this work, we propose a novel algorithm called Decision Boundary based Federated Adversarial Training (DBFAT), which consists of two components (local re-weighting and global regularization) to improve both accuracy and robustness of FL systems.

Adversarial Robustness Federated Learning

Paper
Add Code

Massively Multilingual Shallow Fusion with Large Language Models

no code implementations • 17 Feb 2023 • Ke Hu, Tara N. Sainath, Bo Li, Nan Du, Yanping Huang, Andrew M. Dai, Yu Zhang, Rodrigo Cabrera, Zhifeng Chen, Trevor Strohman

In this work, we propose to train a single multilingual language model (LM) for shallow fusion in multiple languages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition

no code implementations • 16 Feb 2023 • Zhong Meng, Weiran Wang, Rohit Prabhavalkar, Tara N. Sainath, Tongzhou Chen, Ehsan Variani, Yu Zhang, Bo Li, Andrew Rosenberg, Bhuvana Ramabhadran

We propose JEIT, a joint end-to-end (E2E) model and internal language model (ILM) training method to inject large-scale unpaired text into ILM during E2E training which improves rare-word speech recognition.

Language Modelling speech-recognition +1

Paper
Add Code

PerAda: Parameter-Efficient Federated Learning Personalization with Generalization Guarantees

no code implementations • 13 Feb 2023 • Chulin Xie, De-An Huang, Wenda Chu, Daguang Xu, Chaowei Xiao, Bo Li, Anima Anandkumar

In this paper, we propose PerAda, a parameter-efficient pFL framework that reduces communication and computational costs and exhibits superior generalization performance, especially under test-time distribution shifts.

Generalization Bounds Knowledge Distillation +2

Paper
Add Code

3D Colored Shape Reconstruction from a Single RGB Image through Diffusion

no code implementations • 11 Feb 2023 • Bo Li, Xiaolin Wei, Fengwei Chen, Bin Liu

In shape prediction module, the reference RGB image is first encoded into a high-level shape feature and then the shape feature is utilized as a condition to predict the reverse geometric noise in diffusion model.

3D Reconstruction 3D Shape Generation +1

Paper
Add Code

Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning

no code implementations • 9 Feb 2023 • Zhuolin Yang, Wei Ping, Zihan Liu, Vijay Korthikanti, Weili Nie, De-An Huang, Linxi Fan, Zhiding Yu, Shiyi Lan, Bo Li, Ming-Yu Liu, Yuke Zhu, Mohammad Shoeybi, Bryan Catanzaro, Chaowei Xiao, Anima Anandkumar

Augmenting pretrained language models (LMs) with a vision encoder (e. g., Flamingo) has obtained the state-of-the-art results in image-to-text generation.

Few-Shot Learning Image Captioning +3

Paper
Add Code

Interpolation for Robust Learning: Data Augmentation on Wasserstein Geodesics

no code implementations • 4 Feb 2023 • Jiacheng Zhu, JieLin Qiu, Aritra Guha, Zhuolin Yang, XuanLong Nguyen, Bo Li, Ding Zhao

Our work provides a new perspective of model robustness through the lens of Wasserstein geodesic-based interpolation with a practical off-the-shelf strategy that can be combined with existing robust training methods.

Data Augmentation

Paper
Add Code

Defensive ML: Defending Architectural Side-channels with Adversarial Obfuscation

no code implementations • 3 Feb 2023 • Hyoungwook Nam, Raghavendra Pradyumna Pothukuchi, Bo Li, Nam Sung Kim, Josep Torrellas

To address this problem, this paper explores using Adversarial Machine Learning (AML) methods as a defense at the computer architecture layer to obfuscate side channels.

Computer Security

Paper
Add Code

Efficient Domain Adaptation for Speech Foundation Models

no code implementations • 3 Feb 2023 • Bo Li, Dongseong Hwang, Zhouyuan Huo, Junwen Bai, Guru Prakash, Tara N. Sainath, Khe Chai Sim, Yu Zhang, Wei Han, Trevor Strohman, Francoise Beaufays

The FM encoder adapter and decoder are then finetuned to the target domain with a small amount of supervised in-domain data.

Decoder Domain Adaptation +3

Paper
Add Code

Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement Learning

1 code implementation • 26 Jan 2023 • Mingqi Yuan, Bo Li, Xin Jin, Wenjun Zeng

We present AIRS: Automatic Intrinsic Reward Shaping that intelligently and adaptively provides high-quality intrinsic rewards to enhance exploration in reinforcement learning (RL).

Benchmarking reinforcement-learning +1

983

Paper
Code

From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition

no code implementations • 19 Jan 2023 • Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Rohit Prabhavalkar, Tara N. Sainath, Trevor Strohman

In this work, we propose a new parameter-efficient learning framework based on neural model reprogramming for cross-lingual speech recognition, which can \textbf{re-purpose} well-trained English automatic speech recognition (ASR) models to recognize the other languages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Proportional Fairness in Obnoxious Facility Location

no code implementations • 11 Jan 2023 • Haris Aziz, Alexander Lam, Bo Li, Fahimeh Ramezani, Toby Walsh

On the other hand, in the randomized setting, we identify proportionally fair and strategyproof mechanisms that give an expected welfare within a constant factor of the optimal welfare.

Fairness

Paper
Add Code

A Bertrand duopoly game with differentiated products reconsidered

no code implementations • 3 Jan 2023 • Xiaoliang Li, Bo Li

In this paper, we explore a dynamic Bertrand duopoly game with differentiated products, where firms are boundedly rational and consumers are assumed to possess an underlying CES utility function.

Paper
Add Code

Rethinking the Learning Paradigm for Dynamic Facial Expression Recognition

1 code implementation • CVPR 2023 • HanYang Wang, Bo Li, Shuang Wu, Siyuan Shen, Feng Liu, Shouhong Ding, Aimin Zhou

Dynamic Facial Expression Recognition (DFER) is a rapidly developing field that focuses on recognizing facial expressions in video format.

Ranked #7 on Dynamic Facial Expression Recognition on FERV39k

Dynamic Facial Expression Recognition Facial Expression Recognition

Paper
Code

AREA: Adaptive Reweighting via Effective Area for Long-Tailed Classification

1 code implementation • ICCV 2023 • Xiaohua Chen, Yucan Zhou, Dayan Wu, Chule Yang, Bo Li, QinGhua Hu, Weiping Wang

Consequently, we estimate the size of the spanned space for each category, namely effective area, by detailedly analyzing its samples' distribution.

Paper
Code

Dual Alignment Unsupervised Domain Adaptation for Video-Text Retrieval

no code implementations • CVPR 2023 • Xiaoshuai Hao, Wanqian Zhang, Dayan Wu, Fei Zhu, Bo Li

To tackle this, we propose a novel method named Dual Alignment Domain Adaptation (DADA).

Retrieval Text Retrieval +2

Paper
Add Code

PHA: Patch-Wise High-Frequency Augmentation for Transformer-Based Person Re-Identification

no code implementations • CVPR 2023 • Guiwei Zhang, Yongfei Zhang, Tianyu Zhang, Bo Li, ShiLiang Pu

Although recent studies empirically show that injecting Convolutional Neural Networks (CNNs) into Vision Transformers (ViTs) can improve the performance of person re-identification, the rationale behind it remains elusive.

Person Re-Identification

Paper
Add Code

Sequence Generation with Label Augmentation for Relation Extraction

1 code implementation • 29 Dec 2022 • Bo Li, Dingyao Yu, Wei Ye, Jinglei Zhang, Shikun Zhang

Sequence generation demonstrates promising performance in recent information extraction efforts, by incorporating large-scale pre-trained Seq2Seq models.

Ranked #1 on Relation Extraction on sciERC-sent

Relation Relation Extraction

Paper
Code

Reviewing Labels: Label Graph Network with Top-k Prediction Set for Relation Extraction

no code implementations • 29 Dec 2022 • Bo Li, Wei Ye, Jinglei Zhang, Shikun Zhang

Specifically, for a given sample, we build a label graph to review candidate labels in the Top-k prediction set and learn the connections between them.

Ranked #2 on Relation Extraction on TACRED-Revisited

Relation Relation Extraction

Paper
Add Code

EDoG: Adversarial Edge Detection For Graph Neural Networks

no code implementations • 27 Dec 2022 • Xiaojun Xu, Yue Yu, Hanzhang Wang, Alok Lal, Carl A. Gunter, Bo Li

In this paper, we propose a general adversarial edge detection pipeline EDoG without requiring knowledge of the attack strategies based on graph generation.

Edge Detection Graph Generation +2

Paper
Add Code

Forecasting West Nile Virus with Graph Neural Networks: Harnessing Spatial Dependence in Irregularly Sampled Geospatial Data

no code implementations • 21 Dec 2022 • Adam Tonks, Trevor Harris, Bo Li, William Brown, Rebecca Smith

Machine learning methods have seen increased application to geospatial environmental problems, such as precipitation nowcasting, haze forecasting, and crop yield prediction.

Crop Yield Prediction regression

Paper
Add Code

Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift

no code implementations • 15 Dec 2022 • JieLin Qiu, Yi Zhu, Xingjian Shi, Florian Wenzel, Zhiqiang Tang, Ding Zhao, Bo Li, Mu Li

Multimodal image-text models have shown remarkable performance in the past few years.

Benchmarking Image Captioning +5

Paper
Add Code

On the effectiveness of partial variance reduction in federated learning with heterogeneous data

2 code implementations • CVPR 2023 • Bo Li, Mikkel N. Schmidt, Tommy S. Alstrøm, Sebastian U. Stich

In this paper, we first revisit the widely used FedAvg algorithm in a deep neural network to understand how data heterogeneity influences the gradient updates across the neural network layers.

Federated Learning

Paper
Code

Logic and Commonsense-Guided Temporal Knowledge Graph Completion

1 code implementation • 30 Nov 2022 • Guanglin Niu, Bo Li

To address these challenges, we propose a Logic and Commonsense-Guided Embedding model (LCGE) to jointly learn the time-sensitive representation involving timeliness and causality of events, together with the time-independent representation of events from the perspective of commonsense.

Causal Inference Knowledge Graph Completion +1

Paper
Code

Rethinking Disparity: A Depth Range Free Multi-View Stereo Based on Disparity

1 code implementation • 30 Nov 2022 • Qingsong Yan, Qiang Wang, Kaiyong Zhao, Bo Li, Xiaowen Chu, Fei Deng

Existing learning-based multi-view stereo (MVS) methods rely on the depth range to build the 3D cost volume and may fail when the range is too large or unreliable.

Paper
Code

Tackling Visual Control via Multi-View Exploration Maximization

no code implementations • 28 Nov 2022 • Mingqi Yuan, Xin Jin, Bo Li, Wenjun Zeng

We present MEM: Multi-view Exploration Maximization for tackling complex visual control tasks.

Benchmarking Reinforcement Learning (RL) +1

Paper
Add Code

Confounder Balancing for Instrumental Variable Regression with Latent Variable

no code implementations • 18 Nov 2022 • Anpeng Wu, Kun Kuang, Ruoxuan Xiong, Bo Li, Fei Wu

This paper studies the confounding effects from the unmeasured confounders and the imbalance of observed confounders in IV regression and aims at unbiased causal effect estimation.

regression valid

Paper
Add Code

AnimeRun: 2D Animation Visual Correspondence from Open Source 3D Movies

1 code implementation • 10 Nov 2022 • Li SiYao, Yuhang Li, Bo Li, Chao Dong, Ziwei Liu, Chen Change Loy

Existing correspondence datasets for two-dimensional (2D) cartoon suffer from simple frame composition and monotonic movements, making them insufficient to simulate real animations.

Optical Flow Estimation

Paper
Code

HFedMS: Heterogeneous Federated Learning with Memorable Data Semantics in Industrial Metaverse

1 code implementation • 7 Nov 2022 • Shenglai Zeng, Zonghang Li, Hongfang Yu, Zhihao Zhang, Long Luo, Bo Li, Dusit Niyato

Federated Learning (FL), as a rapidly evolving privacy-preserving collaborative machine learning paradigm, is a promising approach to enable edge intelligence in the emerging Industrial Metaverse.

Federated Learning Privacy Preserving

Paper
Code

Resource-Efficient Transfer Learning From Speech Foundation Model Using Hierarchical Feature Fusion

no code implementations • 4 Nov 2022 • Zhouyuan Huo, Khe Chai Sim, Bo Li, Dongseong Hwang, Tara N. Sainath, Trevor Strohman

Experimental results show that the proposed method can achieve better performance on speech recognition task than existing algorithms with fewer number of trainable parameters, less computational memory cost and faster training speed.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Fairness in Federated Learning via Core-Stability

no code implementations • 3 Nov 2022 • Bhaskar Ray Chaudhury, Linyi Li, Mintong Kang, Bo Li, Ruta Mehta

Nonetheless, the heterogeneity nature of distributed data makes it challenging to define and ensure fairness among local agents.

Decision Making Fairness +1

Paper
Add Code

A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition

no code implementations • 2 Nov 2022 • Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Tara N. Sainath, Sabato Marco Siniscalchi, Chin-Hui Lee

We propose a quantum kernel learning (QKL) framework to address the inherent data sparsity issues often encountered in training large-scare acoustic models in low-resource scenarios.

Spoken Command Recognition

Paper
Add Code

Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems

no code implementations • 1 Nov 2022 • Shaan Bijwadia, Shuo-Yiin Chang, Bo Li, Tara Sainath, Chao Zhang, Yanzhang He

In this work, we propose a method to jointly train the ASR and EP tasks in a single end-to-end (E2E) multitask model, improving EP quality by optionally leveraging information from the ASR audio encoder.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

DensePure: Understanding Diffusion Models towards Adversarial Robustness

no code implementations • 1 Nov 2022 • Chaowei Xiao, Zhongzhu Chen, Kun Jin, Jiongxiao Wang, Weili Nie, Mingyan Liu, Anima Anandkumar, Bo Li, Dawn Song

By using the highest density point in the conditional distribution as the reversed sample, we identify the robust region of a given instance under the diffusion model's reverse process.

Adversarial Robustness Denoising

Paper
Add Code

Shape Matters: Deformable Patch Attack

1 code implementation • European Conference on Computer Vision 2022 • Zhaoyu Chen, Bo Li, Shuang Wu, Jianghe Xu, Shouhong Ding, Wenqiang Zhang

Though deep neural networks (DNNs) have demonstrated excellent performance in computer vision, they are susceptible and vulnerable to carefully crafted adversarial examples which can mislead DNNs to incorrect outputs.

Paper
Code

CU-Net: LiDAR Depth-Only Completion With Coupled U-Net

1 code implementation • 26 Oct 2022 • YuFei Wang, Yuchao Dai, Qi Liu, Peng Yang, Jiadai Sun, Bo Li

We find that existing depth-only methods can obtain satisfactory results in the areas where the measurement points are almost accurate and evenly distributed (denoted as normal areas), while the performance is limited in the areas where the foreground and background points are overlapped due to occlusion (denoted as overlap areas) and the areas where there are no measurement points around (denoted as blank areas) since the methods have no reliable input information in these areas.

Paper
Code

Group Distributionally Robust Reinforcement Learning with Hierarchical Latent Variables

no code implementations • 21 Oct 2022 • Mengdi Xu, Peide Huang, Yaru Niu, Visak Kumar, JieLin Qiu, Chao Fang, Kuan-Hui Lee, Xuewei Qi, Henry Lam, Bo Li, Ding Zhao

One key challenge for multi-task Reinforcement learning (RL) in practice is the absence of task indicators.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

LOT: Layer-wise Orthogonal Training on Improving $\ell_2$ Certified Robustness

1 code implementation • 20 Oct 2022 • Xiaojun Xu, Linyi Li, Bo Li

On the other hand, as existing works show that semi-supervised training helps improve empirical robustness, we aim to bridge the gap and prove that semi-supervised learning also improves the certified robustness of Lipschitz-bounded models.

Adversarial Robustness

Paper
Code

Handling Label Uncertainty for Camera Incremental Person Re-Identification

no code implementations • 17 Oct 2022 • Zexian Yang, Dayan Wu, Wanqian Zhang, Bo Li, Weiping Wang

Specifically, new data collected from new cameras may probably contain an unknown proportion of identities seen before.

Incremental Learning Person Re-Identification

Paper
Add Code

Product Ranking for Revenue Maximization with Multiple Purchases

1 code implementation • 15 Oct 2022 • Renzhe Xu, Xingxuan Zhang, Bo Li, Yafeng Zhang, Xiaolong Chen, Peng Cui

In this paper, we assume that each consumer can purchase multiple products at will.

Paper
Code

Feature Reconstruction Attacks and Countermeasures of DNN training in Vertical Federated Learning

no code implementations • 13 Oct 2022 • Peng Ye, Zhifeng Jiang, Wei Wang, Bo Li, Baochun Li

To address this problem, we develop a novel feature protection scheme against the reconstruction attack that effectively misleads the search to some pre-specified random values.

Reconstruction Attack Vertical Federated Learning

Paper
Add Code

JOIST: A Joint Speech and Text Streaming Model For ASR

no code implementations • 13 Oct 2022 • Tara N. Sainath, Rohit Prabhavalkar, Ankur Bapna, Yu Zhang, Zhouyuan Huo, Zhehuai Chen, Bo Li, Weiran Wang, Trevor Strohman

In addition, we explore JOIST using a streaming E2E model with an order of magnitude more data, which are also novelties compared to previous works.

Paper
Add Code

OpenOOD: Benchmarking Generalized Out-of-Distribution Detection

3 code implementations • 13 Oct 2022 • Jingkang Yang, Pengyun Wang, Dejian Zou, Zitang Zhou, Kunyuan Ding, Wenxuan Peng, Haoqi Wang, Guangyao Chen, Bo Li, Yiyou Sun, Xuefeng Du, Kaiyang Zhou, Wayne Zhang, Dan Hendrycks, Yixuan Li, Ziwei Liu

Out-of-distribution (OOD) detection is vital to safety-critical machine learning applications and has thus been extensively studied, with a plethora of methods developed in the literature.

Anomaly Detection Benchmarking +3

777

Paper
Code

Scaling Up Deliberation for Multilingual ASR

no code implementations • 11 Oct 2022 • Ke Hu, Bo Li, Tara N. Sainath

In this work, we investigate second-pass deliberation for multilingual speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

The Equalization Losses: Gradient-Driven Training for Long-tailed Object Recognition

1 code implementation • 11 Oct 2022 • Jingru Tan, Bo Li, Xin Lu, Yongqiang Yao, Fengwei Yu, Tong He, Wanli Ouyang

Long-tail distribution is widely spread in real-world applications.

Image Classification Long-tailed Object Detection +4

424

Paper
Code

Improving Long-tailed Object Detection with Image-Level Supervision by Multi-Task Collaborative Learning

1 code implementation • 11 Oct 2022 • Bo Li, Yongqiang Yao, Jingru Tan, Xin Lu, Fengwei Yu, Ye Luo, Jianwei Lu

Specifically, there are an object detection task (consisting of an instance-classification task and a localization task) and an image-classification task in our framework, responsible for utilizing the two types of supervision.

Classification Contrastive Learning +4