Search Results for author: Yuqing Li

Found 10 papers, 3 papers with code

Demystifying Lazy Training of Neural Networks from a Macroscopic Viewpoint

no code implementations • 7 Apr 2024 • Yuqing Li, Tao Luo, Qixuan Zhou

While NTK typically assumes that $\lim_{m\to\infty}\frac{\log \kappa}{\log m}=\frac{1}{2}$, and imposes each weight parameters to scale by the factor $\frac{1}{\sqrt{m}}$, in our theta-lazy regime, we discard the factor and relax the conditions to $\lim_{m\to\infty}\frac{\log \kappa}{\log m}>0$.

Paper
Add Code

Triple GNNs: Introducing Syntactic and Semantic Information for Conversational Aspect-Based Quadruple Sentiment Analysis

1 code implementation • 15 Mar 2024 • Binbin Li, Yuqing Li, Siyu Jia, Bingnan Ma, Yu Ding, Zisen Qi, Xingbang Tan, Menghan Guo, Shenghui Liu

This necessitates a dual focus on both the syntactic information of individual utterances and the semantic interaction among them.

Aspect-Based Sentiment Analysis Graph Attention +1

Paper
Code

AC-EVAL: Evaluating Ancient Chinese Language Understanding in Large Language Models

1 code implementation • 11 Mar 2024 • Yuting Wei, Yuanxing Xu, Xinru Wei, Simin Yang, Yangfu Zhu, Yuqing Li, Di Liu, Bin Wu

Given the importance of ancient Chinese in capturing the essence of rich historical and cultural heritage, the rapid advancements in Large Language Models (LLMs) necessitate benchmarks that can effectively evaluate their understanding of ancient contexts.

Philosophy Reading Comprehension

Paper
Code

Dynamic Multi-Scale Context Aggregation for Conversational Aspect-Based Sentiment Quadruple Analysis

1 code implementation • 27 Sep 2023 • Yuqing Li, Wenyuan Zhang, Binbin Li, Siyu Jia, Zisen Qi, Xingbang Tan

Conversational aspect-based sentiment quadruple analysis (DiaASQ) aims to extract the quadruple of target-aspect-opinion-sentiment within a dialogue.

Paper
Code

Stochastic Modified Equations and Dynamics of Dropout Algorithm

no code implementations • 25 May 2023 • Zhongwang Zhang, Yuqing Li, Tao Luo, Zhi-Qin John Xu

In order to investigate the underlying mechanism by which dropout facilitates the identification of flatter minima, we study the noise structure of the derived stochastic modified equation for dropout.

Relation

Paper
Add Code

Understanding the Initial Condensation of Convolutional Neural Networks

no code implementations • 17 May 2023 • Zhangchen Zhou, Hanxu Zhou, Yuqing Li, Zhi-Qin John Xu

Previous research has shown that fully-connected networks with small initialization and gradient-based training methods exhibit a phenomenon known as condensation during training.

Paper
Add Code

Phase Diagram of Initial Condensation for Two-layer Neural Networks

no code implementations • 12 Mar 2023 • Zhengan Chen, Yuqing Li, Tao Luo, Zhangchen Zhou, Zhi-Qin John Xu

The phenomenon of distinct behaviors exhibited by neural networks under varying scales of initialization remains an enigma in deep learning research.

Vocal Bursts Valence Prediction

Paper
Add Code

Embedding Principle: a hierarchical structure of loss landscape of deep neural networks

no code implementations • 30 Nov 2021 • Yaoyu Zhang, Yuqing Li, Zhongwang Zhang, Tao Luo, Zhi-Qin John Xu

We prove a general Embedding Principle of loss landscape of deep neural networks (NNs) that unravels a hierarchical structure of the loss landscape of NNs, i. e., loss landscape of an NN contains all critical points of all the narrower NNs.

Paper
Add Code

Nonlinear Weighted Directed Acyclic Graph and A Priori Estimates for Neural Networks

no code implementations • 30 Mar 2021 • Yuqing Li, Tao Luo, Chao Ma

In an attempt to better understand structural benefits and generalization power of deep neural networks, we firstly present a novel graph theoretical formulation of neural network models, including fully connected, residual network (ResNet) and densely connected networks (DenseNet).

Paper
Add Code

Towards an Understanding of Residual Networks Using Neural Tangent Hierarchy (NTH)

no code implementations • 7 Jul 2020 • Yuqing Li, Tao Luo, Nung Kwan Yip

Gradient descent yields zero training loss in polynomial time for deep neural networks despite non-convex nature of the objective function.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.