Search Results for author: Shangmin Guo

Found 18 papers, 9 papers with code

Language Model Evolution: An Iterated Learning Perspective

2 code implementations • 4 Apr 2024 • Yi Ren, Shangmin Guo, Linlu Qiu, Bailin Wang, Danica J. Sutherland

With the widespread adoption of Large Language Models (LLMs), the prevalence of iterative interactions among these models is anticipated to increase.

Language Modelling

271

Paper
Code

Direct Language Model Alignment from Online AI Feedback

no code implementations • 7 Feb 2024 • Shangmin Guo, Biao Zhang, Tianlin Liu, Tianqi Liu, Misha Khalman, Felipe Llinares, Alexandre Rame, Thomas Mesnard, Yao Zhao, Bilal Piot, Johan Ferret, Mathieu Blondel

Moreover, responses in these datasets are often sampled from a language model distinct from the one being aligned, and since the model evolves over training, the alignment phase is inevitably off-policy.

Language Modelling

Paper
Add Code

Decoding-time Realignment of Language Models

no code implementations • 5 Feb 2024 • Tianlin Liu, Shangmin Guo, Leonardo Bianco, Daniele Calandriello, Quentin Berthet, Felipe Llinares, Jessica Hoffmann, Lucas Dixon, Michal Valko, Mathieu Blondel

Aligning language models with human preferences is crucial for reducing errors and biases in these models.

Models Alignment

Paper
Add Code

DRED: Zero-Shot Transfer in Reinforcement Learning via Data-Regularised Environment Design

no code implementations • 5 Feb 2024 • Samuel Garcin, James Doran, Shangmin Guo, Christopher G. Lucas, Stefano V. Albrecht

DRED generates levels using a generative model trained over an initial set of level parameters, reducing distributional shift, and achieves significant improvements in ZSG over adaptive level sampling strategies and UED methods.

Reinforcement Learning (RL)

Paper
Add Code

lpNTK: Better Generalisation with Less Data via Sample Interaction During Learning

no code implementations • 16 Jan 2024 • Shangmin Guo, Yi Ren, Stefano V. Albrecht, Kenny Smith

Although much research has been done on proposing new models or loss functions to improve the generalisation of artificial neural networks (ANNs), less attention has been directed to the impact of the training data on generalisation.

Paper
Add Code

How the level sampling process impacts zero-shot generalisation in deep reinforcement learning

no code implementations • 5 Oct 2023 • Samuel Garcin, James Doran, Shangmin Guo, Christopher G. Lucas, Stefano V. Albrecht

A key limitation preventing the wider adoption of autonomous agents trained via deep reinforcement learning (RL) is their limited ability to generalise to new environments, even when these share similar characteristics with environments encountered during training.

Reinforcement Learning (RL)

Paper
Add Code

How to prepare your task head for finetuning

no code implementations • 11 Feb 2023 • Yi Ren, Shangmin Guo, Wonho Bae, Danica J. Sutherland

We identify a significant trend in the effect of changes in this initial energy on the resulting features after fine-tuning.

Paper
Add Code

Deep Reinforcement Learning for Multi-Agent Interaction

3 code implementations • 2 Aug 2022 • Ibrahim H. Ahmed, Cillian Brewitt, Ignacio Carlucho, Filippos Christianos, Mhairi Dunion, Elliot Fosong, Samuel Garcin, Shangmin Guo, Balint Gyevnar, Trevor McInroe, Georgios Papoudakis, Arrasy Rahman, Lukas Schäfer, Massimiliano Tamborski, Giuseppe Vecchio, Cheng Wang, Stefano V. Albrecht

The development of autonomous agents which can interact with other agents to accomplish a given task is a core area of research in artificial intelligence and machine learning.

BIG-bench Machine Learning Causal Inference +4

413

Paper
Code

Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic Segmentation

1 code implementation • 15 Mar 2022 • Runfa Chen, Yu Rong, Shangmin Guo, Jiaqi Han, Fuchun Sun, Tingyang Xu, Wenbing Huang

After the great success of Vision Transformer variants (ViTs) in computer vision, it has also demonstrated great potential in domain adaptive semantic segmentation.

Ranked #7 on Semantic Segmentation on SYNTHIA-to-Cityscapes

Pseudo Label Segmentation +2

Paper
Code

Better Supervisory Signals by Observing Learning Paths

1 code implementation • ICLR 2022 • Yi Ren, Shangmin Guo, Danica J. Sutherland

Observing the learning path not only provides a new perspective for understanding knowledge distillation, overfitting, and learning dynamics, but also reveals that the supervisory signal of a teacher network can be very unstable near the best points in training on real tasks.

Knowledge Distillation

Paper
Code

Expressivity of Emergent Languages is a Trade-off between Contextual Complexity and Unpredictability

no code implementations • ICLR 2022 • Shangmin Guo, Yi Ren, Kory Wallace Mathewson, Simon Kirby, Stefano V Albrecht, Kenny Smith

Researchers are using deep learning models to explore the emergence of language in various language games, where simulated agents interact and develop an emergent language to solve a task.

Paper
Add Code

Expressivity of Emergent Language is a Trade-off between Contextual Complexity and Unpredictability

1 code implementation • 7 Jun 2021 • Shangmin Guo, Yi Ren, Kory Mathewson, Simon Kirby, Stefano V. Albrecht, Kenny Smith

Researchers are using deep learning models to explore the emergence of language in various language games, where agents interact and develop an emergent language to solve tasks.

Paper
Code

Inductive Bias and Language Expressivity in Emergent Communication

1 code implementation • 4 Dec 2020 • Shangmin Guo, Yi Ren, Agnieszka Słowik, Kory Mathewson

Referential games and reconstruction games are the most common game types for studying emergent languages.

Inductive Bias

Paper
Code

Compositional Languages Emerge in a Neural Iterated Learning Model

1 code implementation • ICLR 2020 • Yi Ren, Shangmin Guo, Matthieu Labeau, Shay B. Cohen, Simon Kirby

The principle of compositionality, which enables natural language to represent complex concepts via a structured combination of simpler ones, allows us to convey an open-ended set of messages using a limited vocabulary.

Paper
Code

Emergence of Numeric Concepts in Multi-Agent Autonomous Communication

1 code implementation • 4 Nov 2019 • Shangmin Guo

Although their encodeing method is not compositional like natural languages from a perspective of human beings, the emergent languages can be generalised to unseen inputs and, more importantly, are easier for models to learn.

Grounded language learning

Paper
Code

The Emergence of Compositional Languages for Numeric Concepts Through Iterated Learning in Neural Agents

no code implementations • 11 Oct 2019 • Shangmin Guo, Yi Ren, Serhii Havrylov, Stella Frank, Ivan Titov, Kenny Smith

Since first introduced, computer simulation has been an increasingly important tool in evolutionary linguistics.

Grounded language learning

Paper
Add Code

IJCNLP-2017 Task 5: Multi-choice Question Answering in Examinations

no code implementations • IJCNLP 2017 • Shangmin Guo, Kang Liu, Shizhu He, Cao Liu, Jun Zhao, Zhuoyu Wei

The IJCNLP-2017 Multi-choice Question Answering(MCQA) task aims at exploring the performance of current Question Answering(QA) techniques via the realworld complex questions collected from Chinese Senior High School Entrance Examination papers and CK12 website1.

Question Answering

Paper
Add Code

Which is the Effective Way for Gaokao: Information Retrieval or Neural Networks?

1 code implementation • EACL 2017 • Shangmin Guo, Xiangrong Zeng, Shizhu He, Kang Liu, Jun Zhao

As one of the most important test of China, Gaokao is designed to be difficult enough to distinguish the excellent high school students.

Information Retrieval Multiple-choice +4

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.