Search Results for author: Yixiao Zhang

Found 22 papers, 15 papers with code

BUTTER: A Representation Learning Framework for Bi-directional Music-Sentence Retrieval and Generation

no code implementations • NLP4MusA 2020 • Yixiao Zhang, Ziyu Wang, Dingsu Wang, Gus Xia

Paper
Add Code

Exploiting Structural Consistency of Chest Anatomy for Unsupervised Anomaly Detection in Radiography Images

1 code implementation • 13 Mar 2024 • Tiange Xiang, Yixiao Zhang, Yongyi Lu, Alan Yuille, Chaoyi Zhang, Weidong Cai, Zongwei Zhou

To this end, we propose a Simple Space-Aware Memory Matrix for In-painting and Detecting anomalies from radiography images (abbreviated as SimSID).

Anatomy Image Reconstruction +1

Paper
Code

Arrange, Inpaint, and Refine: Steerable Long-term Music Audio Generation and Editing via Content-based Controls

1 code implementation • 14 Feb 2024 • Liwei Lin, Gus Xia, Yixiao Zhang, Junyan Jiang

We apply this method to fine-tune MusicGen, a leading autoregressive music generation model.

Audio Generation Music Generation

Paper
Code

MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models

no code implementations • 9 Feb 2024 • Yixiao Zhang, Yukara Ikemiya, Gus Xia, Naoki Murata, Marco Martínez, Wei-Hsiang Liao, Yuki Mitsufuji, Simon Dixon

This paper introduces a novel approach to the editing of music generated by such models, enabling the modification of specific attributes, such as genre, mood and instrument, while maintaining other aspects unchanged.

Music Generation Text-to-Music Generation

Paper
Add Code

The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation

1 code implementation • 16 Nov 2023 • Ilaria Manco, Benno Weck, Seungheon Doh, Minz Won, Yixiao Zhang, Dmitry Bogdanov, Yusong Wu, Ke Chen, Philip Tovstogan, Emmanouil Benetos, Elio Quinton, György Fazekas, Juhan Nam

We introduce the Song Describer dataset (SDD), a new crowdsourced corpus of high-quality audio-caption pairs, designed for the evaluation of music-and-language models.

Music Captioning Music Generation +2

115

Paper
Code

Content-based Controls For Music Large Language Modeling

1 code implementation • 26 Oct 2023 • Liwei Lin, Gus Xia, Junyan Jiang, Yixiao Zhang

We aim to further equip the models with direct and content-based controls on innate music languages such as pitch, chords and drum track.

Language Modelling Music Generation +1

Paper
Code

Loop Copilot: Conducting AI Ensembles for Music Generation and Iterative Editing

no code implementations • 19 Oct 2023 • Yixiao Zhang, Akira Maezawa, Gus Xia, Kazuhiko Yamamoto, Simon Dixon

Creating music is iterative, requiring varied methods at each stage.

Language Modelling Large Language Model +1

Paper
Add Code

AI (r)evolution -- where are we heading? Thoughts about the future of music and sound technologies in the era of deep learning

no code implementations • 20 Sep 2023 • Giovanni Bindi, Nils Demerlé, Rodrigo Diaz, David Genova, Aliénor Golvet, Ben Hayes, Jiawen Huang, Lele Liu, Vincent Martos, Sarah Nabi, Teresa Pelinski, Lenny Renault, Saurjya Sarkar, Pedro Sarmento, Cyrus Vahidi, Lewis Wolstanholme, Yixiao Zhang, Axel Roebel, Nick Bryan-Kinns, Jean-Louis Giavitto, Mathieu Barthet

The students represent the future generation of AI and music researchers.

Paper
Add Code

Exploring XAI for the Arts: Explaining Latent Space in Generative Music

1 code implementation • 10 Aug 2023 • Nick Bryan-Kinns, Berker Banar, Corey Ford, Courtney N. Reed, Yixiao Zhang, Simon Colton, Jack Armitage

We increase the explainability of the model by: i) using latent space regularisation to force some specific dimensions of the latent space to map to meaningful musical attributes, ii) providing a user interface feedback loop to allow people to adjust dimensions of the latent space and observe the results of these changes in real-time, iii) providing a visualisation of the musical attributes in the latent space to help people understand and predict the effect of changes to latent space dimensions.

Music Generation

Paper
Code

Continual Learning for Abdominal Multi-Organ and Tumor Segmentation

1 code implementation • 1 Jun 2023 • Yixiao Zhang, Xinyi Li, Huimiao Chen, Alan Yuille, Yaoyao Liu, Zongwei Zhou

The ability to dynamically extend a model to new data and classes is critical for multiple organ and tumor segmentation.

Continual Learning Organ Segmentation +2

Paper
Code

CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection

2 code implementations • ICCV 2023 • Jie Liu, Yixiao Zhang, Jie-Neng Chen, Junfei Xiao, Yongyi Lu, Bennett A. Landman, Yixuan Yuan, Alan Yuille, Yucheng Tang, Zongwei Zhou

The proposed model is developed from an assembly of 14 datasets, using a total of 3, 410 CT scans for training and then evaluated on 6, 162 external CT scans from 3 additional datasets.

Ranked #1 on Organ Segmentation on BTCV

Organ Segmentation Segmentation +1

480

Paper
Code

Vis2Mus: Exploring Multimodal Representation Mapping for Controllable Music Generation

1 code implementation • 10 Nov 2022 • Runbang Zhang, Yixiao Zhang, Kai Shao, Ying Shan, Gus Xia

In this study, we explore the representation mapping from the domain of visual arts to the domain of music, with which we can use visual arts as an effective handle to control music generation.

Music Generation Representation Learning +1

Paper
Code

Learning Hierarchical Metrical Structure Beyond Measures

1 code implementation • 21 Sep 2022 • Junyan Jiang, Daniel Chin, Yixiao Zhang, Gus Xia

In this paper, we explore a data-driven approach to automatically extract hierarchical metrical structures from scores.

Information Retrieval Music Information Retrieval +1

Paper
Code

Interpreting Song Lyrics with an Audio-Informed Pre-trained Language Model

1 code implementation • 24 Aug 2022 • Yixiao Zhang, Junyan Jiang, Gus Xia, Simon Dixon

Lyric interpretations can help people understand songs and their lyrics quickly, and can also make it easier to manage, retrieve and discover songs efficiently from the growing mass of music archives.

Language Modelling Retrieval

Paper
Code

Fast AdvProp

1 code implementation • ICLR 2022 • Jieru Mei, Yucheng Han, Yutong Bai, Yixiao Zhang, Yingwei Li, Xianhang Li, Alan Yuille, Cihang Xie

Specifically, our modifications in Fast AdvProp are guided by the hypothesis that disentangled learning with adversarial examples is the key for performance improvements, while other training recipes (e. g., paired clean and adversarial training samples, multi-step adversarial attackers) could be largely simplified.

Data Augmentation object-detection +1

Paper
Code

SQUID: Deep Feature In-Painting for Unsupervised Anomaly Detection

2 code implementations • CVPR 2023 • Tiange Xiang, Yixiao Zhang, Yongyi Lu, Alan L. Yuille, Chaoyi Zhang, Weidong Cai, Zongwei Zhou

Radiography imaging protocols focus on particular body regions, therefore producing images of great similarity and yielding recurrent anatomical structures across patients.

Anatomy Unsupervised Anomaly Detection

Paper
Code

A Light-weight Interpretable Compositional Model for Nuclei Detection and Weakly-Supervised Segmentation

no code implementations • 26 Oct 2021 • Yixiao Zhang, Adam Kortylewski, Qing Liu, Seyoun Park, Benjamin Green, Elizabeth Engle, Guillermo Almodovar, Ryan Walk, Sigfredo Soto-Diaz, Janis Taube, Alex Szalay, Alan Yuille

It only requires annotations on isolated nucleus, rather than on all nuclei in the dataset.

Segmentation Weakly supervised segmentation

Paper
Add Code

Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real Images

1 code implementation • ICCV 2021 • Zhuowan Li, Elias Stengel-Eskin, Yixiao Zhang, Cihang Xie, Quan Tran, Benjamin Van Durme, Alan Yuille

Our experiments show CCO substantially boosts the performance of neural symbolic methods on real images.

Question Answering Visual Question Answering

Paper
Code

Learning Interpretable Representation for Controllable Polyphonic Music Generation

2 code implementations • 17 Aug 2020 • Ziyu Wang, Dingsu Wang, Yixiao Zhang, Gus Xia

While deep generative models have become the leading methods for algorithmic composition, it remains a challenging problem to control the generation process because the latent variables of most deep-learning models lack good interpretability.

Disentanglement Music Generation +1

Paper
Code

PIANOTREE VAE: Structured Representation Learning for Polyphonic Music

2 code implementations • 17 Aug 2020 • Ziyu Wang, Yiyi Zhang, Yixiao Zhang, Junyan Jiang, Ruihan Yang, Junbo Zhao, Gus Xia

The dominant approach for music representation learning involves the deep unsupervised model family variational autoencoder (VAE).

Music Generation Representation Learning

Paper
Code

When Radiology Report Generation Meets Knowledge Graph

no code implementations • 19 Feb 2020 • Yixiao Zhang, Xiaosong Wang, Ziyue Xu, Qihang Yu, Alan Yuille, Daguang Xu

In addition, we proposed a new evaluation metric for radiology image reporting with the assistance of the same composed graph.

Graph Embedding Image Captioning

Paper
Add Code

C2FNAS: Coarse-to-Fine Neural Architecture Search for 3D Medical Image Segmentation

no code implementations • CVPR 2020 • Qihang Yu, Dong Yang, Holger Roth, Yutong Bai, Yixiao Zhang, Alan L. Yuille, Daguang Xu

3D convolution neural networks (CNN) have been proved very successful in parsing organs or tumours in 3D medical images, but it remains sophisticated and time-consuming to choose or design proper 3D networks given different task contexts.

Image Segmentation Medical Image Segmentation +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.