Search Results for author: Ioannis Patras

Found 57 papers, 35 papers with code

FashionSD-X: Multimodal Fashion Garment Synthesis using Latent Diffusion

no code implementations • 26 Apr 2024 • Abhishek Kumar Singh, Ioannis Patras

The rapid evolution of the fashion industry increasingly intersects with technological advancements, particularly through the integration of generative AI.

Virtual Try-on

Paper
Add Code

VLLMs Provide Better Context for Emotion Understanding Through Common Sense Reasoning

2 code implementations • 10 Apr 2024 • Alexandros Xenos, Niki Maria Foteinopoulou, Ioanna Ntinou, Ioannis Patras, Georgios Tzimiropoulos

In the first stage, we propose prompting VLLMs to generate descriptions in natural language of the subject's apparent emotion relative to the visual context.

Ranked #1 on Emotion Recognition in Context on EMOTIC

Common Sense Reasoning Emotion Classification +1

213

Paper
Code

DiffusionAct: Controllable Diffusion Autoencoder for One-shot Face Reenactment

no code implementations • 25 Mar 2024 • Stella Bounareli, Christos Tzelepis, Vasileios Argyriou, Ioannis Patras, Georgios Tzimiropoulos

To this end, in this paper we present DiffusionAct, a novel method that leverages the photo-realistic image generation of diffusion models to perform neural face reenactment.

Face Reenactment Image Generation

Paper
Add Code

LAFS: Landmark-based Facial Self-supervised Learning for Face Recognition

3 code implementations • 13 Mar 2024 • Zhonglin Sun, Chen Feng, Ioannis Patras, Georgios Tzimiropoulos

This enables our method - namely LAndmark-based Facial Self-supervised learning LAFS), to learn key representation that is more critical for face recognition.

Face Recognition Self-Supervised Learning

202

Paper
Code

MOAB: Multi-Modal Outer Arithmetic Block For Fusion Of Histopathological Images And Genetic Data For Brain Tumor Grading

1 code implementation • 11 Mar 2024 • Omnia Alwazzan, Abbas Khan, Ioannis Patras, Gregory Slabaugh

We propose a novel Multi-modal Outer Arithmetic Block (MOAB) based on arithmetic operations to combine latent representations of the different modalities for predicting the tumor grade (Grade \rom{2}, \rom{3} and \rom{4}).

Paper
Code

FOAA: Flattened Outer Arithmetic Attention For Multimodal Tumor Classification

1 code implementation • 10 Mar 2024 • Omnia Alwazzan, Ioannis Patras, Gregory Slabaugh

Fusion of multimodal healthcare data holds great promise to provide a holistic view of a patient's health, taking advantage of the complementarity of different modalities while leveraging their correlation.

Paper
Code

Self-Supervised Facial Representation Learning with Facial Region Awareness

no code implementations • 4 Mar 2024 • Zheng Gao, Ioannis Patras

Recent efforts toward this goal are limited to treating each face image as a whole, i. e., learning consistent facial representations at the image-level, which overlooks the consistency of local facial representations (i. e., facial regions like eyes, nose, etc).

Deep Clustering Representation Learning +2

Paper
Add Code

Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization

1 code implementation • 19 Feb 2024 • James Oldfield, Markos Georgopoulos, Grigorios G. Chrysos, Christos Tzelepis, Yannis Panagakis, Mihalis A. Nicolaou, Jiankang Deng, Ioannis Patras

The Mixture of Experts (MoE) paradigm provides a powerful way to decompose inscrutable dense layers into smaller, modular computations often more amenable to human interpretation, debugging, and editability.

Attribute counterfactual

Paper
Code

One-shot Neural Face Reenactment via Finding Directions in GAN's Latent Space

no code implementations • 5 Feb 2024 • Stella Bounareli, Christos Tzelepis, Vasileios Argyriou, Ioannis Patras, Georgios Tzimiropoulos

Moreover, we show that by embedding real images in the GAN latent space, our method can be successfully used for the reenactment of real-world faces.

Disentanglement Face Reenactment

Paper
Add Code

Improving Fairness using Vision-Language Driven Image Augmentation

1 code implementation • 2 Nov 2023 • Moreno D'Incà, Christos Tzelepis, Ioannis Patras, Nicu Sebe

These paths are then applied to augment images to improve the fairness of a given dataset.

Fairness Image Augmentation

Paper
Code

EmoCLIP: A Vision-Language Method for Zero-Shot Video Facial Expression Recognition

1 code implementation • 25 Oct 2023 • Niki Maria Foteinopoulou, Ioannis Patras

To test this, we evaluate using zero-shot classification of the model trained on sample-level descriptions on four popular dynamic FER datasets.

Ranked #1 on Zero-Shot Facial Expression Recognition on MAFW

Facial Expression Recognition (FER) Language Modelling +2

Paper
Code

A Simple Baseline for Knowledge-Based Visual Question Answering

no code implementations • 20 Oct 2023 • Alexandros Xenos, Themos Stafylakis, Ioannis Patras, Georgios Tzimiropoulos

This paper is on the problem of Knowledge-Based Visual Question Answering (KB-VQA).

Ranked #5 on Visual Question Answering (VQA) on A-OKVQA (DA VQA Score metric)

In-Context Learning Question Answering +1

Paper
Add Code

Prompting Visual-Language Models for Dynamic Facial Expression Recognition

1 code implementation • 25 Aug 2023 • Zengqun Zhao, Ioannis Patras

For the visual part, based on the CLIP image encoder, a temporal model consisting of several Transformer encoders is introduced for extracting temporal facial expression features, and the final feature embedding is obtained as a learnable "class" token.

Dynamic Facial Expression Recognition Facial Expression Recognition +1

Paper
Code

Self-Supervised Representation Learning with Cross-Context Learning between Global and Hypercolumn Features

no code implementations • 25 Aug 2023 • Zheng Gao, Chen Feng, Ioannis Patras

Inspired by cross-modality learning, we extend this existing framework that only learns from global features by encouraging the global features and intermediate layer features to learn from each other.

Contrastive Learning Knowledge Distillation +1

Paper
Add Code

SimDETR: Simplifying self-supervised pretraining for DETR

no code implementations • 28 Jul 2023 • Ioannis Maniadis Metaxas, Adrian Bulat, Ioannis Patras, Brais Martinez, Georgios Tzimiropoulos

DETR-based object detectors have achieved remarkable performance but are sample-inefficient and exhibit slow convergence.

Few-Shot Object Detection Object +2

Paper
Add Code

HyperReenact: One-Shot Reenactment via Jointly Learning to Refine and Retarget Faces

1 code implementation • ICCV 2023 • Stella Bounareli, Christos Tzelepis, Vasileios Argyriou, Ioannis Patras, Georgios Tzimiropoulos

In this paper, we present our method for neural face reenactment, called HyperReenact, that aims to generate realistic talking head images of a source identity, driven by a target facial pose.

Face Reenactment

Paper
Code

Parts of Speech-Grounded Subspaces in Vision-Language Models

2 code implementations • 23 May 2023 • James Oldfield, Christos Tzelepis, Yannis Panagakis, Mihalis A. Nicolaou, Ioannis Patras

Latent image representations arising from vision-language models have proved immensely useful for a variety of downstream tasks.

Image Generation POS +1

Paper
Code

Self-Supervised Video Similarity Learning

1 code implementation • 6 Apr 2023 • Giorgos Kordopatis-Zilos, Giorgos Tolias, Christos Tzelepis, Ioannis Kompatsiaris, Ioannis Patras, Symeon Papadopoulos

We introduce S$^2$VS, a video similarity learning approach with self-supervision.

Ranked #1 on Video Retrieval on FIVR-200K

ISVR Retrieval +4

Paper
Code

DivClust: Controlling Diversity in Deep Clustering

1 code implementation • CVPR 2023 • Ioannis Maniadis Metaxas, Georgios Tzimiropoulos, Ioannis Patras

Clustering has been a major research topic in the field of machine learning, one to which Deep Learning has recently been applied with significant success.

Clustering Deep Clustering

Paper
Code

MaskCon: Masked Contrastive Learning for Coarse-Labelled Dataset

1 code implementation • CVPR 2023 • Chen Feng, Ioannis Patras

More specifically, within the contrastive learning framework, for each sample our method generates soft-labels with the aid of coarse labels against other samples and another augmented view of the sample in question.

Ranked #1 on Learning with coarse labels on cifar100

Contrastive Learning Learning with coarse labels

Paper
Code

Attribute-preserving Face Dataset Anonymization via Latent Code Optimization

1 code implementation • CVPR 2023 • Simone Barattin, Christos Tzelepis, Ioannis Patras, Nicu Sebe

By optimizing the latent codes directly, we ensure both that the identity is of a desired distance away from the original (with an identity obfuscation loss), whilst preserving the facial attributes (using a novel feature-matching loss in FaRL's deep feature space).

Attribute

Paper
Code

Motor Imagery Decoding Using Ensemble Curriculum Learning and Collaborative Training

1 code implementation • 21 Nov 2022 • Georgios Zoumpourlis, Ioannis Patras

The first loss applies curriculum learning, forcing each feature extractor to specialize to a subset of the training subjects and promoting feature diversity.

Ranked #1 on Motor Imagery Decoding (left-hand vs right-hand) on OpenBMI

Anatomy Domain Generalization +3

Paper
Code

StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment

1 code implementation • 27 Sep 2022 • Stella Bounareli, Christos Tzelepis, Vasileios Argyriou, Ioannis Patras, Georgios Tzimiropoulos

In this paper we address the problem of neural face reenactment, where, given a pair of a source and a target facial image, we need to transfer the target's pose (defined as the head pose and its facial expressions) to the source image, by preserving at the same time the source's identity characteristics (e. g., facial shape, hair style, etc), even in the challenging case where the source and the target faces belong to different identities.

Disentanglement Face Reenactment

108

Paper
Code

Capsule Network based Contrastive Learning of Unsupervised Visual Representations

1 code implementation • 22 Sep 2022 • Harsh Panwar, Ioannis Patras

Capsule Networks have shown tremendous advancement in the past decade, outperforming the traditional CNNs in various task due to it's equivariant properties.

Contrastive Learning Unsupervised Image Classification

Paper
Code

Adaptive Soft Contrastive Learning

1 code implementation • 22 Jul 2022 • Chen Feng, Ioannis Patras

Self-supervised learning has recently achieved great success in representation learning without human annotations.

Contrastive Learning Representation Learning +1

Paper
Code

Learning from Label Relationships in Human Affect

1 code implementation • 12 Jul 2022 • Niki Maria Foteinopoulou, Ioannis Patras

In the case of affect recognition, we outperform previous vision-based methods in terms of CCC on both the OMG and the AMIGOS datasets.

Ranked #1 on Continuous Affect Estimation on AMIGOS

Continuous Affect Estimation regression

Paper
Code

Summarizing Videos using Concentrated Attention and Considering the Uniqueness and Diversity of the Video Frames

1 code implementation • ACM ICMR 2022 • Evlampios Apostolidis, Georgios Balaouras, Vasileios Mezaris, Ioannis Patras

Instead of simply modeling the frames' dependencies based on global attention, our method integrates a concentrated attention mechanism that is able to focus on non-overlapping blocks in the main diagonal of the attention matrix, and to enrich the existing information by extracting and exploiting knowledge about the uniqueness and diversity of the associated frames of the video.

Ranked #1 on Unsupervised Video Summarization on TvSum

Benchmarking Unsupervised Video Summarization

Paper
Code

ContraCLIP: Interpretable GAN generation driven by pairs of contrasting sentences

1 code implementation • 5 Jun 2022 • Christos Tzelepis, James Oldfield, Georgios Tzimiropoulos, Ioannis Patras

This work addresses the problem of discovering non-linear interpretable paths in the latent space of pre-trained GANs in a model-agnostic manner.

Position

Paper
Code

PandA: Unsupervised Learning of Parts and Appearances in the Feature Maps of GANs

1 code implementation • 31 May 2022 • James Oldfield, Christos Tzelepis, Yannis Panagakis, Mihalis A. Nicolaou, Ioannis Patras

Recent advances in the understanding of Generative Adversarial Networks (GANs) have led to remarkable progress in visual editing and synthesis tasks, capitalizing on the rich semantics that are embedded in the latent spaces of pre-trained GANs.

Paper
Code

Combining Global and Local Attention with Positional Encoding for Video Summarization

1 code implementation • IEEE International Symposium on Multimedia (ISM) 2021 • Evlampios Apostolidis, Georgios Balaouras, Vasileios Mezaris, Ioannis Patras

This paper presents a new method for supervised video summarization.

Ranked #1 on Video Summarization on SumMe

Supervised Video Summarization

Paper
Code

Tensor Component Analysis for Interpreting the Latent Space of GANs

no code implementations • 23 Nov 2021 • James Oldfield, Markos Georgopoulos, Yannis Panagakis, Mihalis A. Nicolaou, Ioannis Patras

This paper addresses the problem of finding interpretable directions in the latent space of pre-trained Generative Adversarial Networks (GANs) to facilitate controllable image synthesis.

Image Generation

Paper
Add Code

SSR: An Efficient and Robust Framework for Learning with Unknown Label Noise

1 code implementation • 22 Nov 2021 • Chen Feng, Georgios Tzimiropoulos, Ioannis Patras

Under this setting, unlike previous methods that often introduce multiple assumptions and lead to complex solutions, we propose a simple, efficient and robust framework named Sample Selection and Relabelling(SSR), that with a minimal number of hyperparameters achieves SOTA results in various conditions.

Ranked #1 on Image Classification on CIFAR-10 (with noisy labels)

Learning with noisy labels Self-Supervised Learning +1

Paper
Code

WarpedGANSpace: Finding non-linear RBF paths in GAN latent space

1 code implementation • ICCV 2021 • Christos Tzelepis, Georgios Tzimiropoulos, Ioannis Patras

This work addresses the problem of discovering, in an unsupervised manner, interpretable paths in the latent space of pretrained GANs, so as to provide an intuitive and easy way of controlling the underlying generative factors.

108

Paper
Code

DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval

1 code implementation • 24 Jun 2021 • Giorgos Kordopatis-Zilos, Christos Tzelepis, Symeon Papadopoulos, Ioannis Kompatsiaris, Ioannis Patras

In this work, we propose a Knowledge Distillation framework, called Distill-and-Select (DnS), that starting from a well-performing fine-grained Teacher Network learns: a) Student Networks at different retrieval performance and computational efficiency trade-offs and b) a Selector Network that at test time rapidly directs samples to the appropriate student to maintain both high retrieval performance and high computational efficiency.

Ranked #2 on Video Retrieval on FIVR-200K

Computational Efficiency Knowledge Distillation +2

Paper
Code

Few-Shot Action Localization without Knowing Boundaries

1 code implementation • 8 Jun 2021 • Ting-Ting Xie, Christos Tzelepis, Fan Fu, Ioannis Patras

Learning to localize actions in long, cluttered, and untrimmed videos is a hard task, that in the literature has typically been addressed assuming the availability of large amounts of annotated training samples for each class -- either in a fully-supervised setting, where action boundaries are known, or in a weakly-supervised setting, where only class labels are known for each video.

Action Localization Few-Shot Learning

Paper
Code

Relationship-based Neural Baby Talk

no code implementations • 8 Mar 2021 • Fan Fu, TingTing Xie, Ioannis Patras, Sepehr Jalali

Understanding interactions between objects in an image is an important element for generating captions.

Caption Generation Graph Attention

Paper
Add Code

Uncertainty Propagation in Convolutional Neural Networks: Technical Report

2 code implementations • 11 Feb 2021 • Christos Tzelepis, Ioannis Patras

In this technical report we study the problem of propagation of uncertainty (in terms of variances of given uni-variate normal random variables) through typical building blocks of a Convolutional Neural Network (CNN).

Paper
Code

Video Summarization Using Deep Neural Networks: A Survey

no code implementations • 15 Jan 2021 • Evlampios Apostolidis, Eleni Adamantidou, Alexandros I. Metsai, Vasileios Mezaris, Ioannis Patras

Video summarization technologies aim to create a concise and complete synopsis by selecting the most informative parts of the video content.

Video Summarization

Paper
Add Code

AC-SUM-GAN: Connecting Actor-Critic and Generative Adversarial Networks for Unsupervised Video Summarization

1 code implementation • IEEE Transactions on Circuits and Systems for Video Technology 2020 • Evlampios Apostolidis, Eleni Adamantidou, Alexandros I. Metsai, Vasileios Mezaris, Ioannis Patras

This paper presents a new method for unsupervised video summarization.

Ranked #3 on Unsupervised Video Summarization on TvSum

Generative Adversarial Network Unsupervised Video Summarization

Paper
Code

Temporal Action Localization with Variance-Aware Networks

no code implementations • 25 Aug 2020 • Ting-Ting Xie, Christos Tzelepis, Ioannis Patras

Results in the action localization problem show that the incorporation of second order statistics improves over the baseline network, and that VANp surpasses the accuracy of virtually all other two-stage networks without involving any additional parameters.

regression Temporal Action Localization

Paper
Add Code

Boundary Uncertainty in a Single-Stage Temporal Action Localization Network

no code implementations • 25 Aug 2020 • Ting-Ting Xie, Christos Tzelepis, Ioannis Patras

We use two uncertainty-aware boundary regression losses: first, the Kullback-Leibler divergence between the ground truth location of the boundary and the Gaussian modeling the prediction of the boundary and second, the expectation of the $\ell_1$ loss under the same Gaussian.

Temporal Action Localization

Paper
Add Code

Unsupervised Video Summarization via Attention-Driven Adversarial Learning

1 code implementation • MultiMedia Modeling (MMM) 2019 • Evlampios Apostolidis, Eleni Adamantidou, Alexandros I. Metsai, Vasileios Mezaris, Ioannis Patras

Experimental evaluation on two datasets (SumMe and TVSum) documents the contribution of the attention auto-encoder to faster and more stable training of the model, resulting in a significant performance improvement with respect to the original model and demonstrating the competitiveness of the proposed SUM-GAN-AAE against the state of the art.

Ranked #6 on Unsupervised Video Summarization on SumMe

Unsupervised Video Summarization

Paper
Code

A Stepwise, Label-based Approach for Improving the Adversarial Training in Unsupervised Video Summarization

1 code implementation • AI4TV 2019 • Evlampios Apostolidis, Alexandros I. Metsai, Eleni Adamantidou, Vasileios Mezaris, Ioannis Patras

In this paper we present our work on improving the efficiency of adversarial training for unsupervised video summarization.

Ranked #5 on Unsupervised Video Summarization on TvSum

Benchmarking Unsupervised Video Summarization

Paper
Code

ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning

1 code implementation • ICCV 2019 • Giorgos Kordopatis-Zilos, Symeon Papadopoulos, Ioannis Patras, Ioannis Kompatsiaris

Subsequently, the similarity matrix between all video frames is fed to a four-layer CNN, and then summarized using Chamfer Similarity (CS) into a video-to-video similarity score -- this avoids feature aggregation before the similarity calculation between videos and captures the temporal similarity patterns between matching frame sequences.

Ranked #5 on Video Retrieval on FIVR-200K

ISVR Retrieval +2

198

Paper
Code

TARN: Temporal Attentive Relation Network for Few-Shot and Zero-Shot Action Recognition

no code implementations • 21 Jul 2019 • Mina Bishay, Georgios Zoumpourlis, Ioannis Patras

At the heart of our network is a meta-learning approach that learns to compare representations of variable temporal length, that is, either two videos of different length (in the case of few-shot action recognition) or a video and a semantic representation such as word vector (in the case of zero-shot action recognition).

Ranked #7 on Few Shot Action Recognition on Kinetics-100

Few-Shot action recognition Few Shot Action Recognition +5

Paper
Add Code

Exploring Feature Representation and Training strategies in Temporal Action Localization

no code implementations • 25 May 2019 • Ting-Ting Xie, Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu, Ioannis Patras

Temporal action localization has recently attracted significant interest in the Computer Vision community.

Temporal Action Localization

Paper
Add Code

Registration-free Face-SSD: Single shot analysis of smiles, facial attributes, and affect in the wild

no code implementations • 11 Feb 2019 • Youngkyoon Jang, Hatice Gunes, Ioannis Patras

In this paper, we present a novel single shot face-related task analysis method, called Face-SSD, for detecting faces and for performing various face-related (classification/regression) tasks including smile recognition, face attribute prediction and valence-arousal estimation in the wild.

Arousal Estimation Attribute +2

Paper
Add Code

FIVR: Fine-grained Incident Video Retrieval

1 code implementation • 11 Sep 2018 • Giorgos Kordopatis-Zilos, Symeon Papadopoulos, Ioannis Patras, Ioannis Kompatsiaris

To create the dataset, we devise a process for the collection of YouTube videos based on major news events from recent years crawled from Wikipedia and deploy a retrieval pipeline for the automatic selection of query videos based on their estimated suitability as benchmarks.

Benchmarking Retrieval +1

Paper
Code

SchiNet: Automatic Estimation of Symptoms of Schizophrenia from Facial Behaviour Analysis

no code implementations • 7 Aug 2018 • Mina Bishay, Petar Palasek, Stefan Priebe, Ioannis Patras

Patients with schizophrenia often display impairments in the expression of emotion and speech and those are observed in their facial behaviour.

Paper
Add Code

Semi-supervised Fisher vector network

no code implementations • 13 Jan 2018 • Petar Palasek, Ioannis Patras

In this work we explore how the architecture proposed in [8], which expresses the processing steps of the classical Fisher vector pipeline approaches, i. e. dimensionality reduction by principal component analysis (PCA) projection, Gaussian mixture model (GMM) and Fisher vector descriptor extraction as network layers, can be modified into a hybrid network that combines the benefits of both unsupervised and supervised training methods, resulting in a model that learns a semi-supervised Fisher vector descriptor of the input data.

Action Recognition Classification +4

Paper
Add Code

Deep Globally Constrained MRFs for Human Pose Estimation

no code implementations • ICCV 2017 • Ioannis Marras, Petar Palasek, Ioannis Patras

We overcome this by introducing a Markov Random Field (MRF)-based spatial model network between the coarse and the refinement model that introduces geometric constraints on the relative locations of the body joints.

Pose Estimation

Paper
Add Code

Discriminative convolutional Fisher vector network for action recognition

no code implementations • 19 Jul 2017 • Petar Palasek, Ioannis Patras

In this work we propose a novel neural network architecture for the problem of human action recognition in videos.

Action Recognition In Videos Dimensionality Reduction +1

Paper
Add Code

Unsupervised convolutional neural networks for motion estimation

no code implementations • 22 Jan 2016 • Aria Ahmadi, Ioannis Patras

In this paper, we propose a direct method and train a Convolutional Neural Network (CNN) that when, at test time, is given a pair of images as input it produces a dense motion field F at its output layer.

Motion Estimation Optical Flow Estimation

Paper
Add Code

Learning to detect video events from zero or very few video examples

no code implementations • 25 Nov 2015 • Christos Tzelepis, Damianos Galanopoulos, Vasileios Mezaris, Ioannis Patras

In this work we deal with the problem of high-level event detection in video.

Event Detection

Paper
Add Code

Face Alignment Assisted by Head Pose Estimation

1 code implementation • 11 Jul 2015 • Heng Yang, Wenxuan Mou, Yichi Zhang, Ioannis Patras, Hatice Gunes, Peter Robinson

In this paper we propose a supervised initialization scheme for cascaded face alignment based on explicit head pose estimation.

Face Alignment Head Pose Estimation

Paper
Code

Linear Maximum Margin Classifier for Learning from Uncertain Data

1 code implementation • 15 Apr 2015 • Christos Tzelepis, Vasileios Mezaris, Ioannis Patras

In this paper, we propose a maximum margin classifier that deals with uncertainty in data input.

Paper
Code

Mirror, mirror on the wall, tell me, is the error small?

no code implementations • CVPR 2015 • Heng Yang, Ioannis Patras

Our experiments lead to several interesting findings: 1) Surprisingly, most of state of the art methods struggle to preserve the mirror symmetry, despite the fact that they do have very similar overall performance on the original and mirror images; 2) the low mirrorability is not caused by training or testing sample bias - all algorithms are trained on both the original images and their mirrored versions; 3) the mirror error is strongly correlated to the localization/alignment error (with correlation coefficients around 0. 7).

Face Alignment Pose Estimation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.