Search Results for author: Malek Mechergui

Found 2 papers, 0 papers with code

Handling Reward Misspecification in the Presence of Expectation Mismatch

no code implementations12 Apr 2024 Sarath Sreedharan, Malek Mechergui

Detecting and handling misspecified objectives, such as reward functions, has been widely recognized as one of the central challenges within the domain of Artificial Intelligence (AI) safety research.

Goal Alignment: A Human-Aware Account of Value Alignment Problem

no code implementations2 Feb 2023 Malek Mechergui, Sarath Sreedharan

To address this lacuna, we propose a novel formulation for the value alignment problem, named goal alignment that focuses on a few central challenges related to value alignment.

Cannot find the paper you are looking for? You can Submit a new open access paper.