1 code implementation • 14 Sep 2021 • Pegah Rahimian, Afshin Oroojlooy, Laszlo Toka
For instance, in the optimized policy, we observe that some actions such as foul, or ball out can be sometimes more rewarding than a shot in specific situations.
no code implementations • NeurIPS 2020 • Afshin Oroojlooy, MohammadReza Nazari, Davood Hajinezhad, Jorge Silva
The first attention model is introduced to handle different numbers of roads-lanes; and the second attention model is intended for enabling decision-making with any number of phases in an intersection.
4 code implementations • NeurIPS 2018 • Mohammadreza Nazari, Afshin Oroojlooy, Lawrence V. Snyder, Martin Takáč
Our model represents a parameterized stochastic policy, and by applying a policy gradient algorithm to optimize its parameters, the trained model produces the solution as a sequence of consecutive actions in real time, without the need to re-train for every new problem instance.