no code implementations • 3 Mar 2024 • Mehran Hosseini, Peyman Hosseini
It outperforms standard SPDA on vision and natural language tasks by up to 17% while having one fewer matrix multiplication per head and 25% fewer parameters than standard SDPA.
no code implementations • 3 Jan 2024 • Mehran Hosseini, Peyman Hosseini
The enduring inability of image generative models to recreate intricate geometric features, such as those present in human hands and fingers has been an ongoing problem in image generation for nearly a decade.
1 code implementation • 4 Mar 2023 • Peyman Hosseini, Mehran Hosseini, Sana Sabah Al-Azzawi, Marcus Liwicki, Ignacio Castro, Matthew Purver
We study the influence of different activation functions in the output layer of deep neural network models for soft and hard label prediction in the learning with disagreement task.