no code implementations • 16 Apr 2024 • Kaibo Liu, Yiyang Liu, Zhenpeng Chen, Jie M. Zhang, Yudong Han, Yun Ma, Ge Li, Gang Huang
Conventional automated test generation tools struggle to generate test oracles and tricky bug-revealing test inputs.
1 code implementation • 3 Feb 2024 • Dong Huang, Jie M. Zhang, Yuhao QING, Heming Cui
This paper presents EffiBench, a benchmark with 1, 000 efficiency-critical coding problems for assessing the efficiency of code generated by code generation models.
1 code implementation • 20 Dec 2023 • Dong Huang, Qingwen Bu, Jie M. Zhang, Michael Luck, Heming Cui
The advancement of natural language processing (NLP) has been significantly boosted by the development of transformer-based large language models (LLMs).
Ranked #1 on Code Generation on HumanEval
no code implementations • 25 Oct 2023 • Yonghao Wu, Zheng Li, Jie M. Zhang, Yong liu
With the growing interest on Large Language Models (LLMs) for fault localization and program repair, ensuring the integrity and generalizability of the LLM-based methods becomes paramount.
no code implementations • 5 Aug 2023 • Xinyue Li, Zhenpeng Chen, Jie M. Zhang, Federica Sarro, Ying Zhang, Xuanzhe Liu
This paper analyzes fairness in automated pedestrian detection, a crucial but under-explored issue in autonomous driving systems.
1 code implementation • 25 Jul 2023 • Zhenpeng Chen, Jie M. Zhang, Federica Sarro, Mark Harman
Existing research mostly improves the fairness of Machine Learning (ML) software regarding a single protected attribute at a time, but this is unrealistic given that many users have multiple protected attributes.
no code implementations • 14 Jul 2022 • Max Hort, Zhenpeng Chen, Jie M. Zhang, Mark Harman, Federica Sarro
How many datasets are used for evaluating bias mitigation methods?
2 code implementations • 7 Jul 2022 • Zhenpeng Chen, Jie M. Zhang, Federica Sarro, Mark Harman
We find that (1) the bias mitigation methods significantly decrease ML performance in 53% of the studied scenarios (ranging between 42%~66% according to different ML performance metrics); (2) the bias mitigation methods significantly improve fairness measured by the 4 used metrics in 46% of all the scenarios (ranging between 24%~59% according to different fairness metrics); (3) the bias mitigation methods even lead to decrease in both fairness and ML performance in 25% of the scenarios; (4) the effectiveness of the bias mitigation methods depends on tasks, models, the choice of protected attributes, and the set of metrics used to assess fairness and ML performance; (5) there is no bias mitigation method that can achieve the best trade-off in all the scenarios.
1 code implementation • ICLR 2022 • Baptiste Roziere, Jie M. Zhang, Francois Charton, Mark Harman, Gabriel Synnaeve, Guillaume Lample
With little to no parallel data available for programming languages, unsupervised methods are well-suited to source code translation.
no code implementations • 19 Jun 2019 • Jie M. Zhang, Mark Harman, Lei Ma, Yang Liu
This paper provides a comprehensive survey of Machine Learning Testing (ML testing) research.
no code implementations • 24 May 2019 • Jie M. Zhang, Mark Harman, Benjamin Guedj, Earl T. Barr, John Shawe-Taylor
MV mutates training data labels, retrains the model against the mutated data, then uses the metamorphic relation that captures the consequent training performance changes to assess model fit.