publications | Zheng Xie

2023

TPAMI
Weakly Supervised AUC Optimization: A Unified Partial AUC Approach

Zheng Xie, Yu Liu, Hao-Yuan He, Ming Li, and Zhi-Hua Zhou

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.

Abs Bib PDF

We propose WSAUC, a unified and robust AUC optimization framework for weakly supervised AUC optimization. The framework covers multiple scenarios including noisy labeled AUC optimization, positive-unlabeled AUC optimization, multi-instance AUC optimization, and semi-supervised AUC optimization with or without noise. The framework achieves robust AUC optimization through a novel variety of AUC, i.e., rpAUC. Theorical and empirical results validate the effectiveness of the framework.
@article{xie2023wsauc, title = {Weakly Supervised AUC Optimization: A Unified Partial AUC Approach}, journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence}, author = {Xie, Zheng and Liu, Yu and He, Hao-Yuan and Li, Ming and Zhou, Zhi-Hua}, year = {2023}, eprint = {2305.14258}, archiveprefix = {arXiv}, primaryclass = {cs.LG}, volume = {in press} }
ICDM
Beyond Lexical Consistency: Preserving Semantic Consistency for Program Translation

Yali Du, Yi-Fan Ma, Zheng Xie, and Ming Li

In The 23rd IEEE International Conference on Data Mining, 2023.

Abs Bib PDF

Program translation aims to convert the input programs from one programming language to another. Automatic program translation is a prized target of software engineering research, which leverages the reusability of projects and improves the efficiency of development. Recently, thanks to the rapid development of deep learning model architectures and the availability of large-scale parallel corpus of programs, the performance of program translation has been greatly improved. However, the existing program translation models are still far from satisfactory, in terms of the quality of translated programs. In this paper, we argue that a major limitation of the current approaches is lack of consideration of semantic consistency. Beyond lexical consistency, semantic consistency is also critical for the task. To make the program translation model more semantically aware, we propose a general framework named Preserving Semantic Consistency for Program Translation (PSCPT), which considers semantic consistency with regularization in the training objective of program translation and can be easily applied to all encoder-decoder methods with various neural networks (e.g., LSTM, Transformer) as the backbone. We conduct extensive experiments in 7 general programming languages. Experimental results show that with CodeBERT as the backbone, our approach outperforms not only the state-of-the-art open-source models but also the commercial closed large language models (e.g., text-davinci-002, text-davinci-003) on the program translation task. Our replication package (including code, data, etc.) is publicly available at https://github.com/duyali2000/PSCPT .
@inproceedings{du2023beyond, author = {Du, Yali and Ma, Yi-Fan and Xie, Zheng and Li, Ming}, booktitle = {The 23rd IEEE International Conference on Data Mining}, title = {Beyond Lexical Consistency: Preserving Semantic Consistency for Program Translation}, year = {2023}, }
AAAI
Cooperative and Adversarial Learning: Co-Enhancing Discriminability and Transferability in Domain Adaptation

Hui Sun, Zheng Xie, Xin-Ye Li, and Ming Li

In The 37th AAAI Conference on Artificial Intelligence, 2023.

Abs Bib PDF

We propose the CALE framework to unify and enhance the two main objectives of domain adaptation: discriminability and transferability. To achieve this, CALE swaps the cooperative examples of the two objectives, enabling the learning of discriminability and transferability to mutually benefit each other. Additionally, adversarial examples are utilized to enhance the robustness of the two objectives themselves. The framework can be applied to improve current domain adaptation approaches and has been shown to outperform existing state-of-the-art methods.
@inproceedings{sun2023cale, author = {Sun, Hui and Xie, Zheng and Li, Xin-Ye and Li, Ming}, booktitle = {The 37th AAAI Conference on Artificial Intelligence}, title = {Cooperative and Adversarial Learning: Co-Enhancing Discriminability and Transferability in Domain Adaptation}, year = {2023}, }
AAAI
Semi-Supervised Learning with Support Isolation by Small-Paced Self-Training

Zheng Xie, Hui Sun, and Ming Li

In The 37th AAAI Conference on Artificial Intelligence, 2023.

Abs Bib PDF

In this paper, we address a special scenario of semi-supervised learning, where the label missing is caused by a preceding ﬁltering mechanism, i.e., an instance can enter a subsequent process in which its label is revealed if and only if it passes the ﬁltering mechanism. The rejected instances are prohibited to enter the subsequent labeling process due to economical or ethical reasons, making the support of the labeled and unlabeled distributions isolated from each other. In this case, classical semi-supervised learning approaches are prone to fail. We propose a SmallPaced Self-Training framework, which iteratively discovers labeled and unlabeled instance subspaces with bounded Wasserstein distance. We theoretically prove that such a framework may achieve provably low error on the pseudo labels during learning, and validate the approach through experiments.
@inproceedings{xie2023spst, author = {Xie, Zheng and Sun, Hui and Li, Ming}, booktitle = {The 37th AAAI Conference on Artificial Intelligence}, title = {Semi-Supervised Learning with Support Isolation by Small-Paced Self-Training}, year = {2023}, }

2018

IJCAI
Cutting the Software Building Efforts in Continuous Integration by Semi-Supervised Online AUC Optimization

Zheng Xie, and Ming Li

In The 27th International Joint Conference on Artificial Intelligence, 2018.

Abs Bib PDF Code

In this paper, we propose a semi-supervised online AUC optimization algorithm, namely SOLA. This algorithm is suitable for tasks that suffers from streaming data, label scarce, and imbalance. The algorithm is used for solving build outcome prediction in software continuous integration, and achieves superior performance.
@inproceedings{xie2018onlinesemiauc, author = {Xie, Zheng and Li, Ming}, booktitle = {The 27th International Joint Conference on Artificial Intelligence}, title = {Cutting the Software Building Efforts in Continuous Integration by Semi-Supervised Online AUC Optimization}, year = {2018}, }
AAAI
Semi-Supervised AUC Optimization without Guessing Labels of Unlabeled Data

Zheng Xie, and Ming Li

In The 32nd AAAI Conference on Artificial Intelligence, 2018.

Abs Bib PDF Code

We prove the theoretical property of AUC optimization under semi-supervised learning and positive-unlabeled learning scenarios, and propose a simple yet effective algorithm for semi-supervised and positive-unlabeled AUC optimization. Our algorithm outperforms elaborated approaches on semi-supervised and positive-unlabeled AUC optimization approaches.
@inproceedings{xie2018semiauc, author = {Xie, Zheng and Li, Ming}, booktitle = {The 32nd AAAI Conference on Artificial Intelligence}, title = {Semi-Supervised AUC Optimization without Guessing Labels of Unlabeled Data}, year = {2018}, }

2017

JOS
Cost-Sensitive Margin Distribution Optimization for Software Bug Localization

Zheng Xie, and Ming Li

Journal of Software, 2017.

Abs Bib PDF

Software bug localization problem suffers from data imbalance and heterogeneous code and natural language structure. To tackle this problem, we propose cost-sensitive margin distribution optimization method to enhance the classification tasks under imbalanced scenario, and design a network architecture for processing programming and natural language. Experimental results validates the effectiveness of our method.
@article{xie2017costsensitive, author = {Xie, Zheng and Li, Ming}, journal = {Journal of Software}, number = {11}, title = {Cost-Sensitive Margin Distribution Optimization for Software Bug Localization}, volume = {28}, year = {2017}, }
CCML
Cost-Sensitive Margin Distribution Optimization for Software Bug Localization

Zheng Xie, and Ming Li

In China Conference on Machine Learning, 2017.

Abs Bib PDF

Software bug localization problem suffers from data imbalance and heterogeneous code and natural language structure. To tackle this problem, we propose cost-sensitive margin distribution optimization method to enhance the classification tasks under imbalanced scenario, and design a network architecture for processing programming and natural language. Experimental results validates the effectiveness of our method.
@inproceedings{xie2017ccml, author = {Xie, Zheng and Li, Ming}, booktitle = {China Conference on Machine Learning}, title = {Cost-Sensitive Margin Distribution Optimization for Software Bug Localization}, year = {2017}, }
ICMC
Music Style Analysis among Haydn, Mozart and Beethoven: an Unsupervised Machine Learning Approach

Ru Wen, Zheng Xie, Kai Chen, Ruoxuan Guo, Kuan Xu, Wenmin Huang, Jiyuan Tian, and Jiang Wu

In The 43rd International Computer Music Conference, 2017.

Abs Bib PDF

We propose an unsupervised music analysis method. We propose a feature extraction method for extracting consecutive note pitch patterns, and use clustering methods for mining the music styles. We apply our method on new built corpus of Haydn, Mozart, and Beethoven. Our discovered pattern fits the Implication-Realization theory, which confirms the validity of our approach.
@inproceedings{xie2017music, author = {Wen, Ru and Xie, Zheng and Chen, Kai and Guo, Ruoxuan and Xu, Kuan and Huang, Wenmin and Tian, Jiyuan and Wu, Jiang}, booktitle = {The 43rd International Computer Music Conference}, title = {Music Style Analysis among Haydn, Mozart and Beethoven: an Unsupervised Machine Learning Approach}, year = {2017}, }