Image
Image
Image
Image
Image
Image
Image
Image
Image
Image



Search
»

Seminar abstract

2009年4月23日(星期四)15:00-16:00,蒙民伟楼404会议室

Imbalanced Data: When It Becomes a "Pain" and How to "Ease" It

Charles Ling
Professor
Department of Computer Science, University of Western Ontario, Canada

Abstract :

In this talk I will first discuss and survey different situations where data imbalance becomes a serious problem ("pain"). There are mainly two types of situations: one is that the misclassification cost is plicitly or implicitly assumed to be different; the cost of rare class is often much higher than the cost of the majority class. In this case, cost-sensitive learning, when used properly, can handle the problem. In the other situation, the misclassification cost is assumed equal, and an accurate classifier, more accurate than the default classifier that predicts the majority class for all examples, is sought. This can be difficult or impossible for highly imbalanced data. Various methods have been proposed, but what are the capacity and limitations of various learning algorithms? We use PAC-learning to study these issues. We derive several bounds on the sample size that guarantee the overall error rate and the error rate of the rare class.

Bio:

Charles X. Ling earned his dual-BSc from Shanghai Jiao Tong Univ in China, and both of his MSc and PhD from Computer and Information Science at Univ of Pennsylvania (Ivy League) within four years. Since then he has been a faculty member in Computer Science at University of Western Ontario, Canada. He is currently a Professor. His main research areas include machine learning and data mining, cognitive modeling, and child education. He has published over 100 research papers in peer-reviewed journals and conferences. He is an Associate Editor for IEEE TKDE and Computational Intelligence Journal, and IEEE Senior Member. He is the Director of Data Mining and E-Business Lab, leading data mining development in CRM, Bioinformatics, and the Internet.
  Name Size

Image
PoweredBy © LAMDA, 2022