Large-Scale Linear Classification: Status and Challenges¶
Chih-Jen Lin
Professor
National Taiwan University
Abstract: Many classification methods such as kernel methods or decision trees are nonlinear approaches. However, linear methods of using a simple weight vector as the model remain to be very useful for many applications.By careful feature engineering and having data in a rich dimensional space, the performance may be competitive with that of using a highly nonlinear classifier. Successful application areas include document classification and computational advertising (CTR prediction). In the first part of this talk, we give an overview of linear classification by introducing commonly used formulations. We discuss optimization techniques developed in our linear-classification package LIBLINEAR for fast training. The flexibility over kernel methods in selecting and employing optimization methods can be clearly seen in our discussion. In the second part of the talk, we select a few examples to demonstrate how linear classification is practically applied. They range from small to big data. The third part of the talk discusses issues in applying linear classification for big-data analytics.
Bio: Chi-Jen Lin is currently a distinguished professor at the Department of Computer Science, National Taiwan University. He obtained his B.S. degree from National Taiwan University in 1993 and Ph.D. degree from University of Michigan in 1998. His major research areas include machine learning, data mining, and numerical optimization.He is best known for his work on support vector machines (SVM) for data classification. His software LIBSVM is one of the most widely used and cited SVM packages. For his research work he has received many awards, including the ACM KDD 2010 and ACM RecSys 2013 best paper awards. He is an IEEE fellow, a AAAI fellow, and an ACM distinguished scientist for his contribution to machine
learning algorithms and software design.