Smart Learning from Crowds
Shipeng Yu
Dr.
Siemens Healthcare USA
Abstract:
For many supervised learning tasks it may be infeasible (or very expensive) to obtain objective and reliable labels. Instead, we can collect subjective (possibly noisy) labels from multiple experts or annotators. In practice, there may be a substantial amount of disagreement among the annotators, and hence it is of great practical interest to address conventional supervised learning problems in this scenario. In this talk we describe a probabilistic approach for supervised learning when we have multiple annotators providing (possibly noisy) labels but no absolute gold standard. The proposed algorithm evaluates the different experts and also gives an estimate of the actual hidden labels. We also propose some ranking mechanism to evaluate the quality of the annotators, and develop a learning strategy that penalizes non-informative annotators in the learning process.
Bio:
Shipeng Yu is currently a senior staff scientist at Siemens Healthcare USA, Inc. He received his B.Sc. and M.Sc. degrees in mathematics from Peking University in 2000 and 2003, respectively, and finished his Ph.D. in computer science at University of Munich in Germany in 2006. He has been working on many areas of statistical machine learning, such as Gaussian processes, Dirichlet processes, multi-task learning, probabilistic dimensionality reduction, and semi-supervised learning. He is interested in machine learning applications in data mining, information and image retrieval, user modeling, healthcare analytics and personalized medicine.