Deep Label Distribution Learning with Label Ambiguity

Bin-Bin Gao¹, Chao Xing², Chen-Wei Xie¹, Jianxin Wu¹, Xin Geng²

¹ National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China
² School of Computer Science and Engineering, Southeast University, Nanjing 210096, China,

IEEE Trans. Image Processing, 26(6), 2017:2825-2838. DOI: Link.

Abstract

Convolutional Neural Networks (ConvNets) have achieved excellent recognition performance in various visual recognition tasks. A large labeled training set is one of the most important factors for its success. However, it is difficult to collect sufficient training images with precise labels in some domains such as apparent age estimation, head pose estimation, multi-label classification and semantic segmentation. Fortunately, there is ambiguous information among labels, which makes these tasks different from traditional classification. Based on this observation, we convert the label of each image into a discrete label distribution, and learn the label distribution by minimizing a Kullback-Leibler divergence between the predicted and ground-truth label distributions using deep ConvNets. The proposed DLDL (Deep Label Distribution Learning) method effectively utilizes the label ambiguity in both feature learning and classifier learning, which prevents the network from over-fitting even when the training set is small. Experimental results show that the proposed approach produces significantly better results than state-of-the-art methods for age estimation and head pose estimation. At the same time, it also improves recognition performance for multi-label classification and semantic segmentation tasks.

The training code of DLDL has been released here (7/8/2018).

The testing code and pre-trained models are available here (11/20/2017).

Disscussion

Feature visualization.

We show the low-dimensional embeddings of all baseline methods using the t-SNE algorithm on ChaLearn, Morph, Pointing'04 and AFLW validation images. It can be observed that there are more clear semantic clusterings for those methods based on deep label distribution than others.

Visualizations of hand-crafted and deep learned features using the t-SNE algorithm on ChaLearn, Morph, Pointing'04 and AFLW validation sets.

ChaLearn

Morph

Pointing'o4

ALFW

Reduce over-fitting.

DLDL can effectively reduce over-fitting when the training set is small.

Comparisons of training and validation MAE of DLDL and all baseline methods on the ChaLearn and AFLW datasets

Robust performance.

DLDL is more amenable to small datasets or sparse labels than C-ConvNet and R-ConvNet.

Comparisons of training and validation MAE of DLDL and all baseline methods on the Morph and BJUT-3D datasets

Analyze the hyper-parameter.

Morph

Pointing'04

VOC2007

ChaLearn

Downloads

Paper
PDF, 5.6 MB

Citation

@article{gao2017deep,
  title={Deep Label Distribution Learning With Label Ambiguity},
  author={Gao, Bin-Bin and Xing, Chao and Xie, Chen-Wei and Wu, Jianxin and Geng, Xin},
  journal={{IEEE} Transactions on Image Processing},
  volume={26},
  number={6},
  pages={2825--2838},
  year={2017},
}

Contact

Please contact Prof. Jianxin Wu (email) and Bin-Bin Gao (email) for questions about the paper.

Deep Label Distribution Learning with Label Ambiguity

Abstract

Main Results

Task1: Age estimation

Task2: Head pose estimation

Task3: Multi-label classification

Task4: Semantic segmentation