# Shen-Huan Lyu @ LAMDA

Modified: 2014/08/25 22:19 by admin - Uncategorized
 Shen-Huan Lyu [CV] PhD CandidateLAMDA Group Department of Computer Science and TechnologyNational Key Laboratory for Novel Software Technology Nanjing UniversityLaboratory: 912, Building of Computer Science and Technology, Xianlin Campus of Nanjing Universityemail: lvsh at lamda.nju.edu.cn

Currently I am a first year PhD student of Department of Computer Science and Technology in Nanjing University and a member of LAMDA Group(LAMDA Publications), led by professor Zhi-Hua Zhou.

Edit

# Supervisor

Professor Zhi-Hua Zhou.

# Biography

I was admitted to study in School of Management in September 2013 and received my B.Sc. degree in Statistic in June 2017 from University of Science and Technology of China. In the same year, I was admitted to pursue for a Ph.D. degree in Nanjing University.

# Research Interest

I am focusing on the statistical learning theory related to deep forests (DFs) and deep neural networks (DNNs). Statistical learning theory is one of the core fields of machine learning, as Vapnik said, 'nothing is more practical than a good theory'. The main goal of statistical learning theory is to provide a framework for studying the problem of inference, that is of gaining knowledge, making predictions, making decisions or constructing models from a set of data. With the successful application of deep learning (including deep neural networks and deep forests), we are curious about how the architectures in different deep models improve the generalization performance. Meanwhile, more extensive application scenarios also require us to explore the mathematical mechanism behind their excellent performance to provide a reliable guarantee for these over-parameterized models.
1. Deep Neural Networks
• Deep neural networks (DNNs) are making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years.
• Margin-based Analysis for Deep Neural Networks
• We prove a distribution-based margin bound for deep neural networks and propose a margin distribution loss function (mdNet) to alleviate the overfitting risk with limited training data. Our bound and experiments show that we can restrict the complexity of deep neural networks by minimizing the ratio of second- to first-order statistic of margin distribution, which can inspire more algorithms to improve the generalization performance of DNNs by considering margin distribution.
• Decoupling-based Analysis for Over-parameterized Deep Neural Networks
• Deep neural networks often come with a huge number of parameters, even larger than the number of training examples, but it seems that these over-parameterized models have not suffered from overfitting. This is quite strange and why over-parameterization does not overfit? poses a fundamental question concerning the mysteries behind the success of deep neural networks. Inspired by distance metric learning transforming the feature space, we attempt to decouple deep neural networks into a linear dual problem with multi-level nonlinear-transformation constraints. We want to indicate that when conventional learning theory concludes that over-parameterization leads to overfitting, the parameters concerned are about hypothesis space from which the classifiers are constructed. As for parameters of feature space transformation, there was no such claim.
1. Deep Forests
• By realizing that the essence of deep learning lies in the layer-by-layer processing, in-model feature transformation, and sufficient model complexity, recently Zhou & Feng propose the deep forest model and the gcForest algorithm to achieve forest representation learning.
• Margin-based Analysis for Deep Forests
• We formulate the forest representation learning approach named casForest as an additive model, and show that the generalization error can be bounded by $\mathcal{O}(\ln m/m)$, when the margin ratio related to the margin standard deviation against the margin mean is sufficiently small. This inspires us to optimize the ratio. To this end, we design a margin distribution reweighting approach for the deep forest model to attain a small margin ratio.
• Interaction-based Improvement for Deep Forests
• We propose a novel deep forest model that utilizes high-order interactions of input features to generate more diverse and effective representation features. We design a variant of Random Intersection Trees (vRIT) to discover stable high-order interactions and transform them into hierarchical distributed representations by Activated Linear Combination (ALC). These interaction-based representations get rid of the dependence on the forests in the front layers, greatly improving the computational efficiency.

# Publications

• Yi-He Chen*, Shen-Huan Lyu*, and Yuan Jiang. Improving Deep Forest by Exploiting High-order Interactions. Preprinted.
• Shen-Huan Lyu, Lu Wang, and Zhi-Hua Zhou. Improving Generalization of Neural Networks by Leveraging Margin Distribution. Preprinted.

• Shen-Huan Lyu, Liang Yang, and Zhi-Hua Zhou. A Refined Margin Distribution Analysis for Forest Representation Learning. In: Advances in Neural Information Processing Systems 32 (NeurIPS'19), Vancouver, CA, 2019.

*: Equal Contribution

# Teaching Assistant

Reviewer for Conferences: AAAI'19, ECAI'19, PRICAI'19, CCML'19, IJCAI'20, NeurIPS'20, IJCAI'21, ICML'21

Reviewer for Journals: Neurocomputing, Transactions on Knowledge Discovery from Data (TKDD)

Volunteer:
The 18th China Symposium on Machine Learning and Applications (MLA’20), Nanjing.
The 16th China Symposium on Machine Learning and Applications (MLA’18), Nanjing.

# Awards & Honors

• The Second Class Academic Scholarship in Nanjing University , Nanjing, 2020
• Artificial Intelligence Scholarship in Nanjing University , Nanjing, 2019
• Presidential Special Scholarship for first year Ph.D. Student in Nanjing University , Nanjing, 2017
• The University Silver Prize Scholarship for Excellent Student in University of Science and Technology of China , Hefei, 2016
• The University Silver Prize Scholarship for Excellent Student in University of Science and Technology of China , Hefei, 2014

Mail:
National Key Laboratory for Novel Software Technology, Nanjing University, Xianlin Campus Mailbox 603, 163 Xianlin Avenue, Qixia District, Nanjing 210023, China
(In Chinese:) 南京市栖霞区仙林大道163号，南京大学仙林校区603信箱，软件新技术国家重点实验室，210023。