Back History
Yang Yu @ NJUCS

<!-------<div style="position:absolute; float: left; top: 3px; left: auto; right: auto; width:940px; height:20px;border: #900 solid thin; text-align:center; margin: 0 auto 5px auto;"><a href="http://www.lamda.nju.edu.cn/yuy/recruit.ashx" target="_blank">招收2018年研究生说明</a></div>
</div>------>
{| border="0" cellpadding="10" width="100%"
|-
| valign="top" align="right" width="200" | <img src="GetFile.aspx?File=MainPage/face.jpg" alt="Image" width="200">
| valign="top" align="left" | <img src="GetFile.aspx?File=MainPage/name_chs.png" alt="Chinese name" width="132" height="51"><span style="float: right">[^cv_ch|(中文简历)]</span>{BR}  <span class="big">Yang Yu (Y. Yu)</span>{BR}  <span style="color:#888888">Can be pronounced as "young you"</span> <br/> Ph.D., Professor<br>LAMDA Group<br/> <a href="http://ai.nju.edu.cn/" target="_blank">School of Artificial Intelligence</a><br/>National Key Laboratory for Novel Software Technology<br/> Nanjing University<br/><br/>Office: 311, Computer Science Building, Xianlin Campus<br/>email: [mailto:yuy@nju.edu.cn|yuy@nju.edu.cn], [mailto:eyounx@gmail.com|eyounx@gmail.com]  
| valign="top" align="right" | <a href="http://www.lamda.nju.edu.cn" title="http://www.lamda.nju.edu.cn" target="_blank"><img src="GetFile.aspx?File=MainPage/lamda-logo.png" alt="Image" width="204" height="90"></a> <a href="http://www.nju.edu.cn" title="http://www.nju.edu.cn" target="_blank"><img src="GetFile.aspx?File=MainPage/nju-logo.png" alt="Image" width="80" height="95"></a>
|}

<p>I received my Ph.D. degree in Computer Science from Nanjing University in 2011 (supervisor Prof. [^http://cs.nju.edu.cn/zhouzh|Zhi-Hua Zhou]), and then joined the <a href="http://www.lamda.nju.edu.cn" target="_blank">LAMDA Group</a> (<a href="http://www.lamda.nju.edu.cn/Pub.ashx" target="_blank">LAMDA Publications</a>), in the Department of Computer Science and Technology of Nanjing University as an Assistant Researcher from 2011, and as an Associate Professor from 2014. I joined the School of Artificial Intelligence of Nanjing University as a Professor from 2019.</p>
<p>My research interest is in machine learning, a sub-field of artificial intelligence. Currently, I am working on reinforcement learning in various aspects, including optimization, representation, transfer, etc. More information please see my CV.
([cv|Detailed CV] | [^{UP}MainPage/CV-yuy.pdf| CV in PDF])

==Recent Update==
<style>
.hlcell {
	font-size: 18px;
	color: #000;
	font-family: 'Helvetical', 'Lucida Grande', 'Lucida Sans Unicode', 'Lucida Sans', 'DejaVu Sans', Verdana, sans-serif; 
	font-weight: lighter;
	height: 50px;
	width: 100px;
	text-align:center;
	vertical-align:middle;
	border-color: #880000;
	border-width: 2pt;
	border-radius: 3pt;
	border-style:solid;
	cursor: pointer;
}
</style>
<table width="100%" border="0" cellspacing="5pt" cellpadding="0">
  <tr>
    <td class="hlcell" valign="middle" onClick="window.open('https://arxiv.org/abs/1809.09095','_blank')">StarCraft II</td>
    <td valign="bottom">We published the first paper of reinforcement learning on the full length game of StarCraft II.</td>
    <td width="20">&nbsp;</td>
    <td class="hlcell" valign="middle" onClick="window.open('https://github.com/eyounx/VirtualTaobao','_blank')">Virtual Taobao</td>
    <td valign="bottom">A Virtual Taobao environment is released for the research of recommendation system and reinforcement learning.</td>
  </tr>
  <tr>
    <td>&nbsp;</td>
    <td>&nbsp;</td>
    <td>&nbsp;</td>
    <td>&nbsp;</td>
    <td>&nbsp;</td>
  </tr><tr>
    <td class="hlcell" valign="middle" onClick="window.open('http://www.lamda.nju.edu.cn/yuy/GetFile.aspx?File=papers/neurips19abl.pdf','_blank')">Neuron &amp; Logic</td>
    <td valign="bottom">Our <a href="http://www.lamda.nju.edu.cn/yuy/GetFile.aspx?File=papers/neurips19abl.pdf" target="_blank">NeurIPS'19 paper</a><em> </em>connects neural perception and logic reasoning through abductive learning. It is now <a href="https://github.com/AbductiveLearning/ABL-HED" target="_blank">open sourced</a></td>
    <td width="20">&nbsp;</td>
    <td class="hlcell" valign="middle" >Talk</td>
    <td valign="bottom">I  gave an Early Career Spotlight talk on Toward Sample Efficient Reinforcement Learning in IJCAI 2018.</td>
  </tr>
  <tr>
    <td>&nbsp;</td>
    <td>&nbsp;</td>
    <td>&nbsp;</td>
    <td>&nbsp;</td>
    <td>&nbsp;</td>
  </tr>
  <tr>
    <td class="hlcell" valign="middle" onClick="window.open('https://github.com/eyounx/ZOOpt/','_blank')">ZOOpt</td>
    <td valign="bottom">A Python package for derivative free optimization. Release 0.2. </td>
    <td width="20">&nbsp;</td>
    <td class="hlcell" valign="middle" onClick="window.open('https://awrl.cc/','_blank')">AWRL</td>
    <td valign="bottom">We will have the <a href="https://awrl.cc/" target="_blank">4th Asian Workshop on Reinforcement Learning</a></td>
  </tr>
</table>

==Research==
<div class="imageright"><iframe src="//player.bilibili.com/player.html?aid=22575605&cid=37440336&page=1" scrolling="no" border="0" frameborder="no" framespacing="0" allowfullscreen="true" width="280" height="200"> </iframe><p class="imagedescription">A quick-learned policy beats level 3 bot in Starcraft II</p></div> Currently, I am mainly focusing on reinforcement learning. Reinforcement learning searches for a policy of near-optimal decisions, by learning from environment interactions autonomously. Despite the fantastic future, reinforcement learning is still in early infancy. Its potential has not been fully released in many situations. Our team is trying in various aspects to improve reinforcement learning, including theoretical foundation, optimization, model structure, experience reuse, abstraction, model building, etc., heading toward sample-efficient methods for large-scale physical-world applications.

[papers|Full publication list >>>]

=== Codes ===
* GitHub: [^https://github.com/eyounx?tab=repositories] <br/>
* LAMDA codes: [^http://www.lamda.nju.edu.cn/Data.ashx]


=== Selected Work ===

* <b>Reinforcement learning</b> aims at learning models for optimal sequential decisions autonomously.
** <b>Environment virtualization for reinforcement learning</b> (with Alibaba and Didi Inc.)<br/> To apply reinforcement learning in real-world industrial applications, our studies discover that it is feasible to build virtual environments with good generalizability solely from the historical data. These environments enable zero-cost trial-error training for industrial applications.
** <b>Experience reuse in reinforcement learning</b> (with Qing Da, Chao Zhang, Zhi-Hua Zhou, etc.)<br/> Our studies design ways to accelerate reinforcement learning by resuing experiences, paricularly, accumulated in simulators.
** <b>Reinforcement learning on StarCraft</b> (with Zhen-Jia Pang, Ruo-Zhe Liu, etc.)<br/> Our studies try as efficient as possible to learn good playing policy for this extremely large-scale partial-observable real-time-strategy game.

<div class="imageright"><a href="https://www.springer.com/cn/book/9789811359552" target="_blank"><img src="http://www.lamda.nju.edu.cn/yuy/GetFile.aspx?File=MainPage%5ccover-elata.JPG"  width="180"><p class="imagedescription">Z.-H. Zhou, Y. Yu, C. Qian. Evoluionary<br/>Learning: Advances in Theories and <br/>Algorithms. Berlin: Springer, 2019.</a></p></div>
* <b>Derivative-free optimization</b> aims at tackling optimization problems with complex structures, such as non-convex, non-differentiable, and non-continuous problems with many local optima. We are working toward theoretical-grounded efficient derivative-free optimization methods for better solving machine learning problems.
** <b>[^research_sal|Model-based derivative-free optimization]</b> (with [^http://www.lamda.nju.edu.cn/qianh|Hong Qian] and [^http://www.lamda.nju.edu.cn/huyq|Yi-Qi Hu], etc.)<br/> For complex optimizations in real domains, our studies address the issues including theoretical foundation, high-dimensionality, and noisy-evaluation.
** <b>[^research_paratoopt|Approximation analysis &amp; Pareto optimization]</b> (with [^http://www.lamda.nju.edu.cn/qianc|Chao Qian], [^http://www.cs.bham.ac.uk/~xin|Xin Yao] and [^http://cs.nju.edu.cn/zhouzh|Zhi-Hua Zhou], etc.)<br/> Our studies analyze the goodness of solutions of evolutionary algorithms, and design the Pareto optimization that has been shown as powerful approximation tools for various subset selection problems.
** <b>[^research_analysistool|Running time analysis of evolutionary optimization]</b> (with [^http://www.lamda.nju.edu.cn/qianc|Chao Qian] and [^http://cs.nju.edu.cn/zhouzh|Zhi-Hua Zhou])<br/>We develop tools for analyzing the complexity of evolutionary algorithms, one of the most foundamental issues of evolutionary algorithms.

<!-----
* <b>[^research_boostpolicy|Functional representation for nonlinear reinforcement learning]</b> (with [^http://www.lamda.nju.edu.cn/houpf|Peng-Fei Hou], [^http://www.lamda.nju.edu.cn/qiany|Yu Qian], [^http://www.lamda.nju.edu.cn/daq|Qing Da] and [^http://cs.nju.edu.cn/zhouzh|Zhi-Hua Zhou])<br/>Reinforcement learning seeks for a policy that receives the highest total reward from its environement. Functional representation is a powerful tool to approximate complex functions, which can help learn complex policies for fitting practical environments.

* <b>[^research_diversity|The role of diversity in ensemble learning]</b> (with [^http://www.lamda.nju.edu.cn/lin|Nan Li], [^http://www.lamda.nju.edu.cn/liyf|Yu-Feng Li] and [^http://cs.nju.edu.cn/zhouzh|Zhi-Hua Zhou])<br/>Ensemble learning is a machine learning paradigm that achieves the state-of-the-art performance. Diversity was believed to be a key to a good performance of an ensemble approach, which, however, previously served only as a heuristic idea. We show that diversity can play the role of regularization. 
----->
([^http://scholar.google.com.hk/citations?user=PG2lDSwAAAAJ|My Goolge Scholar Citations])


==Teaching==
* Tutorial of Artificial Intelligence (for undergraduate students of AI School. Fall, 2018)
* Advanced Machine Learning. (for graduate students. Fall, 2018)
* Advanced Machine Learning. (for graduate students. Fall, 2017)
* Artificial Intelligence. (for undergraduate students. Spring, 2015, 2016, 2017, 2018) 
* Data Mining. (for M.Sc. students. Fall, 2014, 2013, 2012)
* Digital Image Processing. (for undergraduate students from Dept. Math., Spring, 2014, 2013)
* Introduction to Data Mining. (for undergraduate students. Spring, 2013, 2012)

==Students==

* <b><a href="http://www.lamda.nju.edu.cn/yuy/recruit.ashx" target="_blank"><b>Recuit</b> <span style="position: absolute;margin-top:0px;" width:auto;height:auto;><img src="http://www.lamda.nju.edu.cn/yuy/GetFile.aspx?File=MainPage/recruit.png" height="20pt"></span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;>>></a></b>

* <b>[students|Students >>>]</b>

== ==
<b>Mail:</b><br/> National Key Laboratory for Novel Software Technology, Nanjing University, Xianlin Campus Mailbox 603, 163 Xianlin Avenue, Qixia District, Nanjing 210023, China<br/>(In Chinese:) 南京市栖霞区仙林大道163号，南京大学仙林校区603信箱，软件新技术国家重点实验室，210023。