Assignment 2

Modified: 2014/09/28 12:27 by admin - Uncategorized
We create a Google Group for the discussions. If you have any question, please post it there so everybody can see our reply, and try not to send us your questions directly via emails.

Edit

The Task: Implement the Stochastic Gradient Descent for Logistic Regression

Description:
  • Implement the Stochastic Gradient Descent for Logistic Regression (Named as SGDLR)
  • The log-likelihood of Logistic Regression in binary classification case can be written as:
Image

  • The gradient of the log-likelihood is
Image

where
Image

  • A detailed introduction to Stochastic Gradient Descent for Logistic Regression can be found here
  • The task is a multiclass classification problem, so you have to extend the binary classifier to the multiclass case. To achieve this goal, there are 2 strategies, "one vs. rest" and "one vs. one". Detailed information can be obtained here. Besides, Logistic Regression can be applied in multiclass case directly, and if you are interested in it, have a try!
  • Conduct 10 fold cross validation on the benchmark data used in Assignment 1 by SGDLR, report the mean and standard deviation of accuracy
  • Write a brief report to show your results, and also compare your results with your naive Bayes solution, which one is better?



Edit

Benchmark Dataset

The same as Assignment 1.

Edit

Programming Language

  • The choice you have made in the first assignment

Edit

Submission

  • Please use this MSWord template to report your results.
  • Do NOT plagiarize, plagiarism will be seriously penalized: You should be careful on writing your report. Whenever you are using words and works of others, citations should be made clear such that one can tell which part is actually yours. Details about how to identify a plagiarism can be found in "Introduction to the Guidelines for Handling Plagiarism Complaints".
  • Do NOT falsify results, data fraud will be even more seriously penalized: You should honestly record your results in the report, NEVER EVER modify the performance results manually.
  • Pack your report and code into a zip file named with your student ID, e.g., 'MG1433001.zip'. If you have multiple submissions, add an extra '_' with a number, such as 'MG1433001_1.zip'. We will use the the version with the largest number as your final submission.

    • The file format should be zip, no other format is acceptable!
    • NO submission after the deadline is acceptable!
    • NO email submission will be accepted!

Upload your file to FTP: (please use FTP software to upload, do not use Windows Explorer or IE)
ftp://lamda.nju.edu.cn/mg_dm14/assignment2/
username/password: You will be informed in the first class


Edit

Evaluation

We will evaluate your submission according to your implementation and report.

For implementation :
  • Efficiency
  • Performance
  • Code style

For report:
  • Technique: clearly explain all the component you used in your implementation
  • Language: concise, precise, and logical.

If plagiarism is identified, no scores will be given to this report.

Edit

Contact TA

Mr. Qing Da and Mr. Yue Zhu

Back to assignment homepage
Back to course homepage