# BEST代写-线上编程学术专家

Best代写-最专业靠谱代写IT | CS | 留学生作业 | 编程代写Java | Python |C/C++ | PHP | Matlab | Assignment Project Homework代写

# 机器学习程序代写 | Homework 2, CSCI 5525 2021

## 1 Support vector machine implementation using Convex Op- timization

1. (a) (15 pts) Using sklearn, implement a linear SVM with slack variables in dual
form by selecting the appropriate method from the descriptions: https://scikit-learn.org/
stable/modules/svm.html#svm-mathematical-formulation. Identify the prediction
function that can “predict” an input point, that is, compute the functional margin
of the point. Implement a boundary plotting function (see notes in canvas).

(b) (15 pts) Run your SVM implementation on the MNIST-13 dataset. The dataset
contains two classes labeled as 1 and 3 in the rst column of the csv file we
provided. All other columns are data values. Compute test performance using
10-fold cross-validation on random 80-20 splits. Report your results and explain
what you find for the following manipulations.

i. Try C = 0:01; 0:1; 1; 10; 100 and show your results, both graphically and by
reporting the number of mistakes on the training and validation data sets.

ii. What is the impact of test performance as C increases?

iii. What happens to the geometric margin 1=jjwjj as C increases?
iv. What happens to the number of support vectors as C increases?

(c) (10 pts) Answer the following questions about the dual SVM approach:

i. The value of C will typically change the resulting classifier and therefore
also affects the accuracy on test examples.

ii. The quadratic programming problem involves both w, b and the slack vari-
ables i. We can rewrite the optimization problem in terms of w, b alone.
This is done by explicitly solving for the optimal values of the slack variables
i = i(w; b) as functions of w, b. Then the resulting minimization problem
over w, b can be formally written as: where the first (regularization) term biases our solution towards zero in the
absence of any data and the remaining terms give rise to the loss functions,
one loss function per training point, encouraging correct classification. The
values of these slack variables, as functions of w, b, are \loss-functions”. a)
Derive the functions i(w; b) that determine the optimal i. The equations
should take on a familiar form. b) Are all the margin constraints satisfied
with these expressions for the slack variables?

## 1.1 Procedure

Use the code templates provided in the moodle, and any additional instructions
posted by the TA. 