1 General Knowledge (15 pts)
1. ( 3 pts) Derive an expression for expectedLoss involving Bias, variance, and noise.
2. ( 3 pts) Explain how to use cross-validation to estimate each of the terms above.
3. (4 pts) Bias in a classifier means that the probability of classifying a new data point drawn from the same distribution as the training data will result in one category occurring more often than another.
(a) What aspects of the training data affect classfier bias?
(b) How does the hinge loss function in an SVM handle bias?
(c) Which parameters of an SVM affect bias on test data? How does increasing or decreasing these parameters affect bias?
4. (5 pts) Consider a naive-Bayes generative model for the problem of classifying samples f(x1; y1); :::; (xn; yn)g, xi 2 Rp and yi 2 f1; : : : ; Kg, where the marginal distribution of each feature is modeled as a univariate Gaussian, i.e., p(xijjyi = k) ∼ N (µjk; σjk 2 ),where k represents the class label. Assuming all parameters have been estimated,clearly describe how such a naive-Bayes model will do classification on a test point xtest.
2 Experiments (15 pts)
Imagine we are using 10-fold cross-validation to tune a parameter θ of a machine learning algorithm using training set data for parameter estimation, and using the held-out fold to evaluate test performance of different values of θ. This produces 10 models,fh1; :::; h10g;
each model hi has its own value θi for that parameter, and corresponding error ei. Let k = arg mini ei be the index of the model with the lowest error. What is the best procedure for going from these 10 models individual to a single model that we can apply to the test data?
a) Choose the model hk?
b) weight the predictions of each model by wi = exp(−ei)?
c) Set θ = θk, then update by training on the held-out data.
Clearly explain your choice and reasoning.