这是一篇来自澳洲的关于机器学习简介的代码代写
Exercise 1 Conjectures
5 credits each
Here are a collection of conjectures. Which are true, and which are false?
- If it is true, provide a formal proof demonstrating so.
- If it is false, give a counterexample, clearly stating why your counterexamples satisfifies the premise but not the conclusion.
(No marks for just starting True/False.)
Hint: There’s quite a few questions here, but each is relatively simple (the counterexamples aren’t very complicated, and the proofs are short.) Try playing around with a few examples fifirst to get an intuitive feeling if the statement is true before trying to prove it.
Let V be a vector space, and let h·, ·i : V × V → R be an inner product over V .
- Triangle inequality for inner products: For all a, b, c ∈ V , h a, ci ≤ ha, bi + h b, ci .
- Transitivity of orthogonality: For all a, b, c ∈ V , if h a, bi = 0 and h b, ci = 0 then h a, ci = 0.
- Orthogonality closed under addition: Suppose S = {v1, . . . , vn} ⊆ V is a set of vectors, and x is orthogonal to all of them (that is, for all i = 1, 2, . . . n, h x, vii = 0). Then x is orthogonal to any y ∈ Span(S).
- Let S = {v1, v2, . . . , vn} ⊆ V be an orthonormal set of vectors in V . Then for all non-zero x ∈ V , if for all 1 ≤ i ≤ n we have h x, vii = 0 then x 6∈ Span(S).
- Let S = {v1, v2, . . . , vn} ⊆ V be a set of vectors in V (no assumption of orthonormality). Then for all non-zero x ∈ V , if for all 1 ≤ i ≤ n we have h x, vii = 0 then x 6∈ Span(S).
- Let S = {v1, . . . , vn} be a set of orthonormal vectors such that Span(S) = V , and let x ∈ V .
Then there is a unique set of coeffiffifficients c1, . . . , cn such that x = c1v1 + . . . + cnvn
- Let S = {v1, . . . , vn} be a set of vectors (no assumption of orthonormality) such that Span(S) = V ,and let x ∈ V . Then there is a unique set of coeffiffifficients c1, . . . , cn such that x = c1v1 + . . . + cnvn
- Let S = {v1, v2, . . . , vn} ⊆ V be a set of vectors. If all the vectors are pairwise linearly independent(i.e., for any 1 ≤ i 6 =j ≤ n, then only solution to civi +cjvj = 0 is the trivial solution ci = cj = 0.) then the set S is linearly independent.Exercise 2 Inner Products induce Norms 20 credits
Let V be a vector space, and let h·, ·i : V × V → R be an inner product on V . Defifine ||x|| := p h x, xi .
Prove that || · || is a norm.
(Hint: To prove the triangle inequality holds, you may need the Cauchy-Schwartz inequality, h x, yi ≤ ||x||||y||.)
Exercise 3 General Linear Regression with Regularisation (10+10+10+5+5 credits)
Let A ∈ RN×N , B ∈ RD×D be symmetric, positive defifinite matrices. From the lectures, we can use symmetric positive defifinite matrices to defifine a corresponding inner product, as shown below. We can also defifine a norm using the inner products.
h x, yi A := xT Ay
k xk 2A := h x, xi A h x,
yi B := xT By k
xk 2B := h x, xi B
Suppose we are performing linear regression, with a training set {(x1, y1), . . . ,(xN , yN )}, where for each i, xi ∈ RD and yi ∈ R. We can defifine the matrix X = [x1, . . . , xN ]T ∈ RN×D and the vector y = [y1, . . . , yN ]T ∈ RN .
We would like to fifind θ ∈ RD, c ∈ RN such that y ≈ Xθ + c, where the error is measured using k · kA.
We avoid overfifitting by adding a weighted regularization term, measured using ||·||B. We defifine the loss function with regularizer:
LA,B,y,X(θ, c) = ||y − Xθ − c|| 2A + ||θ|| 2B + k ck 2 A
For the sake of brevity we write L(θ, c) for LA,B,y,X(θ, c).
HINTS:
- You may use (without proof) the property that a symmetric positive defifinite matrix is invertible.
- We assume that there are suffiffifficiently many non-redundant data points for X to be full rank. In particular, you may assume that the null space of X is trivial (that is, the only solution to Xz = 0 is the trivial solution, z = 0.)
- You may use identities of gradients from the lectures slides, so long as you mention as such.
- Find the gradient ∇θL(θ, c).
- Let ∇θL(θ, c) = 0, and solve for θ. If you need to invert a matrix to solve for θ, you should prove the inverse exists.
- Find the gradient ∇cL(θ, c).
We now compute the gradient with respect to c.
- Let ∇cL(θ) = 0, and solve for c. If you need to invert a matrix to solve for c, you should prove the inverse exists.
- Show that if we set A = I, c = 0, B = λI, where λ ∈ R, your answer for 3.2 agrees with the analytic solution for the standard least squares regression problem with L2 regularization, given by
θ = (XT X + λI)−1XT y.