这是一篇来自澳洲的关于机器学习简介的**代码代写**

**Exercise 1 Conjectures **

5 credits each

Here are a collection of conjectures. Which are true, and which are false?

(No marks for just starting True/False.)

**Hint: **There’s quite a few questions here, but each is relatively simple (the counterexamples aren’t very complicated, and the proofs are short.) Try playing around with a few examples fifirst to get an intuitive feeling if the statement is true before trying to prove it.

Let *V *be a vector space, and let *h·**, **·i *: *V **× **V **→ *R be an inner product over *V *.

- Triangle inequality for inner products: For all
**a***,***b***,***c***∈**V*,*h***a***,***c***i ≤ h***a***,***b***i*+*h***b***,***c***i*. - Transitivity of orthogonality: For all
**a***,***b***,***c***∈**V*, if*h***a***,***b***i*= 0 and*h***b***,***c***i*= 0 then*h***a***,***c***i*= 0. - Orthogonality closed under addition: Suppose
*S*=*{***v**1*, . . . ,***v***n**} ⊆**V*is a set of vectors, and**x**is orthogonal to all of them (that is, for all*i*= 1*,*2*, . . . n*,*h***x***,***v****i***i*= 0). Then**x**is orthogonal to any**y***∈*Span(*S*).

- Let
*S*=*{***v**1*,***v**2*, . . . ,***v***n**} ⊆**V*be an**orthonormal**set of vectors in*V*. Then for all**non-zero****x***∈**V*, if for all 1*≤**i**≤**n*we have*h***x***,***v****i***i*= 0 then**x***6∈*Span(*S*).

- Let
*S*=*{***v**1*,***v**2*, . . . ,***v***n**} ⊆**V*be a set of vectors in*V*(no assumption of orthonormality). Then for all**non-zero x***∈**V*, if for all 1*≤**i**≤**n*we have*h***x***,***v***i**i*= 0 then**x***6∈*Span(*S*).

- Let
*S*=*{***v**1*, . . . ,***v***n**}*be a set of**orthonormal**vectors such that Span(*S*) =*V*, and let**x***∈**V*.

Then there is a *unique *set of coeffiffifficients *c*1*, . . . , c**n *such that **x **= *c*1**v**1 + *. . . *+ *c**n***v***n *

- Let
*S*=*{***v**1*, . . . ,***v***n**}*be a set of vectors (no assumption of orthonormality) such that Span(*S*) =*V*,and let**x***∈**V*. Then there is a*unique*set of coeffiffifficients*c*1*, . . . , c**n*such that**x**=*c*1**v**1 +*. . .*+*c**n***v***n*

- Let
*S*=*{***v**1*,***v**2*, . . . ,***v***n**} ⊆**V*be a set of vectors. If all the vectors are pairwise linearly independent(i.e., for any 1*≤**i**6*=*j**≤**n*, then only solution to*c**i***v***i*+*c**j***v***j*=**0**is the trivial solution*c**i*=*c**j*= 0.) then the set*S*is linearly independent.**Exercise 2 Inner Products induce Norms**20 credits

Let *V *be a vector space, and let *h·**, **·i *: *V **× **V **→ *R be an inner product on *V *. Defifine *||***x***|| *:= p *h ***x***, ***x***i *.

Prove that *|| · || *is a norm.

(Hint: To prove the triangle inequality holds, you may need the Cauchy-Schwartz inequality, *h ***x***, ***y***i ≤ **||***x***||||***y***||*.)

**Exercise 3 General Linear Regression with Regularisation **(10+10+10+5+5 credits)

Let **A ***∈ *R*N**×**N **, ***B ***∈ *R*D**×**D *be *symmetric, positive defifinite *matrices. From the lectures, we can use symmetric positive defifinite matrices to defifine a corresponding inner product, as shown below. We can also defifine a norm using the inner products.

*h ***x***, ***y***i ***A **:= **x***T ***Ay **

*k ***x***k *2**A **:= *h ***x***, ***x***i ***A ***h ***x***,*

* ***y***i ***B **:= **x***T ***By ***k *

**x***k *2**B **:= *h ***x***, ***x***i ***B **

Suppose we are performing linear regression, with a training set *{*(**x**1*, y*1)*, . . . ,*(**x***N **, y**N *)*}*, where for each *i*, **x***i **∈ *R*D *and *y**i **∈ *R. We can defifine the matrix ** X **= [

**x**1

*, . . . ,*

**x**

*N*]

*T*

*∈*R

*N*

*×*

*D*and the vector

**y**= [

*y*1

*, . . . , y*

*N*]

*T*

*∈*R

*N*

*.*

We would like to fifind *θ **∈ *R*D**, ***c ***∈ *R*N *such that **y ***≈ ***X**** θ **+

**c**, where the error is measured using

*k · k*

**A**.

We avoid overfifitting by adding a weighted regularization term, measured using *||·||***B**. We defifine the loss function with regularizer:

*L***A***,***B***,***y***,***X**(*θ**, ***c**) = *||***y ***− **Xθ **− ***c***|| *2**A **+ *||**θ**|| *2**B **+ *k ***c***k *2 **A **

For the sake of brevity we write *L*(*θ**, ***c**) for *L***A***,***B***,***y***,***X**(*θ**, ***c**).

**HINTS: **

**X**to be full rank. In particular, you may assume that the null space of**X**is trivial (that is, the only solution to**Xz**=**0**is the trivial solution,**z**=**0**.)

- Find the gradient
*∇**θ**L*(*θ**,***c**). - Let
*∇**θ**L*(*θ**,***c**) =**0**, and solve for. If you need to invert a matrix to solve for*θ*, you should prove the inverse exists.*θ*

- Find the gradient
*∇***c***L*(*θ**,***c**).

We now compute the gradient with respect to **c**.

- Let
*∇***c***L*() =*θ***0**, and solve for**c**. If you need to invert a matrix to solve for**c**, you should prove the inverse exists.

- Show that if we set
**A**=**I***,***c**=**0***,***B**=*λ***I**, where*λ**∈*R, your answer for 3.2 agrees with the analytic solution for the standard least squares regression problem with L2 regularization, given by

** θ **= (

*X**T*

**+**

*X**λ*

**)**

*I**−*1

*X**T*

**y**

*.*