这个作业是用Matlab预测城市人口变化

AMATH 301 – Spring 2020

Homework #6

Due on Friday, May 15, 2020

Instructions for submitting:

• Problems 1, 3, 5, and 7 should be submitted to MATLAB Grader. You have 3 attempts

for each problem.

• Problems 2, 4, 6, 8, and 9 should be submitted to Gradescope. The solutions and the

code used to get those solutions should be submitted as a single pdf. All code should be

at the end of the file. You must select which page each problem is on when you submit

to Gradescope.

Part I: Population Data

The file SeaPopData.mat, which is included with the homework, contains the following population data for the city of Seattle.

Year Population

1860 188

1870 1151

1880 3533

1890 42837

1900 80671

1910 237194

1920 315312

1930 365583

1940 368302

1950 467591

1960 557087

1970 530831

1980 493846

1990 516259

2000 563374

2010 608660

Data from:

1. https://en.wikipedia.org/wiki/Demographics_of_Seattle

2. https://www.seattle.gov/opcd/population-and-demographics

You should load this data into MATLAB using the load command. Be sure that the file

SeaPopData.mat has been downloaded into the same directory as your script file. MATLAB

Grader has its own copy of this file. If the load command is successful, you will have two

new vectors in your workspace, t and Seattle Pop. The values of the vector t are the number

of years since 1860. Therefore, t = 0 is 1860 and t = 150 is 2010. The vector Seattle Pop has

the corresponding populations from the table above.

(20 points) Problem 1: MATLAB Grader

(a) Find the line of best fit for the data. That is, find a line P = mt + b where t is the number

of years since 1860 and P is the population of Seattle. Store the slope of the line in the

variable ans1. The most recent population estimate for Seattle is that the 2019 population

was 747,300. Use the equation of the best fit line to predict the population in 2019. Store

this value in the variable ans2.

(b) Find the best fit quadratic function for the data. Use this curve to predict the population

in 2019, and store the prediction in the variable ans3. Repeat this process for the best

fit polynomials of degree 5 and degree 9. Create a 1 × 2 row vector named ans4 with

the predictions for each. The prediction from the degree 5 polynomial should be the first

component of the vector.

(15 points) Problem 2: Gradescope

(a) Create a plot the contains the Seattle population data and all four of the best fit polynomials

that you computed in Problem 1. Your plot should have the following features:

i. The data should be plotted as black circles.

ii. Your plot should show from t = 0 to t = 160 and the y-axis should show from P = 0

to P = 800, 000.

iii. The line of best fit should be blue, the quadratic fit should be red, the degree 5 polynomial should be magenta, and the degree 9 polynomial should be green.

iv. All circles and lines should be large/thick enough to be easily visible when you upload

the plot on the writeup portion of your homework.

v. Label the x-axis with “Years since 1860”.

vi. Label the y-axis with “Seattle Population”.

vii. Include a legend. Use legend labels “data”, “deg 1”, “deg 2”, “deg 5”, and “deg 9”.

(b) What is the real-world interpretation of the slope of the line of best fit for the Seattle

population data? What does it tell us about how the population is changing?

(c) Of the different polynomial fits you tried in Problem 1, which gave the most accurate

prediction of the 2019 population and which gave the least accurate? If you had to predict

the population in 2050, which polynomial fit would you trust the most? Justify your answer.

Part II: Atmospheric CO2 Data

The amount of CO2 in the atmosphere is regularly measured at the Mauna Loa observatory in

Hawaii. The file CO2 data.mat, which is included with the homework, contains the monthly

averages since 1958. Shown below is a plot of the data. The data has an overall upward trend

as well as seasonal oscillations.

You should load this data into MATLAB using the load command. Be sure that the file

CO2 data.mat has been downloaded into the same directory as your script file. MATLAB

Grader has its own copy of this file. If the load command is successful, you will have two

new vectors in your workspace, t and CO2. The values of the vector t are the number of years

since 1958 corresponding to each month from March 1958 to April 2020. So the first value is

t(1)= 3/12 because March 1958 is the third month (out of 12) since the beginning of 1958.

The vector CO2 has the corresponding CO2 levels.

Data from:

1. https://www.esrl.noaa.gov/gmd/ccgg/trends/data.html

(10 points) Problem 3: MATLAB Grader

It looks like the overall trend of the data might be captured well by using an exponential

fit. Recall from the video “Least-Squares Fitting Methods” that this can be done by data

linearization. Here is a recap of that method:

Let’s say that the data is stored in vectors T and Y . We wish to fit an exponential function

y = aert to the data. You can take a natural logarithm of Y and perform a linear fit on the data

vectors T and ln(Y ). This gives a linear function y = mt + b. You can get the values of a and

r for the exponential fit of the original data by using that m = r and b = ln(a).

Use data linearization to get an exponential fit y = aert for the CO2 data. Store the value of r

in the variable ans5 and the value of a in the variable ans6.

(10 points) Problem 4: Gradescope

Create a plot the contains the atmospheric CO2 data and your exponential fit from Problem 3.

Your plot should have the following features:

i. The data should be plotted as black dots connected by black lines by using the line specification ‘-k.’.

ii. Your plot should show from t = 0 to t = 65.

iii. The exponential fit should be plotted as a red curve and it should be thick enough to be

easily visible when you upload the plot on the writeup portion of your homework.

iv. Label the x-axis with “Years since 1958”.

v. Label the y-axis with “Atmospheric CO2”.

vi. Include a legend. Use legend labels “data”, “fit curve”.

(10 points) Problem 5: MATLAB Grader The exponential fit from Problems 3 and 4 is

not great at capturing the overall trend of the data. It can be improved by fitting an exponential

function that is shifted up by a constant:

y = aert + b.

For this function, the data linearization trick will not work. Instead, you must create a MATLAB

function that takes the values of a, r, and b as inputs and calculates the sum of squared errors

as output. Then you can use fminsearch to find the values of a, r, and b that minimize the

sum of squared errors (and therefore minimize the root-mean squared error). Use this method

to find the best fit curve of the form y = aert + b. For fminsearch, use initial guesses of

a = 30, r = 0.03, and b = 300. Make a 3 × 1 column vector named ans7 with the optimal

parameters [a; r; b]. Calculate the minimum value of the sum of squared errors and store

it in the variable ans8.

Hint: If you use fminsearch properly, the answers to ans7 and ans8 can both be output by

fminsearch.

(10 points) Problem 6: Gradescope

Create a plot the contains the atmospheric CO2 data and your best fit curve from Problem 5.

Use the same specifications as given in Problem 4.

(10 points) Problem 7: MATLAB Grader The best fit curve from Problems 5 and 6 does

a much better job of capturing the overall trend of the data, but it still does not capture the

seasonal oscillations. In order to capture the oscillations, we can find a best fit curve of the

form

y = aert + b + c sin(d(t − e))

Create an error function and use fminsearch to find the values of the parameters that minimize

the sum of squared errors. For your initial guess, use the values of a, r, and b that you stored in

ans7 and use c = 3, d = 6 and e = 0. Make a 6×1 column vector named ans9 with the optimal

parameters [a; r; b; c; d; e]. Calculate the minimum value of the sum of squared errors

and store it in ans10.

(10 points) Problem 8: Gradescope

Create a plot the contains the atmospheric CO2 data and your best fit curve from Problem 7.

When you plot the best fit curve, make sure you use enough points to capture the oscillations.

Use the same specifications as given in Problem 4.

(5 points) Problem 9: Gradescope Which of the following types of error is most resistant

to outliers (i.e. is least affected by the presence of outliers)?

(a) average error

(b) maximum error

(c) root-mean square error

Gradescope Deliverables Your Gradescope writeup should contain the following:

• Problem 2: The plot with the population data a polynomial fit curves, and answers to

the questions in (b) and (c).

• Problem 4: The plot

• Problem 6: The plot

• Problem 8: The plot

• Problem 9: A letter corresponding to your answer choice

• Code: Code for problems 2, 4, 6, and 8