Population

Name:

Restore previous session, if applicable
Start blank quiz
Note:

If you wish to ask for a hint or a solution, or if you want your answers checked, you have to log in. Your answers, or the fact that you asked for a hint or solution, may be recorded.




Usage notes

The population in the U.S. in the years 1920-1990 is given by the table



where t is the year and y is the population in millions. The object of this exercise is to predict the population in the year 2000 by using different polynomial extrapolations.


a)

First we are going to fit the given data in the least squares sense using the quadratic model


To what kind of least squares problem will this lead ?

linear

nonlinear




How large is this equation system?

Number of equations = ___________________
Number of unknowns = ___________________




Fill in the following values, assuming that the input data is ordered as in the above matrix-equation.

A[1,2] = ___________________ A[3,1] = ___________________ A[6,3] = ___________________ b[4] = ___________________



Type in the Maple command, which solves the least squares problem in one step! Assume that the necessary package is available and that A and b are properly initialized. Assign the result to c. Please insert no spaces in the Maple commands you enter.

___________________



The result is


What population in the year 2000 does this model predict? (Accuracy: at least full millions).

___________________



Now we are going to do the SVD analysis. Which Maple command can be used to compute the singular values in one step? Assume that the necessary package is available, and that A and b are properly initialized. Do not assign the result. No spaces in expression.

___________________



After applying the evalf() function, the result is


Recall that the condition of a problem is good (resp. bad) if small changes in the input data cause small (resp. large) changes in the solution. The condition number for square matrices



can be generalized to non square matrices by



Its interpretation is that k(A)~=1 means good condition while k(A)>>1 means bad condition. Compute the condition number for our problem (2 significant digits).

___________________



Choose one of the following interpreations:

The matrix A is badly conditioned. This comes not from the problem itself, but from the bad scaling.

Because of k(A)>>1 the matrix is badly conditioned. So we set sigma_3 = 0 and we compute again the solution c and the extrapolation for t = 2000.

As always when solving a least squares problem using SVD, we drop at least one (the smallest) singular value.

Because of k(A)>>1 the matrix A is badly conditioned. This comes from the bad scaling. But by now, we cannot know if it also comes from the problem itself.

Because of the high condition number, the problem is well-conditioned.




b)

Now we are going to fit the same data using the similar model


The data range for s is now from ___________________ to ___________________ and the value to predict is s= ___________________



After computing the 8 values for s and solving the least squares problem as in a), we get

and the result of the SVD is


The additional code used to compute this was:
   with(linalg):
   t := vector([1920,1930,1940,1950,1960,1970,1980,1990]):
   y := vector([ 105, 123, 131, 150, 179, 203, 228, 249]):
   m := vectdim(t):
   A := matrix([seq([1,t[i],t[i]^2],i=1..m)]):
   c := leastsqrs(A, y):
   evalf(evalm(c));
   y_2000 := evalf(c[1]+c[2]*2000+c[3]*2000^2);
   Sigma  := evalf(Svd(A));
   kappa  := Sigma[1]/Sigma[vectdim(Sigma)];
        

Choose one of the following statements:

The models in a) and b) do not determine the same function t ->y(t), and the predicted population in the year 2000 is better than the previous result.

The models in a) and b) determine the same function t->y(t), so the predicted population in the year 2000 is equal to the previous result.

The models in a) and b) do not determine the same function t->y(t), but the predicted population in the year 2000 is equal to the previous result.




Compute the condition number for this problem (at least 2 significant digits).
k(A)= ___________________



Choose one of the following interpretations:

The condition number is not too large, and the components of d are now within the same order of magnitude. We conclude that the artificial bad scaling was responsible for the large condition number in a).

The condition number is still very large, so the problem is ill posed independently of the scaling.




c)

Now we are going to use the model

to fit the population data. What is the problem when fitting a point set with a polynomial of high degree?
___________________



First assume that A has been defined by
A:=matrix([seq(seq(s[i]^j,j=0..6)],i=1..m)]);

, that only the three parameters d2, d3 and d5 are different from zero, and that you want to solve this reduced problem using the command
d:=leastsqrs(Ak,y);

Give the command to define Ak. Use no spaces.
___________________




The result with 3 significant digits is



and with the command norm(evalm(Ak&*d-y),2) we get the residue norm 249. Now we want to find the three columns that lead to the minimal residue. This type of problem is called
___________________



The number of possible choices is
___________________



What is the Maple command to generate all combinations of different numbers between 1 and 7, like for the example [3,4,6]?
___________________



we get the result


with a residue of only 8.74. What is the norm of d and the prediction y(s=0) (3 significant digits)?
||d||= ___________________ y(s=0)= ___________________



If we do the same for all numbers k between 1 and 7 and round to 3 significant digits, we get:


Which of the following statements is true for every best basis problem:

With increasing number of variables, the solution norm increases.

With increasing number of variables, the residue norm decreases.




Back to our problem: What are good criteria to choose the right k:

Choose k as large as possible (in our case 7) because this will minimize the error.

Choose k so that the coefficients di are in the same order of magnitude as the values of y (in our case 105...249).

Choose k less than and not too close tho the number of data points (in our case 7).




Home    

Feedback to quiz developers, problem reports...please use it!!!


Powered by PearlQuiz. With assistance from SkillsOnline and Web Pearls.