I follow some math blogs partly for entertainment, mostly because of professional interest. So I came across the following page.

It contains one of those extrapolations that economy students seem to love. We have the following data.

>yp=[2,4,3,5,9,6,8,9,8,10]

[2, 4, 3, 5, 9, 6, 8, 9, 8, 10]

>xp=1:cols(yp);

Here is a plot.

>statplot(xp,yp):

The suggested model for this curve is

I am not going to discuss the arbitrariness of this here. For a discussion about the problems of prediction see

This link is from the Intmath page. Or see my own discussion of Taleb's book on

As a first step, we implement a function, which computes the sum of the square errors of the model. The square error is the standard way to measure the error of an approximation in statistics.

>function d([a,b,c],xp,yp) := norm(a+b*exp(c*xp)-yp)

The Nelder-Mead method can minimize this, if we take reasonable start values.

>p=nelder("d",[0,-1,-1];xp,yp)

[12.6146, -12.2232, -0.136811]

>function fp (x,p) := p[1]+p[2]*exp(p[3]*x)

Let us plot the result.

>statplot(xp,yp); ... plot2d("fp(x,p)",>add,color=red):

Of course, the parameters of the solution are only meaningful, if the model is. In our case, this is a matter of discussion.

Even more doubtful is an extrapolation.

>plot2d("fp(x,p)",1,30,color=red); ... plot2d(xp,yp,>add); ... plot2d(xp,yp,>points,>add):

The problem discussed on the Intmath page is the following: Can you determine the best fit such that

It is interesting that the best fit does this automatically. To see this, we differentiate

to a, b, c. We do it for each summand in the sum with fixed x and y.

>&gradient((a+b*exp(c*x)-y)^2,[a,b,c])

c x c x c x [2 (- y + b E + a), 2 E (- y + b E + a), c x c x 2 b x E (- y + b E + a)]

In the minimum, this gradient will vanish. Looking at the derivative to a, we see

By the way, we can apply the Broyden algorithm to the gradient. This is much faster and more accurate than the Nelder-Mead algorithm. For this, we need to define a function for the gradient.

>function gradd ([a,b,c],xp,yp) ... w=a+b*exp(c*xp)-yp; return 2*[sum(w),sum(exp(c*xp)*w),sum(b*xp*exp(c*xp)*w)] endfunction

In our approximate solution, the gradient is already small.

>gradd(p,xp,yp)

[6.36603e-006, 2.01653e-006, -0.00021505]

Start the Broyden method.

>p1=broyden("gradd",p;xp,yp)

[12.6146, -12.2232, -0.136811]

The result is very good.

>gradd(p1,xp,yp)

[-1.24345e-014, 0, 3.41061e-013]

The sum is 0, as expected.

>function dsum ([a,b,c],xp,yp) := sum(a+b*exp(c*xp)-yp) >dsum(p1,xp,yp)

0

EMT does also have a very general function to fit a model to data. It uses the Nelder-Mead algorithm or the Powell method for the minimization.

>modelfit("fp",[0,-1,-1],xp,yp)

[12.6146, -12.2232, -0.13681]

If we assume the model is correct, and the errors are random, we can simulate another random experiment and compute a new prediction. The error is

>s=d(p1,xp,yp)/sqrt(10)

1.12476301406

Now we generate another set of yp.

>seed(1); yp1=fp(xp,p1)+s*normal(size(xp)); ... statplot(xp,yp1):

Again, we compute a best fit using the Nelder method.

>p2=nelder("d",p1;xp,yp1)

[28.5581, -28.6788, -0.0505113]

And plot it the same way as above.

>plot2d(xp,yp1,a=0,b=30,c=0,d=30); plot2d(xp,yp1,>points,>add); ... plot2d("fp(x,p2)",1,30,color=red,>add):

Depending on your random variables, you get some value around 12 for t=30. Note, that we fixed the seed for this, so the value is always nicely at 12.

We could check the distribution of the values in a Monte-Carlo-Simulation. We have to discard all the cases where the Nelder-Mead method fails.

>function test (n) ... global xp,yp,s,p1; res=zeros(1,n); loop 1 to n; repeat yp1=fp(xp,p1)+normal(size(xp))*s; {level,p2}=errorlevel(''nelder("d",p1;xp,yp1,eps=0.001)''); if level==0 then res[#]=fp(30,p2); break; endif; end; end; return res; endfunction

If we do this, we see that the prediction is nearly useless.

>seed(1); plot2d(test(500),>distribution):