Recall how the bisection method finds roots of functions in one dimension.
The root is supposed to have been bracketed in an interval
where
and
have different signs. One then evaluates the function at an
intermediate point x and obtains a new, smaller bracketing interval, either
or
. The process continues until the bracketing interval is
acceptably small. It is optional to choose
to be the midpoint of
so that the decrease in the interval length is maximized, when the function
is as uncooperative as it can be, i.e., when the (bad) luck of the draw
forces you to take the bigger bisected segment.
A similar procedure with a simple alteration can be applied to the problem
of minimization. What does it mean to bracket a minimum? A root of a
function is known to be bracketed by a pair of points,
, when the
function has opposite sign at those two points. A minimum, by contrast, is
known to be bracketed only when there is a triplet of points,
, such
that
is less than or equal to both
and
. In this case we
know that the function (if it is nonsingular) has a minimum in the interval
.
The analog of bisection is to choose a new point
, either between
and
or between
and
. Suppose, to be specific, that we make the latter
choice. Then we evaluate
. If
, then the new bracketing
triplet of points is
; otherwise, if
, then the new
bracketing triplet is
. In all cases the middle point of the new
triplet is the abscissa whose ordinate is the best minimum achieved so far;
see Figure 1.6
. We continue the process of bracketing until the distance between the two
outer points of the triplet is sufficiently small.
How small is ''sufficiently'' small? For a minimum located at a value
,
you might naively think that you will be able to bracket it in as small a
range as
, where
is your
computer's floating point epsilon, a number like
(single
precision) or
(double precision).1.4
Not so! In general, the shape of your function
near
will be given
by Taylor's theorem
| or |
The minimum-finding routines of this chapter will often call for a
user-supplied argument
, and return with an abscissa whose fractional
precision is about
(bracketing interval of fractional size about
). Unless you have a better estimate for the right-hand side of equation (1.6), you should set
equal to (not much less than) the square root of your machine's floating precision, since smaller
values will gain you nothing.
It remains to decide on a strategy for choosing the new point
, given
. Suppose that
is a fraction
of the way between
and
,
i.e.
Given, at each stage, a bracketing triplet of points, the next point to be
tried is that which is a fraction
into the larger of the two
intervals (measuring from the central point of the triplet). If you start
out with a bracketing triplet whose segments are not in the golden ratios,
the procedure of choosing successive points at the golden mean point of the
larger segment will quickly converge to the proper, self- replicating ratios.
The golden section search guarantees that each new function evaluation will
(after self-replicating ratios have been achieved) bracket the minimum to an
interval just
times the size of the preceding interval - even in
the worst case. This is comparable to, but not quite as good as, the
which holds when finding roots by bisection. Note that the convergence is
linear, meaning that successive significant figures are gained linearly with
additional function evaluations.
MATLAB Algorithm for the Golden Section Search:
function [xmin, fmin] = golden(f, ax, bx, cx, tol)
%GOLDEN Minimize function of one variable using golden section search
%
% [xmin, fmin] = golden(f, ax, bx, cx, tol) computes a local minimum
% of f. xmin is the computed local minimizer of f and fmin is
% f(xmin). xmin is computed to an relative accuracy of TOL.
%
% The parameters ax, bx and cx must satisfy the following conditions:
% ax < bx < cx, f(bx) < f(ax) and f(bx) < f(cx).
%
% xmin satisfies ax < xmin < cx. golden is guaranteed to succeed if f
% is continuous between ax and cx
%
% Roman Geus, ETH Zuerich, 9.12.97
C = (3-sqrt(5))/2;
R = 1-C;
x0 = ax;
x3 = cx;
if (abs(cx-bx) > abs(bx-ax)),
x1 = bx;
x2 = bx + C*(cx-bx);
else
x2 = bx;
x1 = bx - C*(bx-ax);
end
f1 = feval(f,x1);
f2 = feval(f,x2);
k = 1;
while abs(x3-x0) > tol*(abs(x1)+abs(x2)),
fprintf(1,'k=%4d, |a-b|=%e\n', k, abs(x3-x0));
if f2 < f1,
x0 = x1;
x1 = x2;
x2 = R*x1 + C*x3; % x2 = x1+c*(x3-x1)
f1 = f2;
f2 = feval(f,x2);
else
x3 = x2;
x2 = x1;
x1 = R*x2 + C*x0; % x1 = x2+c*(x0-x2)
f2 = f1;
f1 = feval(f,x1);
end
k = k+1;
end
if f1 < f2,
xmin = x1;
fmin = f1;
else
xmin = x2;
fmin = f2;
end
When
then choose
so that
.
Compute
and replace
,
,
by
,
and
.
Repeat this, until
, then
start with Brent's minimization algorithm.
In each such step,
grows by a factor of
.
Notice that this search may not terminate - e.g. for functions like
or
, in which
case a general algorithm should limit the number of trials of the
exponential search.
The golden section search (Brent's method) is designed to handle, in effect, the worst possible case of function minimization in one dimension, with the uncooperative minimum hunted down and cornered like a scared rabbit. But why assume the worst? If the function is nicely parabolic near its minimum - surely the generic case for sufficiently smooth functions - then the parabola fitted through any three points ought to take us in a single leap to the minimum, or at least very near ´ it. Since we want to find an abscissa rather than an ordinate, the procedure is technically called inverse parabolic interpolation.
The formula for the abscissa
which is the minimum of a parabola through
three points
,
and
is
This formula fails only if three points are collinear, in which case the denominator is zero (minimum of the parabola is infinitely far away). Note, however, that Equation (1.12) is as likely to jump to a parabolic maximum as to a minimum. No minimization scheme that depends solely on Equation (1.12) is likely to succeed in practice.
A good goal is to design a scheme which relies on a sure-but-slow technique, like golden section search, when the function is not cooperative, but which switches over to Equation (1.12) when the function behaves well. The task is nontrivial for several reasons, including:
This improved method is up to the task in all particulars, but will not be discussed in detail here, since the algorithm has a length of about 4 printed pages.
Gaston Gonnet, Institute for Scientific Computing, ETH Zürich, Switzerland