do you remember how to check for the presence of a local max or local min at a point where the derivative is zero?
It went like this: if f'(a) = 0 then you need to check the sign of f"(a). If f"(a) < 0 then it is a local max, if f"(a)>0 it is a local min, and if f"(a)=0, then there is an inflection point at a. This was in calc1, in calc3 the situation is similar but unfortunately more complicated here is the proposition,
In order to understand the statement above we need to define
what we mean by |
> A := matrix(2,2,[fxx,fxy,fyx,fyy]);
[fxx fxy] A := [ ] [fyx fyy]
we say that A is positive (or positive definite) if, |
> vector([a,b]) &* evalm(A) &* matrix(2,1,[a,b]) > 0;
[fxx fxy] [a] 0 < [a, b] &* [ ] &* [ ] [fyx fyy] [b]
and the inequality is true for all possible values of a and b as long as they are not both zero. Notice that this simplifies to, |
> collect(evalm(vector([a,b]) &* evalm(A) &* matrix(2,1,[a,b]))[1],[a,b]) > 0;
2 2 0 < fxx a + (fyx + fxy) b a + b fyy
if the matrix A is symmetric (i.e. when fyx = fxy) this simplifies even more to, |
> subs(fyx=fxy,");
2 2 0 < fxx a + 2 fxy b a + b fyy
Now notice that by factorizing out fxx and completing the square we get the rigth hand side of the above inequality to be: |
> fxx*( (a + (fxy/fxx)*b)^2 - ((fxy/fxx)*b)^2 + (fyy/fxx)*b^2);
/ 2 2 2\ |/ fxy b\2 fxy b fyy b | fxx ||a + -----| - ------- + ------| |\ fxx / 2 fxx | \ fxx /
and the two last terms can be re-order as, |
> R := fxx*( (a + (fxy/fxx)*b)^2 + (1/fxx^2)*(fxx*fyy-fxy^2)*b^2);
/ 2 2\ |/ fxy b\2 (fxx fyy - fxy ) b | R := fxx ||a + -----| + -------------------| |\ fxx / 2 | \ fxx /
so if we call DET the determinant of the matrix A (see above) when fxy=fyx, |
> DET := fxx*fyy - fxy^2;
2 DET := fxx fyy - fxy
we can now see that the sign of R is controlled by the sign of fxx,
provided that DET > 0. This will be useful later.
Let us now prove the theorem above,
As we know from calc1 g(t) will have a local max at t=0 provided that, g'(0)=0 and g"(0)<0. From the chain rule we get, |
> Dg := diff(f(x(t),y(t)),t);
/d \ /d \ Dg := D[1](f)(x(t), y(t)) |-- x(t)| + D[2](f)(x(t), y(t)) |-- y(t)| \dt / \dt /
where D[i](f), is maple's notation for the partial derivative of f w.r.t. the ith variable. Here, i=1 means w.r.t. x and i=2 w.r.t. y. If we denote by fx,fy,fxx,fxy,fyy the first and second order partial derivatives of f at (a,b) then the above expresion when t=0 simplifies to, |
> Dg0 := fx*u + fy*v;
Dg0 := fx u + fy v
where x'(0)=u and y'(0)=v. The only way in which g'(0)=0 FOR ALL paths is that Dg0=0 for all (u,v)'s in particular when (u,v)=(fx,fy) we must have |(fx,fy)| = 0 and thus, fx=fy=0. Let us now turn to the computation of g"(0). Taking the derivative of Dg w.r.t. t we get, |
> D2g := diff(Dg,t);
/ /d \ /d \\ D2g := |D[1, 1](f)(x(t), y(t)) |-- x(t)| + D[1, 2](f)(x(t), y(t)) |-- y(t)|| \ \dt / \dt // / 2 \ /d \ |d | |-- x(t)| + D[1](f)(x(t), y(t)) |--- x(t)| + \dt / | 2 | \dt / / /d \ /d \\ |D[1, 2](f)(x(t), y(t)) |-- x(t)| + D[2, 2](f)(x(t), y(t)) |-- y(t)|| \ \dt / \dt // / 2 \ /d \ |d | |-- y(t)| + D[2](f)(x(t), y(t)) |--- y(t)| \dt / | 2 | \dt /
this looks complicated but it is nothing but the product rule and the chain rule applied to the expression Dg. Now when t=0, the above expression simplifies (with our notation) to: |
> D2g0 := (fxx*u + fxy*v)*u + (fxy*u+fyy*v)*v;
D2g0 := (fxx u + fxy v) u + (fxy u + fyy v) v
hey, what happened to the other two terms with the second derivatives of x(t) and y(t)? well... recall that t=0 and thus fx=fy=0 so the two terms with the second derivatives are just 0. Notice also that since we are assuming that the second order partials are continuous at (a,b) then fxy=fyx. Moreover, D2g0 is nothing but the expression R (above) since, |
> sort(expand(D2g0),[u,v]);
2 2 fxx u + 2 fxy u v + fyy v
Hence, the sign of the above expression controls wether we are
at a local max, local min or a saddle point on the surface z=f(x,y).
This expression is nothing but R and thus, its sign is by definition
the sign of the matrix of second derivatives A (see above). So if
you understand sign of second derivative to mean the sign of the
Hessian (by the way this is the usual name of the matrix of
second derivatives of a function of several variables) then the calc1
theorem reads the same as the calc3 theorem!. But in the calc3 version
there is the possibility that the Hessian is INDEFINITE i.e. not positive
not negative, nor zero! When the Hessian is indefinite then
there are two possibilites either DET<0 in which case we are
in the pressence of a saddle point or DET=0 and we can't use
this theorem to find out local max, mins or saddle points.
We can summarized what we have shown above in the following useful,
|