$$ \newcommand{\RR}{\mathbb{R}} \newcommand{\QQ}{\mathbb{Q}} \newcommand{\CC}{\mathbb{C}} \newcommand{\NN}{\mathbb{N}} \newcommand{\ZZ}{\mathbb{Z}} \newcommand{\EE}{\mathbb{E}} \newcommand{\HH}{\mathbb{H}} \newcommand{\SO}{\operatorname{SO}} \newcommand{\dist}{\operatorname{dist}} \newcommand{\length}{\operatorname{length}} \newcommand{\uppersum}[1]{{\textstyle\sum^+_{#1}}} \newcommand{\lowersum}[1]{{\textstyle\sum^-_{#1}}} \newcommand{\upperint}[1]{{\textstyle\smallint^+_{#1}}} \newcommand{\lowerint}[1]{{\textstyle\smallint^-_{#1}}} \newcommand{\rsum}[1]{{\textstyle\sum_{#1}}} \newcommand{\partitions}[1]{\mathcal{P}_{#1}} \newcommand{\erf}{\operatorname{erf}} \newcommand{\ihat}{\hat{\imath}} \newcommand{\jhat}{\hat{\jmath}} \newcommand{\khat}{\hat{k}} \newcommand{\pmat}[1]{\begin{pmatrix}#1\end{pmatrix}} \newcommand{\smat}[1]{\left(\begin{smallmatrix}#1\end{smallmatrix}\right)} $$

11  Linearization

11.1 The Fundamental Strategy of Calculus

Take a complicated function, zoom in, replace it with something linear. Thus we are looking to replace graphs by planes. We have two formulas for planes (implicit and parametric) so we can have two ways to do this.

First we will look at the more “implicit” approach, and produce an equation of a plane of the form \(z=ax+by+c\) (this is just re-arranged from our most general form in the previous chapters)

To start, look back on the tangent line formula from single variable calculus:

\[y=y_0+f^\prime(x_0)(x-x_0)\]

Our extension to multiple dimensions is just….to add more variables! We need to adjust \(z\) not just for changes in \(x\) any more, but also for changes in any input.

Theorem 11.1 (Tangent Plane) \[z=z_0+f_x(x_0,y_0)(x-x_0)+f_y(x_0,y_0)(y-y_0)\]

Same formula works in higher dimensions, by just adding more terms. Since this gives us an implicit plane, we can re-arrange this to the “standard form” and find the normal vector to the graph

Theorem 11.2 (Normal Vector to a Graph) At the point \((x_0,y_0)\) the plane \[f_x(x_0,y_0)(x-x_0)+f_y(x_0,y_0)(y-y_0)-z=-z_0\] is tangent to the graph of \(f\). Thus, the normal vector is the coefficient vector \[n=\langle f_x(x_0,y_0),f_y(x_0,y_0),-1\rangle\]

Note that any scalar multiple of this vector is also a normal vector to the graph - this just provides one such vector. And, since the \(z\) is downwards, this is the downwards pointing normal vector.

The downward facing normal, on two graphs.

Depending on the application, sometimes we want the upward facing normal: that’s what you’d get by multiplying this by \(-1\) so that the last coordinate is a 1:

The upward facing normal.

We will use normal vectors to surfaces a lot in the last portion of this course, on Vector Analysis. Here we will often need to be careful, and thinking about whether we want the normal that is pointed up or down in a given application.

Example 11.1 Find the tangent plane to \(z=x^2+2y^2\) above the point \((x,y)=(2,3)\).

The second method we could use is to try and find a parametric equation for the plane. Again, we start by considering the one dimensional case. A parametric line is of the form \(vt+p\) for \(v\) the direction and \(p\) a point. How do we find the tangent line to a function \(y=f(x)\)? The derivative measures the slope, so if we go over by 1 unit in the x direction we go up by \(f^\prime(x_0)\) units in the \(y\)-direction. Thus, the direction vector is

\[v=\langle 1, f^\prime(x_0)\rangle\]

Now all we need is the position: but that’s just \((x_0,f(x_0))\) on the graph! Putting these together we get the tangent line \[L(t)=\pmat{x_0\\ f(x_0)}+t\pmat{1\\ f^\prime(x_0)}\]

What do we do for a function of two or more variables? We just…add more variables!

Theorem 11.3 (Parametric Tangent Plane) \[\vec{L}(s,t)=\begin{pmatrix}x_0\\y_0\\f(x_0,y_0)\end{pmatrix}+s\begin{pmatrix}1\\0\\f_x(x_0,y_0)\end{pmatrix}+t\begin{pmatrix}0\\1\\f_y(x_0,y_0)\end{pmatrix}\]

Example 11.2 Find a parametric equation for the tangent plane to the saddle \(x^2-y^2\) at the point \((1,1,0)\).

11.1.1 Differentiability

Partial derivatives give us a way to test if a function itself is differentiable in multiple variables.

Definition 11.1 (Multivariate Differentiablity) A function \(f\) is differentiable at a point \(p\) if its tangent plane at \(p\) is a good approximation to its graph near \(p\).

Functions we will see are mostly differentiable, but warning there are functions that are not. Luckily, there’s an easy way to check

Theorem 11.4 (Multivariate Differentiability) A multivariate function is differentiable at a point, if and only if all of its partial derivatives exist and are continuous at that point.

11.2 Differentials

The Fundamental Strategy of calculus is to take a complicated nonlinear object (like a function that you encounter in some real-world problem) and zoom in until it looks linear. Here, this zooming in process is realized by finding the tangent plane. Close to the point \((x_0,y_0)\) the graph of the function \(z=f(x,y)\) looks like \[L(x,y)=z_0+f_x(x_0,y_0)(x-x_0)+f_y(x_0,y_0)(y-y_0)\]

Where \(z_0=f(x_0,y_0)\). This is just rehashing our definition of the tangent plane of course: but one use for it is to be able to approximate the value of \(f(x,y)\) if you know the value of \(f\) at a nearby point \((x_0,y_0)\) and also its partial derivatives there.

Example 11.3 Find approximate value value of \(x^2+3xy-y^2\) at the point \((2.05,2.96)\).

Using linearization to estimate changes in a value: fundamental to physics and engineering. In 1-dimension, we define a variable called \(dx\) that we think of as measuring small changes in the input variable, and \(dy=y-y_0\) which measures small changes in the output. These are related by

\[dy=f^\prime(x)dx\]

So, any change in the input is multiplied by the derivative to give a change in the output. We can do a similar thing in more variables. For a function \(f(x,y)\), we have the tangent plane above \[z=z_0+f_x(x_0,y_0)(x-x_0)+f_y(x_0,y_0)(y-y_0)\]

Subtracting \(z_0\) and setting \(dz=z-z_0\), and similarly for \(dx,dy\) we can rewrite this as below:

Definition 11.2 (Differentials) \[dz=f_x(x,y)dx+f_y(x,y)dy\]

This allows us to easily estimate how much \(z\) could change if we know how much \(x\) and \(y\) can change. This is of fundamental importance in error analysis, the foundation of all experimental science’s ability to compare with theoretical predictions.

Example 11.4 The volume of a cone is given by \(V=\pi r^2 h/3\). We have a cone which we measure the height to be 10cm and the radius to be 25cm, but our measuring device can have an error up to 1mm or \(0.1cm\). What is the estimated maximal error in volume our measurement could have?

This measurement can also be interpreted geometrically: this is the approximate volume of a thin-shelled cone of thickness \(1mm\) with radius 25cm and height 10cm.

Example 11.5 The dimensions of a rectangular box are measured to be 75cm, 60cm and 40cm. Each measurement is correct to within \(0.05cm\). What is the maximal error in volume measurement we might expect?

11.3 Videos

11.3.1 Calculus Blue:

11.3.2 Khan Academy

11.3.3 Example Problems: