# Kantorovich theorem

The Kantorovich theorem is a mathematical statement on the convergence of Newton's method. It was first stated by Leonid Kantorovich in 1940.

Newton's method constructs a sequence of points that—with good luck—will converge to a solution ${\displaystyle x}$ of an equation ${\displaystyle f(x)=0}$ or a vector solution of a system of equation ${\displaystyle F(x)=0}$. The Kantorovich theorem gives conditions on the initial point of this sequence. If those conditions are satisfied then a solution exists close to the initial point and the sequence converges to that point.

## Assumptions

Let ${\displaystyle X\subset \mathbb {R} ^{n}}$ be an open subset and ${\displaystyle F:\mathbb {R} ^{n}\supset X\to \mathbb {R} ^{n}}$ a differentiable function with a Jacobian ${\displaystyle F^{\prime }(x)}$ that is locally Lipschitz continuous (for instance if it is twice differentiable). That is, it is assumed that for any open subset ${\displaystyle U\subset X}$ there exists a constant ${\displaystyle L>0}$ such that for any ${\displaystyle \mathbf {x} ,\mathbf {y} \in U}$

${\displaystyle \|F'(\mathbf {x} )-F'(\mathbf {y} )\|\leq L\;\|\mathbf {x} -\mathbf {y} \|}$

holds. The norm on the left is some operator norm that is compatible with the vector norm on the right. This inequality can be rewritten to only use the vector norm. Then for any vector ${\displaystyle v\in \mathbb {R} ^{n}}$ the inequality

${\displaystyle \|F'(\mathbf {x} )(v)-F'(\mathbf {y} )(v)\|\leq L\;\|\mathbf {x} -\mathbf {y} \|\,\|v\|}$

must hold.

Now choose any initial point ${\displaystyle \mathbf {x} _{0}\in X}$. Assume that ${\displaystyle F'(\mathbf {x} _{0})}$ is invertible and construct the Newton step ${\displaystyle \mathbf {h} _{0}=-F'(\mathbf {x} _{0})^{-1}F(\mathbf {x} _{0}).}$

The next assumption is that not only the next point ${\displaystyle \mathbf {x} _{1}=\mathbf {x} _{0}+\mathbf {h} _{0}}$ but the entire ball ${\displaystyle B(\mathbf {x} _{1},\|\mathbf {h} _{0}\|)}$ is contained inside the set X. Let ${\displaystyle M\leq L}$ be the Lipschitz constant for the Jacobian over this ball.

As a last preparation, construct recursively, as long as it is possible, the sequences ${\displaystyle (\mathbf {x} _{k})_{k}}$, ${\displaystyle (\mathbf {h} _{k})_{k}}$, ${\displaystyle (\alpha _{k})_{k}}$ according to

{\displaystyle {\begin{alignedat}{2}\mathbf {h} _{k}&=-F'(\mathbf {x} _{k})^{-1}F(\mathbf {x} _{k})\\[0.4em]\alpha _{k}&=M\,\|F'(\mathbf {x} _{k})^{-1}\|\,\|h_{k}\|\\[0.4em]\mathbf {x} _{k+1}&=\mathbf {x} _{k}+\mathbf {h} _{k}.\end{alignedat}}}

## Statement

1. a solution ${\displaystyle \mathbf {x} ^{*}}$ of ${\displaystyle F(\mathbf {x} ^{*})=0}$ exists inside the closed ball ${\displaystyle {\bar {B}}(\mathbf {x} _{1},\|\mathbf {h} _{0}\|)}$ and
2. the Newton iteration starting in ${\displaystyle \mathbf {x} _{0}}$ converges to ${\displaystyle \mathbf {x} ^{*}}$ with at least linear order of convergence.

A statement that is more precise but slightly more difficult to prove uses the roots ${\displaystyle t^{\ast }\leq t^{**}}$ of the quadratic polynomial

${\displaystyle p(t)=\left({\tfrac {1}{2}}L\|F'(\mathbf {x} _{0})^{-1}\|^{-1}\right)t^{2}-t+\|\mathbf {h} _{0}\|}$,
${\displaystyle t^{\ast /**}={\frac {2\|\mathbf {h} _{0}\|}{1\pm {\sqrt {1-2\alpha }}}}}$

and their ratio

${\displaystyle \theta ={\frac {t^{*}}{t^{**}}}={\frac {1-{\sqrt {1-2\alpha }}}{1+{\sqrt {1-2\alpha }}}}.}$

Then

1. a solution ${\displaystyle \mathbf {x} ^{*}}$ exists inside the closed ball ${\displaystyle {\bar {B}}(\mathbf {x} _{1},\theta \|\mathbf {h} _{0}\|)\subset {\bar {B}}(\mathbf {x} _{0},t^{*})}$
2. it is unique inside the bigger ball ${\displaystyle B(\mathbf {x} _{0},t^{*\ast })}$
3. and the convergence to the solution of ${\displaystyle F}$ is dominated by the convergence of the Newton iteration of the quadratic polynomial ${\displaystyle p(t)}$ towards its smallest root ${\displaystyle t^{\ast }}$,[1] if ${\displaystyle t_{0}=0,\,t_{k+1}=t_{k}-{\tfrac {p(t_{k})}{p'(t_{k})}}}$, then
${\displaystyle \|\mathbf {x} _{k+p}-\mathbf {x} _{k}\|\leq t_{k+p}-t_{k}.}$
4. The quadratic convergence is obtained from the error estimate[2]
${\displaystyle \|\mathbf {x} _{n+1}-\mathbf {x} ^{*}\|\leq \theta ^{2^{n}}\|\mathbf {x} _{n+1}-\mathbf {x} _{n}\|\leq {\frac {\theta ^{2^{n}}}{2^{n}}}\|\mathbf {h} _{0}\|.}$

## Notes

1. {{#invoke:Citation/CS1|citation |CitationClass=journal }}
2. {{#invoke:Citation/CS1|citation |CitationClass=journal }}

## Literature

• Kantorowitsch, L. (1948): Functional analysis and applied mathematics (russ.). UMN3, 6 (28), 89–185.
• Kantorowitsch, L. W.; Akilow, G. P. (1964): Functional analysis in normed spaces.
• P. Deuflhard: Newton Methods for Nonlinear Problems. Affine Invariance and Adaptive Algorithms., Springer, Berlin 2004, ISBN 3-540-21099-7 (Springer Series in Computational Mathematics, Vol. 35)