# Difference between revisions of "Diagonalizable matrix"

en>Giftlite m (mv elinks down) |
|||

Line 1: | Line 1: | ||

− | In [[linear algebra]], a [[square matrix]] ''A'' is called '''diagonalizable''' if it is [[similar (linear algebra)|similar]] to a [[diagonal matrix]], i.e., if there exists an [[invertible matrix]] ''P'' such that ''P''<sup> | + | In [[linear algebra]], a [[square matrix]] ''A'' is called '''diagonalizable''' if it is [[similar (linear algebra)|similar]] to a [[diagonal matrix]], i.e., if there exists an [[invertible matrix]] ''P'' such that ''P''<sup>−1</sup>''AP'' is a diagonal matrix. If ''V'' is a finite-[[dimension (linear algebra)|dimension]]al [[vector space]], then a [[linear operator|linear map]] ''T'' : ''V'' → ''V'' is called '''diagonalizable''' if there exists an [[Basis (linear algebra)#Ordered bases and coordinates|ordered basis]] of ''V'' with respect to which ''T'' is represented by a diagonal matrix. '''Diagonalization''' is the process of finding a corresponding [[diagonal matrix]] for a diagonalizable matrix or linear map.<ref>Horn & Johnson 1985</ref> A square matrix which is not diagonalizable is called ''[[Defective matrix|defective]].'' |

− | Diagonalizable matrices and maps are of interest because diagonal matrices are especially easy to handle: | + | Diagonalizable matrices and maps are of interest because diagonal matrices are especially easy to handle: [[Transformation_matrix#Eigenbasis_and_diagonal_matrix|their eigenvalues and eigenvectors are known]] and one can raise a diagonal matrix to a power by simply raising the diagonal entries to that same power. Geometrically, a diagonalizable matrix is an '''[[inhomogeneous dilation]]''' (or ''anisotropic scaling'') — it [[Scaling (geometry)|scales]] the space, as does a ''[[homogeneous dilation]]'', but by a different factor in each direction, determined by the scale factors on each axis (diagonal entries). |

− | == | + | == Characterization == |

The fundamental fact about diagonalizable maps and matrices is expressed by the following: | The fundamental fact about diagonalizable maps and matrices is expressed by the following: | ||

− | * An ''n'' | + | * An ''n''×''n'' matrix ''A'' over the [[field (mathematics)|field]] ''F'' is diagonalizable [[if and only if]] the sum of the [[dimension (linear algebra)|dimension]]s of its eigenspaces is equal to ''n'', which is the case if and only if there exists a [[basis (linear algebra)|basis]] of ''F''<sup>''n''</sup> consisting of eigenvectors of ''A''. If such a basis has been found, one can form the matrix ''P'' having these basis vectors as columns, and ''P''<sup>−1</sup>''AP'' will be a diagonal matrix. The diagonal entries of this matrix are the eigenvalues of ''A''. |

− | * A linear map ''T'': ''V'' → ''V'' is diagonalizable if and only if the sum of the [[dimension (linear algebra)|dimension]]s of its eigenspaces is equal to dim(''V''), which is the case if and only if there exists a basis of ''V'' consisting of eigenvectors of ''T''. With respect to such a basis, ''T'' will be represented by a diagonal matrix. The diagonal entries of this matrix are the eigenvalues of ''T''. | + | * A linear map ''T'' : ''V'' → ''V'' is diagonalizable if and only if the sum of the [[dimension (linear algebra)|dimension]]s of its eigenspaces is equal to dim(''V''), which is the case if and only if there exists a basis of ''V'' consisting of eigenvectors of ''T''. With respect to such a basis, ''T'' will be represented by a diagonal matrix. The diagonal entries of this matrix are the eigenvalues of ''T''. |

Another characterization: A matrix or linear map is diagonalizable over the field ''F'' if and only if its [[minimal polynomial (linear algebra)|minimal polynomial]] is a product of distinct linear factors over ''F''. (Put in another way, a matrix is diagonalizable if and only if all of its [[elementary divisor]]s are linear.) | Another characterization: A matrix or linear map is diagonalizable over the field ''F'' if and only if its [[minimal polynomial (linear algebra)|minimal polynomial]] is a product of distinct linear factors over ''F''. (Put in another way, a matrix is diagonalizable if and only if all of its [[elementary divisor]]s are linear.) | ||

Line 13: | Line 13: | ||

The following sufficient (but not necessary) condition is often useful. | The following sufficient (but not necessary) condition is often useful. | ||

− | * An ''n'' | + | * An ''n''×''n'' matrix ''A'' is diagonalizable over the field ''F'' if it has ''n'' distinct eigenvalues in ''F'', i.e. if its [[characteristic polynomial]] has ''n'' distinct roots in ''F''; however, the converse may be false. Let us consider |

:: <math>\begin{bmatrix} -1 & 3 & -1 \\ -3 & 5 & -1 \\ -3 & 3 & 1 \end{bmatrix},</math> | :: <math>\begin{bmatrix} -1 & 3 & -1 \\ -3 & 5 & -1 \\ -3 & 3 & 1 \end{bmatrix},</math> | ||

− | : which has eigenvalues 1, 2, 2 (not all distinct) and is diagonalizable with diagonal form ( [[similar (linear algebra)|similar]] to A) | + | : which has eigenvalues 1, 2, 2 (not all distinct) and is diagonalizable with diagonal form ( [[similar (linear algebra)|similar]] to ''A'') |

:: <math>\begin{bmatrix} 1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 2 \end{bmatrix}</math> | :: <math>\begin{bmatrix} 1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 2 \end{bmatrix}</math> | ||

− | : and [[change of basis|change of basis matrix]] P | + | : and [[change of basis|change of basis matrix]] ''P'' |

:: <math>\begin{bmatrix} 1 & 1 & -1 \\ 1 & 1 & 0 \\ 1 & 0 & 3 \end{bmatrix}.</math> | :: <math>\begin{bmatrix} 1 & 1 & -1 \\ 1 & 1 & 0 \\ 1 & 0 & 3 \end{bmatrix}.</math> | ||

+ | : The converse fails when ''A'' has an eigenspace of dimension higher than 1. In this example, the eigenspace of ''A'' associated with the eigenvalue 2 has dimension 2. | ||

− | * A linear map ''T'': ''V'' → ''V'' with ''n'' = dim(''V'') is diagonalizable if it has ''n'' distinct eigenvalues, i.e. if its characteristic polynomial has ''n'' distinct roots in ''F''. | + | * A linear map ''T'' : ''V'' → ''V'' with ''n'' = dim(''V'') is diagonalizable if it has ''n'' distinct eigenvalues, i.e. if its characteristic polynomial has ''n'' distinct roots in ''F''. |

− | Let ''A'' be a matrix over ''F''. If ''A'' is diagonalizable, then so is any power of it. Conversely, if ''A'' is invertible, ''F'' is algebraically closed, and ''A<sup>n</sup>'' is diagonalizable for some ''n'' that is not an integer multiple of the characteristic of ''F'', then ''A'' is diagonalizable. Proof: If < | + | Let ''A'' be a matrix over ''F''. If ''A'' is diagonalizable, then so is any power of it. Conversely, if ''A'' is invertible, ''F'' is algebraically closed, and ''A<sup>n</sup>'' is diagonalizable for some ''n'' that is not an integer multiple of the characteristic of ''F'', then ''A'' is diagonalizable. Proof: If ''A<sup>n</sup>'' is diagonalizable, then ''A'' is annihilated by some polynomial <math>(x^n - \lambda_1) \cdots (x^n - \lambda_k)</math>, which has no multiple root (since <math>\lambda_j \ne 0</math>) and is divided by the minimal polynomial of ''A''. |

− | As a rule of thumb, over '''C''' almost every matrix is diagonalizable. More precisely: the set of complex ''n'' | + | As a rule of thumb, over '''C''' almost every matrix is diagonalizable. More precisely: the set of complex ''n''×''n'' matrices that are ''not'' diagonalizable over '''C''', considered as a [[subset]] of '''C'''<sup>''n''×''n''</sup>, has [[Lebesgue measure]] zero. One can also say that the diagonalizable matrices form a dense subset with respect to the [[Zariski topology]]: the complement lies inside the set where the [[discriminant]] of the characteristic polynomial vanishes, which is a [[hypersurface]]. From that follows also density in the usual (''strong'') topology given by a [[norm (mathematics)|norm]]. The same is not true over '''R'''. |

− | The [[Jordan–Chevalley decomposition]] expresses an operator as the sum of its semisimple (i.e., diagonalizable) part and its [[nilpotent]] part. Hence, a matrix is diagonalizable if and only if its nilpotent part is zero. Put in another way, a matrix is diagonalizable if each block in its [[Jordan form]] has no nilpotent part; i.e., one-by-one matrix. | + | The [[Jordan–Chevalley decomposition]] expresses an operator as the sum of its semisimple (i.e., diagonalizable) part and its [[nilpotent]] part. Hence, a matrix is diagonalizable if and only if its nilpotent part is zero. Put in another way, a matrix is diagonalizable if each block in its [[Jordan form]] has no nilpotent part; i.e., each "block" is a one-by-one matrix. |

== Diagonalization == | == Diagonalization == | ||

If a matrix ''A'' can be diagonalized, that is, | If a matrix ''A'' can be diagonalized, that is, | ||

+ | |||

:<math>P^{-1}AP=\begin{pmatrix}\lambda_{1}\\ | :<math>P^{-1}AP=\begin{pmatrix}\lambda_{1}\\ | ||

& \lambda_{2}\\ | & \lambda_{2}\\ | ||

& & \ddots\\ | & & \ddots\\ | ||

− | & & & \lambda_{n}\end{pmatrix} | + | & & & \lambda_{n}\end{pmatrix},</math> |

− | ,</math> | + | |

then: | then: | ||

+ | |||

:<math>AP=P\begin{pmatrix}\lambda_{1}\\ | :<math>AP=P\begin{pmatrix}\lambda_{1}\\ | ||

& \lambda_{2}\\ | & \lambda_{2}\\ | ||

& & \ddots\\ | & & \ddots\\ | ||

& & & \lambda_{n}\end{pmatrix} .</math> | & & & \lambda_{n}\end{pmatrix} .</math> | ||

− | Writing ''P'' as a [[block matrix]] of its column vectors | + | |

+ | Writing ''P'' as a [[block matrix]] of its column vectors <math>\vec{\alpha}_{i}</math> | ||

+ | |||

:<math>P=\begin{pmatrix}\vec{\alpha}_{1} & \vec{\alpha}_{2} & \cdots & \vec{\alpha}_{n}\end{pmatrix},</math> | :<math>P=\begin{pmatrix}\vec{\alpha}_{1} & \vec{\alpha}_{2} & \cdots & \vec{\alpha}_{n}\end{pmatrix},</math> | ||

+ | |||

the above equation can be rewritten as | the above equation can be rewritten as | ||

+ | |||

:<math>A\vec{\alpha}_{i}=\lambda_{i}\vec{\alpha}_{i}\qquad(i=1,2,\cdots,n).</math> | :<math>A\vec{\alpha}_{i}=\lambda_{i}\vec{\alpha}_{i}\qquad(i=1,2,\cdots,n).</math> | ||

− | |||

− | When the matrix ''A'' is a [[Hermitian matrix]] (resp. [[symmetric matrix]]), eigenvectors of ''A'' can be chosen to form an [[orthonormal basis]] of '''C'''<sup>n</sup> (resp. '''R'''<sup>n</sup>). Under such circumstance ''P'' will be a [[unitary matrix]] (resp. [[orthogonal matrix]]) and ''P''<sup>−1</sup> equals the [[conjugate transpose]] (resp. [[transpose]]) of ''P''. | + | So the column vectors of ''P'' are [[right eigenvector]]s of ''A'', and the corresponding diagonal entry is the corresponding [[eigenvalue]]. The invertibility of ''P'' also suggests that the eigenvectors are [[linearly independent]] and form a basis of ''F''<sup>''n''</sup>. This is the necessary and sufficient condition for diagonalizability and the canonical approach of diagonalization. The row vectors of ''P''<sup>−1</sup> are the [[left eigenvector]]s of ''A''. |

+ | |||

+ | When the matrix ''A'' is a [[Hermitian matrix]] (resp. [[symmetric matrix]]), eigenvectors of ''A'' can be chosen to form an [[orthonormal basis]] of '''C'''<sup>''n''</sup> (resp. '''R'''<sup>''n''</sup>). Under such circumstance ''P'' will be a [[unitary matrix]] (resp. [[orthogonal matrix]]) and ''P''<sup>−1</sup> equals the [[conjugate transpose]] (resp. [[transpose]]) of ''P''. | ||

== Simultaneous diagonalization == | == Simultaneous diagonalization == | ||

− | {{See also|Triangular matrix#Simultaneous triangularisability|l1=Simultaneous triangularisability|Weight (representation theory)}} | + | {{See also|Triangular matrix#Simultaneous triangularisability|l1=Simultaneous triangularisability|Weight (representation theory)|Positive definite matrix#Simultaneous_diagonalization|l3=Positive definite matrix}} |

− | A set of matrices are said to be ''simultaneously diagonalisable'' if there exists a single invertible matrix ''P'' such that < | + | A set of matrices are said to be ''simultaneously diagonalisable'' if there exists a single invertible matrix ''P'' such that ''P''<sup>−1</sup>''AP'' is a diagonal matrix for every ''A'' in the set. The following theorem characterises simultaneously diagonalisable matrices: A set of diagonalizable [[Commuting matrices|matrices commutes]] if and only if the set is simultaneously diagonalisable.<ref>Horn & Johnson 1985, pp. 51–53</ref> |

+ | |||

+ | The set of all ''n''×''n'' diagonalisable matrices (over '''C''') with ''n'' > 1 is not simultaneously diagonalisable. For instance, the matrices | ||

− | |||

:<math> \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix} \quad\text{and}\quad \begin{bmatrix} 1 & 1 \\ 0 & 0 \end{bmatrix} </math> | :<math> \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix} \quad\text{and}\quad \begin{bmatrix} 1 & 1 \\ 0 & 0 \end{bmatrix} </math> | ||

+ | |||

are diagonalizable but not simultaneously diagonalizable because they do not commute. | are diagonalizable but not simultaneously diagonalizable because they do not commute. | ||

− | A set consists of commuting [[normal matrix|normal matrices]] if and only if it is simultaneously diagonalisable by a [[unitary matrix]]; that is, there exists a unitary matrix ''U'' such that | + | A set consists of commuting [[normal matrix|normal matrices]] if and only if it is simultaneously diagonalisable by a [[unitary matrix]]; that is, there exists a unitary matrix ''U'' such that ''U*AU'' is diagonal for every ''A'' in the set. |

In the language of [[Lie theory]], a set of simultaneously diagonalisable matrices generate a [[toral Lie algebra]]. | In the language of [[Lie theory]], a set of simultaneously diagonalisable matrices generate a [[toral Lie algebra]]. | ||

Line 63: | Line 73: | ||

== Examples == | == Examples == | ||

=== Diagonalizable matrices === | === Diagonalizable matrices === | ||

− | * [[Involution (mathematics)|Involution]]s are diagonalisable over the reals (and indeed any field of characteristic not 2), with | + | * [[Involution (mathematics)|Involution]]s are diagonalisable over the reals (and indeed any field of characteristic not 2), with ±1 on the diagonal |

− | * Finite order [[endomorphism]]s are diagonalisable over | + | * Finite order [[endomorphism]]s are diagonalisable over '''C''' (or any algebraically closed field where the characteristic of the field does not divide the order of the endomorphism) with [[roots of unity]] on the diagonal. This follows since the minimal polynomial is [[separable polynomial|separable]], because the roots of unity are distinct. |

* [[Projection (linear algebra)|Projections]] are diagonalizable, with 0's and 1's on the diagonal. | * [[Projection (linear algebra)|Projections]] are diagonalizable, with 0's and 1's on the diagonal. | ||

− | * Real [[symmetric matrices]] are diagonalizable by [[orthogonal matrix|orthogonal matrices]]; i.e., given a real symmetric matrix | + | * Real [[symmetric matrices]] are diagonalizable by [[orthogonal matrix|orthogonal matrices]]; i.e., given a real symmetric matrix ''A'', ''Q<sup>T</sup>AQ'' is diagonal for some orthogonal matrix ''Q''. More generally, matrices are diagonalizable by [[unitary matrix|unitary matrices]] if and only if they are [[normal matrix|normal]]. In the case of the real symmetric matrix, we see that <math>A=A^T</math>, so clearly <math>AA^T=A^TA</math> holds. Examples of normal matrices are real symmetric (or skew-symmetric) matrices (e.g. covariance matrices) and [[Hermitian matrix|Hermitian matrices]] (or skew-Hermitian matrices). See [[spectral theorem]]s for generalizations to infinite-dimensional vector spaces. |

=== Matrices that are not diagonalizable === | === Matrices that are not diagonalizable === | ||

+ | In general, a [[rotation matrix]] is not diagonalizable over the reals, but all [[rotation matrix#Independent planes|rotation matrices]] are diagonalizable over the complex field. Even if a matrix is not diagonalizable, it is always possible to "do the best one can", and find a matrix with the same properties consisting of eigenvalues on the leading diagonal, and either ones or zeroes on the superdiagonal - known as [[Jordan Normal Form|Jordan normal form]]. | ||

Some matrices are not diagonalizable over any field, most notably nonzero [[nilpotent matrix|nilpotent matrices]]. This happens more generally if the [[Eigenvalue,_eigenvector_and_eigenspace#Algebraic and geometric multiplicities|algebraic and geometric multiplicities]] of an eigenvalue do not coincide. For instance, consider | Some matrices are not diagonalizable over any field, most notably nonzero [[nilpotent matrix|nilpotent matrices]]. This happens more generally if the [[Eigenvalue,_eigenvector_and_eigenspace#Algebraic and geometric multiplicities|algebraic and geometric multiplicities]] of an eigenvalue do not coincide. For instance, consider | ||

+ | |||

:<math> C = \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix}. </math> | :<math> C = \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix}. </math> | ||

− | This matrix is not diagonalizable: there is no matrix ''U'' such that < | + | |

+ | This matrix is not diagonalizable: there is no matrix ''U'' such that ''U''<sup>−1</sup>''CU'' is a diagonal matrix. Indeed, ''C'' has one eigenvalue (namely zero) and this eigenvalue has algebraic multiplicity 2 and geometric multiplicity 1. | ||

Some real matrices are not diagonalizable over the reals. Consider for instance the matrix | Some real matrices are not diagonalizable over the reals. Consider for instance the matrix | ||

+ | |||

:<math> B = \begin{bmatrix} 0 & 1 \\ -1 & 0 \end{bmatrix}. </math> | :<math> B = \begin{bmatrix} 0 & 1 \\ -1 & 0 \end{bmatrix}. </math> | ||

− | The matrix ''B'' does not have any real eigenvalues, so there is no '''real''' matrix ''Q'' such that < | + | |

+ | The matrix ''B'' does not have any real eigenvalues, so there is no '''real''' matrix ''Q'' such that ''Q''<sup>−1</sup>''BQ'' is a diagonal matrix. However, we can diagonalize ''B'' if we allow complex numbers. Indeed, if we take | ||

+ | |||

:<math> Q = \begin{bmatrix} 1 & \textrm{i} \\ \textrm{i} & 1 \end{bmatrix}, </math> | :<math> Q = \begin{bmatrix} 1 & \textrm{i} \\ \textrm{i} & 1 \end{bmatrix}, </math> | ||

− | then < | + | |

+ | then ''Q''<sup>−1</sup>''BQ'' is diagonal. | ||

Note that the above examples show that the sum of diagonalizable matrices need not be diagonalizable. | Note that the above examples show that the sum of diagonalizable matrices need not be diagonalizable. | ||

=== How to diagonalize a matrix === | === How to diagonalize a matrix === | ||

+ | Consider a matrix | ||

− | |||

:<math>A=\begin{bmatrix} | :<math>A=\begin{bmatrix} | ||

− | 1 & 2 & 0 \\ | + | 1& 2 & 0 \\ |

0 & 3 & 0 \\ | 0 & 3 & 0 \\ | ||

2 & -4 & 2 \end{bmatrix}.</math> | 2 & -4 & 2 \end{bmatrix}.</math> | ||

This matrix has [[eigenvalues]] | This matrix has [[eigenvalues]] | ||

+ | |||

: <math> \lambda_1 = 3, \quad \lambda_2 = 2, \quad \lambda_3= 1. </math> | : <math> \lambda_1 = 3, \quad \lambda_2 = 2, \quad \lambda_3= 1. </math> | ||

− | |||

− | These eigenvalues are the values that will appear in the diagonalized form of matrix | + | ''A'' is a 3×3 matrix with 3 different eigenvalues; therefore, it is diagonalizable. Note that if there are exactly ''n'' distinct eigenvalues in an ''n''×''n'' matrix then this matrix is diagonalizable. |

+ | |||

+ | These eigenvalues are the values that will appear in the diagonalized form of matrix ''A'', so by finding the eigenvalues of ''A'' we have diagonalized it. We could stop here, but it is a good check to use the [[eigenvectors]] to diagonalize ''A''. | ||

The [[eigenvectors]] of ''A'' are | The [[eigenvectors]] of ''A'' are | ||

− | : <math> v_1 = \begin{bmatrix} -1 \\ -1 \\ 2 \end{bmatrix}, \quad v_2 = \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}, \quad v_3 = \begin{bmatrix} -1 \\ 0 \\ 2 \end{bmatrix}. </math> | + | |

+ | :<math>v_1 = \begin{bmatrix} -1 \\ -1 \\ 2 \end{bmatrix}, \quad v_2 = \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}, \quad v_3 = \begin{bmatrix} -1 \\ 0 \\ 2 \end{bmatrix}.</math> | ||

+ | |||

One can easily check that <math>A v_k = \lambda_k v_k.</math> | One can easily check that <math>A v_k = \lambda_k v_k.</math> | ||

Now, let ''P'' be the matrix with these eigenvectors as its columns: | Now, let ''P'' be the matrix with these eigenvectors as its columns: | ||

− | :<math>P= | + | |

− | \begin{bmatrix} | + | :<math>P= \begin{bmatrix} |

-1 & 0 & -1 \\ | -1 & 0 & -1 \\ | ||

-1 & 0 & 0 \\ | -1 & 0 & 0 \\ | ||

Line 109: | Line 130: | ||

Note there is no preferred order of the eigenvectors in ''P''; changing the order of the [[eigenvectors]] in ''P'' just changes the order of the [[eigenvalues]] in the diagonalized form of ''A''.<ref>{{cite book| last1=Anton |first1=H. |last2= Rorres|first2= C. |title=Elementary Linear Algebra (Applications Version) |publisher=John Wiley & Sons|edition=8th|date=22 Feb 2000| ISBN= 978-0-471-17052-5}}</ref> | Note there is no preferred order of the eigenvectors in ''P''; changing the order of the [[eigenvectors]] in ''P'' just changes the order of the [[eigenvalues]] in the diagonalized form of ''A''.<ref>{{cite book| last1=Anton |first1=H. |last2= Rorres|first2= C. |title=Elementary Linear Algebra (Applications Version) |publisher=John Wiley & Sons|edition=8th|date=22 Feb 2000| ISBN= 978-0-471-17052-5}}</ref> | ||

− | Then ''P'' diagonalizes ''A'', as a simple computation confirms: | + | Then ''P'' diagonalizes ''A'', as a simple computation confirms, having calculated ''P''<sup> −1</sup> using any [[Invertible_matrix#Methods_of_matrix_inversion|suitable method]]: |

+ | |||

:<math>P^{-1}AP = | :<math>P^{-1}AP = | ||

\begin{bmatrix} | \begin{bmatrix} | ||

Line 127: | Line 149: | ||

0 & 2 & 0 \\ | 0 & 2 & 0 \\ | ||

0 & 0 & 1\end{bmatrix}.</math> | 0 & 0 & 1\end{bmatrix}.</math> | ||

+ | |||

Note that the eigenvalues <math>\lambda_k</math> appear in the diagonal matrix. | Note that the eigenvalues <math>\lambda_k</math> appear in the diagonal matrix. | ||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

== An application == | == An application == | ||

+ | Diagonalization can be used to compute the powers of a matrix ''A'' efficiently, provided the matrix is diagonalizable. Suppose we have found that | ||

− | + | :<math>P^{-1}AP = D</math> | |

− | |||

− | |||

− | :<math>P^{-1}AP = D | ||

is a diagonal matrix. Then, as the matrix product is associative, | is a diagonal matrix. Then, as the matrix product is associative, | ||

− | :<math>\begin{align} A^k &= (PDP^{-1})^k = (PDP^{-1}) \cdot (PDP^{-1}) \cdots (PDP^{-1}) \\ | + | :<math>\begin{align} |

+ | A^k &= (PDP^{-1})^k = (PDP^{-1}) \cdot (PDP^{-1}) \cdots (PDP^{-1}) \\ | ||

&= PD(P^{-1}P) D (P^{-1}P) \cdots (P^{-1}P) D P^{-1} \\ | &= PD(P^{-1}P) D (P^{-1}P) \cdots (P^{-1}P) D P^{-1} \\ | ||

&= PD^kP^{-1} \end{align} </math> | &= PD^kP^{-1} \end{align} </math> | ||

Line 174: | Line 170: | ||

=== Particular application === | === Particular application === | ||

For example, consider the following matrix: | For example, consider the following matrix: | ||

+ | |||

:<math>M =\begin{bmatrix}a & b-a \\ 0 &b \end{bmatrix}.</math> | :<math>M =\begin{bmatrix}a & b-a \\ 0 &b \end{bmatrix}.</math> | ||

+ | |||

Calculating the various powers of ''M'' reveals a surprising pattern: | Calculating the various powers of ''M'' reveals a surprising pattern: | ||

− | :<math> | + | |

− | M^2 = \begin{bmatrix}a^2 & b^2-a^2 \\ 0 &b^2 \end{bmatrix},\quad | + | :<math>M^2 = \begin{bmatrix}a^2 & b^2-a^2 \\ 0 &b^2 \end{bmatrix},\quad |

M^3 = \begin{bmatrix}a^3 & b^3-a^3 \\ 0 &b^3 \end{bmatrix},\quad | M^3 = \begin{bmatrix}a^3 & b^3-a^3 \\ 0 &b^3 \end{bmatrix},\quad | ||

− | M^4 = \begin{bmatrix}a^4 & b^4-a^4 \\ 0 &b^4 \end{bmatrix},\quad \ldots | + | M^4 = \begin{bmatrix}a^4 & b^4-a^4 \\ 0 &b^4 \end{bmatrix},\quad \ldots</math> |

− | </math> | + | |

+ | The above phenomenon can be explained by diagonalizing ''M''. To accomplish this, we need a basis of '''R'''<sup>2</sup> consisting of eigenvectors of ''M''. One such eigenvector basis is given by | ||

+ | |||

+ | :<math>\mathbf{u}=\begin{bmatrix} 1 \\ 0 \end{bmatrix}=\mathbf{e}_1,\quad \mathbf{v}=\begin{bmatrix} 1 \\ 1 \end{bmatrix}=\mathbf{e}_1+\mathbf{e}_2,</math> | ||

+ | |||

+ | where '''e'''<sub>i</sub> denotes the standard basis of '''R'''<sup>n</sup>. The reverse change of basis is given by | ||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

:<math> \mathbf{e}_1 = \mathbf{u},\qquad \mathbf{e}_2 = \mathbf{v}-\mathbf{u}.</math> | :<math> \mathbf{e}_1 = \mathbf{u},\qquad \mathbf{e}_2 = \mathbf{v}-\mathbf{u}.</math> | ||

Straightforward calculations show that | Straightforward calculations show that | ||

+ | |||

:<math>M\mathbf{u} = a\mathbf{u},\qquad M\mathbf{v}=b\mathbf{v}.</math> | :<math>M\mathbf{u} = a\mathbf{u},\qquad M\mathbf{v}=b\mathbf{v}.</math> | ||

− | Thus, ''a'' and ''b'' are the eigenvalues corresponding to '''u''' and '''v''', respectively. | + | |

− | By linearity of matrix multiplication, we have that | + | Thus, ''a'' and ''b'' are the eigenvalues corresponding to '''u''' and '''v''', respectively. By linearity of matrix multiplication, we have that |

+ | |||

:<math> M^n \mathbf{u} = a^n\, \mathbf{u},\qquad M^n \mathbf{v}=b^n\,\mathbf{v}.</math> | :<math> M^n \mathbf{u} = a^n\, \mathbf{u},\qquad M^n \mathbf{v}=b^n\,\mathbf{v}.</math> | ||

+ | |||

Switching back to the standard basis, we have | Switching back to the standard basis, we have | ||

+ | |||

:<math> M^n \mathbf{e}_1 = M^n \mathbf{u} = a^n \mathbf{e}_1,</math> | :<math> M^n \mathbf{e}_1 = M^n \mathbf{u} = a^n \mathbf{e}_1,</math> | ||

:<math> M^n \mathbf{e}_2 = M^n (\mathbf{v}-\mathbf{u}) = b^n \mathbf{v} - a^n\mathbf{u} = (b^n-a^n) \mathbf{e}_1+b^n\mathbf{e}_2.</math> | :<math> M^n \mathbf{e}_2 = M^n (\mathbf{v}-\mathbf{u}) = b^n \mathbf{v} - a^n\mathbf{u} = (b^n-a^n) \mathbf{e}_1+b^n\mathbf{e}_2.</math> | ||

+ | |||

The preceding relations, expressed in matrix form, are | The preceding relations, expressed in matrix form, are | ||

− | :<math> | + | |

− | M^n = \begin{bmatrix}a^n & b^n-a^n \\ 0 &b^n \end{bmatrix}, | + | :<math> M^n = \begin{bmatrix}a^n & b^n-a^n \\ 0 &b^n \end{bmatrix}, </math> |

− | </math> | + | |

thereby explaining the above phenomenon. | thereby explaining the above phenomenon. | ||

Line 227: | Line 229: | ||

[[Category:Matrices]] | [[Category:Matrices]] | ||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− | |||

− |

## Revision as of 18:39, 16 January 2014

In linear algebra, a square matrix *A* is called **diagonalizable** if it is similar to a diagonal matrix, i.e., if there exists an invertible matrix *P* such that *P*^{−1}*AP* is a diagonal matrix. If *V* is a finite-dimensional vector space, then a linear map *T* : *V* → *V* is called **diagonalizable** if there exists an ordered basis of *V* with respect to which *T* is represented by a diagonal matrix. **Diagonalization** is the process of finding a corresponding diagonal matrix for a diagonalizable matrix or linear map.^{[1]} A square matrix which is not diagonalizable is called *defective.*

Diagonalizable matrices and maps are of interest because diagonal matrices are especially easy to handle: their eigenvalues and eigenvectors are known and one can raise a diagonal matrix to a power by simply raising the diagonal entries to that same power. Geometrically, a diagonalizable matrix is an **inhomogeneous dilation** (or *anisotropic scaling*) — it scales the space, as does a *homogeneous dilation*, but by a different factor in each direction, determined by the scale factors on each axis (diagonal entries).

## Characterization

The fundamental fact about diagonalizable maps and matrices is expressed by the following:

- An
*n*×*n*matrix*A*over the field*F*is diagonalizable if and only if the sum of the dimensions of its eigenspaces is equal to*n*, which is the case if and only if there exists a basis of*F*^{n}consisting of eigenvectors of*A*. If such a basis has been found, one can form the matrix*P*having these basis vectors as columns, and*P*^{−1}*AP*will be a diagonal matrix. The diagonal entries of this matrix are the eigenvalues of*A*. - A linear map
*T*:*V*→*V*is diagonalizable if and only if the sum of the dimensions of its eigenspaces is equal to dim(*V*), which is the case if and only if there exists a basis of*V*consisting of eigenvectors of*T*. With respect to such a basis,*T*will be represented by a diagonal matrix. The diagonal entries of this matrix are the eigenvalues of*T*.

Another characterization: A matrix or linear map is diagonalizable over the field *F* if and only if its minimal polynomial is a product of distinct linear factors over *F*. (Put in another way, a matrix is diagonalizable if and only if all of its elementary divisors are linear.)

The following sufficient (but not necessary) condition is often useful.

- An
*n*×*n*matrix*A*is diagonalizable over the field*F*if it has*n*distinct eigenvalues in*F*, i.e. if its characteristic polynomial has*n*distinct roots in*F*; however, the converse may be false. Let us consider

- which has eigenvalues 1, 2, 2 (not all distinct) and is diagonalizable with diagonal form ( similar to
*A*) - and change of basis matrix
*P* - The converse fails when
*A*has an eigenspace of dimension higher than 1. In this example, the eigenspace of*A*associated with the eigenvalue 2 has dimension 2.

- A linear map
*T*:*V*→*V*with*n*= dim(*V*) is diagonalizable if it has*n*distinct eigenvalues, i.e. if its characteristic polynomial has*n*distinct roots in*F*.

Let *A* be a matrix over *F*. If *A* is diagonalizable, then so is any power of it. Conversely, if *A* is invertible, *F* is algebraically closed, and *A ^{n}* is diagonalizable for some

*n*that is not an integer multiple of the characteristic of

*F*, then

*A*is diagonalizable. Proof: If

*A*is diagonalizable, then

^{n}*A*is annihilated by some polynomial , which has no multiple root (since ) and is divided by the minimal polynomial of

*A*.

As a rule of thumb, over **C** almost every matrix is diagonalizable. More precisely: the set of complex *n*×*n* matrices that are *not* diagonalizable over **C**, considered as a subset of **C**^{n×n}, has Lebesgue measure zero. One can also say that the diagonalizable matrices form a dense subset with respect to the Zariski topology: the complement lies inside the set where the discriminant of the characteristic polynomial vanishes, which is a hypersurface. From that follows also density in the usual (*strong*) topology given by a norm. The same is not true over **R**.

The Jordan–Chevalley decomposition expresses an operator as the sum of its semisimple (i.e., diagonalizable) part and its nilpotent part. Hence, a matrix is diagonalizable if and only if its nilpotent part is zero. Put in another way, a matrix is diagonalizable if each block in its Jordan form has no nilpotent part; i.e., each "block" is a one-by-one matrix.

## Diagonalization

If a matrix *A* can be diagonalized, that is,

then:

Writing *P* as a block matrix of its column vectors

the above equation can be rewritten as

So the column vectors of *P* are right eigenvectors of *A*, and the corresponding diagonal entry is the corresponding eigenvalue. The invertibility of *P* also suggests that the eigenvectors are linearly independent and form a basis of *F*^{n}. This is the necessary and sufficient condition for diagonalizability and the canonical approach of diagonalization. The row vectors of *P*^{−1} are the left eigenvectors of *A*.

When the matrix *A* is a Hermitian matrix (resp. symmetric matrix), eigenvectors of *A* can be chosen to form an orthonormal basis of **C**^{n} (resp. **R**^{n}). Under such circumstance *P* will be a unitary matrix (resp. orthogonal matrix) and *P*^{−1} equals the conjugate transpose (resp. transpose) of *P*.

## Simultaneous diagonalization

{{#invoke:see also|seealso}}

A set of matrices are said to be *simultaneously diagonalisable* if there exists a single invertible matrix *P* such that *P*^{−1}*AP* is a diagonal matrix for every *A* in the set. The following theorem characterises simultaneously diagonalisable matrices: A set of diagonalizable matrices commutes if and only if the set is simultaneously diagonalisable.^{[2]}

The set of all *n*×*n* diagonalisable matrices (over **C**) with *n* > 1 is not simultaneously diagonalisable. For instance, the matrices

are diagonalizable but not simultaneously diagonalizable because they do not commute.

A set consists of commuting normal matrices if and only if it is simultaneously diagonalisable by a unitary matrix; that is, there exists a unitary matrix *U* such that *U*AU* is diagonal for every *A* in the set.

In the language of Lie theory, a set of simultaneously diagonalisable matrices generate a toral Lie algebra.

## Examples

### Diagonalizable matrices

- Involutions are diagonalisable over the reals (and indeed any field of characteristic not 2), with ±1 on the diagonal
- Finite order endomorphisms are diagonalisable over
**C**(or any algebraically closed field where the characteristic of the field does not divide the order of the endomorphism) with roots of unity on the diagonal. This follows since the minimal polynomial is separable, because the roots of unity are distinct. - Projections are diagonalizable, with 0's and 1's on the diagonal.
- Real symmetric matrices are diagonalizable by orthogonal matrices; i.e., given a real symmetric matrix
*A*,*Q*is diagonal for some orthogonal matrix^{T}AQ*Q*. More generally, matrices are diagonalizable by unitary matrices if and only if they are normal. In the case of the real symmetric matrix, we see that , so clearly holds. Examples of normal matrices are real symmetric (or skew-symmetric) matrices (e.g. covariance matrices) and Hermitian matrices (or skew-Hermitian matrices). See spectral theorems for generalizations to infinite-dimensional vector spaces.

### Matrices that are not diagonalizable

In general, a rotation matrix is not diagonalizable over the reals, but all rotation matrices are diagonalizable over the complex field. Even if a matrix is not diagonalizable, it is always possible to "do the best one can", and find a matrix with the same properties consisting of eigenvalues on the leading diagonal, and either ones or zeroes on the superdiagonal - known as Jordan normal form.

Some matrices are not diagonalizable over any field, most notably nonzero nilpotent matrices. This happens more generally if the algebraic and geometric multiplicities of an eigenvalue do not coincide. For instance, consider

This matrix is not diagonalizable: there is no matrix *U* such that *U*^{−1}*CU* is a diagonal matrix. Indeed, *C* has one eigenvalue (namely zero) and this eigenvalue has algebraic multiplicity 2 and geometric multiplicity 1.

Some real matrices are not diagonalizable over the reals. Consider for instance the matrix

The matrix *B* does not have any real eigenvalues, so there is no **real** matrix *Q* such that *Q*^{−1}*BQ* is a diagonal matrix. However, we can diagonalize *B* if we allow complex numbers. Indeed, if we take

then *Q*^{−1}*BQ* is diagonal.

Note that the above examples show that the sum of diagonalizable matrices need not be diagonalizable.

### How to diagonalize a matrix

Consider a matrix

This matrix has eigenvalues

*A* is a 3×3 matrix with 3 different eigenvalues; therefore, it is diagonalizable. Note that if there are exactly *n* distinct eigenvalues in an *n*×*n* matrix then this matrix is diagonalizable.

These eigenvalues are the values that will appear in the diagonalized form of matrix *A*, so by finding the eigenvalues of *A* we have diagonalized it. We could stop here, but it is a good check to use the eigenvectors to diagonalize *A*.

The eigenvectors of *A* are

Now, let *P* be the matrix with these eigenvectors as its columns:

Note there is no preferred order of the eigenvectors in *P*; changing the order of the eigenvectors in *P* just changes the order of the eigenvalues in the diagonalized form of *A*.^{[3]}

Then *P* diagonalizes *A*, as a simple computation confirms, having calculated *P*^{ −1} using any suitable method:

Note that the eigenvalues appear in the diagonal matrix.

## An application

Diagonalization can be used to compute the powers of a matrix *A* efficiently, provided the matrix is diagonalizable. Suppose we have found that

is a diagonal matrix. Then, as the matrix product is associative,

and the latter is easy to calculate since it only involves the powers of a diagonal matrix. This approach can be generalized to matrix exponential and other matrix functions since they can be defined as power series.

This is particularly useful in finding closed form expressions for terms of linear recursive sequences, such as the Fibonacci numbers.

### Particular application

For example, consider the following matrix:

Calculating the various powers of *M* reveals a surprising pattern:

The above phenomenon can be explained by diagonalizing *M*. To accomplish this, we need a basis of **R**^{2} consisting of eigenvectors of *M*. One such eigenvector basis is given by

where **e**_{i} denotes the standard basis of **R**^{n}. The reverse change of basis is given by

Straightforward calculations show that

Thus, *a* and *b* are the eigenvalues corresponding to **u** and **v**, respectively. By linearity of matrix multiplication, we have that

Switching back to the standard basis, we have

The preceding relations, expressed in matrix form, are

thereby explaining the above phenomenon.

## Quantum mechanical application

In quantum mechanical and quantum chemical computations matrix diagonalization is one of the most frequently applied numerical processes. The basic reason is that the time-independent Schrödinger equation is an eigenvalue equation, albeit in most of the physical situations on an infinite dimensional space (a Hilbert space). A very common approximation is to truncate Hilbert space to finite dimension, after which the Schrödinger equation can be formulated as an eigenvalue problem of a real symmetric, or complex Hermitian, matrix. Formally this approximation is founded on the variational principle, valid for Hamiltonians that are bounded from below. But also first-order perturbation theory for degenerate states leads to a matrix eigenvalue problem.

## See also

- Defective matrix
- Scaling (geometry)
- Triangular matrix
- Semisimple operator
- Diagonalizable group
- Jordan normal form
- Weight module – associative algebra generalization

## Notes

## References

- {{#invoke:citation/CS1|citation

|CitationClass=book }}