# Talk:Cayley–Hamilton theorem

## Request

Can somebody add a proof outline for the commutative ring case? The proofs I know work only over fields (or integral domains). AxelBoldt 00:52 Mar 30, 2003 (UTC)

### General case

I was going to add this to the article, but it may be a bit too technical to include there. Furthermore, it is poorly written. (PLEASE correct and include in the article.)

I don't think there's such a thing as "too technical", as long as the more technical stuff is at the end of an article. The first few sentences should be written for a high school student, and then progressively more advanced. PAR 03:21, 4 May 2005 (UTC)
I agree. One should not try to knock the heck out of college students by starting with a terribly technical thing. But an article can (and should in many circumstances) gradually develop the subject and end up being quite advanced. Oleg Alexandrov 03:27, 4 May 2005 (UTC)
Okay, I'll move it into the article. It still may need some corrections, though.
I've corrected it (I'm a professional mathematician): the proof was incomplete, so I have filled in the gaps, and I tried to reflect the development from the elementary to the advanced.Geometry guy 12:51, 7 February 2007 (UTC)

Can somebody check on the corecctnes of the example? i think that when you substract the matrix {[t,0][0,t]} from the original one, that you DO NOT negate the second and third number!

## I do not think the given proof works

This is proved using matrices over B = R[t].
However it is an equation in S = all such matrices.
You then apply this to m in M. However you have given no definition of such an action.
Also you have given no evaluation homomorphism for such a thing.

## problem

The action definition is not given. You have to assign some meaning to Am. If you want to imitate the proof in Atiyah-MacDonald you need to operate on a vector with components from m. You can then drop all reference to the evaluation operation.

## I may be stupid but...

I don't understand why this theorem is not a triviality...

${\displaystyle p(A)=det(A-AI)=det(A-A)=det(0)=0}$

Isn't it?

I think the introduction to this article should show why this is not so. 131.220.68.177 09:47, 9 September 2005 (UTC)

Your "proof" only holds for 1×1 matrices. If you read carefully example 1, the idea is that you start with p(t) where t is a number, and then do the calculation of det(A-tI) according to the properties of the determinant. Then, plug in formally instead of t the matrix A, and you will get zero.
The error in your proof is the following. The quantity tI is supposed to be interpretted as
${\displaystyle {\begin{bmatrix}t&0&0\\0&t&0\\0&0&t\end{bmatrix}}}$.

If you plug right in here instead of the number t the matrix A, you get a matrix of size n2. But then the subtraction A-AI can't be done in your proof, as the sizes don't match. Oleg Alexandrov 16:58, 9 September 2005 (UTC)

Yes I understood that already. In fact that's not that easy. I really see the point. Wouldn't it be possible to make this point clear from the beginning of the article. So that nobody fall in this trap again.131.220.44.10 14:49, 12 September 2005 (UTC)

I don't see a good way of writing that while still maintaining the nice style of the article. I think things are most clear if you look at the first example, they show how to interpret that t. Oleg Alexandrov 15:56, 12 September 2005 (UTC)

I patched up the proof for you. It is just a matter of defining an evaluation map between modules over the polynomial ring. I also thought it would be nicer (given the idea to proceed from the novice to the advanced) to give the proof for matrices before discussing the generalizations. This provides an opportunity to discuss the famous incorrect proof as well.Geometry guy 12:52, 7 February 2007 (UTC)

I agree that the intro is poorly done. I'd love to see someone who understands the intro, alone, who hasn't already seen this theorem. Regardless, changing "The Cayley–Hamilton theorem states that substituting the matrix A in the characteristic polynomial (which involves multiplying its constant term by In, since that is the zeroth power of A) results in the..." to "The Cayley–Hamilton theorem states that substituting the matrix A in the characteristic polynomial after the determinant has been taken (which involves matrix exponentiation instead of scalar exponentiation) results in the..." The current parenthesis are a bit pointless, and imply what I've written. "after the determinant has been taken" gets around the point above. 67.158.43.41 (talk) 10:07, 22 December 2009 (UTC)

I agree that the formulation was not optimal, but I do not agree with the suggested replacement, since (1) there is a real difficulty with the constant term, and (2) "after the determinant has been taken" is confusing to me (the characteristic polynomial is already the determinant, so you cannot "take the determinant" when you've already got the characteristic polynomial). I've changed to formulation to address (IMHO) most issues. The important thing is that one must view that characteristic polynomial as a true polynomial, not a polynomial function, and that the substitution is a somewhat unusual one in that the result is not a value in the same set as where the polynomial coefficients live (but instead in the set of matrices). In formal algebraic terms, the matrices form a k-algebra M (where k is the base field, say the complex numbers) and an element of M can be substituted into a polynomial in k[λ] to produce another element of M. This involves embedding k into M, and thus interpreting constants (in k) as the corresponding multiples of the identity. All this need not be said in the lede, but the reinterpretation of the constant term is a real change every reader should be aware of (one could maintain that the constant term is already multiplied by λ0 which changes into In by the substitution, but I'd bet very few readers will implicitly realize that). Marc van Leeuwen (talk) 18:07, 23 December 2009 (UTC)

## Accessibility

I fixed up this article about a month ago, but I would like to edit it again to make the proof (for matrices) more accessible (for instance to undergraduate math students). If anyone has comments or suggestions, please let me know. Geometry guy 21:19, 21 February 2007 (UTC)

## Undefined term

Hi, can the term "N-partitioned matrices" be explained or be linked to an appropriate article? Thanks. Randomblue (talk) 20:43, 17 January 2008 (UTC)

Link introduced. User:PhilDWraight 11:30, 11 February 2008 (UTC)

## Fun with page names

Hmm, this is neat: http://en.wikipedia.org/wiki/Cayley-Hamilton_theorem vs. http://en.wikipedia.org/wiki/Cayley%E2%80%93Hamilton_theorem But maybe you'll like to change it... —Preceding unsigned comment added by 141.84.69.20 (talk) 00:25, 30 January 2008 (UTC)

Yes, I see the former redirects to the latter one, which has a bizarre page name. I'm not aware of when or how such a thing happened, or how it can be undone, but obviously it should be. Marc van Leeuwen (talk) 10:55, 30 January 2008 (UTC)

Nope, by now I found found that this "bizarre" page name just contained an "en dash", and that the wikipedia manual of style says this is the right symbol to use here, since Cayley and Hamilton are separate persons (the manual gives the Michelson–Morley experiment as an example of correct use of an "en dash"). Marc van Leeuwen (talk) 11:34, 30 January 2008 (UTC)

## What's that about N-partitioned matrices?

The lead says the C-H theorem also holds for N-partitioned matrices. Can anybody explain what that means and why it is any different from the usual case? Or is somebody claiming the theorem makes sense (and holds) for matrices over another ring of matrices? If not, I'm tempted to delete that mention. Marc van Leeuwen (talk) 10:47, 7 March 2008 (UTC)

C-H is important in assessing system stability - re: position of eigenvalues etc - in multidimensional systems this becomes a complex issue. Partitioning the A/system matrix can make the problem much simpler - essentially looking at the problem as sub-systems - particularly useful for on-line analysis. Proof the C-H theorem holds for such matrices is vital. User:PhilDWraight 10.22, 22 April 2008 (UTC)

I understand (somewhat) the concern, but I still don't understand what the statement that C-H holds for partitioned/block matrices means. Suppose for concreteness that I have a block matrix

${\displaystyle M={\begin{pmatrix}A&B\\C&D\end{pmatrix}}}$

where A, B, C, D are 10×10 matrices, say. First question: is there a characteristic polynomial of this matrix that is not of degree 20. The usual characteristic polynomial of a 2×2 matrix would be

${\displaystyle P=X^{2}-(A+D)X+AD-BC=X^{2}-(A+D)X+AD-CB,}$

but as polynomials with matrix coefficients, the two versions are different (and in any case the unwritten coefficient 1 of ${\displaystyle X^{2}}$ should be interpreteed as a 10×10 identity matrix). So what exactly would the Cayley–Hamilton theorem say for this block matrix? Marc van Leeuwen (talk) 07:33, 25 April 2008 (UTC)

## Changes to proof section in July 2008

On 4 July 2008 the section on proofs was considerably rewritten. The following discussion is copied from respective talk pages.

Marc, I've completely changed the section (originally mostly due to you, it looks like) in Cayley-Hamilton theorem concerning proofs. I've tried to retain the philosophical points you made, but some of what you said was simply not true, and it was overly opinionated. I'm afraid a lot of the pedagogy of comparing and correcting incorrect proofs has gone by the wayside as a result of the last one. Since you seem to feel strongly about the issue, I thought you would like to know so you could take a look. By the way: do you know of a published source for the proof you wrote (now tidied up a bit and included as the "First proof")? I gave a second proof from Atiyah-MacDonald, a third proof based on one which had appeared on that page some time ago, and a fourth based on some of the comments you made, but these latter two are likewise unreferenced (unreferenceable?). Ryan Reich (talk) 14:22, 4 July 2008 (UTC)

(from User talk:Ryan Reich)

Ryan, I've seen your edits on the Cayley–Hamilton article, and I appreciate that you informed me on my talk page, since indeed I had invested quite a bit of time in the part that you replaced. The text you replaced it with is interesting, but in my opinion not an improvement, even though it seems essentially correct (I have a few gripes but these are not so serious). But I'm probably not the most neutral person to judge, so we'll see how your change fares by other editors opinions. Let me just say a few things that come to mind.

• Your text frequently (at least four times) mentions the "defining property" of adjugates. It is important, but not "defining". The definition of the adjugate is that its entries are certain minors, and the mentioned property follows from that. If the property were "defining" any zero matrix could have any matrix of the same size as adjugate.
• You say "a lot of the pedagogy of comparing and correcting incorrect proofs has gone by the wayside as a result of the last one". I don't understand the phrase, which one is the last one?
• Your first edit summary mentions a didactic diatribe, but I did not want to push any didactic point. It is just that I think it is really a singular property of the Cayley–Hamilton theorem to inspire false proofs, and it is valid for the article to say so. I've seen many false proofs, some in print; before I edited the article all the proofs given there were false (rest assured, yours are not). Most false proofs are more sophisticated than bluntly substituting A for t in the defining equation of the characteristic polynomial, but they are false anyway. The thing that worries me most about the text you substituted is not that it it will mislead people, but that it will soon get replaced by people honestly convinced that they can do better than this.
• While interesting, none of the proofs given gives me the impression that it touches the essence of why Cayley–Hamilton holds (well the final argument invoking Euclidean division comes close, but phrases like the one starting "This incorporates the evaluation map" put me off (frankly I don't understand what it is affirming) and seem to have little to do with Euclidean division). The first three proofs are all quite long, and apart from the first one I doubt there are many wikipedia users that can actually understand them (given that their average level seems to be high school). The second took me long to absorb, and the essential point, that the determinant of B is p(A) needs more thought than is suggested.
• You say some of what I wrote is simply not true. I would appreciate if you were more specific. And then, you could have corrected (and added the proofs you did) without throwing away everything.
• The proof I wrote was loosely based on a book I found in our library (being dazzled by all the wrong proofs I looked for some solid ground), which happened to be an algebra course by Patrice Tauvel (in French); I undid it of some of what seemed to me unnecessary generality for the context at hand (it arrives at the Cayley–Hamilton theorem as a corollary to something more general). The consideration of Euclidean division was a result of discussions with colleagues at our math institute. But later I found much of it also on the French wikipedia (in some indirectly related article I cannot trace right now), so there is no point in claiming (or being accused of) original research here. In fact somebody sent me a paper reviewing some 20 different proofs... It seems like that many people have been thinking about this.
• I regret the disappearence of some points
• the observation that the inital naive method not only gives a wrong argument, but also leads to the wrong conclusion (a scalar rather that matrix 0).
• the observation that confusion arises from confusing unwritten (matrix and scalar) multiplication operations
• the example that shows how naive substitution leads to genuinely false identities
• the (before last) expression that shows that the adjugate of A is in fact a polynomial in A (with coefficients taken from the characteristic polynomial of A). This is an inportant and very general fact, which implies Cayley–Hamilton immediately, without being as easily implied by it.

Marc van Leeuwen (talk) 20:21, 4 July 2008 (UTC)

The reason I keep saying "defining property" is that it's hard to number equations in Wikipedia, and I need some other memorable device to refer back to them. Perhaps I will do as you did and insert (*) next to this one.
"The last one" referred to the last item in the previous sentence (which you didn't copy here), namely, that your text was sometimes opinionated. Basically, it seemed to me that your main goal was to correctly instruct the reader in the art of proving Cayley-Hamilton, and in particular, to push the point that trying to use the evaluation map directly could never work. All of your examples, including one example false proof, made this point; this is the thing that I thought was not right in what you wrote, since in fact it is possible to formulate a proof correctly using the evaluation map (you just have to be more careful, as in the third proof I wrote). The whole effort seemed "didactic" in that it was primarily concerned with correcting a misconception and instructing the reader through numerous but subtly different arguments that the only path to a proof of the Cayley-Hamilton theorem is through "real work" (that phrase really did have to go, by the way).
Concerning the loss of these subtly different arguments: looking back at the last version of the page before I edited it, the two big points you made were: there is no evaluation homomorphism from M[t] to M (in my notation), because A is not in the center of M; and, direct evaluation of the equation
${\displaystyle p(t)=\det(tI_{n}-A)\,}$
at A leads to equating a matrix, p(A), with a scalar, det(0); a sub-point of this is that even replacing t by A on the right requires one to reconcile three things:
1. That t In is a diagonal matrix with t on the diagonal, so substituting A gives a matrix with matrix entries;
2. That A itself, as it appears in that expression, is a matrix with scalar entries;
3. That if we interpret t In as the "quantity" t multiplied by the matrix In, substitution of A for t transforms a scalar multiplication into a matrix multiplication.
You also think that these are among your main points. The one about there being no evaluation homomorphism is the one I think is wrong (given the proper context, and this distinction did make it into my version). As for the others, I actually think that they have been partially retained in my text, although perhaps in an excessively terse form. The first numbered point and its comparison with the second are explicitly in my text right before the first proof. I did miss the opportunity to give the "matrix equals scalar" dichotomy, but there is an obvious place to do so also right before the first proof, and I will make that correction. The point about multiplication is also implicit in the juxtaposition of the second and third proofs, though as you observe, the proofs as a whole are long and detailed, and perhaps extracting their "meaning" is not easy. I will expand the discussion before the proofs in order to reincorporate these points.
The reasons I replaced your analysis of erroneous arguments with just some proofs are that first, I felt that the existence of my third proof invalidated your frequently-expressed assertion that there could be no proof based on the evaluation homomorphism; second, although the above points are worthy ones, they could be easily expressed more briefly in the context of correct proofs, rather than as criticisms of incorrect ones; and third, that what was there concerned itself at least as much with educating the reader as with informing them. The second and third reasons are both related to the nature of the medium here: since an article is not a discussion, the false arguments you shoot down are more of the nature of a straw man than a real opposing position; and since it is also not a page in a textbook, the instruction you provide doesn't reside in the proper context for it to be received as intended.
I didn't mention original research because I think that any attempt to be philosophical about the proof of any theorem borders on it (comparison of proofs is not a major mathematical activity, although you say that for this theorem, it may be). I believe that a discussion such as you wrote is a good idea, but that to have the discussion in full requires a more scholarly medium.
This also goes for the proofs I included. I liked the third one much more before I started to write it up, at which time I realized that the story for Z is not as nice as I had thought, since it is not commutative. The whole thing ends up being a little technical, whereas the concept, which is to restrict the evaluation homomorphism to a context in which it is a homomorphism but still does the job intended for it, is quite simple. The fourth proof based on your Euclidean division idea is much more elegant. You say that you don't feel like any of the proofs gets at the "why" of the theorem, except maybe the fourth, but I think that the second one is really the best (this is somehow to be expected, given the author). My reason is that since (as we both agree is essential) p(A) is to be the zero matrix, and since matrices are naturally endomorphisms of vector spaces, this should be verified by considering its action on a vector space, and not simply by having p(A) = 0 pop out of a piece of algebraic machinery. Using the evaluation homomorphism is an elegant trick, but polynomial algebras are at their core a piece of algebraic machinery, a formal device; in the second proof, the matrix B is literally the matrix-with-matrix-entries which is t In - A, evaluated at A, and with the A already appearing considered as having matrix entries (which are all scalar matrices). This interpretation is consistent with the idea of using "actions on vector spaces" in the proof. That det(B) = p(A) is clear once this point is made; perhaps it needs to be made better, but I think this proof (which, unlike the last two, is sourced) is an important part of the philosophy of this theorem.
As for the fact that Adj(A) is a polynomial in A, of course, I did make that point in the fourth proof. I didn't mention that the coefficients are those of the characteristic polynomial, though that would of course give still a third way of using the Euclidean division technique to prove the theorem (the first two are: do division, observe that the remainder is p(A), and also that the remainder must be zero; do division, observe that the quotient is in k[A][t], and then show that the evaluation map is a homomorphism).
The main thing I'm getting out of this discussion is that the theorem is even more interesting than I had thought. What is this paper with the 20 proofs in it? I'd like to see it. Ryan Reich (talk) 17:13, 6 July 2008 (UTC)

(end of copied discussion) Marc van Leeuwen (talk) 12:31, 8 July 2008 (UTC)

I think we have sufficient common ground to cooperate productively. I may do some editing to restore some of the previous version without pushing any point you don't like; I see that version had plenty of defects, among which being too verbose. But I think we more or less agree that most algebraic proofs have the curious property of looking like attempts to justify by some elaborate arguments a simple but incorrect argument. I don't think though that the incorrect argument is necessarily the straight substitution into the definition of the defintion of the characteristic polynomial. The second proof does seem to try to make sense out if that, but at also has to invoke adjugate matrices and the original punchline seems to get lost in the details. But I think a greater temptation is to proceed as in the third proof, and cut short by using evaluation without the necessary precaution. The current text does warn against that, but the formulation is still somewhat confused: "At this point, it is tempting to consider...likewise not equal" makes it seem like there is one natural map evA (which fails to be a homomorphism); however there are two equally natural maps, left and right evaluation, neither of which is a homomorphism. The given evA is right evaluation, probably due to the fact of thinking of M[t] as polynomials in t with matrix coefficients written to the left, but one could eqully well write the powers of t on the left before replacing them by A, giving left evaluation. But anyway, I think the reason it is so tempting to think evaluation is well defined (and a homomorphism) is that most of us learn about polynomials in a commutative context, and worse, learn about polynomial functions before learning about polynomials as algebraic objects; the very notation p(A) illustrates the mental identification of polynomials with polynomial functions, and all of this makes a very bad preparation for handling polynomials with non-commuting coefficients with the proper caution. Note that even though one could define "polynomial functions" of a non-commutative ring to itself as those defined by left- or right-evalution of a polynomial with coefficients in the ring, such functions do not behave algebraically as the polynomials do (and in fact a product of such "polynomial functions" is usually not a polynomial function), so this is really an idea to put out of ones thoughts when considering polynomials over a non-commutative ring. I think some warning about this danger before setting up a proof would be in place, thought maybe not with as much emphasis on possible wrong proofs as in the old version of the article.

Actually I would prefer changing the order of the proofs to first, third, second, as I think the (current) second proof is really the hardest on the neurons, with all the respect I have for its authors. Here are some reasons why I think it is actually the most abstract of all. It considers a matrix with matrix entries, but this is not to be considered as a block matrix, since the determinant of a block matrix is not the usual determinantal expression in terms of the blocks (which does not even have a well defined meaning). So the matrix B must be thought of as a matrix in abstract indivisible objects (say endomorphisms of our vector space), so that the deteminant is just the determinant of those abstract values. However that only makes sense if the values live in a commutative ring, so one must observe that coefficients live in a commutative subring of endomorphisms before the notion of determinant even makes sense (the current text does make this point, but for the existence of the adjugate; what I would want to add is that in order for det(B) to even make sense this is a prerequisite). In fact I have the impression the motivation for introducing this strange matrix B is not so much to give a justification to the apparent bizarre operation of substituting a matrix into a matrix coefficient, as to define some abstract matrix whose determinant evaluates to p(A). Then the relation ${\displaystyle \sum _{i}B_{i,j}e_{j}=0}$ looks like it is a 1-line array (e1 ... en) of abstract vectors right-multiplied by a column of B, that is a 1-column array of endomorphisms, through the usual matrix multiplication rule, but with products of "coeffcients" ei and Bi,j evaluated by applying (the endomorphism) Bi,j to (the vector) ei, which is quite a mental exercise. I know, one does not have to interpret it like that, it is just a sum of endomorphism-applied-to-vector values, but in that view the fact that the Bi,j are arranged into a matrix seems irrelevant. In any case I think trying to see the "sense" of the manipulations of that proof is not easy. Marc van Leeuwen (talk) 04:15, 9 July 2008 (UTC)

The second proof (and the third, for that matter) turned out quite a bit more involved when I wrote it than I had anticipated. The version given in Atiyah-MacDonald negelcts to mention the commutativity controversy, and therefore gives something of a false impression of completeness (I guess it's something they can assume any experienced mathematician would fill in automatically). The current third proof really does work better right after the first one, since it employs the same basic object. As for the erroneous arguments: it would probably work well to place a brief discussion of what could go wrong and how it is averted before each proof, or within it (for example, I think the existing discussion of the evaluation map in the third proof works well where it is). I may not have time to work on this page for a few days, though. Ryan Reich (talk) 02:35, 10 July 2008 (UTC)

## The topological proof

given here is actually valid for matrices over any field, not just the complex numbers, since over any Algebraically closed field, the diagonalizable matrices always form a dense subset in the zariski topology. Liransh Talk 15:30, 10 April 2009 (UTC)

## "Determinant-free" proof

Sheldon Axler, in his (very nice, it won an award) American Mathematical Monthly article "Down with Determinants!", and presumably also in his later book Linear Algebra Done Right, gives a short proof of the Cayley-Hamilton theorem using a definition of the characteristic polynomial that avoids determinants entirely. Is it equivalent to one of the proofs here? If not, I wonder (but doubt) whether it's possible to incorporate the proof somehow, without having to re-prove all the five pages of theorems beforehand. Shreevatsa (talk) 08:41, 29 September 2009 (UTC)

I've just rapidly read the article. While the paper is interesting, I cannot consider the result it calls "Cayley-Hamilton theorem" to be the same as the one this wikipedia article is about. That is because the notion of characteristic polynomial it uses, being defined without mentioning determinants, is very hard to match with the formal algebraic object defined here to be the characteristic polynomial. For one thing the article depends quite essentially on the complex numbers. It would be rather hard to deduce from its approach what the characteristic polynomial of a matrix with entries in for instance the a finite field (or in the integers, or in a commutative ring with zero divisors) would have to be, and why it takes coefficient in the same field (or commutative ring) as the matrix; all this is immediately obvious from the "determinant" definition. Also one has to read all the way to the end of the article, where the usual definition of determinants is finally "deduced", to understand why the coefficients of the characteristic polynomial depend in a polynomial (or even continuous) fashion on the coefficients of the matrix. In fact this is the fundamental reason why the characteristic polynomial, rather than the minimal polynomial (which the article naturally comes upon first), is of crucial importance for the theory. In the paper the characteristic polynomial comes across as a somewhat artificially "blown up" version of the minimal polynomial (by replacing the numbers ${\displaystyle \alpha _{i}}$ by the dimensions of generalized eigenspaces), making the fact that it is divisible by the minimal polynomial (i.e., the "Cayley-Hamilton theorem") more or less built into the definition. So in short, the answers to your questions are are no and no (even if one does re-prove those five pages). Marc van Leeuwen (talk) 21:02, 29 September 2009 (UTC)
Hmm, good point. Elsewhere, he says:

Another of the comments on this blog asks how to deal with linear operators on vector spaces over non-complete fields. There are two ways to do this: (1) Embed the non-complete field in its algebraic completion, as I do in the paper or (2) Give up on factoring a polynomial into linear factors and work directly with the non-complete field; the techniques for doing this are illustrated in Chapter 9 (titled “Operators on Real Vector Spaces”) of my book.

So it may be possible to extend the approach, but it doesn't seem utopian anymore. :-) Shreevatsa (talk) 02:28, 30 September 2009 (UTC)

## one more proof?

One proof that is easy to follow is based on the similarity of any complex matrix to an upper-triangular one. This is the one I would choose to lecture engineers or scientists on this subject.

Thanks! Gold gerry (talk) 18:53, 18 January 2010 (UTC)

## Diagonalisable matrices *are* dense in GL(n,R)

In the Preliminaries section, it is said that "for matrices with real entries the diagonizable ones do not form a dense set". This is wrong. If ${\displaystyle M=PJP^{-1}}$ is a Jordan decomposition of a real square matrix ${\displaystyle M}$, one can always perturb ${\displaystyle J}$ by ${\displaystyle D={\textrm {diag}}(\epsilon ,\epsilon ^{2},...,\epsilon ^{n})}$ for an arbitrarily small ${\displaystyle \epsilon >0}$ to obtain a matrix ${\displaystyle J+D}$ with distinct eigenvalues, and obtain in turn a diagonalizable real square matrix ${\displaystyle P(J+D)P^{-1}}$.The suffocated (talk) 14:09, 22 April 2010 (UTC)

This is not true. Rotation of R2 by ninety degrees gives an example of a real matrix that cannot be approximated by real diagonalizable matrices. Algebraist 16:03, 22 April 2010 (UTC)
But it is complex-diagonalisable. I mean, the original text is confusing because the set of complex-diagonalisable real matrices is indeed dense in GL(n,R). You just cannot perform diagonalisation over the real field, but the proof can still work without modification. Even if you insists doing arithmetics on R, you can still follow the same idea and approximate the matrix by a direct sum of real scalars and 2x2 blocks (modulo a similarity transform). So you just need an additional step to prove the theorem for the 2x2 case.The suffocated (talk) 19:31, 22 April 2010 (UTC)
That's what that line in the article says, isn't it? You can prove Cayley–Hamilton for real matrices by extending to the complexes and diagonalizing, but it's a rather unsatisfactory proof. Algebraist 20:03, 22 April 2010 (UTC)
What I'm saying are: (1) the line is imprecise as one may (actually I did) mistake it as saying "the subset of all complex-diagonalisable matrices in GL(n,R) is not dense in GL(n,R)"; (2) the proof is unsatisfactory, but not because it does not work on R. It does work on R without extending to the complex field. All you need is to add one small step to verify the theorem for scalar multiples of 2x2 rotation matrices (which is easy). On a second thought, I also think that dismissing the proof for complex matrices as "strange" is ... strange. I′ll bet most people will find this proof more natural and intuitive than the others, because it reduces the whole theorem to the (trivial) scalar case. Also, continuity arguments are very useful and common tricks in matrix analysis. Transforming one problem into another easier one is also a usual practice in mathematics. If one solves a differential equation by using Fourier transform, shall we say that working on the frequency domain is strange? ;-D The suffocated (talk) 21:57, 22 April 2010 (UTC)

## A Non-proof'?

What is this ludicrous thing doing in the article? —Preceding unsigned comment added by 95.181.12.52 (talk) 09:29, 22 May 2010 (UTC)

I am not sure if ludicrous' refers to the phrase `non-proof' or the wrong proof itself. If you are asking what purpose does the wrong proof serve, I believe it is here because its validity is the most confusing part for novices. See, for example, the discussion thread ″I_may_be_stupid_but...″ in the above.The suffocated (talk) 13:09, 26 May 2010 (UTC)

## Unclear explanation of the bogus proof

There seem to be people who have already spent a lot of time on this, so I won't change anything and set off an edit war, but the explanation of the bogus "proof" seems to be needlessly obfuscated. This is really a simple issue with a simple solution: people don't understand the notation p(A), so we should write out p(A) explicitly. As currently written, this section never points out what p(A) actually is. This seems absurd to me. Note that p(A) is currently the fifth displayed line of this section, but the accompanying text gives the impression that this line is some sort of error. (More precisely, that line would be p(A), if the given text didn't re-interpret the matrix as a 4x4 matrix instead of a 2x2 matrix with entries in R[A].) Some of the other stuff here is worthwhile -- I like pointing out that the bogus proof tries to compare a scalar to a matrix, since I've seen too many of my students try to equate things that aren't even the same type of object -- but we should lead with the simple explanation "p(A) is THIS, not THAT." 165.123.215.110 (talk) 04:36, 10 February 2013 (UTC)

I suppose it is always hard to clearly steer the reader to take the received wrong steps correctly! The correct 2x2 matrix definition of p(A) is already in the last displayed eqn of the Example, section 1, at the top of the article, indeed, the takeout message of the whole article. I would argue against encouraging the confused reader to take the "right two left turns" to get to the proper p(A) here--it might be a good idea to not mix the right with the wrong in the bogus section. If the reader knew how to properly and safely compute and interpret non-scalar-component matrices, as you assume, she/he need not be here, in the first place! Any discussion of entries in R[A] would compound the confusion. Cuzkatzimhut (talk) 13:12, 10 February 2013 (UTC)