282 lines
7.2 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

Rather than redefining evaluation for each of these cases,
we should map our polynomial into a structure compatible with how we want to evaluate it.
Essentially, this means that from a polynomial in the base structure,
we can derive polynomials in these other structures.
In particular, we can either have a matrix of polynomials or a polynomial in matrices.
<!-- TODO: notes about functoriality of `fmap`ping eval vs -->
:::: {layout-ncol="2"}
::: {}
$$
\begin{align*}
p &: K[x]
\\
p(x) &= x^n + p_{n-1}x^{n-1} + ...
\\
\phantom{= p} & + p_1 x + p_0
\end{align*}
$$
:::
::: {}
$x$ is a scalar indeterminate
```haskell
p :: Polynomial K
```
:::
::::
:::: {layout-ncol="2"}
::: {}
$$
\begin{align*}
P &: (K[x])^{m \times m}
\\
P(x I) &= (x I)^n + (p_{n-1})(x I)^{n-1} + ...
\\
& + p_1(x I)+ p_0 I
\end{align*}
$$
:::
::: {}
$x$ is a scalar indeterminate, $P(x I)= p(x) I$ is a matrix of polynomials in $x$
```haskell
asPolynomialMatrix
:: Polynomial K -> Matrix (Polynomial K)
pMat :: Matrix (Polynomial K)
pMat = asPolynomialMatrix p
```
:::
::::
:::: {layout-ncol="2"}
::: {}
$$
\begin{align*}
\hat P &: K^{m \times m}[X]
\\
\hat P(X) &= X^n + (p_{n-1}I)X^{n-1} + ...
\\
& + (p_1 I) X + (p_0 I)
\end{align*}
$$
:::
::: {}
$X$ is a matrix indeterminate, $\hat P(X)$ is a polynomial over matrices
```haskell
asMatrixPolynomial
:: Polynomial K -> Polynomial (Matrix K)
pHat :: Polynomial (Matrix K)
pHat = asMatrixPolynomial p
```
:::
::::
### Cayley-Hamilton Theorem
When evaluating the characteristic polynomial of a matrix *with* that matrix,
something strange happens.
Continuing from the previous article, using $x^2 + x + 1$ and its companion matrix, we have:
$$
\begin{gather*}
p(x) = x^2 + x + 1 \qquad C_{p} = C
= \left( \begin{matrix}
0 & 1 \\
-1 & -1
\end{matrix} \right)
\\ \\
\hat P(C) = C^2 + C + (1 \cdot I)
= \left( \begin{matrix}
-1 & -1 \\
1 & 0
\end{matrix} \right)
+ \left( \begin{matrix}
0 & 1 \\
-1 & -1
\end{matrix} \right)
+ \left( \begin{matrix}
1 & 0 \\
0 & 1
\end{matrix} \right)
\\ \\
= \left( \begin{matrix}
0 & 0 \\
0 & 0
\end{matrix} \right)
\end{gather*}
$$
The result is the zero matrix.
This tells us that, at least in this case, the matrix *C* is a root of its own characteristic polynomial.
By the [Cayley-Hamilton theorem](https://en.wikipedia.org/wiki/Cayley%E2%80%93Hamilton_theorem),
this is true in general, no matter the degree of *p*, no matter its coefficients,
and importantly, no matter the choice of field.
This is more powerful than it would otherwise seem.
For one, factoring a polynomial "inside" a matrix turns out to give the same answer
as factoring a polynomial over matrices.
:::: {layout-ncol="2"}
::: {}
$$
\begin{gather*}
P(xI) = \left( \begin{matrix}
x^2 + x + 1 & 0 \\
0 & x^2 + x + 1
\end{matrix}\right)
\\ \\
= (xI - C)(xI - C')
\\ \\
= \left( \begin{matrix}
x & -1 \\
1 & x + 1
\end{matrix} \right)
\left( \begin{matrix}
x - a & -b \\
-c & x - d
\end{matrix} \right)
\\ \\
\begin{align*}
x(x - a) + c &= x^2 + x + 1
\\
\textcolor{green}{x(-b) - (x - d)} &\textcolor{green}{= 0}
\\
\textcolor{blue}{(x - a) + (x + 1)(-c)} &\textcolor{blue}{= 0}
\\
(-b) + (x + 1)(x - d) &= x^2 + x + 1
\end{align*}
\\ \\
\textcolor{green}{(-b -1)x +d = 0} \implies b = -1, ~ d = 0 \\
\textcolor{blue}{(1 - c)x - a - c = 0} \implies c = 1, ~ a = -1
\\ \\
C' =
\left( \begin{matrix}
-1 & -1 \\
1 & 0
\end{matrix} \right)
\end{gather*}
$$
:::
::: {}
$$
\begin{gather*}
\hat P(X) = X^2 + X + 1I
\\[10pt]
= (X - C)(X - C')
\\[10pt]
= X^2 - (C + C')X + CC'
\\[10pt]
\implies
\\[10pt]
C + C' = -I, ~ C' = -I - C
\\[10pt]
CC' = I, ~ C^{-1} = C'
\\[10pt]
C' = \left( \begin{matrix}
-1 & -1 \\
1 & 0
\end{matrix} \right)
\end{gather*}
$$
:::
::::
It's important to not that a matrix factorization is not unique.
*Any* matrix with a given characteristic polynomial can be used as a root of that polynomial.
Of course, choosing one root affects the other matrix roots.
### Moving Roots
All matrices commute with the identity and zero matrices.
A less obvious fact is that all of the matrix roots *also* commute with one another.
By the Fundamental Theorem of Algebra,
[Vieta's formulas](https://en.wikipedia.org/wiki/Vieta%27s_formulas) state:
$$
\begin{gather*}
\hat P(X)
= \prod_{[i]_n} (X - \Xi_i)
= (X - \Xi_0) (X - \Xi_1)...(X - \Xi_{n-1})
\\
= \left\{ \begin{align*}
& \phantom{+} X^n
\\
& - (\Xi_0 + \Xi_1 + ... + \Xi_{n-1}) X^{n-1}
\\
& + (\Xi_0 \Xi_1+ \Xi_0 \Xi_2 + ... + \Xi_0 \Xi_{n-1} + \Xi_1 \Xi_2 + ... + \Xi_{n-2} \Xi_{n-1})X^{n-2}
\\
& \qquad \vdots
\\
& + (-1)^n \Xi_0 \Xi_1 \Xi_2...\Xi_n
\end{align*} \right.
\\
= X^n -\sigma_1([\Xi]_n)X^{n-1} + \sigma_2([\Xi]_n)X^{n-2} + ... + (-1)^n \sigma_n([\Xi]_n)
\end{gather*}
$$
The product range \[*i*\]~*n*~ means that the terms are ordered from 0 to *n* - 1 over the index given.
On the bottom line, *σ* are
[elementary symmetric polynomials](https://en.wikipedia.org/wiki/Elementary_symmetric_polynomial)
and \[*Ξ*\]~*n*~ is the list of root matrices from *Ξ*~*0*~ to Ξ~*n-1*~.
By factoring the matrix with the roots in a different order, we get another factorization.
It suffices to only focus on *σ*~2~, which has all pairwise products.
$$
\begin{gather*}
\pi \in S_n
\\
\qquad
\pi \circ \hat P(X) = \prod_{\pi ([i]_n)} (X - \Xi_i)
\\ \\
= X^n
- \sigma_1 \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)X^{n-1} +
+ \sigma_2 \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)X^{n-2} + ...
+ (-1)^n \sigma_n \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)
\\ \\
\\ \\
(0 ~ 1) \circ \hat P(X) = (X - \Xi_{1}) (X - \Xi_0)(X - \Xi_2)...(X - \Xi_{n-1})
\\
= X^n + ... + \sigma_2(\Xi_1, \Xi_0, \Xi_2, ...,\Xi_{n-1})X^{n-2} + ...
\\ \\ \\ \\
\begin{array}{}
e & (0 ~ 1) & (1 ~ 2) & ... & (n-2 ~~ n-1)
\\ \hline
\textcolor{red}{\Xi_0 \Xi_1} & \textcolor{red}{\Xi_1 \Xi_0} & \Xi_0 \Xi_1 & & \Xi_0 \Xi_1
\\
\Xi_0 \Xi_2 & \Xi_0 \Xi_2 & \Xi_0 \Xi_2 & & \Xi_0 \Xi_2
\\
\Xi_0 \Xi_3 & \Xi_0 \Xi_3 & \Xi_0 \Xi_3 & & \Xi_0 \Xi_3
\\
\vdots & \vdots & \vdots & & \vdots
\\
\Xi_0 \Xi_{n-1} & \Xi_0 \Xi_{n-1} & \Xi_{0} \Xi_{n-1} & & \Xi_{n-1} \Xi_0
\\
\textcolor{green}{\Xi_1 \Xi_2} & \Xi_1 \Xi_2 & \textcolor{green}{\Xi_2 \Xi_1} & & \Xi_1 \Xi_2
\\
\vdots & \vdots & \vdots & & \vdots
\\
\textcolor{blue}{\Xi_{n-2} \Xi_{n-1}} & \Xi_{n-2} \Xi_{n-1} & \Xi_{n-2} \Xi_{n-1} & & \textcolor{blue}{\Xi_{n-1} \Xi_{n-2}}
\end{array}
\end{gather*}
$$
<!-- TODO: permutation -->
The "[path swaps](/posts/permutations/1/)" shown commute only the adjacent elements.
By contrast, the permutation (0 2) commutes *Ξ*~0~ past both *Ξ*~1~ and *Ξ*~2~.
But since we already know *Ξ*~0~ and *Ξ*~1~ commute by the above list,
we learn at this step that *Ξ*~0~ and *Ξ*~2~ commute.
This can be repeated until we reach the permutation (0 *n*-1) to prove commutativity between all pairs.