282 lines
7.2 KiB
Plaintext
282 lines
7.2 KiB
Plaintext
Rather than redefining evaluation for each of these cases,
|
||
we should map our polynomial into a structure compatible with how we want to evaluate it.
|
||
Essentially, this means that from a polynomial in the base structure,
|
||
we can derive polynomials in these other structures.
|
||
In particular, we can either have a matrix of polynomials or a polynomial in matrices.
|
||
|
||
<!-- TODO: notes about functoriality of `fmap`ping eval vs -->
|
||
:::: {layout-ncol="2"}
|
||
::: {}
|
||
$$
|
||
\begin{align*}
|
||
p &: K[x]
|
||
\\
|
||
p(x) &= x^n + p_{n-1}x^{n-1} + ...
|
||
\\
|
||
\phantom{= p} & + p_1 x + p_0
|
||
\end{align*}
|
||
$$
|
||
:::
|
||
|
||
::: {}
|
||
$x$ is a scalar indeterminate
|
||
|
||
```haskell
|
||
p :: Polynomial K
|
||
```
|
||
:::
|
||
::::
|
||
|
||
:::: {layout-ncol="2"}
|
||
::: {}
|
||
$$
|
||
\begin{align*}
|
||
P &: (K[x])^{m \times m}
|
||
\\
|
||
P(x I) &= (x I)^n + (p_{n-1})(x I)^{n-1} + ...
|
||
\\
|
||
& + p_1(x I)+ p_0 I
|
||
\end{align*}
|
||
$$
|
||
:::
|
||
|
||
::: {}
|
||
$x$ is a scalar indeterminate, $P(x I)= p(x) I$ is a matrix of polynomials in $x$
|
||
|
||
```haskell
|
||
asPolynomialMatrix
|
||
:: Polynomial K -> Matrix (Polynomial K)
|
||
|
||
pMat :: Matrix (Polynomial K)
|
||
pMat = asPolynomialMatrix p
|
||
```
|
||
:::
|
||
::::
|
||
|
||
:::: {layout-ncol="2"}
|
||
::: {}
|
||
$$
|
||
\begin{align*}
|
||
\hat P &: K^{m \times m}[X]
|
||
\\
|
||
\hat P(X) &= X^n + (p_{n-1}I)X^{n-1} + ...
|
||
\\
|
||
& + (p_1 I) X + (p_0 I)
|
||
\end{align*}
|
||
$$
|
||
:::
|
||
|
||
::: {}
|
||
$X$ is a matrix indeterminate, $\hat P(X)$ is a polynomial over matrices
|
||
|
||
```haskell
|
||
asMatrixPolynomial
|
||
:: Polynomial K -> Polynomial (Matrix K)
|
||
|
||
pHat :: Polynomial (Matrix K)
|
||
pHat = asMatrixPolynomial p
|
||
```
|
||
:::
|
||
::::
|
||
|
||
|
||
### Cayley-Hamilton Theorem
|
||
|
||
When evaluating the characteristic polynomial of a matrix *with* that matrix,
|
||
something strange happens.
|
||
Continuing from the previous article, using $x^2 + x + 1$ and its companion matrix, we have:
|
||
|
||
$$
|
||
\begin{gather*}
|
||
p(x) = x^2 + x + 1 \qquad C_{p} = C
|
||
= \left( \begin{matrix}
|
||
0 & 1 \\
|
||
-1 & -1
|
||
\end{matrix} \right)
|
||
\\ \\
|
||
\hat P(C) = C^2 + C + (1 \cdot I)
|
||
= \left( \begin{matrix}
|
||
-1 & -1 \\
|
||
1 & 0
|
||
\end{matrix} \right)
|
||
+ \left( \begin{matrix}
|
||
0 & 1 \\
|
||
-1 & -1
|
||
\end{matrix} \right)
|
||
+ \left( \begin{matrix}
|
||
1 & 0 \\
|
||
0 & 1
|
||
\end{matrix} \right)
|
||
\\ \\
|
||
= \left( \begin{matrix}
|
||
0 & 0 \\
|
||
0 & 0
|
||
\end{matrix} \right)
|
||
\end{gather*}
|
||
$$
|
||
|
||
The result is the zero matrix.
|
||
This tells us that, at least in this case, the matrix *C* is a root of its own characteristic polynomial.
|
||
By the [Cayley-Hamilton theorem](https://en.wikipedia.org/wiki/Cayley%E2%80%93Hamilton_theorem),
|
||
this is true in general, no matter the degree of *p*, no matter its coefficients,
|
||
and importantly, no matter the choice of field.
|
||
|
||
This is more powerful than it would otherwise seem.
|
||
For one, factoring a polynomial "inside" a matrix turns out to give the same answer
|
||
as factoring a polynomial over matrices.
|
||
|
||
:::: {layout-ncol="2"}
|
||
::: {}
|
||
|
||
$$
|
||
\begin{gather*}
|
||
P(xI) = \left( \begin{matrix}
|
||
x^2 + x + 1 & 0 \\
|
||
0 & x^2 + x + 1
|
||
\end{matrix}\right)
|
||
\\ \\
|
||
= (xI - C)(xI - C')
|
||
\\ \\
|
||
= \left( \begin{matrix}
|
||
x & -1 \\
|
||
1 & x + 1
|
||
\end{matrix} \right)
|
||
\left( \begin{matrix}
|
||
x - a & -b \\
|
||
-c & x - d
|
||
\end{matrix} \right)
|
||
\\ \\
|
||
\begin{align*}
|
||
x(x - a) + c &= x^2 + x + 1
|
||
\\
|
||
\textcolor{green}{x(-b) - (x - d)} &\textcolor{green}{= 0}
|
||
\\
|
||
\textcolor{blue}{(x - a) + (x + 1)(-c)} &\textcolor{blue}{= 0}
|
||
\\
|
||
(-b) + (x + 1)(x - d) &= x^2 + x + 1
|
||
\end{align*}
|
||
\\ \\
|
||
\textcolor{green}{(-b -1)x +d = 0} \implies b = -1, ~ d = 0 \\
|
||
\textcolor{blue}{(1 - c)x - a - c = 0} \implies c = 1, ~ a = -1
|
||
\\ \\
|
||
C' =
|
||
\left( \begin{matrix}
|
||
-1 & -1 \\
|
||
1 & 0
|
||
\end{matrix} \right)
|
||
\end{gather*}
|
||
$$
|
||
:::
|
||
|
||
::: {}
|
||
$$
|
||
\begin{gather*}
|
||
\hat P(X) = X^2 + X + 1I
|
||
\\[10pt]
|
||
= (X - C)(X - C')
|
||
\\[10pt]
|
||
= X^2 - (C + C')X + CC'
|
||
\\[10pt]
|
||
\implies
|
||
\\[10pt]
|
||
C + C' = -I, ~ C' = -I - C
|
||
\\[10pt]
|
||
CC' = I, ~ C^{-1} = C'
|
||
\\[10pt]
|
||
C' = \left( \begin{matrix}
|
||
-1 & -1 \\
|
||
1 & 0
|
||
\end{matrix} \right)
|
||
\end{gather*}
|
||
$$
|
||
:::
|
||
::::
|
||
|
||
It's important to not that a matrix factorization is not unique.
|
||
*Any* matrix with a given characteristic polynomial can be used as a root of that polynomial.
|
||
Of course, choosing one root affects the other matrix roots.
|
||
|
||
|
||
### Moving Roots
|
||
|
||
All matrices commute with the identity and zero matrices.
|
||
A less obvious fact is that all of the matrix roots *also* commute with one another.
|
||
By the Fundamental Theorem of Algebra,
|
||
[Vieta's formulas](https://en.wikipedia.org/wiki/Vieta%27s_formulas) state:
|
||
|
||
$$
|
||
\begin{gather*}
|
||
\hat P(X)
|
||
= \prod_{[i]_n} (X - \Xi_i)
|
||
= (X - \Xi_0) (X - \Xi_1)...(X - \Xi_{n-1})
|
||
\\
|
||
= \left\{ \begin{align*}
|
||
& \phantom{+} X^n
|
||
\\
|
||
& - (\Xi_0 + \Xi_1 + ... + \Xi_{n-1}) X^{n-1}
|
||
\\
|
||
& + (\Xi_0 \Xi_1+ \Xi_0 \Xi_2 + ... + \Xi_0 \Xi_{n-1} + \Xi_1 \Xi_2 + ... + \Xi_{n-2} \Xi_{n-1})X^{n-2}
|
||
\\
|
||
& \qquad \vdots
|
||
\\
|
||
& + (-1)^n \Xi_0 \Xi_1 \Xi_2...\Xi_n
|
||
\end{align*} \right.
|
||
\\
|
||
= X^n -\sigma_1([\Xi]_n)X^{n-1} + \sigma_2([\Xi]_n)X^{n-2} + ... + (-1)^n \sigma_n([\Xi]_n)
|
||
\end{gather*}
|
||
$$
|
||
|
||
The product range \[*i*\]~*n*~ means that the terms are ordered from 0 to *n* - 1 over the index given.
|
||
On the bottom line, *σ* are
|
||
[elementary symmetric polynomials](https://en.wikipedia.org/wiki/Elementary_symmetric_polynomial)
|
||
and \[*Ξ*\]~*n*~ is the list of root matrices from *Ξ*~*0*~ to Ξ~*n-1*~.
|
||
|
||
By factoring the matrix with the roots in a different order, we get another factorization.
|
||
It suffices to only focus on *σ*~2~, which has all pairwise products.
|
||
|
||
$$
|
||
\begin{gather*}
|
||
\pi \in S_n
|
||
\\
|
||
\qquad
|
||
\pi \circ \hat P(X) = \prod_{\pi ([i]_n)} (X - \Xi_i)
|
||
\\ \\
|
||
= X^n
|
||
- \sigma_1 \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)X^{n-1} +
|
||
+ \sigma_2 \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)X^{n-2} + ...
|
||
+ (-1)^n \sigma_n \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)
|
||
\\ \\
|
||
\\ \\
|
||
(0 ~ 1) \circ \hat P(X) = (X - \Xi_{1}) (X - \Xi_0)(X - \Xi_2)...(X - \Xi_{n-1})
|
||
\\
|
||
= X^n + ... + \sigma_2(\Xi_1, \Xi_0, \Xi_2, ...,\Xi_{n-1})X^{n-2} + ...
|
||
\\ \\ \\ \\
|
||
\begin{array}{}
|
||
e & (0 ~ 1) & (1 ~ 2) & ... & (n-2 ~~ n-1)
|
||
\\ \hline
|
||
\textcolor{red}{\Xi_0 \Xi_1} & \textcolor{red}{\Xi_1 \Xi_0} & \Xi_0 \Xi_1 & & \Xi_0 \Xi_1
|
||
\\
|
||
\Xi_0 \Xi_2 & \Xi_0 \Xi_2 & \Xi_0 \Xi_2 & & \Xi_0 \Xi_2
|
||
\\
|
||
\Xi_0 \Xi_3 & \Xi_0 \Xi_3 & \Xi_0 \Xi_3 & & \Xi_0 \Xi_3
|
||
\\
|
||
\vdots & \vdots & \vdots & & \vdots
|
||
\\
|
||
\Xi_0 \Xi_{n-1} & \Xi_0 \Xi_{n-1} & \Xi_{0} \Xi_{n-1} & & \Xi_{n-1} \Xi_0
|
||
\\
|
||
\textcolor{green}{\Xi_1 \Xi_2} & \Xi_1 \Xi_2 & \textcolor{green}{\Xi_2 \Xi_1} & & \Xi_1 \Xi_2
|
||
\\
|
||
\vdots & \vdots & \vdots & & \vdots
|
||
\\
|
||
\textcolor{blue}{\Xi_{n-2} \Xi_{n-1}} & \Xi_{n-2} \Xi_{n-1} & \Xi_{n-2} \Xi_{n-1} & & \textcolor{blue}{\Xi_{n-1} \Xi_{n-2}}
|
||
\end{array}
|
||
\end{gather*}
|
||
$$
|
||
|
||
<!-- TODO: permutation -->
|
||
The "[path swaps](/posts/permutations/1/)" shown commute only the adjacent elements.
|
||
By contrast, the permutation (0 2) commutes *Ξ*~0~ past both *Ξ*~1~ and *Ξ*~2~.
|
||
But since we already know *Ξ*~0~ and *Ξ*~1~ commute by the above list,
|
||
we learn at this step that *Ξ*~0~ and *Ξ*~2~ commute.
|
||
This can be repeated until we reach the permutation (0 *n*-1) to prove commutativity between all pairs.
|