zenzicubi.co/posts/finite-field/2/extra.qmd

Rather than redefining evaluation for each of these cases,
  we should map our polynomial into a structure compatible with how we want to evaluate it.
Essentially, this means that from a polynomial in the base structure,
  we can derive polynomials in these other structures.
In particular, we can either have a matrix of polynomials or a polynomial in matrices.

<!-- TODO: notes about functoriality of `fmap`ping eval vs -->
:::: {layout-ncol="2"}
::: {}
$$
\begin{align*}
  p &: K[x]
  \\
  p(x) &= x^n + p_{n-1}x^{n-1} + ...
  \\
  \phantom{= p} & + p_1 x + p_0
\end{align*}
$$
:::

::: {}
$x$ is a scalar indeterminate

```haskell
p :: Polynomial K
```
:::
::::

:::: {layout-ncol="2"}
::: {}
$$
\begin{align*}
  P &: (K[x])^{m \times m}
  \\
  P(x I) &= (x I)^n + (p_{n-1})(x I)^{n-1} + ...
  \\
  & + p_1(x I)+ p_0 I
\end{align*}
$$
:::

::: {}
$x$ is a scalar indeterminate, $P(x I)= p(x) I$ is a matrix of polynomials in $x$

```haskell
asPolynomialMatrix
  :: Polynomial K -> Matrix (Polynomial K)

pMat :: Matrix (Polynomial K)
pMat = asPolynomialMatrix p
```
:::
::::

:::: {layout-ncol="2"}
::: {}
$$
\begin{align*}
  \hat P &: K^{m \times m}[X]
  \\
  \hat P(X) &= X^n + (p_{n-1}I)X^{n-1} + ...
  \\
  & + (p_1 I) X + (p_0 I)
\end{align*}
$$
:::

::: {}
$X$ is a matrix indeterminate, $\hat P(X)$ is a polynomial over matrices

```haskell
asMatrixPolynomial
  :: Polynomial K -> Polynomial (Matrix K)

pHat :: Polynomial (Matrix K)
pHat = asMatrixPolynomial p
```
:::
::::


### Cayley-Hamilton Theorem

When evaluating the characteristic polynomial of a matrix *with* that matrix,
  something strange happens.
Continuing from the previous article, using $x^2 + x + 1$ and its companion matrix, we have:

$$
\begin{gather*}
  p(x) = x^2 + x + 1 \qquad C_{p} = C
  = \left( \begin{matrix}
       0 &  1 \\
      -1 & -1
    \end{matrix} \right)
  \\ \\
  \hat P(C) = C^2 + C + (1 \cdot I)
    = \left( \begin{matrix}
      -1 & -1 \\
       1 &  0
    \end{matrix} \right)
    + \left( \begin{matrix}
       0 &  1 \\
      -1 & -1
    \end{matrix} \right)
    + \left( \begin{matrix}
      1 & 0 \\
      0 & 1
    \end{matrix} \right)
  \\ \\
  = \left( \begin{matrix}
      0 & 0 \\
      0 & 0
    \end{matrix} \right)
\end{gather*}
$$

The result is the zero matrix.
This tells us that, at least in this case, the matrix *C* is a root of its own characteristic polynomial.
By the [Cayley-Hamilton theorem](https://en.wikipedia.org/wiki/Cayley%E2%80%93Hamilton_theorem),
  this is true in general, no matter the degree of *p*, no matter its coefficients,
  and importantly, no matter the choice of field.

This is more powerful than it would otherwise seem.
For one, factoring a polynomial "inside" a matrix turns out to give the same answer
  as factoring a polynomial over matrices.

:::: {layout-ncol="2"}
::: {}

$$
\begin{gather*}
  P(xI) = \left( \begin{matrix}
      x^2 + x + 1 &           0 \\
                0 & x^2 + x + 1
    \end{matrix}\right)
  \\ \\
  = (xI - C)(xI - C')
  \\ \\
  = \left( \begin{matrix}
      x &    -1 \\
      1 & x + 1
    \end{matrix} \right)
    \left( \begin{matrix}
      x - a &    -b \\
         -c & x - d
    \end{matrix} \right)
  \\ \\
  \begin{align*}
    x(x - a) + c &= x^2 + x + 1
    \\
    \textcolor{green}{x(-b) - (x - d)} &\textcolor{green}{= 0}
    \\
    \textcolor{blue}{(x - a) + (x + 1)(-c)} &\textcolor{blue}{= 0}
    \\
    (-b) + (x + 1)(x - d) &= x^2 + x + 1
  \end{align*}
  \\ \\
    \textcolor{green}{(-b -1)x +d = 0} \implies b = -1, ~ d = 0 \\
    \textcolor{blue}{(1 - c)x - a - c = 0} \implies c = 1, ~ a = -1
  \\ \\
  C' =
    \left( \begin{matrix}
      -1 & -1 \\
       1 &  0
    \end{matrix} \right)
\end{gather*}
$$
:::

::: {}
$$
\begin{gather*}
  \hat P(X) = X^2 + X + 1I
  \\[10pt]
  = (X - C)(X - C')
  \\[10pt]
  = X^2 - (C + C')X + CC'
  \\[10pt]
  \implies
  \\[10pt]
  C + C' = -I, ~ C' = -I - C
  \\[10pt]
  CC' = I, ~ C^{-1} = C'
  \\[10pt]
  C' = \left( \begin{matrix}
      -1 & -1 \\
       1 &  0
    \end{matrix} \right)
\end{gather*}
$$
:::
::::

It's important to not that a matrix factorization is not unique.
*Any* matrix with a given characteristic polynomial can be used as a root of that polynomial.
Of course, choosing one root affects the other matrix roots.


### Moving Roots

All matrices commute with the identity and zero matrices.
A less obvious fact is that all of the matrix roots *also* commute with one another.
By the Fundamental Theorem of Algebra,
  [Vieta's formulas](https://en.wikipedia.org/wiki/Vieta%27s_formulas) state:

$$
\begin{gather*}
  \hat P(X)
    = \prod_{[i]_n} (X - \Xi_i)
    = (X - \Xi_0) (X - \Xi_1)...(X - \Xi_{n-1})
  \\
  = \left\{ \begin{align*}
      & \phantom{+} X^n
      \\
      & - (\Xi_0 + \Xi_1 + ... + \Xi_{n-1}) X^{n-1}
      \\
      & + (\Xi_0 \Xi_1+ \Xi_0 \Xi_2 + ... + \Xi_0 \Xi_{n-1} + \Xi_1 \Xi_2 + ... + \Xi_{n-2} \Xi_{n-1})X^{n-2}
      \\
      & \qquad \vdots
      \\
      & + (-1)^n \Xi_0 \Xi_1 \Xi_2...\Xi_n
    \end{align*} \right.
  \\
  = X^n -\sigma_1([\Xi]_n)X^{n-1} + \sigma_2([\Xi]_n)X^{n-2} + ... + (-1)^n \sigma_n([\Xi]_n)
\end{gather*}
$$

The product range \[*i*\]~*n*~ means that the terms are ordered from 0 to *n* - 1 over the index given.
On the bottom line, *σ* are
  [elementary symmetric polynomials](https://en.wikipedia.org/wiki/Elementary_symmetric_polynomial)
  and \[*Ξ*\]~*n*~ is the list of root matrices from *Ξ*~*0*~ to Ξ~*n-1*~.

By factoring the matrix with the roots in a different order, we get another factorization.
It suffices to only focus on *σ*~2~, which has all pairwise products.

$$
\begin{gather*}
  \pi \in S_n
  \\
  \qquad
    \pi \circ \hat P(X) = \prod_{\pi ([i]_n)} (X - \Xi_i)
  \\ \\
  = X^n
    - \sigma_1 \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)X^{n-1} +
    + \sigma_2 \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)X^{n-2} + ...
    + (-1)^n \sigma_n \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)
  \\ \\
  \\ \\
  (0 ~ 1) \circ \hat P(X) = (X - \Xi_{1}) (X - \Xi_0)(X - \Xi_2)...(X - \Xi_{n-1})
  \\
  = X^n + ... + \sigma_2(\Xi_1, \Xi_0, \Xi_2, ...,\Xi_{n-1})X^{n-2} + ...
  \\ \\ \\ \\
  \begin{array}{}
    e & (0 ~ 1) & (1 ~ 2) & ... & (n-2 ~~ n-1)
    \\ \hline
    \textcolor{red}{\Xi_0 \Xi_1} & \textcolor{red}{\Xi_1 \Xi_0} & \Xi_0 \Xi_1  & & \Xi_0 \Xi_1
    \\
    \Xi_0 \Xi_2 & \Xi_0 \Xi_2 & \Xi_0 \Xi_2 & & \Xi_0 \Xi_2
    \\
    \Xi_0 \Xi_3  & \Xi_0 \Xi_3 & \Xi_0 \Xi_3 & & \Xi_0 \Xi_3
    \\
    \vdots & \vdots & \vdots & & \vdots
    \\
    \Xi_0 \Xi_{n-1}  & \Xi_0 \Xi_{n-1} & \Xi_{0} \Xi_{n-1} & & \Xi_{n-1} \Xi_0
    \\
    \textcolor{green}{\Xi_1 \Xi_2}  & \Xi_1 \Xi_2 & \textcolor{green}{\Xi_2 \Xi_1}  & & \Xi_1 \Xi_2
    \\
    \vdots & \vdots & \vdots & & \vdots
    \\
    \textcolor{blue}{\Xi_{n-2} \Xi_{n-1}}  & \Xi_{n-2} \Xi_{n-1} & \Xi_{n-2} \Xi_{n-1} & & \textcolor{blue}{\Xi_{n-1} \Xi_{n-2}}
  \end{array}
\end{gather*}
$$

<!-- TODO: permutation -->
The "[path swaps](/posts/permutations/1/)" shown commute only the adjacent elements.
By contrast, the permutation (0 2) commutes *Ξ*~0~ past both *Ξ*~1~ and *Ξ*~2~.
But since we already know *Ξ*~0~ and *Ξ*~1~ commute by the above list,
  we learn at this step that *Ξ*~0~ and *Ξ*~2~ commute.
This can be repeated until we reach the permutation (0 *n*-1) to prove commutativity between all pairs.