---
title: "Exploring Finite Fields, Part 2 Appendix"
description: |
  Additional notes about polynomial evaluation.
format:
  html:
    html-math-method: katex
date: "2024-01-15"
date-modified: "2025-07-16"
categories:
  - algebra
  - finite field
  - haskell
---


In the [second post in this series](../), we briefly discussed alternate means
  of evaluating polynomials by "plugging in" different structures.


Different Kinds of Polynomials
------------------------------

Rather than redefining evaluation for each of these cases,
  we can map the polynomial into a structure compatible with it should be evaluated.
Essentially, this means that from a polynomial in the base structure,
  we can derive polynomials in these other structures.

In particular, there is a distinction between a matrix of polynomials or a polynomial in matrices:

:::: {layout-ncol="2"}
::: {}
$$
\begin{align*}
  p &: K[x]
  \\
  p(x) &= x^n + p_{n-1}x^{n-1} + ...
  \\
  \phantom{= p} & + p_1 x + p_0
\end{align*}
$$
:::

::: {}
$x$ is a scalar indeterminate

```haskell
p :: Polynomial k
```
:::
::::

:::: {layout-ncol="2"}
::: {}
$$
\begin{align*}
  P &: (K[x])^{m \times m}
  \\
  P(x I) &= (x I)^n + (p_{n-1})(x I)^{n-1} + ...
  \\
  & + p_1(x I)+ p_0 I
\end{align*}
$$
:::

::: {}
$x$ is a scalar indeterminate, $P(x I)= p(x) I$ is a matrix of polynomials in $x$

```haskell
asPolynomialMatrix
  :: Polynomial k -> Matrix (Polynomial k)

pMat :: Matrix (Polynomial k)
pMat = asPolynomialMatrix p
```
:::
::::

:::: {layout-ncol="2"}
::: {}
$$
\begin{align*}
  \hat P &: K^{m \times m}[X]
  \\
  \hat P(X) &= X^n + (p_{n-1}I)X^{n-1} + ...
  \\
  & + (p_1 I) X + (p_0 I)
\end{align*}
$$
:::

::: {}
$X$ is a matrix indeterminate, $\hat P(X)$ is a polynomial over matrices

```haskell
asMatrixPolynomial
  :: Polynomial k -> Polynomial (Matrix k)

pHat :: Polynomial (Matrix k)
pHat = asMatrixPolynomial p
```
:::
::::

It's easy to confuse the latter two, but the Haskell makes the difference in types clearer.
There exists a natural isomorphism between the two, which is discussed further
  in the [fourth post in this series](../../4/).


Cayley-Hamilton Theorem, Revisited
----------------------------------

As a reminder, the
  [Cayley-Hamilton theorem](https://en.wikipedia.org/wiki/Cayley%E2%80%93Hamilton_theorem)
  says that a matrix satisfies its own characteristic polynomial.
In a type-stricter sense, it says the following relationship holds:

```haskell
evalPoly :: a -> Polynomial a -> a

mA :: Matrix a

charpolyA :: Polynomial a
charpolyA = charpoly mA

charpolyA :: Polynomial (Matrix a)
matCharpolyA = asMatrixPolynomial charPolyA

evalPoly mA matCharpolyA == (0 :: Matrix a)
```


Due to the aformentioned isomorphism, factoring a polynomial "inside" a matrix turns
  out to give the same answer as factoring a polynomial over matrices.

:::: {layout-ncol="2"}
::: {}

$$
\begin{gather*}
  P(xI) = \left( \begin{matrix}
      x^2 + x + 1 &           0 \\
                0 & x^2 + x + 1
    \end{matrix}\right)
  \\ \\
  = (xI - C)(xI - C')
  \\ \\
  = \left( \begin{matrix}
      x &    -1 \\
      1 & x + 1
    \end{matrix} \right)
    \left( \begin{matrix}
      x - a &    -b \\
         -c & x - d
    \end{matrix} \right)
  \\ \\
  \begin{align*}
    x(x - a) + c &= x^2 + x + 1
    \\
    \textcolor{green}{x(-b) - (x - d)} &\textcolor{green}{= 0}
    \\
    \textcolor{blue}{(x - a) + (x + 1)(-c)} &\textcolor{blue}{= 0}
    \\
    (-b) + (x + 1)(x - d) &= x^2 + x + 1
  \end{align*}
  \\ \\
    \textcolor{green}{(-b -1)x +d = 0} \implies b = -1, ~ d = 0 \\
    \textcolor{blue}{(1 - c)x - a - c = 0} \implies c = 1, ~ a = -1
  \\ \\
  C' =
    \left( \begin{matrix}
      -1 & -1 \\
       1 &  0
    \end{matrix} \right)
\end{gather*}
$$
:::

::: {}
$$
\begin{gather*}
  \hat P(X) = X^2 + X + 1I
  \\[10pt]
  = (X - C)(X - C')
  \\[10pt]
  = X^2 - (C + C')X + CC'
  \\[10pt]
  \implies
  \\[10pt]
  C + C' = -I, ~ C' = -I - C
  \\[10pt]
  CC' = I, ~ C^{-1} = C'
  \\[10pt]
  C' = \left( \begin{matrix}
      -1 & -1 \\
       1 &  0
    \end{matrix} \right)
\end{gather*}
$$
:::
::::

It's important to not that a matrix factorization is not unique.
*Any* matrix with a given characteristic polynomial can be used as a root of that polynomial.
Of course, choosing one root affects the other matrix roots.


### Moving Roots

All matrices commute with the identity and zero matrices.
A less obvious fact is that for a matrix factorization, all roots *also* commute with one another.
By the Fundamental Theorem of Algebra,
  [Vieta's formulas](https://en.wikipedia.org/wiki/Vieta%27s_formulas) state:

$$
\begin{gather*}
  \hat P(X)
    = \prod_{[i]_n} (X - \Xi_i)
    = (X - \Xi_0) (X - \Xi_1)...(X - \Xi_{n-1})
  \\
  = \left\{ \begin{align*}
      & \phantom{+} X^n
      \\
      & - (\Xi_0 + \Xi_1 + ... + \Xi_{n-1}) X^{n-1}
      \\
      & + (\Xi_0 \Xi_1+ \Xi_0 \Xi_2 + ... + \Xi_0 \Xi_{n-1} + \Xi_1 \Xi_2 + ... + \Xi_{n-2} \Xi_{n-1})X^{n-2}
      \\
      & \qquad \vdots
      \\
      & + (-1)^n \Xi_0 \Xi_1 \Xi_2...\Xi_n
    \end{align*} \right.
  \\
  = X^n -\sigma_1([\Xi]_n)X^{n-1} + \sigma_2([\Xi]_n)X^{n-2} + ... + (-1)^n \sigma_n([\Xi]_n)
\end{gather*}
$$

The product range \[*i*\]~*n*~ means that the terms are ordered from 0 to *n* - 1 over the index given.
On the bottom line, *σ* are
  [elementary symmetric polynomials](https://en.wikipedia.org/wiki/Elementary_symmetric_polynomial)
  and \[*Ξ*\]~*n*~ is the list of root matrices from *Ξ*~*0*~ to Ξ~*n-1*~.

By factoring the matrix with the roots in a different order, we get another factorization.
It suffices to only focus on *σ*~2~, which has all pairwise products.

$$
\begin{gather*}
  \pi \in S_n
  \\
  \qquad
    \pi \circ \hat P(X) = \prod_{\pi ([i]_n)} (X - \Xi_i)
  \\ \\
  = X^n
    - \sigma_1 \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)X^{n-1}
    + \sigma_2 \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)X^{n-2} + ...
    + (-1)^n \sigma_n \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)
  \\[10pt]
  \pi_{(0 ~ 1)} \circ \hat P(X) = (X - \Xi_{1}) (X - \Xi_0)(X - \Xi_2)...(X - \Xi_{n-1})
  \\
  = X^n + ... + \sigma_2(\Xi_1, \Xi_0, \Xi_2, ...,\Xi_{n-1})X^{n-2} + ...
  \\[10pt]
  \begin{array}{}
    e & (0 ~ 1) & (1 ~ 2) & ... & (n-2 ~~ n-1)
    \\ \hline
    \textcolor{red}{\Xi_0 \Xi_1} & \textcolor{red}{\Xi_1 \Xi_0} & \Xi_0 \Xi_1  & & \Xi_0 \Xi_1
    \\
    \Xi_0 \Xi_2 & \Xi_0 \Xi_2 & \Xi_0 \Xi_2 & & \Xi_0 \Xi_2
    \\
    \Xi_0 \Xi_3  & \Xi_0 \Xi_3 & \Xi_0 \Xi_3 & & \Xi_0 \Xi_3
    \\
    \vdots & \vdots & \vdots & & \vdots
    \\
    \Xi_0 \Xi_{n-1}  & \Xi_0 \Xi_{n-1} & \Xi_{0} \Xi_{n-1} & & \Xi_{n-1} \Xi_0
    \\
    \textcolor{green}{\Xi_1 \Xi_2}  & \Xi_1 \Xi_2 & \textcolor{green}{\Xi_2 \Xi_1}  & & \Xi_1 \Xi_2
    \\
    \vdots & \vdots & \vdots & & \vdots
    \\
    \textcolor{blue}{\Xi_{n-2} \Xi_{n-1}}  & \Xi_{n-2} \Xi_{n-1} & \Xi_{n-2} \Xi_{n-1} & & \textcolor{blue}{\Xi_{n-1} \Xi_{n-2}}
  \end{array}
\end{gather*}
$$

The "[path swaps](/posts/math/permutations/1/)" shown commute only the adjacent elements.
By contrast, the permutation $(0 ~ 2)$ commutes *Ξ*~0~ past both *Ξ*~1~ and *Ξ*~2~.
But since we already know *Ξ*~0~ and *Ξ*~1~ commute by the above list,
  we learn at this step that *Ξ*~0~ and *Ξ*~2~ commute.
This can be repeated until we reach the permutation $(0 ~ n-1)$ to prove commutativity between all pairs.