288 lines
7.2 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "Exploring Finite Fields, Part 2 Appendix"
description: |
Additional notes about polynomial evaluation.
format:
html:
html-math-method: katex
date: "2024-01-15"
date-modified: "2025-07-16"
categories:
- algebra
- finite field
- haskell
---
In the [second post in this series](../), we briefly discussed alternate means
of evaluating polynomials by "plugging in" different structures.
Different Kinds of Polynomials
------------------------------
Rather than redefining evaluation for each of these cases,
we can map the polynomial into a structure compatible with it should be evaluated.
Essentially, this means that from a polynomial in the base structure,
we can derive polynomials in these other structures.
In particular, there is a distinction between a matrix of polynomials or a polynomial in matrices:
:::: {layout-ncol="2"}
::: {}
$$
\begin{align*}
p &: K[x]
\\
p(x) &= x^n + p_{n-1}x^{n-1} + ...
\\
\phantom{= p} & + p_1 x + p_0
\end{align*}
$$
:::
::: {}
$x$ is a scalar indeterminate
```haskell
p :: Polynomial k
```
:::
::::
:::: {layout-ncol="2"}
::: {}
$$
\begin{align*}
P &: (K[x])^{m \times m}
\\
P(x I) &= (x I)^n + (p_{n-1})(x I)^{n-1} + ...
\\
& + p_1(x I)+ p_0 I
\end{align*}
$$
:::
::: {}
$x$ is a scalar indeterminate, $P(x I)= p(x) I$ is a matrix of polynomials in $x$
```haskell
asPolynomialMatrix
:: Polynomial k -> Matrix (Polynomial k)
pMat :: Matrix (Polynomial k)
pMat = asPolynomialMatrix p
```
:::
::::
:::: {layout-ncol="2"}
::: {}
$$
\begin{align*}
\hat P &: K^{m \times m}[X]
\\
\hat P(X) &= X^n + (p_{n-1}I)X^{n-1} + ...
\\
& + (p_1 I) X + (p_0 I)
\end{align*}
$$
:::
::: {}
$X$ is a matrix indeterminate, $\hat P(X)$ is a polynomial over matrices
```haskell
asMatrixPolynomial
:: Polynomial k -> Polynomial (Matrix k)
pHat :: Polynomial (Matrix k)
pHat = asMatrixPolynomial p
```
:::
::::
It's easy to confuse the latter two, but the Haskell makes the difference in types clearer.
There exists a natural isomorphism between the two, which is discussed further
in the [fourth post in this series](../../4/).
Cayley-Hamilton Theorem, Revisited
----------------------------------
As a reminder, the
[Cayley-Hamilton theorem](https://en.wikipedia.org/wiki/Cayley%E2%80%93Hamilton_theorem)
says that a matrix satisfies its own characteristic polynomial.
In a type-stricter sense, it says the following relationship holds:
```haskell
evalPoly :: a -> Polynomial a -> a
mA :: Matrix a
charpolyA :: Polynomial a
charpolyA = charpoly mA
charpolyA :: Polynomial (Matrix a)
matCharpolyA = asMatrixPolynomial charPolyA
evalPoly mA matCharpolyA == (0 :: Matrix a)
```
Due to the aformentioned isomorphism, factoring a polynomial "inside" a matrix turns
out to give the same answer as factoring a polynomial over matrices.
:::: {layout-ncol="2"}
::: {}
$$
\begin{gather*}
P(xI) = \left( \begin{matrix}
x^2 + x + 1 & 0 \\
0 & x^2 + x + 1
\end{matrix}\right)
\\ \\
= (xI - C)(xI - C')
\\ \\
= \left( \begin{matrix}
x & -1 \\
1 & x + 1
\end{matrix} \right)
\left( \begin{matrix}
x - a & -b \\
-c & x - d
\end{matrix} \right)
\\ \\
\begin{align*}
x(x - a) + c &= x^2 + x + 1
\\
\textcolor{green}{x(-b) - (x - d)} &\textcolor{green}{= 0}
\\
\textcolor{blue}{(x - a) + (x + 1)(-c)} &\textcolor{blue}{= 0}
\\
(-b) + (x + 1)(x - d) &= x^2 + x + 1
\end{align*}
\\ \\
\textcolor{green}{(-b -1)x +d = 0} \implies b = -1, ~ d = 0 \\
\textcolor{blue}{(1 - c)x - a - c = 0} \implies c = 1, ~ a = -1
\\ \\
C' =
\left( \begin{matrix}
-1 & -1 \\
1 & 0
\end{matrix} \right)
\end{gather*}
$$
:::
::: {}
$$
\begin{gather*}
\hat P(X) = X^2 + X + 1I
\\[10pt]
= (X - C)(X - C')
\\[10pt]
= X^2 - (C + C')X + CC'
\\[10pt]
\implies
\\[10pt]
C + C' = -I, ~ C' = -I - C
\\[10pt]
CC' = I, ~ C^{-1} = C'
\\[10pt]
C' = \left( \begin{matrix}
-1 & -1 \\
1 & 0
\end{matrix} \right)
\end{gather*}
$$
:::
::::
It's important to not that a matrix factorization is not unique.
*Any* matrix with a given characteristic polynomial can be used as a root of that polynomial.
Of course, choosing one root affects the other matrix roots.
### Moving Roots
All matrices commute with the identity and zero matrices.
A less obvious fact is that for a matrix factorization, all roots *also* commute with one another.
By the Fundamental Theorem of Algebra,
[Vieta's formulas](https://en.wikipedia.org/wiki/Vieta%27s_formulas) state:
$$
\begin{gather*}
\hat P(X)
= \prod_{[i]_n} (X - \Xi_i)
= (X - \Xi_0) (X - \Xi_1)...(X - \Xi_{n-1})
\\
= \left\{ \begin{align*}
& \phantom{+} X^n
\\
& - (\Xi_0 + \Xi_1 + ... + \Xi_{n-1}) X^{n-1}
\\
& + (\Xi_0 \Xi_1+ \Xi_0 \Xi_2 + ... + \Xi_0 \Xi_{n-1} + \Xi_1 \Xi_2 + ... + \Xi_{n-2} \Xi_{n-1})X^{n-2}
\\
& \qquad \vdots
\\
& + (-1)^n \Xi_0 \Xi_1 \Xi_2...\Xi_n
\end{align*} \right.
\\
= X^n -\sigma_1([\Xi]_n)X^{n-1} + \sigma_2([\Xi]_n)X^{n-2} + ... + (-1)^n \sigma_n([\Xi]_n)
\end{gather*}
$$
The product range \[*i*\]~*n*~ means that the terms are ordered from 0 to *n* - 1 over the index given.
On the bottom line, *σ* are
[elementary symmetric polynomials](https://en.wikipedia.org/wiki/Elementary_symmetric_polynomial)
and \[*Ξ*\]~*n*~ is the list of root matrices from *Ξ*~*0*~ to Ξ~*n-1*~.
By factoring the matrix with the roots in a different order, we get another factorization.
It suffices to only focus on *σ*~2~, which has all pairwise products.
$$
\begin{gather*}
\pi \in S_n
\\
\qquad
\pi \circ \hat P(X) = \prod_{\pi ([i]_n)} (X - \Xi_i)
\\ \\
= X^n
- \sigma_1 \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)X^{n-1}
+ \sigma_2 \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)X^{n-2} + ...
+ (-1)^n \sigma_n \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)
\\[10pt]
\pi_{(0 ~ 1)} \circ \hat P(X) = (X - \Xi_{1}) (X - \Xi_0)(X - \Xi_2)...(X - \Xi_{n-1})
\\
= X^n + ... + \sigma_2(\Xi_1, \Xi_0, \Xi_2, ...,\Xi_{n-1})X^{n-2} + ...
\\[10pt]
\begin{array}{}
e & (0 ~ 1) & (1 ~ 2) & ... & (n-2 ~~ n-1)
\\ \hline
\textcolor{red}{\Xi_0 \Xi_1} & \textcolor{red}{\Xi_1 \Xi_0} & \Xi_0 \Xi_1 & & \Xi_0 \Xi_1
\\
\Xi_0 \Xi_2 & \Xi_0 \Xi_2 & \Xi_0 \Xi_2 & & \Xi_0 \Xi_2
\\
\Xi_0 \Xi_3 & \Xi_0 \Xi_3 & \Xi_0 \Xi_3 & & \Xi_0 \Xi_3
\\
\vdots & \vdots & \vdots & & \vdots
\\
\Xi_0 \Xi_{n-1} & \Xi_0 \Xi_{n-1} & \Xi_{0} \Xi_{n-1} & & \Xi_{n-1} \Xi_0
\\
\textcolor{green}{\Xi_1 \Xi_2} & \Xi_1 \Xi_2 & \textcolor{green}{\Xi_2 \Xi_1} & & \Xi_1 \Xi_2
\\
\vdots & \vdots & \vdots & & \vdots
\\
\textcolor{blue}{\Xi_{n-2} \Xi_{n-1}} & \Xi_{n-2} \Xi_{n-1} & \Xi_{n-2} \Xi_{n-1} & & \textcolor{blue}{\Xi_{n-1} \Xi_{n-2}}
\end{array}
\end{gather*}
$$
The "[path swaps](/posts/math/permutations/1/)" shown commute only the adjacent elements.
By contrast, the permutation $(0 ~ 2)$ commutes *Ξ*~0~ past both *Ξ*~1~ and *Ξ*~2~.
But since we already know *Ξ*~0~ and *Ξ*~1~ commute by the above list,
we learn at this step that *Ξ*~0~ and *Ξ*~2~ commute.
This can be repeated until we reach the permutation $(0 ~ n-1)$ to prove commutativity between all pairs.