1012 lines
31 KiB
Plaintext
1012 lines
31 KiB
Plaintext
---
|
||
title: "Exploring Finite Fields, Part 2: Matrix Boogaloo"
|
||
description: |
|
||
"..."
|
||
format:
|
||
html:
|
||
html-math-method: katex
|
||
jupyter: python3
|
||
date: "2024-01-15"
|
||
date-modified: "2025-07-16"
|
||
categories:
|
||
- algebra
|
||
- finite field
|
||
- haskell
|
||
---
|
||
|
||
<style>
|
||
.red {
|
||
color: red;
|
||
}
|
||
.orange {
|
||
color: orange;
|
||
}
|
||
.yellow {
|
||
color: yellow;
|
||
}
|
||
.green {
|
||
color: green;
|
||
}
|
||
.blue {
|
||
color: blue;
|
||
}
|
||
.purple {
|
||
color: purple;
|
||
}
|
||
</style>
|
||
|
||
<!--
|
||
TODO: data for this post should be built by an external Haskell program
|
||
TODO: half-post about organizing "data" vs full post about graphs and irreducibles
|
||
-->
|
||
|
||
In the [last post](../1), we discussed finite fields, polynomials and matrices over them, and the typical,
|
||
symbolic way of extending fields with polynomials.
|
||
This post will will focus on circumventing symbolic means with numeric ones.
|
||
|
||
|
||
More about Matrices (and Polynomials)
|
||
-------------------------------------
|
||
|
||
Recall the definition of polynomial evaluation.
|
||
Since a polynomial is defined with respect to a field or ring, we expect only to be able to evaluate the
|
||
polynomial at values *in* that field or ring.
|
||
|
||
$$
|
||
\begin{gather*}
|
||
K[x] \times K \overset{\text{eval}}{\longrightarrow} K
|
||
\\
|
||
(p(x), n) \overset{\text{eval}}{\mapsto} p(n)
|
||
\end{gather*}
|
||
$$
|
||
|
||
However, there's nothing wrong with evaluating polynomials with another polynomial,
|
||
as long as they're defined over the same structure.
|
||
After all, we can take powers of polynomials, scalar-multiply them with coefficients from *K*,
|
||
and add them together.
|
||
The same holds for matrices, or any "collection" structure *F* over *K* which has those properties.
|
||
|
||
$$
|
||
\begin{align*}
|
||
K[x] \times K[x]
|
||
&\overset{\text{eval}_{poly}}{\longrightarrow} K[x]
|
||
\\
|
||
(p(x), q(x)) \mapsto p(q(x))
|
||
\\[10pt]
|
||
K[x] \times K^{n \times n}
|
||
&\overset{\text{eval}_{mat}}{\longrightarrow} K^{n \times n}
|
||
(p(x), A) \overset{?}{\mapsto} p(A)
|
||
\\[10pt]
|
||
K[x] \times F(K)
|
||
&\overset{\text{eval}_F}{\longrightarrow} F(K)
|
||
\end{align*}
|
||
$$
|
||
|
||
Rather than redefining evaluation for each of these cases,
|
||
we should map our polynomial into a structure compatible with how we want to evaluate it.
|
||
Essentially, this means that from a polynomial in the base structure,
|
||
we can derive polynomials in these other structures.
|
||
In particular, we can either have a matrix of polynomials or a polynomial in matrices.
|
||
|
||
<!-- TODO: notes about functoriality of `fmap`ping eval vs -->
|
||
:::: {layout-ncol="2"}
|
||
::: {}
|
||
$$
|
||
\begin{align*}
|
||
p &: K[x]
|
||
\\
|
||
p(x) &= x^n + p_{n-1}x^{n-1} + ...
|
||
\\
|
||
\phantom{= p} & + p_1 x + p_0
|
||
\end{align*}
|
||
$$
|
||
:::
|
||
|
||
::: {}
|
||
$x$ is a scalar indeterminate
|
||
|
||
```{.haskell}
|
||
p :: Polynomial K
|
||
```
|
||
:::
|
||
::::
|
||
|
||
:::: {layout-ncol="2"}
|
||
::: {}
|
||
$$
|
||
\begin{align*}
|
||
P &: (K[x])^{m \times m}
|
||
\\
|
||
P(x I) &= (x I)^n + (p_{n-1})(x I)^{n-1} + ...
|
||
\\
|
||
& + p_1(x I)+ p_0 I
|
||
\end{align*}
|
||
$$
|
||
:::
|
||
|
||
::: {}
|
||
$x$ is a scalar indeterminate, $P(x I)= p(x) I$ is a matrix of polynomials in $x$
|
||
|
||
```{.haskell}
|
||
asPolynomialMatrix
|
||
:: Polynomial K -> Matrix (Polynomial K)
|
||
|
||
pMat :: Matrix (Polynomial K)
|
||
pMat = asPolynomialMatrix p
|
||
```
|
||
:::
|
||
::::
|
||
|
||
:::: {layout-ncol="2"}
|
||
::: {}
|
||
$$
|
||
\begin{align*}
|
||
\hat P &: K^{m \times m}[X]
|
||
\\
|
||
\hat P(X) &= X^n + (p_{n-1}I)X^{n-1} + ...
|
||
\\
|
||
& + (p_1 I) X + (p_0 I)
|
||
\end{align*}
|
||
$$
|
||
:::
|
||
|
||
::: {}
|
||
$X$ is a matrix indeterminate, $\hat P(X)$ is a polynomial over matrices
|
||
|
||
```{.haskell}
|
||
asMatrixPolynomial
|
||
:: Polynomial K -> Polynomial (Matrix K)
|
||
|
||
pHat :: Polynomial (Matrix K)
|
||
pHat = asMatrixPolynomial p
|
||
```
|
||
:::
|
||
::::
|
||
|
||
|
||
### Cayley-Hamilton Theorem
|
||
|
||
When evaluating the characteristic polynomial of a matrix *with* that matrix,
|
||
something strange happens.
|
||
Continuing from the previous article, using $x^2 + x + 1$ and its companion matrix, we have:
|
||
|
||
$$
|
||
\begin{gather*}
|
||
p(x) = x^2 + x + 1 \qquad C_{p} = C
|
||
= \left( \begin{matrix}
|
||
0 & 1 \\
|
||
-1 & -1
|
||
\end{matrix} \right)
|
||
\\ \\
|
||
\hat P(C) = C^2 + C + (1 \cdot I)
|
||
= \left( \begin{matrix}
|
||
-1 & -1 \\
|
||
1 & 0
|
||
\end{matrix} \right)
|
||
+ \left( \begin{matrix}
|
||
0 & 1 \\
|
||
-1 & -1
|
||
\end{matrix} \right)
|
||
+ \left( \begin{matrix}
|
||
1 & 0 \\
|
||
0 & 1
|
||
\end{matrix} \right)
|
||
\\ \\
|
||
= \left( \begin{matrix}
|
||
0 & 0 \\
|
||
0 & 0
|
||
\end{matrix} \right)
|
||
\end{gather*}
|
||
$$
|
||
|
||
The result is the zero matrix.
|
||
This tells us that, at least in this case, the matrix *C* is a root of its own characteristic polynomial.
|
||
By the [Cayley-Hamilton theorem](https://en.wikipedia.org/wiki/Cayley%E2%80%93Hamilton_theorem),
|
||
this is true in general, no matter the degree of *p*, no matter its coefficients,
|
||
and importantly, no matter the choice of field.
|
||
|
||
This is more powerful than it would otherwise seem.
|
||
For one, factoring a polynomial "inside" a matrix turns out to give the same answer
|
||
as factoring a polynomial over matrices.
|
||
|
||
:::: {layout-ncol="2"}
|
||
::: {}
|
||
|
||
$$
|
||
\begin{gather*}
|
||
P(xI) = \left( \begin{matrix}
|
||
x^2 + x + 1 & 0 \\
|
||
0 & x^2 + x + 1
|
||
\end{matrix}\right)
|
||
\\ \\
|
||
= (xI - C)(xI - C')
|
||
\\ \\
|
||
= \left( \begin{matrix}
|
||
x & -1 \\
|
||
1 & x + 1
|
||
\end{matrix} \right)
|
||
\left( \begin{matrix}
|
||
x - a & -b \\
|
||
-c & x - d
|
||
\end{matrix} \right)
|
||
\\ \\
|
||
\begin{align*}
|
||
x(x - a) + c &= x^2 + x + 1
|
||
\\
|
||
\textcolor{green}{x(-b) - (x - d)} &\textcolor{green}{= 0}
|
||
\\
|
||
\textcolor{blue}{(x - a) + (x + 1)(-c)} &\textcolor{blue}{= 0}
|
||
\\
|
||
(-b) + (x + 1)(x - d) &= x^2 + x + 1
|
||
\end{align*}
|
||
\\ \\
|
||
\textcolor{green}{(-b -1)x +d = 0} \implies b = -1, ~ d = 0 \\
|
||
\textcolor{blue}{(1 - c)x - a - c = 0} \implies c = 1, ~ a = -1
|
||
\\ \\
|
||
C' =
|
||
\left( \begin{matrix}
|
||
-1 & -1 \\
|
||
1 & 0
|
||
\end{matrix} \right)
|
||
\end{gather*}
|
||
$$
|
||
:::
|
||
|
||
::: {}
|
||
$$
|
||
\begin{gather*}
|
||
\hat P(X) = X^2 + X + 1I
|
||
\\[10pt]
|
||
= (X - C)(X - C')
|
||
\\[10pt]
|
||
= X^2 - (C + C')X + CC'
|
||
\\[10pt]
|
||
\implies
|
||
\\[10pt]
|
||
C + C' = -I, ~ C' = -I - C
|
||
\\[10pt]
|
||
CC' = I, ~ C^{-1} = C'
|
||
\\[10pt]
|
||
C' = \left( \begin{matrix}
|
||
-1 & -1 \\
|
||
1 & 0
|
||
\end{matrix} \right)
|
||
\end{gather*}
|
||
$$
|
||
:::
|
||
::::
|
||
|
||
It's important to not that a matrix factorization is not unique.
|
||
*Any* matrix with a given characteristic polynomial can be used as a root of that polynomial.
|
||
Of course, choosing one root affects the other matrix roots.
|
||
|
||
|
||
### Moving Roots
|
||
|
||
All matrices commute with the identity and zero matrices.
|
||
A less obvious fact is that all of the matrix roots *also* commute with one another.
|
||
By the Fundamental Theorem of Algebra,
|
||
[Vieta's formulas](https://en.wikipedia.org/wiki/Vieta%27s_formulas) state:
|
||
|
||
$$
|
||
\begin{gather*}
|
||
\hat P(X)
|
||
= \prod_{[i]_n} (X - \Xi_i)
|
||
= (X - \Xi_0) (X - \Xi_1)...(X - \Xi_{n-1})
|
||
\\
|
||
= \left\{ \begin{align*}
|
||
& \phantom{+} X^n
|
||
\\
|
||
& - (\Xi_0 + \Xi_1 + ... + \Xi_{n-1}) X^{n-1}
|
||
\\
|
||
& + (\Xi_0 \Xi_1+ \Xi_0 \Xi_2 + ... + \Xi_0 \Xi_{n-1} + \Xi_1 \Xi_2 + ... + \Xi_{n-2} \Xi_{n-1})X^{n-2}
|
||
\\
|
||
& \qquad \vdots
|
||
\\
|
||
& + (-1)^n \Xi_0 \Xi_1 \Xi_2...\Xi_n
|
||
\end{align*} \right.
|
||
\\
|
||
= X^n -\sigma_1([\Xi]_n)X^{n-1} + \sigma_2([\Xi]_n)X^{n-2} + ... + (-1)^n \sigma_n([\Xi]_n)
|
||
\end{gather*}
|
||
$$
|
||
|
||
The product range \[*i*\]~*n*~ means that the terms are ordered from 0 to *n* - 1 over the index given.
|
||
On the bottom line, *σ* are
|
||
[elementary symmetric polynomials](https://en.wikipedia.org/wiki/Elementary_symmetric_polynomial)
|
||
and \[*Ξ*\]~*n*~ is the list of root matrices from *Ξ*~*0*~ to Ξ~*n-1*~.
|
||
|
||
By factoring the matrix with the roots in a different order, we get another factorization.
|
||
It suffices to only focus on *σ*~2~, which has all pairwise products.
|
||
|
||
$$
|
||
\begin{gather*}
|
||
\pi \in S_n
|
||
\\
|
||
\qquad
|
||
\pi \circ \hat P(X) = \prod_{\pi ([i]_n)} (X - \Xi_i)
|
||
\\ \\
|
||
= X^n
|
||
- \sigma_1 \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)X^{n-1} +
|
||
+ \sigma_2 \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)X^{n-2} + ...
|
||
+ (-1)^n \sigma_n \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)
|
||
\\ \\
|
||
\\ \\
|
||
(0 ~ 1) \circ \hat P(X) = (X - \Xi_{1}) (X - \Xi_0)(X - \Xi_2)...(X - \Xi_{n-1})
|
||
\\
|
||
= X^n + ... + \sigma_2(\Xi_1, \Xi_0, \Xi_2, ...,\Xi_{n-1})X^{n-2} + ...
|
||
\\ \\ \\ \\
|
||
\begin{array}{}
|
||
e & (0 ~ 1) & (1 ~ 2) & ... & (n-2 ~~ n-1)
|
||
\\ \hline
|
||
\textcolor{red}{\Xi_0 \Xi_1} & \textcolor{red}{\Xi_1 \Xi_0} & \Xi_0 \Xi_1 & & \Xi_0 \Xi_1
|
||
\\
|
||
\Xi_0 \Xi_2 & \Xi_0 \Xi_2 & \Xi_0 \Xi_2 & & \Xi_0 \Xi_2
|
||
\\
|
||
\Xi_0 \Xi_3 & \Xi_0 \Xi_3 & \Xi_0 \Xi_3 & & \Xi_0 \Xi_3
|
||
\\
|
||
\vdots & \vdots & \vdots & & \vdots
|
||
\\
|
||
\Xi_0 \Xi_{n-1} & \Xi_0 \Xi_{n-1} & \Xi_{0} \Xi_{n-1} & & \Xi_{n-1} \Xi_0
|
||
\\
|
||
\textcolor{green}{\Xi_1 \Xi_2} & \Xi_1 \Xi_2 & \textcolor{green}{\Xi_2 \Xi_1} & & \Xi_1 \Xi_2
|
||
\\
|
||
\vdots & \vdots & \vdots & & \vdots
|
||
\\
|
||
\textcolor{blue}{\Xi_{n-2} \Xi_{n-1}} & \Xi_{n-2} \Xi_{n-1} & \Xi_{n-2} \Xi_{n-1} & & \textcolor{blue}{\Xi_{n-1} \Xi_{n-2}}
|
||
\end{array}
|
||
\end{gather*}
|
||
$$
|
||
|
||
<!-- TODO: permutation -->
|
||
The "[path swaps]()" shown commute only the adjacent elements.
|
||
By contrast, the permutation (0 2) commutes *Ξ*~0~ past both *Ξ*~1~ and *Ξ*~2~.
|
||
But since we already know *Ξ*~0~ and *Ξ*~1~ commute by the above list,
|
||
we learn at this step that *Ξ*~0~ and *Ξ*~2~ commute.
|
||
This can be repeated until we reach the permutation (0 *n*-1) to prove commutativity between all pairs.
|
||
|
||
|
||
### Matrix Fields?
|
||
|
||
The above arguments tell us that if *p* is irreducible, we can take its companion matrix *C*~*p*~
|
||
and work with its powers in the same way we would a typical root.
|
||
Irreducible polynomials cannot have a constant term 0, otherwise *x* could be factored out.
|
||
The constant term is equal to the determinant of the companion matrix (up to sign),
|
||
so *C*~*p*~ is invertible.
|
||
We get commutativity for free, since it follows from associativity
|
||
that all powers of *C*~*p*~ commute.
|
||
|
||
This narrows the ring of matrices to a full-on field.
|
||
Importantly, it absolves us from the need to symbolically render elements using a power of the root.
|
||
Instead, they can be adjoined by going from scalars to matrices.
|
||
We can also find every element in the field arithmetically.
|
||
Starting with a root, every element, produce new elements taking its matrix powers.
|
||
Then, scalar-multiply them and add them to elements of the field which are already known.
|
||
For finite fields, we can repeat this process with the new matrices
|
||
until we have all *p*^*d*^ elements.
|
||
|
||
|
||
GF(8)
|
||
-----
|
||
|
||
This is all rather abstract, so let's look at an example before we proceed any further.
|
||
The next smallest field of characteristic 2 is GF(8).
|
||
We can construct this field from the two irreducible polynomials of degree 3 over GF(2):
|
||
|
||
$$
|
||
\begin{gather*}
|
||
q(x) = x^3 + x + 1 = 1011_x \sim {}_2 11 \qquad
|
||
C_q = \left( \begin{matrix}
|
||
0 & 1 & 0 \\
|
||
0 & 0 & 1 \\
|
||
1 & 1 & 0
|
||
\end{matrix} \right) \mod 2
|
||
\\ \\
|
||
r(x) = x^3 + x^2 + 1 =1101_x \sim {}_2 13 \qquad
|
||
C_r = \left( \begin{matrix}
|
||
0 & 1 & 0 \\
|
||
0 & 0 & 1 \\
|
||
1 & 0 & 1
|
||
\end{matrix} \right) \mod 2
|
||
\end{gather*}
|
||
$$
|
||
|
||
Notice how the bit strings for either of these polynomials is the other, reversed.
|
||
Arbitrarily, let's work with C~r~.
|
||
The powers of this matrix, mod 2, are as follows:
|
||
|
||
$$
|
||
\begin{gather*}
|
||
(C_r)^1 = \left( \begin{matrix}
|
||
0 & 1 & 0 \\
|
||
0 & 0 & 1 \\
|
||
1 & 0 & 1
|
||
\end{matrix} \right)
|
||
\quad
|
||
(C_r)^2 = \left( \begin{matrix}
|
||
0 & 0 & 1 \\
|
||
1 & 0 & 1 \\
|
||
1 & 1 & 1
|
||
\end{matrix} \right)
|
||
\quad
|
||
(C_r)^3 = \left( \begin{matrix}
|
||
1 & 0 & 1 \\
|
||
1 & 1 & 1 \\
|
||
1 & 1 & 0
|
||
\end{matrix} \right)
|
||
\\
|
||
(C_r)^4 = \left( \begin{matrix}
|
||
1 & 1 & 1 \\
|
||
1 & 1 & 0 \\
|
||
0 & 1 & 1
|
||
\end{matrix} \right) \quad
|
||
(C_r)^5 = \left( \begin{matrix}
|
||
1 & 1 & 0 \\
|
||
0 & 1 & 1 \\
|
||
1 & 0 & 0
|
||
\end{matrix} \right) \quad
|
||
(C_r)^6 = \left( \begin{matrix}
|
||
0 & 1 & 1 \\
|
||
1 & 0 & 0 \\
|
||
0 & 1 & 0
|
||
\end{matrix} \right)
|
||
\\
|
||
(C_r)^7 = \left( \begin{matrix}
|
||
1 & 0 & 0 \\
|
||
0 & 1 & 0 \\
|
||
0 & 0 & 1
|
||
\end{matrix} \right) = I
|
||
= (C_r)^0 \quad
|
||
(C_r)^8 = \left( \begin{matrix}
|
||
0 & 1 & 0 \\
|
||
0 & 0 & 1 \\
|
||
1 & 0 & 1
|
||
\end{matrix} \right) = C_r
|
||
\end{gather*}
|
||
$$
|
||
|
||
As a reminder, these matrices are taken mod 2, so the elements can only be 0 or 1.
|
||
The seventh power of *C*~*r*~ is just the identity matrix,
|
||
meaning that the eighth power is the original matrix.
|
||
This means that *C*~*r*~ is cyclic of order 7 with respect to self-multiplication mod 2.
|
||
Along with the zero matrix, this fully characterizes GF(8).
|
||
|
||
If we picked *C*~*q*~ instead, we would have gotten different matrices.
|
||
I'll omit writing them here, but we get the same result: *C*~*q*~ is also cyclic of order 7.
|
||
Since every nonzero element of the field can be written as a power of the root,
|
||
the root (and the polynomial) is termed
|
||
[primitive](https://en.wikipedia.org/wiki/Primitive_polynomial_%28field_theory%29).
|
||
|
||
|
||
### Condensing
|
||
|
||
Working with matrices directly, as a human, is very cumbersome.
|
||
While it makes computation explicit, it makes presentation difficult.
|
||
One of the things in which we know we should be interested is the characteristic polynomial,
|
||
since it is central to the definition and behavior of the matrices.
|
||
Let's focus only on the characteristic polynomial for successive powers of *C*~*r*~
|
||
|
||
$$
|
||
\begin{gather*}
|
||
C_r = \left( \begin{matrix}
|
||
0 & 1 & 0 \\
|
||
0 & 0 & 1 \\
|
||
1 & 0 & 1
|
||
\end{matrix} \right) \mod 2
|
||
\\ ~ \\
|
||
\begin{array}{}
|
||
\text{charpoly}((C_r)^1)
|
||
&=& \color{blue} x^3 + x^2 + 1
|
||
&=& \color{blue} 1101_x \sim {}_2 13 = r
|
||
\\
|
||
\text{charpoly}((C_r)^2)
|
||
&=& \color{blue} x^3 + x^2 + 1
|
||
&=& \color{blue} 1101_x \sim {}_2 13 = r
|
||
\\
|
||
\text{charpoly}((C_r)^3)
|
||
&=& \color{red} x^3 + x + 1
|
||
&=& \color{red} 1011_x \sim {}_2 11 = q
|
||
\\
|
||
\text{charpoly}((C_r)^4)
|
||
&=& \color{blue} x^3 + x^2 + 1
|
||
&=& \color{blue} 1101_x \sim {}_2 13 = r
|
||
\\
|
||
\text{charpoly}((C_r)^5)
|
||
&=& \color{red} x^3 + x + 1
|
||
&=& \color{red} 1011_x \sim {}_2 11 = q
|
||
\\
|
||
\text{charpoly}((C_r)^6)
|
||
&=& \color{red} x^3 + x + 1
|
||
&=& \color{red} 1011_x \sim {}_2 11 = q
|
||
\\
|
||
\text{charpoly}((C_r)^7)
|
||
&=& x^3 + x^2 + x + 1
|
||
&=& 1111_x \sim {}_2 15 = (x+1)^3
|
||
\end{array}
|
||
\end{gather*}
|
||
$$
|
||
|
||
Somehow, even though we start with one characteristic polynomial, the other manages to work its way in here.
|
||
Both polynomials are of degree 3 and have 3 matrix roots (distinguished in red and blue).
|
||
|
||
If we chose to use *C*~*q*~, we'd actually get the same sequence backwards (starting with ~2~11).
|
||
It's beneficial to remember that 6, 5, and 3 can also be written as 7 - 1, 7 - 2, and 7 - 4.
|
||
This makes it clear that the powers of 2 (the field characteristic) less than the 8 (the order of the field) play a role with respect to both the initial and terminal items.
|
||
|
||
|
||
### Factoring
|
||
|
||
Intuitively, you may try using the roots to factor the matrix into powers of *C*~*r*~.
|
||
This turns out to work:
|
||
|
||
$$
|
||
\begin{gather*}
|
||
\hat R(X) \overset?= (X - C_r)(X - (C_r)^2)(X - (C_r)^4)
|
||
\\
|
||
\hat Q(X) \overset?= (X - (C_r)^3)(X - (C_r)^5)(X - (C_r)^6)
|
||
\\ \\
|
||
\textcolor{red}{ \sigma_1([(C_r)^i]_{i \in [1,2,4]}) } = C_r + (C_r)^2 + (C_r)^4 = \textcolor{red}I
|
||
\\
|
||
\textcolor{brown}{ \sigma_1([(C_r)^i]_{i \in [3,5,6]}) } = (C_r)^3 + (C_r)^5 + (C_r)^6 = \textcolor{brown}0
|
||
\\ \\
|
||
\begin{align*}
|
||
\color{blue} \sigma_2([(C_r)^i]_{i \in [1,2,4]})
|
||
&= (C_r)(C_r)^2 + (C_r)(C_r)^4 + (C_r)^2(C_r)^4
|
||
\\
|
||
&= (C_r)^3 + (C_r)^5 + (C_r)^6 = \color{blue}0
|
||
\\
|
||
\color{cyan} \sigma_2([(C_r)^i]_{i \in [3,5,6]})
|
||
&= (C_r)^3(C_r)^5 + (C_r)^3(C_r)^6 + (C_r)^5(C_r)^6
|
||
\\
|
||
&= (C_r)^8 + (C_r)^9 + (C_r)^{11}
|
||
\\
|
||
&= (C_r)^1 + (C_r)^2 + (C_r)^4 = \color{cyan} I
|
||
\end{align*}
|
||
\\ \\
|
||
\textcolor{green}{ \sigma_3([(C_r)^i]_{i \in [1,2,4]}) } = (C_r)(C_r)^2(C_r)^4 = \textcolor{green}I
|
||
\\
|
||
\textcolor{lightgreen}{ \sigma_3([(C_r)^i]_{i \in [3,5,6]}) }= (C_r)^3(C_r)^5(C_r)^6 = \textcolor{lightgreen}I
|
||
\\ \\
|
||
\hat R(X) = X^3 + \textcolor{red}IX^2 + \textcolor{blue}0X + \textcolor{green}I
|
||
\\
|
||
\hat Q(X) = X^3 + \textcolor{brown}0X^2 + \textcolor{cyan}IX + \textcolor{lightgreen}I
|
||
\end{gather*}
|
||
$$
|
||
|
||
We could have factored our polynomials differently if we used *C*~*q*~ instead.
|
||
However, the effect of splitting both polynomials into monomial factors is the same.
|
||
|
||
|
||
GF(16)
|
||
------
|
||
|
||
GF(8) is simple to study, but too simple to study the sequence of characteristic polynomials alone.
|
||
Let's widen our scope to GF(16).
|
||
There are three irreducible polynomials of degree 3 over GF(2).
|
||
|
||
$$
|
||
\begin{gather*}
|
||
s(x) = x^4 + x + 1 = 10011_x \sim {}_2 19 \quad
|
||
C_s = \left( \begin{matrix}
|
||
0 & 1 & 0 & 0 \\
|
||
0 & 0 & 1 & 0 \\
|
||
0 & 0 & 0 & 1 \\
|
||
1 & 1 & 0 & 0
|
||
\end{matrix} \right) \mod 2
|
||
\\
|
||
t(x) = x^4 + x^3 + 1 = 11001_x \sim {}_2 25 \quad
|
||
C_t = \left( \begin{matrix}
|
||
0 & 1 & 0 & 0 \\
|
||
0 & 0 & 1 & 0 \\
|
||
0 & 0 & 0 & 1 \\
|
||
1 & 0 & 0 & 1
|
||
\end{matrix} \right) \mod 2
|
||
\\
|
||
u(x) = x^4 + x^3 + x^2 + x + 1 = 11111_x \sim {}_2 31 \quad
|
||
C_u = \left( \begin{matrix}
|
||
0 & 1 & 0 & 0 \\
|
||
0 & 0 & 1 & 0 \\
|
||
0 & 0 & 0 & 1 \\
|
||
1 & 1 & 1 & 1
|
||
\end{matrix} \right) \mod 2
|
||
\end{gather*}
|
||
$$
|
||
|
||
Again, *s* and *t* form a pair under the reversal of their bit strings, while *u* is palindromic.
|
||
Both *C*~*s*~ and *C*~*t*~ are cyclic of order 15, so *s* and *t* are primitive polynomials.
|
||
Using *s* = ~2~19 to generate the field, the powers of its companion matrix *C*~*s*~
|
||
have the following characteristic polynomials:
|
||
|
||
```{python}
|
||
#| echo: false
|
||
|
||
from IPython.display import Markdown
|
||
from tabulate import tabulate
|
||
|
||
charpolys = [19, 19, 31, 19, 21, 31, 25, 19, 31, 21, 25, 31, 25, 25, 17]
|
||
charpolyformat = lambda x: f"<span class=\"{'blue' if x == 19 else 'red' if x == 25 else ''}\">~2~{x}</span>"
|
||
|
||
Markdown(tabulate(
|
||
[[
|
||
"charpoly((*C*~*s*~)^*m*^)",
|
||
*[charpolyformat(charpoly) for charpoly in charpolys]
|
||
]],
|
||
headers=["*m*", *[str(i + 1) for i in range(15)]],
|
||
))
|
||
```
|
||
|
||
The polynomial ~2~19 occurs at positions 1, 2, 4, and 8.
|
||
These are obviously powers of 2, the characteristic of the field.
|
||
Similarly, the polynomial *t* = ~2~25 occurs at positions 14 (= 15 - 1), 13 (= 15 - 2),
|
||
11 (= 15 - 4), and 7 (= 15 - 8).
|
||
We'd get the same sequence backwards if we used *C*~*t*~ instead, just like in GF(8).
|
||
|
||
|
||
### Non-primitive
|
||
|
||
The polynomial *u* = ~2~31 occurs at positions 3, 6, 9, and 12
|
||
-- multiples of 3, which is a factor of *15*.
|
||
It follows that the roots of *u* are cyclic of order 5, so this polynomial is irreducible,
|
||
but *not* primitive.
|
||
|
||
Naturally, $\hat U(X)$ can be factored as powers of (*C*~*s*~)^3^.
|
||
We can also factor it more naively as powers of *C*~*u*~. Either way, we get the same sequence.
|
||
|
||
:::: {layout-ncol = "2"}
|
||
::: {}
|
||
```{python}
|
||
#| echo: false
|
||
upowers = [31, 31, 31, 31, 17]
|
||
|
||
Markdown(tabulate(
|
||
[[
|
||
"charpoly((*C*~*s*~)^*3m*^)",
|
||
*[f"~2~{charpoly}" for charpoly in charpolys[2::3]]
|
||
], [
|
||
"charpoly((*C*~*u*~)^*m*^)",
|
||
*[f"~2~{upower}" for upower in upowers]
|
||
]],
|
||
headers=["*m*", *[str(i + 1) for i in range(5)]],
|
||
))
|
||
```
|
||
|
||
Both of the matrices in column 5 happen to be the identity matrix.
|
||
It follows that this root is only cyclic of order 5.
|
||
|
||
The polynomials ~2~19 and ~2~25 are reversals of one another and the sequences that their companion matrices
|
||
generate end one with another -- in this regard, they are dual.
|
||
However, ~2~31 = 11111~*x*~ is a palindrome and its sequence ends where it begins, so it is self-dual.
|
||
:::
|
||
|
||
::: {width = "33%"}
|
||
$$
|
||
\begin{gather*}
|
||
(C_u)^1 =\left( \begin{matrix}
|
||
0 & 1 & 0 & 0 \\
|
||
0 & 0 & 1 & 0 \\
|
||
0 & 0 & 0 & 1 \\
|
||
1 & 1 & 1 & 1
|
||
\end{matrix} \right)
|
||
\\ \\
|
||
(C_u)^2 =\left( \begin{matrix}
|
||
0 & 0 & 1 & 0 \\
|
||
0 & 0 & 0 & 1 \\
|
||
1 & 1 & 1 & 1 \\
|
||
1 & 0 & 0 & 0
|
||
\end{matrix} \right)
|
||
\\ \\
|
||
(C_u)^3 =\left( \begin{matrix}
|
||
0 & 0 & 0 & 1 \\
|
||
1 & 1 & 1 & 1 \\
|
||
1 & 0 & 0 & 0 \\
|
||
0 & 1 & 0 & 0
|
||
\end{matrix} \right)
|
||
\\ \\
|
||
(C_u)^4 =\left( \begin{matrix}
|
||
1 & 1 & 1 & 1 \\
|
||
1 & 0 & 0 & 0 \\
|
||
0 & 1 & 0 & 0 \\
|
||
0 & 0 & 1 & 0 \\
|
||
\end{matrix} \right)
|
||
\\ \\
|
||
(C_u)^5 =\left( \begin{matrix}
|
||
1 & 0 & 0 & 0 \\
|
||
0 & 1 & 0 & 0 \\
|
||
0 & 0 & 1 & 0 \\
|
||
0 & 0 & 0 & 1 \\
|
||
\end{matrix} \right)
|
||
\\
|
||
= I = (C_u)^0
|
||
\end{gather*}
|
||
$$
|
||
:::
|
||
::::
|
||
|
||
|
||
### Non-irreducible
|
||
|
||
In addition to the three irreducibles, a fourth polynomial, ~2~21 = 10101~*x*~,
|
||
also appears in the sequence on entries 5 and 10 -- multiples of 5, which is also a factor of 15.
|
||
Like ~2~31, this polynomial is palindromic.
|
||
This polynomial is *not* irreducible mod 2, and factors as:
|
||
|
||
$$
|
||
\begin{gather*}
|
||
{}_2 21 \sim 10101_x = x^4 + x^2 + 1 = (x^2 + x + 1)^2 \mod 2
|
||
\\[10pt]
|
||
(X - (C_s)^5)(X - (C_s)^{10}) = X^2 + ((C_s)^5 + (C_s)^{10})X + (C_s)^{15}
|
||
\\
|
||
= X^2 + IX + I
|
||
\end{gather*}
|
||
$$
|
||
|
||
Just like how the fields we construct are powers of a prime, this extra element is a power
|
||
of a smaller irreducible.
|
||
This is unexpected, but perhaps not surprising.
|
||
|
||
Something a little more surprising is that the companion matrix is cyclic of degree *6*,
|
||
rather than of degree 3 like the matrices encountered in GF(8).
|
||
The powers of its companion matrix are:
|
||
|
||
<!--
|
||
TODO: assemble this table
|
||
::: {}
|
||
| *m* | 1 | 2 | 3 | 4 | 5 | 6 |
|
||
|--------------------------|-------|-------|-------|-------|-------|-------|
|
||
| charpoly((*C*~*s*~)^5m^) | ~2~21 | ~2~21 | ~2~17 ((*C*~*s*~)^15^ is the identity matrix) | ~2~21 | ~2~21 | ~2~17 (identity) |
|
||
| charpoly((*C*~*21*~)^m^) | <span class="red">~2~21</span> | <span class="blue">~2~21</span> | ~2~17 ((*C*~*21*~)^3^ is the identity matrix) | <span class="blue">~2~21</span> | <span class="red">~2~21</span> | ~2~17 (identity) |
|
||
:::
|
||
-->
|
||
|
||
We can think of the repeated sequence as ensuring that there are enough roots of ~2~21.
|
||
The Fundamental Theorem of Algebra states that there must be 4 roots.
|
||
For *numbers*, we'd allow duplicate roots with multiplicities greater than 1, but the matrix roots are all distinct.
|
||
|
||
Basic group theory tells us that as a cyclic group, the matrix's first and fifth powers
|
||
(in red) are pairs of inverses.
|
||
The constant term of the characteristic polynomial is the product of all four roots and,
|
||
as a polynomial over matrices, must be some nonzero multiple of the identity matrix.
|
||
Since the red roots are a pair of inverses, the blue roots are, too.
|
||
|
||
|
||
GF(32)
|
||
------
|
||
|
||
GF(32) turns out to be special.
|
||
There are six irreducible polynomials of degree 5 over GF(2).
|
||
Picking one of them at random, ~2~37, and looking at the polynomial sequence it generates, we see:
|
||
|
||
```{python}
|
||
#| echo: false
|
||
gf32powers = [
|
||
37, 37, 61, 37, 55, 61, 47, 37, 55, 55, 59, 61, 59, 47, 41,
|
||
37, 61, 55, 47, 55, 59, 59, 41, 61, 47, 59, 41, 47, 41, 41, 51,
|
||
]
|
||
gf32colors = {
|
||
37: "red",
|
||
61: "blue",
|
||
55: "yellow",
|
||
47: "orange",
|
||
59: "purple",
|
||
41: "green",
|
||
}
|
||
gf32format = lambda x: f"<span class=\"{gf32colors.get(x, '')}\">~2~{x}</span>"
|
||
|
||
Markdown(tabulate(
|
||
[[
|
||
"charpoly((*C*~*u*~)^*m*^)",
|
||
"-",
|
||
*[gf32format(gf32power) for gf32power in gf32powers[:15]]
|
||
]],
|
||
headers=["*m*", *[str(i) for i in range(16 + 1)]],
|
||
))
|
||
```
|
||
```{python}
|
||
#| echo: false
|
||
Markdown(tabulate(
|
||
[[
|
||
"charpoly((*C*~*u*~)^*m*^)",
|
||
*[gf32format(gf32power) for gf32power in gf32powers[:-17:-1]]
|
||
]],
|
||
headers=["*m*", *[str(i) for i in reversed(range(16, 32))]],
|
||
))
|
||
```
|
||
|
||
31 is prime, so we don't have any sub-patterns that appear on multiples of factors.
|
||
In fact, all six irreducible polynomials are present in this table.
|
||
The pairs in complementary colors form pairs under reversing the polynomials:
|
||
<span class="red">~2~37</span> and <span class="green">~2~41</span>,
|
||
<span class="blue">~2~61</span> and <span class="orange">~2~47</span>,
|
||
and <span class="yellow">~2~55</span> and <span class="purple">~2~59</span>.
|
||
|
||
Since their roots have order 31, these polynomials are actually
|
||
the distinct factors of *x*^31^ - 1 mod 2:
|
||
|
||
$$
|
||
\begin{gather*}
|
||
x^{31} -1 = (x-1)(x^{30} +x^{29} + ... + x + 1)
|
||
\\
|
||
(x^{30} +x^{29} + ... + x + 1) =
|
||
\left\{ \begin{align*}
|
||
&\phantom\cdot (x^5 + x^2 + 1) &\sim \quad {}_2 37
|
||
\\
|
||
&\cdot (x^5 + x^3 + 1) &\sim \quad {}_2 41 \\
|
||
&\cdot (x^5 + x^4 + x^3 + x^2 + 1) &\sim \quad {}_2 61
|
||
\\
|
||
&\cdot (x^5 + x^3 + x^2 + x + 1) &\sim \quad {}_2 47
|
||
\\
|
||
&\cdot (x^5 + x^4 + x^2 + x + 1) &\sim \quad {}_2 55
|
||
\\
|
||
&\cdot (x^5 + x^4 + x^3 + x + 1) &\sim \quad {}_2 59
|
||
\end{align*} \right.
|
||
\end{gather*}
|
||
$$
|
||
|
||
This is a feature special to fields of characteristic 2.
|
||
2 is the only prime number whose powers can be one more than another prime,
|
||
since all other prime powers are one more than even numbers.
|
||
31 is a [Mersenne prime](https://en.wikipedia.org/wiki/Mersenne_prime),
|
||
so all integers less than 31 are coprime to it.
|
||
Thus, there is no room for the "extra" entries we observed in GF(16) which occurred
|
||
on factors of 15 = 16 - 1.
|
||
No entry can be irreducible (but not primitive) or the power of an irreducible of lower degree.
|
||
In other words, *only primitive polynomials exist of degree* p *if 2^p^ - 1 is a Mersenne prime*.
|
||
|
||
|
||
### Counting Irreducibles
|
||
|
||
The remark about coprimes to 31 may inspire you to think of the
|
||
[totient function](https://en.wikipedia.org/wiki/Euler%27s_totient_function).
|
||
We have *φ*(2^5^ - 1) = 30 = 5⋅6, where 5 is the degree and 6 is the number of primitive polynomials.
|
||
We also have *φ*(24 - 1) = 8 = 4⋅2 and *φ*(23 - 1) = 6 = 3⋅2.
|
||
In general, it is true that there are *φ*(*pm* - 1) / *m* primitive polynomials of degree m over GF(p).
|
||
|
||
|
||
Polynomial Reversal
|
||
-------------------
|
||
|
||
We've only been looking at fields of characteristic 2, where the meaning of
|
||
"palindrome" and "reversed polynomial" is intuitive.
|
||
Let's look at an example over characteristic 3.
|
||
One primitive of degree 2 is ~3~14, which gives rise to the following sequence over GF(9):
|
||
|
||
```{python}
|
||
#| echo: false
|
||
gf9powers = [14, 10, 14, 16, 17, 10, 17, 13]
|
||
gf9format = lambda x: f"<span class=\"{'red' if x == 14 else 'blue' if x == 17 else ''}\">~3~{x}</span>"
|
||
|
||
Markdown(tabulate(
|
||
[[
|
||
"charpoly((*C*~*14*~)^*m*^)",
|
||
*[gf9format(gf9power) for gf9power in gf9powers]
|
||
]],
|
||
headers=["*m*", *[str(i + 1) for i in range(8)]],
|
||
))
|
||
```
|
||
|
||
The table suggests that ~3~14 = 112~*x*~ = *x*^2^ + *x* + 2 and ~3~17 = 122~*x*~ = *x*^2^ + 2*x* + 2
|
||
are reversals of one another.
|
||
More naturally, you'd think that 112~*x*~ reversed is 211~*x*~.
|
||
But remember that we prefer to work with monic polynomials.
|
||
By multiplying the polynomial by the multiplicative inverse of the leading coefficient (in this case, 2),
|
||
we get 422~*x*~ ≡ 122~*x*~ mod 3.
|
||
This is a rule that applies over larger characteristics in general.
|
||
|
||
Note that ~3~16 is 121~*x*~ = *x*^2^ + 2x + 1 and ~3~13 = 111~*x*~ = *x*^2^ + x + 1 = *x*^2^ - 2x + 1,
|
||
both of which have factors over GF(3).
|
||
|
||
|
||
Power Graphs
|
||
------------
|
||
|
||
We can study the interplay of primitives, irreducibles, and their powers by converting
|
||
our sequences into (directed) graphs.
|
||
Each node in the graph represents a characteristic polynomial that appears over the field;
|
||
call the one under consideration *a*.
|
||
If the sequence of polynomials generated by *C*~*a*~ contains contains another polynomial *b*,
|
||
then there is an edge from *a* to *b*.
|
||
|
||
We can do this for every GF(*p*^*m*^).
|
||
Let's start with the first few fields of characteristic 2.
|
||
We get the following graphs:
|
||
|
||

|
||
|
||
All nodes connect to the node corresponding to the identity matrix, since all roots are cyclic.
|
||
Also, since all primitive polynomials are interchangeable with one another,
|
||
they are all interconnected and form a [complete](https://en.wikipedia.org/wiki/Complete_graph) clique.
|
||
This means that, excluding the identity node, the graphs for fields of order one more
|
||
than a Mersenne prime are just the complete graphs.
|
||
|
||
Since all of the graphs share the identity node as a feature
|
||
-- a node with incoming edges from every other node -- its convenient to omit it.
|
||
Here are a few more of these graphs after doing so, over fields of other characteristics:
|
||
|
||
<!-- TODO: these are graphviz diagrams and could be generated via code -->
|
||
::: {layout="[[1,1], [1,1], [1,1,1]]"}
|
||
![
|
||
GF(9)
|
||
]()
|
||
|
||
![
|
||
GF(25)
|
||
]()
|
||
|
||
![
|
||
GF(49)
|
||
]()
|
||
|
||
![
|
||
GF(121)
|
||
]()
|
||
|
||
![
|
||
GF(27)
|
||
]()
|
||
|
||
![
|
||
GF(125)
|
||
]()
|
||
|
||
![
|
||
GF(343)
|
||
]()
|
||
:::
|
||
|
||
|
||
### Spectra
|
||
|
||
Again, since visually interpreting graphs is difficult, we can study an invariant.
|
||
From these graphs of polynomials, we can compute *their* characteristic polynomials
|
||
(to add another layer to this algebraic cake) and look at their spectra.
|
||
|
||
It turns out that a removing a fully-connected node (like the one for the identity matrix)
|
||
has a simple effect on characteristic polynomial of a graph: it just removes a factor of *x*.
|
||
Here are a few of the (identity-reduced) spectra, arranged into a table.
|
||
|
||
| Characteristic | Order | Spectrum | Remark |
|
||
|----------------|-------|--------------------------------------|--------------------------|
|
||
| 2 | 4 | 0 | |
|
||
| | 8 | -1, 1 | Mersenne |
|
||
| | 16 | 0^2^, -1, 1 | |
|
||
| | 32 | -1^5^, 5 | Mersenne |
|
||
| 3 | 9 | 0^2^, -1, 1 | |
|
||
| | 27 | 0, -1^6^, 3^2^ | Pseudo-Mersenne? |
|
||
| 5 | 25 | 0^3^, -1^6^, 1^3^, 3 | |
|
||
| | 125 | 0, -1^38^, 1, 9^2^, 19 | Prime power in spectrum |
|
||
| 7 | 49 | 0^2, -1^17^, 1^4^, 3^2^, 7 | |
|
||
| | 343 | 0, -1^106^, 1^4^, 5^2^, 11^2^, 35^2^ | Composite in spectrum |
|
||
| 11 | 121 | 0^4^, -1^49^, 1^2^, 3^6^, 7^2^, 15 | Composite in spectrum |
|
||
|
||
Incredibly, all spectra shown are composed exclusively of integers, and thus,
|
||
each of these graphs are integral graphs.
|
||
Moreover, it does not appear that any integer sequences that one may try extracting from this table
|
||
(for example, the multiplicity of -1) can be found in the
|
||
[Online Encyclopedia of Integer Sequences](https://oeis.org/).
|
||
|
||
From what I was able to tell, the following subgraphs were *also* integral over the range I tested:
|
||
|
||
- the induced subgraph of vertices corresponding to non-primitives
|
||
- the complement of the previous graph with respect to the whole graph
|
||
- the induced subgraph of vertices corresponding only to irreducibles
|
||
|
||
Unfortunately, proving any such relationship is out of the scope of this post (and my abilities).
|
||
|
||
|
||
Closing
|
||
-------
|
||
|
||
This concludes the first foray into using matrices as elements of prime power fields.
|
||
It is a subject which, using the tools of linear algebra, makes certain aspects of field theory
|
||
more palatable and constructs some objects with fairly interesting properties.
|
||
|
||
One of the most intriguing parts to me is the sequence of polynomials generated by a companion matrix.
|
||
Though I haven't proven it, I suspect that it suffices to study only the sequence generated
|
||
by a primitive polynomial.
|
||
It seems to be possible to get the non-primitive sequences by looking at the subsequences
|
||
where the indices are multiples of a factor of the length of the sequence.
|
||
But this means that the entire story about polynomials and finite fields can be foregone entirely,
|
||
and the problem instead becomes one of number theory.
|
||
|
||
The [next post](../3) will focus on an "application" of matrix roots to other areas of abstract algebra.
|
||
Diagrams made with Geogebra and NetworkX (GraphViz).
|