zenzicubi.co/posts/finite-field/2/index.qmd

---
title: "Exploring Finite Fields, Part 2: Matrix Boogaloo"
description: |
  "..."
format:
  html:
    html-math-method: katex
jupyter: python3
date: "2024-01-15"
date-modified: "2025-07-16"
categories:
  - algebra
  - finite field
  - haskell
---

<style>
.red {
  color: red;
}
.orange {
  color: orange;
}
.yellow {
  color: yellow;
}
.green {
  color: green;
}
.blue {
  color: blue;
}
.purple {
  color: purple;
}
</style>

<!--
  TODO: data for this post should be built by an external Haskell program
  TODO: half-post about organizing "data" vs full post about graphs and irreducibles
-->

In the [last post](../1), we discussed finite fields, polynomials and matrices over them, and the typical,
  symbolic way of extending fields with polynomials.
This post will will focus on circumventing symbolic means with numeric ones.


More about Matrices (and Polynomials)
-------------------------------------

Recall the definition of polynomial evaluation.
Since a polynomial is defined with respect to a field or ring, we expect only to be able to evaluate the
  polynomial at values *in* that field or ring.

$$
\begin{gather*}
  K[x] \times K \overset{\text{eval}}{\longrightarrow} K
  \\
  (p(x), n) \overset{\text{eval}}{\mapsto} p(n)
\end{gather*}
$$

However, there's nothing wrong with evaluating polynomials with another polynomial,
  as long as they're defined over the same structure.
After all, we can take powers of polynomials, scalar-multiply them with coefficients from *K*,
  and add them together.
The same holds for matrices, or any "collection" structure *F* over *K* which has those properties.

$$
\begin{align*}
  K[x] \times K[x]
    &\overset{\text{eval}_{poly}}{\longrightarrow} K[x]
  \\
  (p(x), q(x)) \mapsto p(q(x))
  \\[10pt]
  K[x] \times K^{n \times n}
    &\overset{\text{eval}_{mat}}{\longrightarrow} K^{n \times n}
  (p(x), A) \overset{?}{\mapsto} p(A)
  \\[10pt]
  K[x] \times F(K)
    &\overset{\text{eval}_F}{\longrightarrow} F(K)
\end{align*}
$$

Rather than redefining evaluation for each of these cases,
  we should map our polynomial into a structure compatible with how we want to evaluate it.
Essentially, this means that from a polynomial in the base structure,
  we can derive polynomials in these other structures.
In particular, we can either have a matrix of polynomials or a polynomial in matrices.

<!-- TODO: notes about functoriality of `fmap`ping eval vs -->
:::: {layout-ncol="2"}
::: {}
$$
\begin{align*}
  p &: K[x]
  \\
  p(x) &= x^n + p_{n-1}x^{n-1} + ...
  \\
  \phantom{= p} & + p_1 x + p_0
\end{align*}
$$
:::

::: {}
$x$ is a scalar indeterminate

```{.haskell}
p :: Polynomial K
```
:::
::::

:::: {layout-ncol="2"}
::: {}
$$
\begin{align*}
  P &: (K[x])^{m \times m}
  \\
  P(x I) &= (x I)^n + (p_{n-1})(x I)^{n-1} + ...
  \\
  & + p_1(x I)+ p_0 I
\end{align*}
$$
:::

::: {}
$x$ is a scalar indeterminate, $P(x I)= p(x) I$ is a matrix of polynomials in $x$

```{.haskell}
asPolynomialMatrix
  :: Polynomial K -> Matrix (Polynomial K)

pMat :: Matrix (Polynomial K)
pMat = asPolynomialMatrix p
```
:::
::::

:::: {layout-ncol="2"}
::: {}
$$
\begin{align*}
  \hat P &: K^{m \times m}[X]
  \\
  \hat P(X) &= X^n + (p_{n-1}I)X^{n-1} + ...
  \\
  & + (p_1 I) X + (p_0 I)
\end{align*}
$$
:::

::: {}
$X$ is a matrix indeterminate, $\hat P(X)$ is a polynomial over matrices

```{.haskell}
asMatrixPolynomial
  :: Polynomial K -> Polynomial (Matrix K)

pHat :: Polynomial (Matrix K)
pHat = asMatrixPolynomial p
```
:::
::::


### Cayley-Hamilton Theorem

When evaluating the characteristic polynomial of a matrix *with* that matrix,
  something strange happens.
Continuing from the previous article, using $x^2 + x + 1$ and its companion matrix, we have:

$$
\begin{gather*}
  p(x) = x^2 + x + 1 \qquad C_{p} = C
  = \left( \begin{matrix}
       0 &  1 \\
      -1 & -1
    \end{matrix} \right)
  \\ \\
  \hat P(C) = C^2 + C + (1 \cdot I)
    = \left( \begin{matrix}
      -1 & -1 \\
       1 &  0
    \end{matrix} \right)
    + \left( \begin{matrix}
       0 &  1 \\
      -1 & -1
    \end{matrix} \right)
    + \left( \begin{matrix}
      1 & 0 \\
      0 & 1
    \end{matrix} \right)
  \\ \\
  = \left( \begin{matrix}
      0 & 0 \\
      0 & 0
    \end{matrix} \right)
\end{gather*}
$$

The result is the zero matrix.
This tells us that, at least in this case, the matrix *C* is a root of its own characteristic polynomial.
By the [Cayley-Hamilton theorem](https://en.wikipedia.org/wiki/Cayley%E2%80%93Hamilton_theorem),
  this is true in general, no matter the degree of *p*, no matter its coefficients,
  and importantly, no matter the choice of field.

This is more powerful than it would otherwise seem.
For one, factoring a polynomial "inside" a matrix turns out to give the same answer
  as factoring a polynomial over matrices.

:::: {layout-ncol="2"}
::: {}

$$
\begin{gather*}
  P(xI) = \left( \begin{matrix}
      x^2 + x + 1 &           0 \\
                0 & x^2 + x + 1
    \end{matrix}\right)
  \\ \\
  = (xI - C)(xI - C')
  \\ \\
  = \left( \begin{matrix}
      x &    -1 \\
      1 & x + 1
    \end{matrix} \right)
    \left( \begin{matrix}
      x - a &    -b \\
         -c & x - d
    \end{matrix} \right)
  \\ \\
  \begin{align*}
    x(x - a) + c &= x^2 + x + 1
    \\
    \textcolor{green}{x(-b) - (x - d)} &\textcolor{green}{= 0}
    \\
    \textcolor{blue}{(x - a) + (x + 1)(-c)} &\textcolor{blue}{= 0}
    \\
    (-b) + (x + 1)(x - d) &= x^2 + x + 1
  \end{align*}
  \\ \\
    \textcolor{green}{(-b -1)x +d = 0} \implies b = -1, ~ d = 0 \\
    \textcolor{blue}{(1 - c)x - a - c = 0} \implies c = 1, ~ a = -1
  \\ \\
  C' =
    \left( \begin{matrix}
      -1 & -1 \\
       1 &  0
    \end{matrix} \right)
\end{gather*}
$$
:::

::: {}
$$
\begin{gather*}
  \hat P(X) = X^2 + X + 1I
  \\[10pt]
  = (X - C)(X - C')
  \\[10pt]
  = X^2 - (C + C')X + CC'
  \\[10pt]
  \implies
  \\[10pt]
  C + C' = -I, ~ C' = -I - C
  \\[10pt]
  CC' = I, ~ C^{-1} = C'
  \\[10pt]
  C' = \left( \begin{matrix}
      -1 & -1 \\
       1 &  0
    \end{matrix} \right)
\end{gather*}
$$
:::
::::

It's important to not that a matrix factorization is not unique.
*Any* matrix with a given characteristic polynomial can be used as a root of that polynomial.
Of course, choosing one root affects the other matrix roots.


### Moving Roots

All matrices commute with the identity and zero matrices.
A less obvious fact is that all of the matrix roots *also* commute with one another.
By the Fundamental Theorem of Algebra,
  [Vieta's formulas](https://en.wikipedia.org/wiki/Vieta%27s_formulas) state:

$$
\begin{gather*}
  \hat P(X)
    = \prod_{[i]_n} (X - \Xi_i)
    = (X - \Xi_0) (X - \Xi_1)...(X - \Xi_{n-1})
  \\
  = \left\{ \begin{align*}
      & \phantom{+} X^n
      \\
      & - (\Xi_0 + \Xi_1 + ... + \Xi_{n-1}) X^{n-1}
      \\
      & + (\Xi_0 \Xi_1+ \Xi_0 \Xi_2 + ... + \Xi_0 \Xi_{n-1} + \Xi_1 \Xi_2 + ... + \Xi_{n-2} \Xi_{n-1})X^{n-2}
      \\
      & \qquad \vdots
      \\
      & + (-1)^n \Xi_0 \Xi_1 \Xi_2...\Xi_n
    \end{align*} \right.
  \\
  = X^n -\sigma_1([\Xi]_n)X^{n-1} + \sigma_2([\Xi]_n)X^{n-2} + ... + (-1)^n \sigma_n([\Xi]_n)
\end{gather*}
$$

The product range \[*i*\]~*n*~ means that the terms are ordered from 0 to *n* - 1 over the index given.
On the bottom line, *σ* are
  [elementary symmetric polynomials](https://en.wikipedia.org/wiki/Elementary_symmetric_polynomial)
  and \[*Ξ*\]~*n*~ is the list of root matrices from *Ξ*~*0*~ to Ξ~*n-1*~.

By factoring the matrix with the roots in a different order, we get another factorization.
It suffices to only focus on *σ*~2~, which has all pairwise products.

$$
\begin{gather*}
  \pi \in S_n
  \\
  \qquad
    \pi \circ \hat P(X) = \prod_{\pi ([i]_n)} (X - \Xi_i)
  \\ \\
  = X^n
    - \sigma_1 \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)X^{n-1} +
    + \sigma_2 \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)X^{n-2} + ...
    + (-1)^n \sigma_n \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)
  \\ \\
  \\ \\
  (0 ~ 1) \circ \hat P(X) = (X - \Xi_{1}) (X - \Xi_0)(X - \Xi_2)...(X - \Xi_{n-1})
  \\
  = X^n + ... + \sigma_2(\Xi_1, \Xi_0, \Xi_2, ...,\Xi_{n-1})X^{n-2} + ...
  \\ \\ \\ \\
  \begin{array}{}
    e & (0 ~ 1) & (1 ~ 2) & ... & (n-2 ~~ n-1)
    \\ \hline
    \textcolor{red}{\Xi_0 \Xi_1} & \textcolor{red}{\Xi_1 \Xi_0} & \Xi_0 \Xi_1  & & \Xi_0 \Xi_1
    \\
    \Xi_0 \Xi_2 & \Xi_0 \Xi_2 & \Xi_0 \Xi_2 & & \Xi_0 \Xi_2
    \\
    \Xi_0 \Xi_3  & \Xi_0 \Xi_3 & \Xi_0 \Xi_3 & & \Xi_0 \Xi_3
    \\
    \vdots & \vdots & \vdots & & \vdots
    \\
    \Xi_0 \Xi_{n-1}  & \Xi_0 \Xi_{n-1} & \Xi_{0} \Xi_{n-1} & & \Xi_{n-1} \Xi_0
    \\
    \textcolor{green}{\Xi_1 \Xi_2}  & \Xi_1 \Xi_2 & \textcolor{green}{\Xi_2 \Xi_1}  & & \Xi_1 \Xi_2
    \\
    \vdots & \vdots & \vdots & & \vdots
    \\
    \textcolor{blue}{\Xi_{n-2} \Xi_{n-1}}  & \Xi_{n-2} \Xi_{n-1} & \Xi_{n-2} \Xi_{n-1} & & \textcolor{blue}{\Xi_{n-1} \Xi_{n-2}}
  \end{array}
\end{gather*}
$$

<!-- TODO: permutation -->
The "[path swaps]()" shown commute only the adjacent elements.
By contrast, the permutation (0 2) commutes *Ξ*~0~ past both *Ξ*~1~ and *Ξ*~2~.
But since we already know *Ξ*~0~ and *Ξ*~1~ commute by the above list,
  we learn at this step that *Ξ*~0~ and *Ξ*~2~ commute.
This can be repeated until we reach the permutation (0 *n*-1) to prove commutativity between all pairs.


### Matrix Fields?

The above arguments tell us that if *p* is irreducible, we can take its companion matrix *C*~*p*~
  and work with its powers in the same way we would a typical root.
Irreducible polynomials cannot have a constant term 0, otherwise *x* could be factored out.
The constant term is equal to the determinant of the companion matrix (up to sign),
  so *C*~*p*~ is invertible.
We get commutativity for free, since it follows from associativity
  that all powers of *C*~*p*~ commute.

This narrows the ring of matrices to a full-on field.
Importantly, it absolves us from the need to symbolically render elements using a power of the root.
Instead, they can be adjoined by going from scalars to matrices.
We can also find every element in the field arithmetically.
Starting with a root, every element, produce new elements taking its matrix powers.
Then, scalar-multiply them and add them to elements of the field which are already known.
For finite fields, we can repeat this process with the new matrices
  until we have all *p*^*d*^ elements.


GF(8)
-----

This is all rather abstract, so let's look at an example before we proceed any further.
The next smallest field of characteristic 2 is GF(8).
We can construct this field from the two irreducible polynomials of degree 3 over GF(2):

$$
\begin{gather*}
  q(x) = x^3 + x + 1 = 1011_x \sim {}_2  11 \qquad
    C_q = \left( \begin{matrix}
      0 & 1 & 0 \\
      0 & 0 & 1 \\
      1 & 1 & 0
    \end{matrix} \right) \mod 2
  \\ \\
  r(x) = x^3 + x^2 + 1 =1101_x \sim {}_2 13 \qquad
    C_r = \left( \begin{matrix}
      0 & 1 & 0 \\
      0 & 0 & 1 \\
      1 & 0 & 1
    \end{matrix} \right) \mod 2
\end{gather*}
$$

Notice how the bit strings for either of these polynomials is the other, reversed.
Arbitrarily, let's work with C~r~.
The powers of this matrix, mod 2, are as follows:

$$
\begin{gather*}
  (C_r)^1 = \left( \begin{matrix}
      0 & 1 & 0 \\
      0 & 0 & 1 \\
      1 & 0 & 1
    \end{matrix} \right)
  \quad
  (C_r)^2 = \left( \begin{matrix}
      0 & 0 & 1 \\
      1 & 0 & 1 \\
      1 & 1 & 1
    \end{matrix} \right)
  \quad
  (C_r)^3 = \left( \begin{matrix}
      1 & 0 & 1 \\
      1 & 1 & 1 \\
      1 & 1 & 0
    \end{matrix} \right)
  \\
  (C_r)^4 = \left( \begin{matrix}
      1 & 1 & 1 \\
      1 & 1 & 0 \\
      0 & 1 & 1
    \end{matrix} \right) \quad
  (C_r)^5 = \left( \begin{matrix}
      1 & 1 & 0 \\
      0 & 1 & 1 \\
      1 & 0 & 0
    \end{matrix} \right) \quad
  (C_r)^6 = \left( \begin{matrix}
      0 & 1 & 1 \\
      1 & 0 & 0 \\
      0 & 1 & 0
    \end{matrix} \right)
  \\
  (C_r)^7 = \left( \begin{matrix}
      1 & 0 & 0 \\
      0 & 1 & 0 \\
      0 & 0 & 1
    \end{matrix} \right) = I
    = (C_r)^0 \quad
  (C_r)^8 = \left( \begin{matrix}
      0 & 1 & 0 \\
      0 & 0 & 1 \\
      1 & 0 & 1
    \end{matrix} \right) = C_r
\end{gather*}
$$

As a reminder, these matrices are taken mod 2, so the elements can only be 0 or 1.
The seventh power of *C*~*r*~ is just the identity matrix,
  meaning that the eighth power is the original matrix.
This means that *C*~*r*~ is cyclic of order 7 with respect to self-multiplication mod 2.
Along with the zero matrix, this fully characterizes GF(8).

If we picked *C*~*q*~ instead, we would have gotten different matrices.
I'll omit writing them here, but we get the same result: *C*~*q*~ is also cyclic of order 7.
Since every nonzero element of the field can be written as a power of the root,
  the root (and the polynomial) is termed
  [primitive](https://en.wikipedia.org/wiki/Primitive_polynomial_%28field_theory%29).


### Condensing

Working with matrices directly, as a human, is very cumbersome.
While it makes computation explicit, it makes presentation difficult.
One of the things in which we know we should be interested is the characteristic polynomial,
  since it is central to the definition and behavior of the matrices.
Let's focus only on the characteristic polynomial for successive powers of *C*~*r*~

$$
\begin{gather*}
  C_r = \left( \begin{matrix}
      0 & 1 & 0 \\
      0 & 0 & 1 \\
      1 & 0 & 1
    \end{matrix} \right) \mod 2
  \\ ~ \\
  \begin{array}{}
    \text{charpoly}((C_r)^1)
      &=& \color{blue} x^3 + x^2 + 1
      &=& \color{blue} 1101_x \sim {}_2 13 = r
    \\
    \text{charpoly}((C_r)^2)
      &=& \color{blue} x^3 + x^2 + 1
      &=& \color{blue} 1101_x \sim {}_2 13 = r
    \\
    \text{charpoly}((C_r)^3)
      &=& \color{red} x^3 + x + 1
      &=& \color{red} 1011_x \sim {}_2 11 = q
    \\
    \text{charpoly}((C_r)^4)
      &=& \color{blue} x^3 + x^2 + 1
      &=& \color{blue} 1101_x \sim {}_2 13 = r
    \\
    \text{charpoly}((C_r)^5)
      &=& \color{red} x^3 + x + 1
      &=& \color{red} 1011_x \sim {}_2 11 = q
    \\
    \text{charpoly}((C_r)^6)
      &=& \color{red} x^3 + x + 1
      &=& \color{red} 1011_x \sim {}_2 11 = q
    \\
    \text{charpoly}((C_r)^7)
      &=& x^3 + x^2 + x + 1
      &=& 1111_x \sim {}_2 15 = (x+1)^3
  \end{array}
\end{gather*}
$$

Somehow, even though we start with one characteristic polynomial, the other manages to work its way in here.
Both polynomials are of degree 3 and have 3 matrix roots (distinguished in red and blue).

If we chose to use *C*~*q*~, we'd actually get the same sequence backwards (starting with ~2~11).
It's beneficial to remember that 6, 5, and 3 can also be written as 7 - 1, 7 - 2, and 7 - 4.
This makes it clear that the powers of 2 (the field characteristic) less than the 8 (the order of the field) play a role with respect to both the initial and terminal items.


### Factoring

Intuitively, you may try using the roots to factor the matrix into powers of *C*~*r*~.
This turns out to work:

$$
\begin{gather*}
  \hat R(X) \overset?= (X - C_r)(X - (C_r)^2)(X - (C_r)^4)
  \\
  \hat Q(X) \overset?= (X - (C_r)^3)(X - (C_r)^5)(X - (C_r)^6)
  \\ \\
  \textcolor{red}{ \sigma_1([(C_r)^i]_{i \in [1,2,4]}) } = C_r + (C_r)^2 + (C_r)^4 = \textcolor{red}I
  \\
  \textcolor{brown}{ \sigma_1([(C_r)^i]_{i \in [3,5,6]}) } = (C_r)^3 + (C_r)^5 + (C_r)^6 = \textcolor{brown}0
  \\ \\
  \begin{align*}
    \color{blue} \sigma_2([(C_r)^i]_{i \in [1,2,4]})
      &= (C_r)(C_r)^2 + (C_r)(C_r)^4 + (C_r)^2(C_r)^4
    \\
    &= (C_r)^3 + (C_r)^5 + (C_r)^6 = \color{blue}0
    \\
    \color{cyan} \sigma_2([(C_r)^i]_{i \in [3,5,6]})
      &= (C_r)^3(C_r)^5 + (C_r)^3(C_r)^6 + (C_r)^5(C_r)^6
    \\
    &= (C_r)^8 + (C_r)^9 + (C_r)^{11}
    \\
    &= (C_r)^1 + (C_r)^2 + (C_r)^4 = \color{cyan} I
  \end{align*}
  \\ \\
  \textcolor{green}{ \sigma_3([(C_r)^i]_{i \in [1,2,4]}) } = (C_r)(C_r)^2(C_r)^4 = \textcolor{green}I
  \\
  \textcolor{lightgreen}{ \sigma_3([(C_r)^i]_{i \in [3,5,6]}) }= (C_r)^3(C_r)^5(C_r)^6 = \textcolor{lightgreen}I
  \\ \\
  \hat R(X) = X^3 + \textcolor{red}IX^2 + \textcolor{blue}0X + \textcolor{green}I
  \\
  \hat Q(X) = X^3 + \textcolor{brown}0X^2 + \textcolor{cyan}IX + \textcolor{lightgreen}I
\end{gather*}
$$

We could have factored our polynomials differently if we used *C*~*q*~ instead.
However, the effect of splitting both polynomials into monomial factors is the same.


GF(16)
------

GF(8) is simple to study, but too simple to study the sequence of characteristic polynomials alone.
Let's widen our scope to GF(16).
There are three irreducible polynomials of degree 3 over GF(2).

$$
\begin{gather*}
  s(x) = x^4 + x + 1  = 10011_x \sim {}_2 19 \quad
    C_s = \left( \begin{matrix}
      0 & 1 & 0 & 0 \\
      0 & 0 & 1 & 0 \\
      0 & 0 & 0 & 1 \\
      1 & 1 & 0 & 0
    \end{matrix} \right) \mod 2
  \\
  t(x) = x^4 + x^3 + 1 = 11001_x \sim {}_2 25 \quad
    C_t = \left( \begin{matrix}
      0 & 1 & 0 & 0 \\
      0 & 0 & 1 & 0 \\
      0 & 0 & 0 & 1 \\
      1 & 0 & 0 & 1
    \end{matrix} \right) \mod 2
  \\
  u(x) = x^4 + x^3 + x^2 + x + 1 = 11111_x \sim {}_2 31 \quad
    C_u = \left( \begin{matrix}
      0 & 1 & 0 & 0 \\
      0 & 0 & 1 & 0 \\
      0 & 0 & 0 & 1 \\
      1 & 1 & 1 & 1
    \end{matrix} \right) \mod 2
\end{gather*}
$$

Again, *s* and *t* form a pair under the reversal of their bit strings, while *u* is palindromic.
Both *C*~*s*~ and *C*~*t*~ are cyclic of order 15, so *s* and *t* are primitive polynomials.
Using *s* = ~2~19 to generate the field, the powers of its companion matrix *C*~*s*~
  have the following characteristic polynomials:

```{python}
#| echo: false

from IPython.display import Markdown
from tabulate import tabulate

charpolys = [19, 19, 31, 19, 21, 31, 25, 19, 31, 21, 25, 31, 25, 25, 17]
charpolyformat = lambda x: f"<span class=\"{'blue' if x == 19 else 'red' if x == 25 else ''}\">~2~{x}</span>"

Markdown(tabulate(
  [[
    "charpoly((*C*~*s*~)^*m*^)",
    *[charpolyformat(charpoly) for charpoly in charpolys]
  ]],
  headers=["*m*", *[str(i + 1) for i in range(15)]],
))
```

The polynomial ~2~19 occurs at positions 1, 2, 4, and 8.
These are obviously powers of 2, the characteristic of the field.
Similarly, the polynomial *t* = ~2~25 occurs at positions 14 (= 15 - 1), 13 (= 15 - 2),
  11 (= 15 - 4), and 7 (= 15 - 8).
We'd get the same sequence backwards if we used *C*~*t*~ instead, just like in GF(8).


### Non-primitive

The polynomial *u* = ~2~31 occurs at positions 3, 6, 9, and 12
  -- multiples of 3, which is a factor of *15*.
It follows that the roots of *u* are cyclic of order 5, so this polynomial is irreducible,
  but *not* primitive.

Naturally, $\hat U(X)$ can be factored as powers of (*C*~*s*~)^3^.
We can also factor it more naively as powers of *C*~*u*~. Either way, we get the same sequence.

:::: {layout-ncol = "2"}
::: {}
```{python}
#| echo: false
upowers = [31, 31, 31, 31, 17]

Markdown(tabulate(
  [[
    "charpoly((*C*~*s*~)^*3m*^)",
    *[f"~2~{charpoly}" for charpoly in charpolys[2::3]]
  ], [
    "charpoly((*C*~*u*~)^*m*^)",
    *[f"~2~{upower}" for upower in upowers]
  ]],
  headers=["*m*", *[str(i + 1) for i in range(5)]],
))
```

Both of the matrices in column 5 happen to be the identity matrix.
It follows that this root is only cyclic of order 5.

The polynomials ~2~19 and ~2~25 are reversals of one another and the sequences that their companion matrices
  generate end one with another -- in this regard, they are dual.
However, ~2~31 = 11111~*x*~ is a palindrome and its sequence ends where it begins, so it is self-dual.
:::

::: {width = "33%"}
$$
\begin{gather*}
  (C_u)^1 =\left( \begin{matrix}
      0 & 1 & 0 & 0 \\
      0 & 0 & 1 & 0 \\
      0 & 0 & 0 & 1 \\
      1 & 1 & 1 & 1
    \end{matrix} \right)
  \\ \\
  (C_u)^2 =\left( \begin{matrix}
      0 & 0 & 1 & 0 \\
      0 & 0 & 0 & 1 \\
      1 & 1 & 1 & 1 \\
      1 & 0 & 0 & 0
    \end{matrix} \right)
  \\ \\
  (C_u)^3 =\left( \begin{matrix}
      0 & 0 & 0 & 1 \\
      1 & 1 & 1 & 1 \\
      1 & 0 & 0 & 0 \\
      0 & 1 & 0 & 0
    \end{matrix} \right)
  \\ \\
  (C_u)^4 =\left( \begin{matrix}
      1 & 1 & 1 & 1 \\
      1 & 0 & 0 & 0 \\
      0 & 1 & 0 & 0 \\
      0 & 0 & 1 & 0 \\
    \end{matrix} \right)
  \\ \\
  (C_u)^5 =\left( \begin{matrix}
      1 & 0 & 0 & 0 \\
      0 & 1 & 0 & 0 \\
      0 & 0 & 1 & 0 \\
      0 & 0 & 0 & 1 \\
    \end{matrix} \right)
  \\
  = I = (C_u)^0
\end{gather*}
$$
:::
::::


### Non-irreducible

In addition to the three irreducibles, a fourth polynomial, ~2~21 = 10101~*x*~,
  also appears in the sequence on entries 5 and 10 -- multiples of 5, which is also a factor of 15.
Like ~2~31, this polynomial is palindromic.
This polynomial is *not* irreducible mod 2, and factors as:

$$
\begin{gather*}
  {}_2 21 \sim 10101_x = x^4 + x^2 + 1 = (x^2 + x + 1)^2 \mod 2
  \\[10pt]
  (X - (C_s)^5)(X - (C_s)^{10}) = X^2 + ((C_s)^5 + (C_s)^{10})X + (C_s)^{15}
  \\
  = X^2 + IX + I
\end{gather*}
$$

Just like how the fields we construct are powers of a prime, this extra element is a power
  of a smaller irreducible.
This is unexpected, but perhaps not surprising.

Something a little more surprising is that the companion matrix is cyclic of degree *6*,
  rather than of degree 3 like the matrices encountered in GF(8).
The powers of its companion matrix are:

<!--
TODO: assemble this table
::: {}
|                      *m* |     1 |     2 |     3 |     4 |     5 |     6 |
|--------------------------|-------|-------|-------|-------|-------|-------|
| charpoly((*C*~*s*~)^5m^) | ~2~21 | ~2~21 | ~2~17 ((*C*~*s*~)^15^ is the identity matrix) | ~2~21 | ~2~21 | ~2~17 (identity) |
| charpoly((*C*~*21*~)^m^) | <span class="red">~2~21</span> | <span class="blue">~2~21</span> | ~2~17 ((*C*~*21*~)^3^ is the identity matrix) | <span class="blue">~2~21</span> | <span class="red">~2~21</span> | ~2~17 (identity) |
:::
-->

We can think of the repeated sequence as ensuring that there are enough roots of ~2~21.
The Fundamental Theorem of Algebra states that there must be 4 roots.
For *numbers*, we'd allow duplicate roots with multiplicities greater than 1, but the matrix roots are all distinct.

Basic group theory tells us that as a cyclic group, the matrix's first and fifth powers
  (in red) are pairs of inverses.
The constant term of the characteristic polynomial is the product of all four roots and,
  as a polynomial over matrices, must be some nonzero multiple of the identity matrix.
Since the red roots are a pair of inverses, the blue roots are, too.


GF(32)
------

GF(32) turns out to be special.
There are six irreducible polynomials of degree 5 over GF(2).
Picking one of them at random, ~2~37, and looking at the polynomial sequence it generates, we see:

```{python}
#| echo: false
gf32powers = [
  37, 37, 61, 37, 55, 61, 47, 37, 55, 55, 59, 61, 59, 47, 41,
  37, 61, 55, 47, 55, 59, 59, 41, 61, 47, 59, 41, 47, 41, 41, 51,
]
gf32colors = {
  37: "red",
  61: "blue",
  55: "yellow",
  47: "orange",
  59: "purple",
  41: "green",
}
gf32format = lambda x: f"<span class=\"{gf32colors.get(x, '')}\">~2~{x}</span>"

Markdown(tabulate(
  [[
    "charpoly((*C*~*u*~)^*m*^)",
    "-",
    *[gf32format(gf32power) for gf32power in gf32powers[:15]]
  ]],
  headers=["*m*", *[str(i) for i in range(16 + 1)]],
))
```
```{python}
#| echo: false
Markdown(tabulate(
  [[
    "charpoly((*C*~*u*~)^*m*^)",
    *[gf32format(gf32power) for gf32power in gf32powers[:-17:-1]]
  ]],
  headers=["*m*", *[str(i) for i in reversed(range(16, 32))]],
))
```

31 is prime, so we don't have any sub-patterns that appear on multiples of factors.
In fact, all six irreducible polynomials are present in this table.
The pairs in complementary colors form pairs under reversing the polynomials:
  <span class="red">~2~37</span> and <span class="green">~2~41</span>,
  <span class="blue">~2~61</span> and <span class="orange">~2~47</span>,
  and <span class="yellow">~2~55</span> and <span class="purple">~2~59</span>.

Since their roots have order 31, these polynomials are actually
  the distinct factors of *x*^31^ - 1 mod 2:

$$
\begin{gather*}
  x^{31} -1 = (x-1)(x^{30} +x^{29} + ... + x + 1)
  \\
  (x^{30} +x^{29} + ... + x + 1) =
    \left\{ \begin{align*}
      &\phantom\cdot (x^5 + x^2 + 1) &\sim \quad {}_2 37
      \\
      &\cdot (x^5 + x^3 + 1) &\sim \quad {}_2 41 \\
      &\cdot (x^5 + x^4 + x^3 + x^2 + 1) &\sim \quad {}_2 61
      \\
      &\cdot (x^5 + x^3 + x^2 + x + 1) &\sim \quad {}_2 47
      \\
      &\cdot (x^5 + x^4 + x^2 + x + 1) &\sim \quad {}_2 55
      \\
      &\cdot (x^5 + x^4 + x^3 + x + 1) &\sim \quad {}_2 59
    \end{align*} \right.
\end{gather*}
$$

This is a feature special to fields of characteristic 2.
2 is the only prime number whose powers can be one more than another prime,
  since all other prime powers are one more than even numbers.
31 is a [Mersenne prime](https://en.wikipedia.org/wiki/Mersenne_prime),
  so all integers less than 31 are coprime to it.
Thus, there is no room for the "extra" entries we observed in GF(16) which occurred
  on factors of 15 = 16 - 1.
No entry can be irreducible (but not primitive) or the power of an irreducible of lower degree.
In other words, *only primitive polynomials exist of degree* p *if 2^p^ - 1 is a Mersenne prime*.


### Counting Irreducibles

The remark about coprimes to 31 may inspire you to think of the
  [totient function](https://en.wikipedia.org/wiki/Euler%27s_totient_function).
We have *φ*(2^5^ - 1) = 30 = 5⋅6, where 5 is the degree and 6 is the number of primitive polynomials.
We also have *φ*(24 - 1) = 8 = 4⋅2 and *φ*(23 - 1) = 6 = 3⋅2.
In general, it is true that there are *φ*(*pm* - 1) / *m* primitive polynomials of degree m over GF(p).


Polynomial Reversal
-------------------

We've only been looking at fields of characteristic 2, where the meaning of
  "palindrome" and "reversed polynomial" is intuitive.
Let's look at an example over characteristic 3.
One primitive of degree 2 is ~3~14, which gives rise to the following sequence over GF(9):

```{python}
#| echo: false
gf9powers = [14, 10, 14, 16, 17, 10, 17, 13]
gf9format = lambda x: f"<span class=\"{'red' if x == 14 else 'blue' if x == 17 else ''}\">~3~{x}</span>"

Markdown(tabulate(
  [[
    "charpoly((*C*~*14*~)^*m*^)",
    *[gf9format(gf9power) for gf9power in gf9powers]
  ]],
  headers=["*m*", *[str(i + 1) for i in range(8)]],
))
```

The table suggests that ~3~14 = 112~*x*~ = *x*^2^ + *x* + 2 and ~3~17 = 122~*x*~ = *x*^2^ + 2*x* + 2
  are reversals of one another.
More naturally, you'd think that 112~*x*~ reversed is 211~*x*~.
But remember that we prefer to work with monic polynomials.
By multiplying the polynomial by the multiplicative inverse of the leading coefficient (in this case, 2),
  we get 422~*x*~ ≡ 122~*x*~ mod 3.
This is a rule that applies over larger characteristics in general.

Note that ~3~16 is 121~*x*~ = *x*^2^ + 2x + 1 and ~3~13 = 111~*x*~ = *x*^2^ + x + 1 = *x*^2^ - 2x + 1,
  both of which have factors over GF(3).


Power Graphs
------------

We can study the interplay of primitives, irreducibles, and their powers by converting
  our sequences into (directed) graphs.
Each node in the graph represents a characteristic polynomial that appears over the field;
  call the one under consideration *a*.
If the sequence of polynomials generated by *C*~*a*~ contains contains another polynomial *b*,
  then there is an edge from *a* to *b*.

We can do this for every GF(*p*^*m*^).
Let's start with the first few fields of characteristic 2.
We get the following graphs:

![](./char_2_irreducibles_graphs.png)

All nodes connect to the node corresponding to the identity matrix, since all roots are cyclic.
Also, since all primitive polynomials are interchangeable with one another,
  they are all interconnected and form a [complete](https://en.wikipedia.org/wiki/Complete_graph) clique.
This means that, excluding the identity node, the graphs for fields of order one more
  than a Mersenne prime are just the complete graphs.

Since all of the graphs share the identity node as a feature
  -- a node with incoming edges from every other node -- its convenient to omit it.
Here are a few more of these graphs after doing so, over fields of other characteristics:

<!-- TODO: these are graphviz diagrams and could be generated via code -->
::: {layout="[[1,1], [1,1], [1,1,1]]"}
![
  GF(9)
]()

![
  GF(25)
]()

![
  GF(49)
]()

![
  GF(121)
]()

![
  GF(27)
]()

![
  GF(125)
]()

![
  GF(343)
]()
:::


### Spectra

Again, since visually interpreting graphs is difficult, we can study an invariant.
From these graphs of polynomials, we can compute *their* characteristic polynomials
  (to add another layer to this algebraic cake) and look at their spectra.

It turns out that a removing a fully-connected node (like the one for the identity matrix)
  has a simple effect on characteristic polynomial of a graph: it just removes a factor of *x*.
Here are a few of the (identity-reduced) spectra, arranged into a table.

| Characteristic | Order |               Spectrum               | Remark                   |
|----------------|-------|--------------------------------------|--------------------------|
|              2 |     4 |                                    0 |                          |
|                |     8 |                                -1, 1 | Mersenne                 |
|                |    16 |                          0^2^, -1, 1 |                          |
|                |    32 |                             -1^5^, 5 | Mersenne                 |
|              3 |     9 |                          0^2^, -1, 1 |                          |
|                |    27 |                       0, -1^6^, 3^2^ | Pseudo-Mersenne?         |
|              5 |    25 |                 0^3^, -1^6^, 1^3^, 3 |                          |
|                |   125 |               0, -1^38^, 1, 9^2^, 19 | Prime power in spectrum  |
|              7 |    49 |           0^2, -1^17^, 1^4^, 3^2^, 7 |                          |
|                |   343 | 0, -1^106^, 1^4^, 5^2^, 11^2^, 35^2^ | Composite in spectrum    |
|             11 |   121 |   0^4^, -1^49^, 1^2^, 3^6^, 7^2^, 15 | Composite in spectrum    |

Incredibly, all spectra shown are composed exclusively of integers, and thus,
  each of these graphs are integral graphs.
Moreover, it does not appear that any integer sequences that one may try extracting from this table
  (for example, the multiplicity of -1) can be found in the
  [Online Encyclopedia of Integer Sequences](https://oeis.org/).

From what I was able to tell, the following subgraphs were *also* integral over the range I tested:

- the induced subgraph of vertices corresponding to non-primitives
- the complement of the previous graph with respect to the whole graph
- the induced subgraph of vertices corresponding only  to irreducibles

Unfortunately, proving any such relationship is out of the scope of this post (and my abilities).


Closing
-------

This concludes the first foray into using matrices as elements of prime power fields.
It is a subject which, using the tools of linear algebra, makes certain aspects of field theory
  more palatable and constructs some objects with fairly interesting properties.

One of the most intriguing parts to me is the sequence of polynomials generated by a companion matrix.
Though I haven't proven it, I suspect that it suffices to study only the sequence generated
  by a primitive polynomial.
It seems to be possible to get the non-primitive sequences by looking at the subsequences
  where the indices are multiples of a factor of the length of the sequence.
But this means that the entire story about polynomials and finite fields can be foregone entirely,
  and the problem instead becomes one of number theory.

The [next post](../3) will focus on an "application" of matrix roots to other areas of abstract algebra.
Diagrams made with Geogebra and NetworkX (GraphViz).