826 lines
27 KiB
Plaintext
826 lines
27 KiB
Plaintext
---
|
||
title: "Exploring Finite Fields, Part 4: The Power of Forgetting"
|
||
description: |
|
||
...
|
||
format:
|
||
html:
|
||
html-math-method: katex
|
||
jupyter: python3
|
||
date: "2024-02-03"
|
||
date-modified: "2025-07-20"
|
||
categories:
|
||
- algebra
|
||
- finite field
|
||
- haskell
|
||
---
|
||
|
||
|
||
The [last post](../3) in this series focused on understanding some small linear groups
|
||
and implementing them on the computer over both a prime field and prime power field.
|
||
|
||
The prime power case was particularly interesting.
|
||
First, we adjoined the roots of a polynomial to the base field, GF(2).
|
||
Rather than the traditional means of adding new symbols like *α*, we used companion matrices,
|
||
which behave the same arithmetically.
|
||
For example, for the smallest prime power field, GF(4), we use the polynomial $p(x) = x^2 + x + 1$,
|
||
and map its symbolic roots (*α* and *α*^2^), to matrices over GF(2):
|
||
|
||
$$
|
||
\begin{gather*}
|
||
f : \mathbb{F}_4 \longrightarrow \mathbb{F}_2 {}^{2 \times 2}
|
||
\\ \\
|
||
\begin{gather*}
|
||
f(0) = {\bf 0} =
|
||
\left(\begin{matrix} 0 & 0 \\ 0 & 0 \end{matrix}\right)
|
||
& f(1) = I
|
||
= \left(\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}\right)
|
||
\\
|
||
f(\alpha) = C_p
|
||
= \left(\begin{matrix} 0 & 1 \\ 1 & 1 \end{matrix}\right)
|
||
& f(\alpha^2) = C_p {}^2
|
||
= \left(\begin{matrix} 1 & 1 \\ 1 & 0 \end{matrix}\right)
|
||
\end{gather*}
|
||
\\ \\
|
||
f(a + b)= f(a) + f(b), \quad f(ab) = f(a)f(b)
|
||
\end{gather*}
|
||
$$
|
||
|
||
Finally, we constructed GL(2, 4) using matrices of matrices
|
||
-- not [block matrices](https://en.wikipedia.org/wiki/Block_matrix)!
|
||
This post will focus on studying this method in slightly more detail.
|
||
|
||
|
||
Reframing the Path Until Now
|
||
----------------------------
|
||
|
||
In the above description, we already mentioned larger structures over GF(2),
|
||
namely polynomials and matrices.
|
||
Since GF(4) can itself be described with matrices over GF(2),
|
||
we can generalize *f* to give us two more maps:
|
||
|
||
- $f^*$, which converts matrices over GF(4) to double-layered matrices over GF(2), and
|
||
- $f^\bullet$, which converts polynomials over GF(4) to polynomials of matrices over GF(2)
|
||
|
||
|
||
### Matrix Map
|
||
|
||
We examined the former map briefly in the previous post.
|
||
More explicitly, we looked at a matrix *B* in SL(2, 4) which had the property
|
||
that it was cyclic of order five.
|
||
Then, to work with it without relying on symbols, we simply applied *f* over the contents of the matrix.
|
||
|
||
$$
|
||
\begin{gather*}
|
||
f^* : \mathbb{F}_4 {}^{2 \times 2}
|
||
\longrightarrow
|
||
(\mathbb{F}_2 {}^{2 \times 2})^{2 \times 2}
|
||
\\[10pt]
|
||
B = \left(\begin{matrix}
|
||
0 & \alpha \\
|
||
\alpha^2 & \alpha^2
|
||
\end{matrix} \right)
|
||
\\
|
||
B^* = f^*(B)
|
||
= \left(\begin{matrix}
|
||
f(0) & f(\alpha) \\
|
||
f(\alpha^2) & f(\alpha^2)
|
||
\end{matrix} \right)
|
||
= \left(\begin{matrix}
|
||
\left(\begin{matrix} 0 & 0 \\ 0 & 0 \end{matrix} \right)
|
||
& \left(\begin{matrix} 0 & 1 \\ 1 & 1 \end{matrix} \right)
|
||
\\
|
||
\left(\begin{matrix} 1 & 1 \\ 1 & 0 \end{matrix} \right)
|
||
& \left(\begin{matrix} 1 & 1 \\ 1 & 0 \end{matrix} \right)
|
||
\end{matrix} \right)
|
||
\end{gather*}
|
||
$$
|
||
|
||
We can do this because a matrix contains values in the domain of *f*, thus uniquely determining
|
||
a way to change the internal structure (what Haskell calls
|
||
a [functor](https://wiki.haskell.org/Functor)).
|
||
Furthermore, due to the properties of *f*, it and *f*\* commute with the determinant,
|
||
as shown by the following diagram:
|
||
|
||
$$
|
||
\begin{gather*}
|
||
f(\det(B)) = f(1) = I =\det(B^*)= \det(f^*(B))
|
||
\\[10pt]
|
||
\begin{CD}
|
||
\mathbb{F}_4 {}^{2 \times 2}
|
||
@>{\det}>>
|
||
\mathbb{F}_4
|
||
\\
|
||
@V{f^*}VV ~ @VV{f}V
|
||
\\
|
||
(\mathbb{F}_2 {}^{2 \times 2})^{2 \times 2}
|
||
@>>{\det}>
|
||
\mathbb{F}_2 {}^{2 \times 2}
|
||
\end{CD}
|
||
\end{gather*}
|
||
$$
|
||
|
||
It should be noted that the determinant strips off the *outer* matrix.
|
||
We could also consider the map **det**\* , where we apply the determinant
|
||
to the internal matrices (in Haskell terms, `fmap det`).
|
||
This map isn't as nice though, since:
|
||
|
||
$$
|
||
\begin{align*}
|
||
\det {}^*(B^*)
|
||
&= \left(\begin{matrix}
|
||
\det \left(\begin{matrix} 0 & 0 \\ 0 & 0 \end{matrix} \right)
|
||
& \det \left(\begin{matrix} 0 & 1 \\ 1 & 1 \end{matrix} \right)
|
||
\\
|
||
\det \left(\begin{matrix} 1 & 1 \\ 1 & 0 \end{matrix} \right)
|
||
& \det \left(\begin{matrix} 1 & 1 \\ 1 & 0 \end{matrix} \right)
|
||
\end{matrix} \right)
|
||
= \left(\begin{matrix}
|
||
0 & 1 \\
|
||
1 & 1
|
||
\end{matrix} \right)
|
||
\\ \\
|
||
&\neq \left(\begin{matrix}
|
||
1 & 0 \\
|
||
0 & 1
|
||
\end{matrix} \right)
|
||
= \det(B^*)
|
||
\end{align*}
|
||
$$
|
||
|
||
|
||
### Polynomial Map
|
||
|
||
Much like how we can change the internal structure of matrices, we can do the same for polynomials.
|
||
For the purposes of demonstration, we'll work with $b = \lambda^2 + \alpha^2 \lambda + 1$,
|
||
the characteristic polynomial of *B*, since it has coefficients in the domain of *f*.
|
||
We define the extended map $f^\bullet$ as:
|
||
|
||
$$
|
||
\begin{gather*}
|
||
f^{\bullet} : \mathbb{F}_4[\lambda] \longrightarrow
|
||
\mathbb{F}_2 {}^{2 \times 2}[\Lambda]
|
||
\\
|
||
f^{\bullet} (\lambda) = \Lambda \qquad
|
||
f^{\bullet}(a) = f(a), \quad a \in \mathbb{F}_4
|
||
\\ \\
|
||
\begin{align*}
|
||
b^{\bullet}
|
||
= f^{\bullet}(b)
|
||
&= f^{\bullet}(\lambda^2)
|
||
&&+&& f^{\bullet}(\alpha^2)f^{\bullet}(\lambda)
|
||
&&+&& f^{\bullet}(1)
|
||
\\
|
||
&= \Lambda^2
|
||
&&+&& \left(\begin{matrix} 1 & 1 \\ 1 & 0\end{matrix}\right) \Lambda
|
||
&&+&& \left(\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}\right)
|
||
\end{align*}
|
||
\end{gather*}
|
||
$$
|
||
|
||
Since we're looking at the characteristic polynomial of *B*, we might as well also look
|
||
at the characteristic polynomial of *B*\*, its image under $f^*$.
|
||
We already looked at the determinant of this matrix, which is the constant term
|
||
of the characteristic polynomial (up to sign).
|
||
Therefore, it's probably not surprising that $f^\bullet$ and the characteristic polynomial commute
|
||
in a similar fashion to the determinant.
|
||
|
||
$$
|
||
\begin{gather*}
|
||
\begin{align*}
|
||
b^*
|
||
&= \text{charpoly}(f^*(B))
|
||
= \text{charpoly}
|
||
\left(\begin{matrix}
|
||
\left(\begin{matrix} 0 & 0 \\ 0 & 0 \end{matrix} \right) &
|
||
\left(\begin{matrix} 0 & 1 \\ 1 & 1 \end{matrix} \right) \\
|
||
\left(\begin{matrix} 1 & 1 \\ 1 & 0 \end{matrix} \right) &
|
||
\left(\begin{matrix} 1 & 1 \\ 1 & 0 \end{matrix} \right)
|
||
\end{matrix} \right)
|
||
\\
|
||
&= \Lambda^2 +
|
||
\left(\begin{matrix} 1 & 1 \\ 1 & 0 \end{matrix} \right) \Lambda +
|
||
\left(\begin{matrix} 1 & 1 \\ 1 & 0 \end{matrix} \right)
|
||
= f^{\bullet}(\text{charpoly}(B))
|
||
= b^\bullet
|
||
\end{align*}
|
||
\\ \\
|
||
\begin{CD}
|
||
\mathbb{F}_4 {}^{2 \times 2}
|
||
@>{\text{charpoly}}>>
|
||
\mathbb{F}_4[\lambda]
|
||
\\
|
||
@V{f^*}VV ~ @VV{f^\bullet}V
|
||
\\
|
||
(\mathbb{F}_2 {}^{2 \times 2})^{2 \times 2}
|
||
@>>{\text{charpoly}}>
|
||
(\mathbb{F}_2 {}^{2 \times 2})[\Lambda]
|
||
\end{CD}
|
||
\end{gather*}
|
||
$$
|
||
|
||
It should also be mentioned that **charpoly**\*, taking the characteristic polynomials
|
||
of the internal matrices, does *not* obey the same relationship.
|
||
For one, the type is wrong: the codomain is a matrix *containing* polynomials,
|
||
rather than a polynomial over matrices.
|
||
|
||
There *does* happen to be an isomorphism between the two structures
|
||
(a direction of which we'll discuss momentarily).
|
||
But even by converting to the proper type, we already have a counterexample in the constant term
|
||
from taking **det**\* earlier.
|
||
|
||
$$
|
||
\begin{align*}
|
||
\text{charpoly}^*(B^*)
|
||
&= \left(\begin{matrix}
|
||
\text{charpoly} \left(\begin{matrix} 0 & 0 \\ 0 & 0 \end{matrix} \right) &
|
||
\text{charpoly} \left(\begin{matrix} 0 & 1 \\ 1 & 1 \end{matrix} \right) \\
|
||
\text{charpoly} \left(\begin{matrix} 1 & 1 \\ 1 & 0 \end{matrix} \right) &
|
||
\text{charpoly} \left(\begin{matrix} 1 & 1 \\ 1 & 0 \end{matrix} \right)
|
||
\end{matrix} \right)
|
||
\\
|
||
&= \left(\begin{matrix}
|
||
\lambda^2 & \lambda^2 + \lambda + 1 \\
|
||
\lambda^2 + \lambda + 1 & \lambda^2 + \lambda + 1
|
||
\end{matrix} \right)
|
||
\\
|
||
&\cong
|
||
\left(\begin{matrix} 1 & 1 \\ 1 & 1 \end{matrix} \right) \Lambda^2
|
||
+ \left(\begin{matrix} 0 & 1 \\ 1 & 1 \end{matrix} \right) \Lambda
|
||
+ \left(\begin{matrix} 0 & 1 \\ 1 & 1 \end{matrix} \right)
|
||
\\ \\
|
||
&\neq f^{\bullet}(\text{charpoly}(B))
|
||
\end{align*}
|
||
$$
|
||
|
||
|
||
Forgetting
|
||
----------
|
||
|
||
Clearly, layering matrices has several advantages over how we usually interpret block matrices.
|
||
But what happens if we *do* "forget" about the internal structure?
|
||
|
||
$$
|
||
\begin{gather*}
|
||
\text{forget} : (\mathbb{F}_2 {}^{2 \times 2})^{2 \times 2}
|
||
\longrightarrow \mathbb{F}_2 {}^{4 \times 4}
|
||
\\ \\
|
||
\hat B = \text{forget}(B^*)
|
||
= \text{forget}\left(\begin{matrix}
|
||
\left(\begin{matrix} 0 & 0 \\ 0 & 0 \end{matrix} \right)
|
||
& \left(\begin{matrix} 0 & 1 \\ 1 & 1 \end{matrix} \right)
|
||
\\
|
||
\left(\begin{matrix} 1 & 1 \\ 1 & 0 \end{matrix} \right)
|
||
& \left(\begin{matrix} 1 & 1 \\ 1 & 0 \end{matrix} \right)
|
||
\end{matrix} \right)
|
||
= \left(\begin{matrix}
|
||
0 & 0 & 0 & 1 \\
|
||
0 & 0 & 1 & 1 \\
|
||
1 & 1 & 1 & 1 \\
|
||
1 & 0 & 1 & 0
|
||
\end{matrix} \right)
|
||
\end{gather*}
|
||
$$
|
||
|
||
<details>
|
||
<summary>
|
||
Haskell implementation of `forget`
|
||
</summary>
|
||
|
||
<!-- TODO: run in jupyter -->
|
||
```{.haskell}
|
||
forget :: Matrix (Matrix a) -> Matrix a
|
||
-- Massively complicated point-free way to forget matrices:
|
||
-- 1. Convert internal matrices to lists of lists
|
||
-- 2. Convert the external matrix to a list of lists
|
||
-- 3. There are now four layers of lists. Transpose the second and third.
|
||
-- 4. Concat the new third and fourth layers together
|
||
-- 5. Concat the first and second layers together
|
||
-- 6. Convert the list of lists back to a matrix
|
||
forget = toMatrix . concat . fmap (fmap concat . transpose) .
|
||
fromMatrix . fmap fromMatrix
|
||
```
|
||
|
||
To see why this is the structure, remember that we need to work with rows
|
||
of the external matrix at the same time.
|
||
We'd like to read across the whole row, but this involves descending into two matrices.
|
||
The `fmap transpose` allows us to collect rows in the way we expect.
|
||
For example, for the above matrix, We get `[[[0,0],[0,1]], [[0,0],[1,1]]]` after the transposition,
|
||
which are the first two rows, grouped by the matrix they belonged to.
|
||
Then, we can finally get the desired row by `fmap (fmap concat)`ing the rows together.
|
||
Finally, we `concat` once more to undo the column grouping.
|
||
</details>
|
||
|
||
Like *f*, `forget` preserves addition and multiplication, a fact already appreciated by block matrices.
|
||
Further, by *f*, the internal matrices multiply the same as elements of GF(4).
|
||
Hence, this shows us directly that GL(2, 4) is a subgroup of GL(4, 2).
|
||
|
||
However, an obvious difference between layered and "forgotten" matrices is
|
||
the determinant and characteristic polynomial:
|
||
|
||
$$
|
||
\begin{align*}
|
||
\det {B^*} &= \left(\begin{matrix}1 & 0 \\ 0 & 1\end{matrix}\right)
|
||
\\ \\
|
||
\det {\hat B} &= 1
|
||
\end{align*}
|
||
\qquad
|
||
\begin{align*}
|
||
\text{charpoly}(B^*)
|
||
&= \Lambda^2 +
|
||
\left(\begin{matrix}1 & 1 \\ 1 & 0 \end{matrix}\right)\Lambda +
|
||
\left(\begin{matrix}1 & 0 \\ 0 & 1\end{matrix}\right)
|
||
\\ \\
|
||
\text{charpoly}(\hat B)
|
||
&= \lambda^4 + \lambda^3 + \lambda^2 + \lambda + 1\\
|
||
\end{align*}
|
||
$$
|
||
|
||
|
||
### Another Path to the Forgotten
|
||
|
||
It's a relatively simple matter to move between determinants, since it's straightforward
|
||
to identify 1 and the identity matrix.
|
||
However, a natural question to ask is whether there's a way to reconcile or coerce
|
||
the matrix polynomial into the "forgotten" one.
|
||
|
||
<!-- TODO: reorganize parts of second post? -->
|
||
First, let's formally establish a path from matrix polynomials to a matrix of polynomials.
|
||
We need only use our friend from the [second post](../2) -- polynomial evaluation.
|
||
Simply evaluating a matrix polynomial at *λI* converts our matrix indeterminate (*Λ*) into a scalar one (*λ*).
|
||
|
||
$$
|
||
\begin{align*}
|
||
\text{eval}_{\Lambda \mapsto \lambda I}
|
||
&: (\mathbb{F}_2 {}^{2 \times 2})[\Lambda]
|
||
\rightarrow (\mathbb{F}_2[\lambda]) {}^{2 \times 2}
|
||
\\
|
||
&:: \quad
|
||
r(\Lambda) \mapsto r(\lambda I)
|
||
\\ \\
|
||
\text{eval}_{\Lambda \mapsto \lambda I}(\text{charpoly}(B^*))
|
||
&= (\lambda I)^2
|
||
+ \left(\begin{matrix}1 & 1 \\ 1 & 0 \end{matrix}\right)(\lambda I)
|
||
+ \left(\begin{matrix}1 & 0 \\ 0 & 1\end{matrix}\right)
|
||
\\
|
||
&= \left(\begin{matrix}
|
||
\lambda^2 + \lambda + 1 & \lambda \\
|
||
\lambda & \lambda^2 + 1
|
||
\end{matrix}\right)
|
||
\end{align*}
|
||
$$
|
||
|
||
Since a matrix containing polynomials is still a matrix, we can then take its determinant.
|
||
What pops out is exactly what we were after...
|
||
|
||
$$
|
||
\begin{align*}
|
||
\det(\text{eval}_{\Lambda \mapsto \lambda I}(\text{charpoly}(B^*)))
|
||
&= (\lambda^2 + \lambda + 1)(\lambda^2 + 1) - \lambda^2
|
||
\\
|
||
&= \lambda^4 + \lambda^3 + \lambda^2 + \lambda + 1
|
||
\\
|
||
&= \text{charpoly}(\hat B)
|
||
\end{align*}
|
||
$$
|
||
|
||
...and we can arrange our maps into another diagram:
|
||
|
||
$$
|
||
\begin{gather*}
|
||
\begin{CD}
|
||
(\mathbb{F}_2 {}^{2 \times 2})^{2 \times 2}
|
||
@>{\text{charpoly}}>>
|
||
(\mathbb{F}_2 {}^{2 \times 2})[\Lambda]
|
||
\\
|
||
@V{\text{id}}VV ~ @VV{\text{eval}_{\Lambda \mapsto \lambda I}}V
|
||
\\
|
||
-
|
||
@. (\mathbb{F}_2 [\lambda])^{2 \times 2}
|
||
\\
|
||
@V{\text{forget}}VV ~ @VV{\det}V
|
||
\\
|
||
\mathbb{F}_2 {}^{4 \times 4}
|
||
@>>{\text{charpoly}}>
|
||
\mathbb{F}_2[\lambda]
|
||
\end{CD}
|
||
\\ \\
|
||
\text{charpoly} \circ \text{forget}
|
||
= \det \circ ~\text{eval}_{\Lambda \mapsto \lambda I} \circ\text{charpoly}
|
||
\end{gather*}
|
||
$$
|
||
|
||
<details>
|
||
<summary>
|
||
Haskell demonstration of this commutation
|
||
</summary>
|
||
Fortunately, the implementation of `charpoly` using Laplace expansion already works with numeric matrices.
|
||
Therefore, we need only define the special eval:
|
||
|
||
```{.haskell}
|
||
toMatrixPolynomial :: Num a => Polynomial (Matrix a) -> Matrix (Polynomial a)
|
||
-- Collect our coefficient matrices into a single matrix of polynomials
|
||
toMatrixPolynomial (Poly ps) = Mat $ array rs values where
|
||
-- Technically, we're always working with square matrices, but we should
|
||
-- always use the largest bounds available.
|
||
(is,js) = unzip $ map mDims ps
|
||
rs = ((0,0),(maximum is - 1,maximum js - 1))
|
||
-- Address a matrix. This needs defaulting to zero to be fully correct
|
||
-- with respect to the range given by `rs`
|
||
access b (Mat m) = m!b
|
||
-- Build the value at an address by addressing over the coefficients
|
||
-- ps is already in rising coefficient order, so our values are too.
|
||
values = map (\r -> (r, Poly $ map (access r) $ ps)) (range rs)
|
||
```
|
||
|
||
Now we can simply observe:
|
||
|
||
<!-- TODO: run in jupyter -->
|
||
```{.haskell}
|
||
field4 = [zero 2, eye 2, toMatrix [[0,1],[1,1]], toMatrix [[1,1],[1,0]]]
|
||
|
||
mB = toMatrix $ [[field4!!0, field4!!2], [field4!!3, field4!!3]]
|
||
|
||
-- >>> mapM_ print $ fromMatrix $ forget mB
|
||
-- -- [0,0,0,1]
|
||
-- -- [0,0,1,1]
|
||
-- -- [1,1,1,1]
|
||
-- -- [1,0,1,0]
|
||
|
||
-- >>> fmap (`mod` 2) $ charpoly $ forget mB
|
||
-- -- 1x^4 + 1x^3 + 1x^2 + 1x + 1
|
||
-- >>> fmap (`mod` 2) $ determinant $ toMatrixPolynomial $ charpoly mB
|
||
-- -- 1x^4 + 1x^3 + 1x^2 + 1x + 1
|
||
```
|
||
</details>
|
||
|
||
It should be noted that we do *not* get the same results by taking the determinant after
|
||
applying **charpoly**\*, indicating that the above method is "correct".
|
||
|
||
$$
|
||
\begin{align*}
|
||
\text{charpoly}^*(B^*) &= \left(\begin{matrix}
|
||
\lambda^2 & \lambda^2 + \lambda + 1 \\
|
||
\lambda^2 + \lambda + 1 & \lambda^2 + \lambda + 1
|
||
\end{matrix}\right)
|
||
\\ ~ \\
|
||
\det( \text{charpoly}^*(B^*))
|
||
&= \lambda^2(\lambda^2 + \lambda + 1) - (\lambda^2 + \lambda + 1)^2
|
||
\\
|
||
&= \lambda^3 + 1 \mod 2
|
||
\end{align*}
|
||
$$
|
||
|
||
|
||
### Cycles and Cycles
|
||
|
||
Since we can get $\lambda^4 + \lambda^3 + \lambda^2 + \lambda + 1$ in two ways,
|
||
it's natural to assume this polynomial is significant in some way.
|
||
In the language of the the second post, the polynomial can also be written as ~2~31,
|
||
whose root we determined was cyclic of order 5.
|
||
This happens to match the order of *B* in GL(2, 4).
|
||
|
||
Perhaps this is unsurprising, since there are only so many polynomials of degree 4 over GF(2).
|
||
However, the reason we see it is more obvious if we look at the powers of scalar multiples of *B*.
|
||
First, recall that *f*\* takes us from a matrix over GF(4) to a matrix of matrices of GF(2).
|
||
Then define a map *g* that gives us degree 4 polynomials:
|
||
|
||
::: {layout="[[1],[1,1,1]]"}
|
||
$$
|
||
\begin{gather*}
|
||
g : \mathbb{F}_4^{2 \times 2} \rightarrow \mathbb{F}_2[\lambda]
|
||
\\
|
||
g = \text{charpoly} \circ \text{forget} \circ f^*
|
||
\end{gather*}
|
||
$$
|
||
|
||
$$
|
||
\begin{array}{}
|
||
& \scriptsize \left(\begin{matrix}
|
||
0 & \alpha \\
|
||
\alpha^2 & \alpha^2
|
||
\end{matrix}\right)
|
||
\\
|
||
B & \overset{g}{\mapsto}
|
||
& 11111_\lambda
|
||
\\
|
||
B^2 & \overset{g}{\mapsto}
|
||
& 11111_\lambda
|
||
\\
|
||
B^3 & \overset{g}{\mapsto}
|
||
& 11111_\lambda
|
||
\\
|
||
B^4 & \overset{g}{\mapsto}
|
||
& 11111_\lambda
|
||
\\
|
||
B^5 & \overset{g}{\mapsto}
|
||
& 10001_\lambda
|
||
\end{array}
|
||
$$
|
||
|
||
$$
|
||
\begin{array}{}
|
||
& \scriptsize \left(\begin{matrix}
|
||
0 & \alpha^2 \\
|
||
1 & 1
|
||
\end{matrix}\right)
|
||
\\
|
||
\alpha B & \overset{g}{\mapsto}
|
||
& 10011_\lambda
|
||
\\
|
||
(\alpha B)^2 & \overset{g}{\mapsto}
|
||
& 10011_\lambda
|
||
\\
|
||
(\alpha B)^3 & \overset{g}{\mapsto}
|
||
& 11111_\lambda
|
||
\\
|
||
(\alpha B)^4 & \overset{g}{\mapsto}
|
||
& 10011_\lambda
|
||
\\
|
||
(\alpha B)^5 & \overset{g}{\mapsto}
|
||
& 10101_\lambda
|
||
\end{array}
|
||
$$
|
||
|
||
$$
|
||
\begin{array}{}
|
||
& \scriptsize \left(\begin{matrix}
|
||
0 & 1 \\
|
||
\alpha & \alpha
|
||
\end{matrix}\right)
|
||
\\
|
||
\alpha^2 B & \overset{g}{\mapsto}
|
||
& 11001_\lambda
|
||
\\
|
||
(\alpha^2 B)^2 & \overset{g}{\mapsto}
|
||
& 11001_\lambda
|
||
\\
|
||
(\alpha^2 B)^3 & \overset{g}{\mapsto}
|
||
& 11111_\lambda
|
||
\\
|
||
(\alpha^2 B)^4 & \overset{g}{\mapsto}
|
||
& 11001_\lambda
|
||
\\
|
||
(\alpha^2 B)^5 & \overset{g}{\mapsto}
|
||
& 10101_\lambda
|
||
\end{array}
|
||
$$
|
||
:::
|
||
|
||
The matrices in the middle and rightmost columns both have order 15 inside GL(2, 4).
|
||
Correspondingly, both 10011~λ~ = ~2~19 and 11001~λ~ = ~2~25 are primitive,
|
||
and so have roots of order 15 over GF(2).
|
||
|
||
|
||
### A Field?
|
||
|
||
Since we have 15 matrices generated by the powers of one, you might wonder whether or not
|
||
they can correspond to the nonzero elements of GF(16).
|
||
And they can!
|
||
In a sense, we've "borrowed" the order 15 elements from this "field" within GL(4, 2).
|
||
However, none of the powers of this matrix are the companion matrix of either ~2~19 or ~2~25.
|
||
|
||
<details>
|
||
<summary>
|
||
Haskell demonstration of the field-like-ness of these matrices
|
||
</summary>
|
||
|
||
All we really need to do is test additive closure, since the powers trivially commute and include the identity matrix.
|
||
|
||
<!-- TODO: run in jupyter -->
|
||
```{.haskell}
|
||
hasAdditiveClosure :: Integral a => Int -> a -> [Matrix a] -> bool
|
||
-- Check whether n x n matrices (mod p) have additive closure
|
||
-- Supplement the identity, even if it is not already present
|
||
hasAdditiveClosure n p xs = all (`elem` xs') sums where
|
||
-- Add in the zero matrix
|
||
xs' = zero n:xs
|
||
-- Calculate all possible sums of pairs (mod p)
|
||
sums = map (fmap (`mod` p)) $ (+) <$> xs' <*> xs'
|
||
|
||
|
||
generatesField :: Integral a => Int -> a -> Matrix a -> bool
|
||
-- Generate the powers of x, then test if they form a field (mod p)
|
||
generatesField n p x = hasAdditiveClosure n p xs where
|
||
xs = map (fmap (`mod` p) . (x^)) [1..p^n-1]
|
||
|
||
alphaB = toMatrix [[zero 2, field4!!3],[eye 2, eye 2]]
|
||
|
||
-- >>> mapM_ $ print $ fromMatrix $ forget alphaB
|
||
-- -- [0,0,1,1]
|
||
-- -- [0,0,1,0]
|
||
-- -- [1,0,1,0]
|
||
-- -- [0,1,0,1]
|
||
--
|
||
-- >>> generatesField 4 2 $ forget $ alphaB
|
||
-- -- True
|
||
```
|
||
</details>
|
||
|
||
More directly, we might also observe that *α*^2^*B* is the companion matrix of
|
||
an irreducible polynomial over GF(4), namely $q(x) = x^2 - \alpha x - \alpha$.
|
||
|
||
Both the "forgotten" matrices and the aforementioned companion matrices lie within GL(4, 2).
|
||
A natural question to ask is whether we can make fields by the following process:
|
||
|
||
1. Filter out all order-15 elements of GL(4, 2)
|
||
2. Partition the elements and their powers into their respective order-15 subgroups
|
||
3. Add the zero matrix into each class
|
||
4. Check whether all classes are additively closed (and are therefore fields)
|
||
|
||
In this case, it happens to be true, but proving this in general is difficult, and I haven't done so.
|
||
|
||
|
||
Expanding Dimensions
|
||
--------------------
|
||
|
||
Of course, we need not only focus on GF(4) -- we can just as easily work over GL(2, 2*r*) for other *r* than 2.
|
||
In this case, the internal matrices will be *r*×*r* while the external one remains 2×2.
|
||
But neither do we have to work exclusively with 2×2 matrices -- we can work over GL(*n*, 2^*r*^).
|
||
In either circumstance, the "borrowing" of elements of larger order still occurs.
|
||
This is summarized by the following diagram:
|
||
|
||
$$
|
||
\begin{CD}
|
||
\underset{
|
||
\scriptsize S \text{ (order $k$)}
|
||
}{
|
||
\text{SL}(n,2^r)
|
||
}
|
||
@>>>
|
||
\underset{
|
||
\scriptsize
|
||
\begin{matrix}
|
||
S \text{ (order $k$)} \\
|
||
T \text{ (order $2^{nr}-1$)}
|
||
\end{matrix}
|
||
}{
|
||
\text{GL}(n, 2^r)
|
||
}
|
||
@>{\text{forget} \circ f_{r}^*}>>
|
||
{\text{GL}(nr, 2)}
|
||
@<{f_{nr}}<<
|
||
\underset{
|
||
\scriptsize
|
||
\begin{matrix}
|
||
s \text{ (order $k$)} \\
|
||
t \text{ (order $2^{nr}-1$)}
|
||
\end{matrix}
|
||
}{
|
||
\mathbb{F}_{2^{nr}}
|
||
}
|
||
\end{CD}
|
||
$$
|
||
|
||
Here, *f*~*r*~ is our map from GF(2^*r*^) to *r*×*r* matrices and *f*~*nr*~ is a similar map.
|
||
*r* must greater than 1 for us to properly make use of matrix arithmetic.
|
||
Similarly, *n* must be greater than 1 for the leftmost GL.
|
||
Thus, *nr* is a composite number.
|
||
Here, *k* is a proper factor of 2^*nr*^ - 1.
|
||
In the prior discussion, *k* was 5 and 2^*nr*^ - 1 was 15.
|
||
|
||
Recall that primitive polynomials over GF(2^*nr*^) have roots with order 2^*nr*^ - 1.
|
||
This number can *never* be prime, since the only primes of the form
|
||
2^*p*^ - 1 are Mersenne primes -- *p* itself must be prime.
|
||
Thus, in GL of prime dimensions, we can never loan to a GL over a field
|
||
of larger order with the same characteristic.
|
||
Conversely, GL(*nr* + 1, 2) trivially contains GL(*nr*, 2) by fixing a subspace.
|
||
So we do eventually see elements of order 2^*m*^ - 1 for either prime or composite *m*.
|
||
|
||
|
||
### Other Primes
|
||
|
||
This concern about prime dimensions is unique to characteristic 2.
|
||
For any other prime *p*, *p*^*m*^ - 1 is composite since it is at the very least even.
|
||
All other remarks about the above diagram should still hold for any other prime *p*.
|
||
|
||
In addition, our earlier diagram where we correspond the order of an element in GL(2, 2^2^)
|
||
with the order of an element in GF(2^2×2^) via the characteristic polynomial also generalizes.
|
||
Though I have not proven it, I strongly suspect the following diagram commutes,
|
||
at least in the case where *K* is a finite field:
|
||
|
||
$$
|
||
\begin{CD}
|
||
(K^{r \times r})^{n \times n}
|
||
@>{\text{charpoly}}>>
|
||
(K^{r \times r})[\Lambda]
|
||
\\
|
||
@V{\text{id}}VV ~ @VV{\text{eval}_{\Lambda \mapsto \lambda I}}V
|
||
\\
|
||
-
|
||
@. (K [\lambda])^{r \times r}
|
||
\\
|
||
@V{\text{forget}}VV ~ @VV{\det}V
|
||
\\
|
||
K^{nr \times nr}
|
||
@>>{\text{charpoly}}>
|
||
K[\lambda]
|
||
\end{CD}
|
||
$$
|
||
|
||
Over larger primes, the gap between GL and SL may grow ever larger,
|
||
but SL over a prime power field seems to inject into SL over a prime field.
|
||
If the above diagram is true, then the prior statement follows.
|
||
|
||
|
||
### Monadicity and Injections
|
||
|
||
The action of forgetting the internal structure may sound somewhat familiar if you know your Haskell.
|
||
Remember that for lists, we can do something similar
|
||
-- converting `[[1,2,3],[4,5,6]]` to `[1,2,3,4,5,6]` is just a matter of applying `concat`.
|
||
But this is an instance in which we know lists to behave like a [monad](https://wiki.haskell.org/Monad).
|
||
Despite being an indecipherable bit of jargon to newcomers, it just means we:
|
||
|
||
1. can apply functions inside the structure (for example, to the elements of a list),
|
||
2. have a sensible injection into the structure (creating singleton lists, called `return`), and
|
||
3. can reduce two layers to one (concat, or join for monads in general).
|
||
- Monads are traditionally defined using the operator `>>=`, but `join = (>>= id)`
|
||
|
||
Just comparing the types of `join :: Monad m => m (m a) -> m a`
|
||
and `forget :: Matrix (Matrix a) -> Matrix a` suggests that `Matrix` (meaning square matrices)
|
||
could be a monad, and further, one which respects addition and multiplication.
|
||
Of course, **this is only true when our internal matrices are all the same size**.
|
||
In the above diagrams, this restriction has applied, but should be stated explicitly
|
||
since no dimension is specified by `Matrix a`.
|
||
|
||
However, we run into difficulty at condition 2.
|
||
For one, only "numbers" (elements of a ring) can go inside matrices.
|
||
This restricts where monadicity can hold.
|
||
More importantly, we have a *lot* of freedom in what dimension we choose to inject into.
|
||
For example, we might pick a `return` that uses 1×1 matrices (which add no additional structure).
|
||
We might also pick `return2`, which scalar-multiplies its argument to a 2×2 identity matrix instead.
|
||
|
||
Unfortunately, there's no good answer.
|
||
At the very least, we can close our eyes and pretend that we have a nice diagram:
|
||
|
||
$$
|
||
\begin{gather*}
|
||
\begin{matrix}
|
||
& L\underset{\text{degree } r}{/} K
|
||
\\ \\
|
||
\small f
|
||
& \begin{matrix} | \\ \downarrow \end{matrix}
|
||
\\ \\
|
||
& K^{r \times r}
|
||
\end{matrix}
|
||
& \quad & \quad
|
||
& \begin{matrix}
|
||
& (L\underset{\text{degree } r}{/} K)^{n \times n}
|
||
\\ \\
|
||
\small f^* &
|
||
\begin{matrix} | \\ \downarrow \end{matrix}
|
||
& \searrow & \small \texttt{>>=} ~ f \qquad
|
||
\\ \\
|
||
& (K^{r \times r})^{n \times n}
|
||
& \underset{\text{forget}} {\longrightarrow}
|
||
& K {}^{nr \times nr}
|
||
\end{matrix}
|
||
\end{gather*}
|
||
$$
|
||
|
||
As one last note on the monadicity of matrices, I *have* played around with an alternative `Matrix`
|
||
type which includes scalars alongside proper matrices, which would allow for
|
||
a simple canonical injection.
|
||
Unfortunately, it complicates `join` -- we just place the responsibility of sizing the internal matrices
|
||
front-and-center since we can correspond internal scalars with identity matrices.
|
||
|
||
|
||
Closing
|
||
-------
|
||
|
||
At this point, I've gone on far too long about algebra.
|
||
One nagging curiosity makes me wonder whether the there are any diagrams like the following:
|
||
|
||
$$
|
||
\begin{matrix}
|
||
& (L\underset{\text{degree } r}{/} K)^{n \times n}
|
||
& & & & (L\underset{\text{degree } n}{/} K)^{r \times r}
|
||
\\ \\
|
||
\small f_1^*
|
||
& \begin{matrix} | \\ \downarrow \end{matrix}
|
||
& \searrow & & \swarrow
|
||
& \begin{matrix} | \\ \downarrow \end{matrix}
|
||
& f_2^*
|
||
\\ \\
|
||
& (K^{r \times r})^{n \times n}
|
||
& \underset{\text{forget}} {\longrightarrow}
|
||
& K {}^{nr \times nr}
|
||
& \underset{\text{forget}}{\longleftarrow}
|
||
& (K^{n \times n})^{r \times r}
|
||
\end{matrix}
|
||
$$
|
||
|
||
Or in English, whether "rebracketing" certain *nr*×*nr* matrices can be traced back to
|
||
not only a degree *r* field extension, but also one of degree *n*.
|
||
|
||
The mathematician in me tells me to believe in well-defined structures.
|
||
Matrices are one such structure, with myriad applications.
|
||
However, the computer scientist in me laments that the application of these structures is
|
||
buried in symbols and that layering them is at most glossed over.
|
||
There is clear utility and interest in doing so, otherwise the diagrams shown above would not exist.
|
||
|
||
Of course, there's plenty of reason *not* to go down this route.
|
||
For one, it's plainly inefficient -- GPUs are *built* on matrix operations being as efficient as possible,
|
||
i.e., without the layering.
|
||
It's also inefficient to learn for people *just* learning matrices.
|
||
I'd still argue that the method is efficient for learning about more complex topics, like field extensions.
|