zenzicubi.co/posts/math/finite-field/1/index.qmd

---
title: "Exploring Finite Fields: Preliminaries"
description: |
  How to do arithmetic when there is a finite number of numbers
format:
  html:
    html-math-method: katex
date: "2024-01-09"
date-modified: "2025-07-16"
categories:
  - algebra
  - finite field
  - haskell
---

<style>
.cayley-table .header,
.cayley-table td:nth-child(1),
.cayley-table th:nth-child(1) {
  color: green;
}
</style>

```{haskell}
--|echo: false

import Data.List (intercalate)
import Data.Profunctor (rmap)

import Colonnade
import qualified Colonnade.Encode as CE
import IHaskell.Display (markdown)

markdownTable col rows = unlines $ h:h':r where
  toColumns = ("| " ++) . (++ " |") . intercalate " | " . foldr (:) []
  h  = toColumns $ CE.header id col
  h' = [if x == '|' then x else '-' | x <- h]
  r  = map (toColumns . CE.row id col) rows
```


[Fields](https://en.wikipedia.org/wiki/Field_%28mathematics%29) are a basic structure in abstract algebra.
Roughly, a field is a collection of elements paired with two operations,
  addition and multiplication, along with particular rules about their interactions.
The most important elements of a field are 0 (the additive identity),
  1 (the multiplicative identity), and -1 (which forms additive inverses).
Moreover, multiplicative inverses must also exist for everything but 0.

Certain fields are widely-known, such as the rational numbers $\mathbb{Q}$
  and complex numbers $\mathbb{C}$.
Finite fields also exist, such as the field with two elements.
This field only contains the elements 0 and 1[^1].
The addition and multiplication tables are consequently the simplest possible according to familiar rules.

[^1]: The elements -1 and 1 are identical, unlike in most fields.

:::: {layout-ncol="2"}
::: {.cayley-table}
| + | 0 | 1 |
|---|---|---|
| 0 | 0 | 1 |
| 1 | 1 | 0 |
:::

::: {.cayley-table}
| × | 0 | 1 |
|---|---|---|
| 0 | 0 | 0 |
| 1 | 0 | 1 |
:::
::::

This field expresses the parity of sums and products of two integers since,
  replacing "0" with "even" and "1" with odd:

:::: {layout-ncol="2"}
::: {}
- even + even = even
- even + odd = odd
- odd + even = odd
- odd + odd = even
:::

::: {}
- even × even = even
- even × odd = even
- odd × even = even
- odd × odd = odd
:::
::::


In One's Prime
--------------

Two is not unique as the only possible size for a finite field -- all prime numbers are also candidates.
Such fields are referred to as GF(*p*) or $\mathbb{F}_p$,
  where the prime *p* is the *order* of the field.

These fields inherit the properties of integer arithmetic.
Addition is cyclic mod *p*: 0 and *p* are equivalent to one another.
The role of -1 taken up by *p* - 1, and the additive inverse of an element *x* can be viewed in two ways:

- Multiplying -1 in the field with x, (i.e., $(p - 1)x \mod p$)
- Counting backwards from zero (i.e., $p - x$)

For this to be a field, the product of any two elements in the field cannot be a multiple of *p*,
  since that would be congruent to 0.
If there were such a product, then *p* would share factors (other than 1) with one of the two terms.
But the order is prime, so this is impossible.
More strongly, multiplicative inverses [can be found algorithmically](
    https://en.wikipedia.org/wiki/Modular_multiplicative_inverse#Extended_Euclidean_algorithm
  ), although it is a somewhat tricky task.


Polynomials
-----------

For a given field *K*, we can also consider polynomials with coefficients
  from elements in the field, K\[*x*\].
The mathematical structure that polynomials belong to is called a
  [*ring*](https://en.wikipedia.org/wiki/Ring_%28mathematics%29).
Rings are slightly weaker than fields in the sense that there are not generally multiplicative inverses.
Additive and multiplicative identities must still exist however, corresponding to
  the zero polynomial and constant polynomial 1, respectively.

Since GF(*p*) has a finite number of elements to consider, there are only so many choices
  for polynomial coefficients for a given degree.
Contrast this with for polynomials over the integers, where there are infinitely many monomials $mx + n$.
Looking at polynomials over GF(2), we have:

| Degree | Polynomial *q(x)* | List of coefficients of *q(x)* (ascending) | *q*(2) | *q*(2) (Binary) |
|--------|-------------------|--------------------------------------------|--------|-----------------|
| 1      |               $x$ |                                     [0, 1] |      2 | 10              |
| 1      |           $1 + x$ |                                     [1, 1] |      3 | 11              |
| 2      |             $x^2$ |                                  [0, 0, 1] |      4 | 100             |
| 2      |         $1 + x^2$ |                                  [1, 0, 1] |      5 | 101             |
| 2      |         $x + x^2$ |                                  [0, 1, 1] |      6 | 110             |
| 2      |     $1 + x + x^2$ |                                  [1, 1, 1] |      7 | 111             |
| 3      |             $x^3$ |                               [0, 0, 0, 1] |      8 | 1000            |
| 3      |         $1 + x^3$ |                               [1, 0, 0, 1] |      9 | 1001            |
| 3      |         $x + x^3$ |                               [0, 1, 0, 1] |     10 | 1010            |
| 3      |     $1 + x + x^3$ |                               [1, 1, 0, 1] |     11 | 1011            |
| ...    |               ... |                                        ... |    ... | ...             |


### The Base-ics

There is a very close correspondence between binary expansions and polynomials over GF(2).
This is evident by comparing the list of coefficients in the polynomial (column 3)
  with the binary expansions of the polynomial evaluated at 2 (column 5).
This gives a handy way of referring to polynomials (mod *p*) without having to write out
  each individual "×" or "+".
In fact, this is commonly used to compactly compute with (and refer to)
  [polynomials used in cyclic redundancy checks](
    https://en.wikipedia.org/wiki/Mathematics_of_cyclic_redundancy_checks
  ).

Again, 2 is not unique among primes.
Polynomials over any prime field GF(*p*) can be expressed as integers in base *p*.
To make the correspondence more explicit, I'll use the $_p m$ to denote a decimal integer *m*
  whose base *p* expansion should be interpreted as a polynomial (mod *p*).
*m* is also the value of the polynomial at *p*.
For example, $_2 11 = 1 + x + x^3$, and 1 + 2 + 2^3^ = 11.

<details>
<summary>
Haskell implementation of duality between polynomials mod *p* and base *p* expansions of integers
</summary>
<br>
This implementation actually works for any base *b*, which is not necessarily prime.
  The only difference is that the coefficients lose "field-ness" for composite *b*.

```{haskell}
import Data.List (unfoldr)
import Data.Tuple (swap)

-- A polynomial is its ascending list of coefficients (of type a)
newtype Polynomial a = Poly { coeffs :: [a] } deriving Functor

-- Interpret a number's base-b expansion as a polynomial
asPoly :: Int -> Int -> Polynomial Int
-- Build a list with f, which returns either Nothing
-- or Just (next element of list, next argument to f)
asPoly b = Poly . unfoldr f where
  -- Divide x by b. Emit the remainder and recurse with the quotient.
  f x | x /= 0    = Just $ swap $ divMod x b
  -- If there's nothing left to divide out, terminate
      | otherwise = Nothing

-- Horner evaluation of a polynomial at the integer b
evalPoly :: Int -> Polynomial Int -> Int
-- Start with the highest coefficient
-- Multiply by b at each step and add the coefficient of the next term
evalPoly b (Poly p) = foldr (\y acc -> acc*b + y) 0 p

-- evalPoly n . asPoly n = id :: Int -> Int
```

An interesting detail here is that the duality is expressed through `foldr` using multiplication
  and addition and `unfoldr` using divMod.
</details>


### Mono, not Stereo

With respect to their roots (which will soon become of primary interest), polynomials are *projective*.
That is, any scalar multiple of the polynomial has the same roots.
For GF(2), this is insignificant since 1 is the only nonzero element, but over GF(5),
  the following polynomials have the same roots:

$$
\begin{align*}
  x^2 + 2 \text{ over GF$(5)$} \quad
    &\longleftrightarrow \quad {_5 27}
  \\
  2x^2 + 4 \text{ over GF$(5)$} \quad
    &\longleftrightarrow \quad {_5 54}
  \\
  3x^2 + 1 \text{ over GF$(5)$} \quad
    &\longleftrightarrow \quad {_5 76}
  \\
  4x^2 + 3 \text{ over GF$(5)$} \quad
    &\longleftrightarrow \quad {_5 103}
\end{align*}
$$

Only the first polynomial is a *monic* polynomial, or has a leading coefficient of 1.
It is preferable to work with these since the product of two monic polynomials is also monic.

An equivalent condition to this is that when evaluated with the order of the field,
  the polynomial has a value between 5^2^ and 2×5^2^ - 1 = 49.
In general, monic polynomials over GF(*p*) as integers fall in the range *p*^2^ and 2*p*^2^ - 1.

<details>
<summary>
Haskell implementation of monic polynomials over GF(*p*)
</summary>

Again, nothing about this definition depends on the base being prime.

```{haskell}
-- All monic polynomials of degree d with coefficients mod n
monics :: Int -> Int -> [Polynomial Int]
monics n d = map (asPoly n) [n^d..2*(n^d) - 1]

-- All monic polynomials with coefficients mod n, ordered by degree
allMonics :: Int -> [Polynomial Int]
allMonics n = concat [monics n d | d <- [1..]]
```
</details>

As an aside, one can also read out monics by counting normally by using the digit alphabet
  {1, 0, -1, ..., -*p* + 2}.
Unfortunately, these base-*p* expansions are more difficult to obtain algorithmically,
  and I'll leave this as an exercise to the reader.


### Sieving out Irreducibles

Over the integers, we can factor a number into primes.
To decide if a number is prime, we just divide it (using an algorithm like long division)
  by numbers less than it and see if we get a nonzero remainder.

Similarly, we can factor polynomials into *irreducible* polynomials,
  which have no "smaller" polynomial factors other than 1.
More precisely, by "smaller", we mean those of lesser degree.
For example, over the integers, the polynomial $x^2 - 1$ (degree 2) factors into
  $(x + 1)(x - 1)$ (both degree 1), but $x^2 + 1$ is irreducible.

In general, a factorization of a polynomial over the integers implies a factorization of one over GF(*p*),
  since the coefficients for each factor may be taken mod *p*.
However, the converse does not hold.
Over GF(2),

$$
(x + 1)^2 = x^2 + 2x + 1 \equiv x^2 + 1 \text{ over GF$(2)$}
$$

...but as just mentioned, the right-hand side is irreducible over the integers.

Just like integers, we can use [polynomial long division](
    https://en.wikipedia.org/wiki/Polynomial_long_division
  ) with these objects to decide if a polynomial is irreducible.
[Synthetic division](https://en.wikipedia.org/wiki/Synthetic_division) is an alternative which is
  slightly easier to implement (especially over GF(2), where it is, again, used in CRCs).
It only works for monic polynomials, but this is all we need.

<details>
<summary>
Haskell implementation of synthetic division
</summary>

The algorithm is similar to table-less algorithms for CRCs, but we don't have the luxury
  of working at the bit level with XOR for addition.
We also have to watch out for negation and coefficients other than 1 for when not working over GF(2).

```{haskell}
zipAdd :: Num a => [a] -> [a] -> [a]
zipAdd []     ds     = ds
zipAdd cs     []     = cs
zipAdd (c:cs) (d:ds) = (c + d):zipAdd cs ds

-- Divide the polynomial ps by qs (coefficients in descending degree order)
synthDiv' :: (Eq a, Num a) => [a] -> [a] -> ([a], [a])
synthDiv' ps qs
  | head qs /= 1 = error "Cannot divide by non-monic polynomial"
  | otherwise   = splitAt deg $ doDiv ps deg
  where
    -- Negate the denominator and ignore leading term
    qNeg = map negate $ tail qs
    -- The degree of the result, based on the degrees of the numerator and denominator
    deg   = max 0 (length ps - length qs + 1)
    -- Pluck off the head of the list and add a shifted and scaled version of
    -- qs to the tail of the list. Repeat this d times
    doDiv xs    0  = xs
    doDiv (x:xs) d = x:doDiv (zipAdd xs $ map (*x) qNeg) (d - 1)

-- Use Polynomial (coefficients in ascending degree order) instead of lists
synthDiv :: (Eq a, Num a) => Polynomial a -> Polynomial a
  -> (Polynomial a, Polynomial a)
synthDiv (Poly p) (Poly q) = (Poly $ reverse quot, Poly $ reverse rem)
  where (quot, rem)        = synthDiv' (reverse p) (reverse q)
```
</details>

Then, using our list of monic polynomials, we can use the same strategy
  for sieving out primes to find (monic) irreducibles.

<details>
<summary>
Haskell implementation of an irreducible polynomial sieve over GF(*p*)
</summary>

```{haskell}
-- All irreducible monic polynomials with coefficients mod n
irreducibles :: Int -> [Polynomial Int]
irreducibles n = go [] $ allMonics n where
  -- Divide the polynomial x by i, then take the remainder mod n
  remModN x i      = fmap (`mod` n) $ snd $ synthDiv x i
  -- Find remainders of x divided by every irreducible in "is".
  -- If any give the zero polynomial, then x is a multiple of an irreducible
  notMultiple x is = and [not $ all (==0) $ coeffs $ remModN x i | i <- is]
  -- Sieve out by notMultiple
  go is (x:xs)
    | notMultiple x is = x:go (x:is) xs
    | otherwise        = go is xs
```
</details>

Since we can denote polynomials by numbers, it may be tempting to freely switch
  between primes and irreducibles.
However, irreducibles depend on the chosen field and do not generally correspond
  to the base-*p* expansion of a prime.

```{haskell}
--|code-fold: true
--|classes: plain

texifyPoly :: (Num a, Eq a, Show a) => Polynomial a -> String
texifyPoly (Poly xs) = ("$" ++) $ (++ "$") $ texify' $ zip xs [0..] where
  texify' [] = "0"
  texify' ((c, n):xs)
    | all ((==0) . fst) xs = showPow c n
    | c == 0               = texify' xs
    | otherwise            = showPow c n ++ " + " ++ texify' xs
  showPow c 0 = show c
  showPow 1 1 = "x"
  showPow c 1 = show c ++ showPow 1 1
  showPow 1 n = "x^{" ++ show n ++ "}"
  showPow c n = show c ++ showPow 1 n

-- Simple prime sieve
primes = primes' [] 2 where
  primes' ps n
    | and [n `mod` p /= 0 | p <- ps] = n:primes' (n:ps) (n+1)
    | otherwise                      = primes' ps (n+1)

somePrimes = takeWhile (<50) primes
someIrr2 = takeWhile (<50) $ map (evalPoly 2) $ irreducibles 2

irr2PrimeTable = columns (\(_, f) r -> f r) (\(c, _) -> Headed c) [
    ("Irreducible over GF(2), *q*(x)", texifyPoly . (irreducibles 2 !!)),
    ("*q*(2), [OEIS A014580](https://oeis.org/A014580)",
      (\x -> if x `notElem` somePrimes then redSpan $ show x else show x) .
        (someIrr2 !!)),
    ("Prime",
      (\x -> if x `notElem` someIrr2 then greenSpan $ show x else show x) .
        (primes !!))
  ] where
  greenSpan x = "<span style=\"color: green\">" ++ x ++ "</span>"
  redSpan x = "<span style=\"color: red\">" ++ x ++ "</span>"

markdown $ markdownTable irr2PrimeTable [0..10]
```

The red entry in column 2 is not prime.
Dually, the green entries in column 3 do not have binary expansions which correspond
  to irreducible polynomials over GF(2).


Matrices
--------

Along with polynomials over a finite field, we can also look at matrices.
The most interesting matrices are square ones, since the product of two square matrices
  is another square matrix.
With the zero matrix ($\bf 0_n$) as the additive identity and the identity matrix
  ($\bf 1_n$) as the multiplicative, square matrices also form a ring over the field *K*,
  denoted $K^{n \times n}$.

Square matrices are associated to a [determinant](https://en.wikipedia.org/wiki/Determinant),
  which is an element from the underlying field.
Determinants are nice, since the determinant of the product of two matrices is
  the product of the determinants.
The determinant can be implemented using [Laplace expansion](
    https://en.wikipedia.org/wiki/Laplace_expansion
  ), which is also useful for inductive proofs.

<details>
<summary>
Haskell implementation of Laplace expansion
</summary>

Laplace expansion is ludicrously inefficient compared to other algorithms,
  and is only shown here due to its "straightforward" implementation and use in proof.
Numeric computation will not be used to keep the arithmetic exact.

```{haskell}
import Data.Array
newtype Matrix a = Mat { unMat :: Array (Int, Int) a } deriving Functor

-- Simple function for building a Matrix from lists
toMatrix :: [[a]] -> Matrix a
toMatrix l = Mat $ listArray ((0,0),(n-1,m-1)) $ concat l where
  m = length $ head l
  n = length l

determinant :: (Num a, Eq a) => Matrix a -> a
determinant (Mat xs) = determinant' xs where
  -- Evaluate (-1)^i without repeated multiplication
  parity i = if even i then 1 else -1
  -- Map old array addresses to new ones when eliminating row 0, column i
  rowMap i (x,y) = (x+1, if y >= i then y+1 else y)
  -- Recursive determinant Array
  determinant' xs
    -- Base case: 1x1 matrix
    | n == 0    = xs!(0,0)
    -- Sum of cofactor expansions
    | otherwise = sum $ map cofactor [0..n] where
      -- Produce the cofactor of row 0, column i
      cofactor i
        | xs!(0,i) == 0 = 0
        | otherwise     = (parity i) * xs!(0,i) * determinant' (minor i)
      -- Furthest extent of the bounds, i.e., the size of the matrix
      (_,(n,_)) = bounds xs
      -- Build a new Array by eliminating row 0 and column i
      minor i = ixmap ((0,0),(n-1,n-1)) (rowMap i) xs
```
</details>


### Back to Polynomials

The [characteristic polynomial](https://en.wikipedia.org/wiki/Characteristic_polynomial)
  is a stronger invariant which follows from the determinant.
It is defined as, for *λ* a scalar variable:

$$
\begin{gather*}
  \text{charpoly}(A) = p_A(\lambda)
    = \left| \lambda I - A  \right|
  \\ \\
  = \left| \begin{matrix*}
    \lambda - a_{00} &          -a_{01} &    ... &          -a_{0n} \\
             -a_{10} & \lambda - a_{11} &    ... &          -a_{1n} \\
              \vdots &           \vdots & \ddots &           \vdots \\
             -a_{n0} &          -a_{n1} &    ... & \lambda - a_{nn} \\
  \end{matrix*} \right|
\end{gather*}
$$

Laplace expansion never gives *λ* a coefficient before recursing,
  so the characteristic polynomial is always monic.
Also, the characteristic polynomial is over the same field that the entries of the matrix were from.

<details>
<summary>
Haskell implementation of the characteristic polynomial
</summary>
Since `determinant` was defined for all `Num` and `Eq`, it can immediately be applied
  if these instances are defined for polynomials.

```{haskell}
instance (Eq a, Num a) => (Num (Polynomial a)) where
  (+) x@(Poly as) y@(Poly bs) = Poly $ zipAdd as bs

  --convolution
  (*) (Poly []) (Poly bs) = Poly []
  (*) (Poly as) (Poly []) = Poly []
  (*) (Poly as) (Poly bs) = Poly $ zeroize $ finish $ foldl convolve' ([], []) as
    where convolve' (xs, ys) a = (a:xs, sum (zipWith (*) (a:xs) bs):ys)
          finish (xs, ys) = (reverse ys ++) $ finish' xs $ tail bs
          finish' xs [] = []
          finish' xs ys = sum (zipWith (*) xs ys):finish' xs (tail ys)
          zeroize xs    = if all (==0) xs then [] else xs

  -- this definition works
  negate = fmap negate
  -- but these two might run into some problems
  abs    = fmap abs
  signum = fmap signum
  fromInteger 0    = Poly []
  fromInteger a    = Poly [fromInteger a]

instance Eq a => Eq (Polynomial a) where
  (==) (Poly xs) (Poly ys) = xs == ys
```

Then, along with some helper matrix functions, we can build a function for the characteristic polynomial.

```{haskell}
-- Zero matrix
zero :: Num a => Int -> Matrix a
zero n = Mat $ listArray ((0,0),(n-1,n-1)) $ repeat 0

-- Identity matrix
eye :: Num a => Int -> Matrix a
eye n = Mat $ unMat (zero n) // take n [((i,i), 1) | i <- [0..]]

-- Maps
mapRange :: Ix i => (i -> e) -> (i, i) -> [(i, e)]
mapRange g r = map (\x -> (x, g x)) $ range r

-- Pointwise application
zipWithArr :: Ix i => (a -> b -> c)
  -> Array i a -> Array i b -> Array i c
zipWithArr f a b
  | ab == bb  = array ab $ map (\x -> (x, f (a!x) (b!x))) $ indices a
  | otherwise = error "Array dimension mismatch" where
    ab = bounds a
    bb = bounds b

(|+|) (Mat x) (Mat y) = Mat $ zipWithArr (+) x y

-- Characteristic Polynomial
charpoly :: Matrix Int -> Polynomial Int
charpoly xs = determinant $ eyeLambda |+| negPolyXs where
  -- Furthest extent of the bounds, i.e., the size of the matrix
  (_,(n,_)) = bounds $ unMat xs
  -- Negative of input matrix, after being converted to polynomials
  negPolyXs :: Matrix (Polynomial Int)
  negPolyXs = fmap (\x -> Poly [-x]) xs
  -- Identity matrix times lambda (encoded as Poly [0, 1])
  eyeLambda :: Matrix (Polynomial Int)
  eyeLambda = (\x -> Poly [x] * Poly [0, 1]) <$> eye (n+1)

markdown $ texifyPoly $ charpoly $ toMatrix [[1,0],[0,1]]
```
</details>

Computation using this definition is only good for demonstrative purposes.
The [Faddeev-LeVerrier algorithm](
    https://en.wikipedia.org/wiki/Faddeev%E2%80%93LeVerrier_algorithm
  ) circumvents Laplace expansion entirely and happens to generate the determinant along the way.
However, it has some problems:

- It inverts the order in which the determinant and characteristic polynomial are defined
- It introduces division, which makes it unsuitable for direct use with matrices with entries mod *p*

Fortunately, we can just work with a matrix over the integers and mod out at the end instead,
  as the following diagram conveys:

$$
\begin{array}{}
  \mathbb{F}_p ^{n \times n} & \textcolor{green}{\hookrightarrow} &
    \normalsize  \mathbb{Z}^{n \times n} &
    \overset{\mod p ~~}{\longrightarrow} &
    \mathbb{F}_p^{n \times n} & \scriptsize \phantom{text{charpoly}}
  \\[10pt]
  & \scriptsize \textcolor{green}{\text{charpoly (FL)}}  &
    \textcolor{green}{\downarrow} & &
    \textcolor{red}{\downarrow} & \scriptsize \textcolor{red}{\text{charpoly (LE)}}
  \\[12pt]
  & & \mathbb{Z}[\lambda] &
    \textcolor{green}{\underset{\mod p ~~}{\longrightarrow}} &
    \mathbb{F}_p[\lambda] & \scriptsize \phantom{text{charpoly}}
\end{array}
$$

The top row are matrices and the bottom row are polynomials.
To get to the bottom-right, which contains the characteristic polynomial over GF(*p*),
  we can avoid the red arrow and follow the path in green instead.


### Friends Among Matrices

In the reverse direction, a matrix with a specific characteristic polynomial
  can be constructed from a polynomal.
The matrix is called the [companion matrix](https://en.wikipedia.org/wiki/Companion_matrix),
  and is defined as

$$
\begin{gather*}
  p(\lambda) = \lambda^n + p_{n-1}\lambda^{n-1} + ... + p_1 \lambda + p_0
  \\[10pt]
  C_{p(\lambda)} = \left( \begin{matrix}
         0 &      1 &      0 &    ... &        0 \\
         0 &      0 &      1 &    ... &        0 \\
    \vdots & \vdots & \vdots & \ddots &   \vdots \\
         0 &      0 &      0 &    ... &        1 \\
      -p_0 &   -p_1 &   -p_2 &    ... & -p_{n-1}
  \end{matrix} \right)
  = \left( \begin{matrix}
    \overrightharpoon 0_{n-1} & \bold{1}_{n-1} \\
    -p_0 & -(\overrightharpoon{p}_{1:n-1})^T
  \end{matrix} \right)
  \\ \\[1pt]
  \text{charpoly}(C_{p}) = p_{C_{p}}(\lambda) = p(\lambda)
\end{gather*}
$$

The definition of the companion matrix only depends on elements having an additive inverse,
  which is always true in a field.
Therefore, there are always matrices over a field that have a monic polynomial
  as their characteristic polynomial.

Proving that the companion matrix has the characteristic polynomial it was constructed from
  can be done via Laplace expansion:

$$
\begin{gather*}
  p_{0:n-1}(\lambda)
    = \left| \begin{matrix}
      \textcolor{red}{\lambda} &      -1 &      0 &    ... &                 0 \\
                             0 & \lambda &     -1 &    ... &                 0 \\
                        \vdots &  \vdots & \vdots & \ddots &            \vdots \\
                             0 &       0 &      0 &    ... &                -1 \\
      \textcolor{green}{p_0}   &     p_1 &    p_2 &    ... & \lambda + p_{n-1}
    \end{matrix} \right|
  \\ \\[1pt]
  = \textcolor{green}{p_0} \cdot (-1)^{n-1}
    \left| \begin{matrix}
           -1 &      0 &    ... &      0 \\
      \lambda &     -1 &    ... &      0 \\
      \vdots  & \vdots & \ddots & \vdots \\
            0 &      0 &    ... &     -1
    \end{matrix} \right|
    + \textcolor{red}{\lambda}
    \left| \begin{matrix}
      \lambda &     -1 &    ... &                 0 \\
       \vdots & \vdots & \ddots &            \vdots \\
            0 &      0 &    ... &                -1 \\
          p_1 &    p_2 &    ... & \lambda + p_{n-1}
    \end{matrix} \right|
  \\ \\[1pt]
  = \textcolor{green}{p_0} \cdot (-1)^{n-1} \cdot (-1)^{n-1}
    + \textcolor{red}{\lambda} \cdot p_{1:n-1}(\lambda)
  \\
  = p_0 + \lambda(p_1 + \lambda ( \cdots (p_{n-1} + \lambda ) \cdots )))
\end{gather*}
$$

Pleasantly, this yields the Horner form, which was used above to evaluate polynomials.

<details>
<summary>
Haskell implementation of the companion matrix
</summary>

```{haskell}
companion :: Polynomial Int -> Matrix Int
companion (Poly ps)
  | last ps' /= 1 = error "Cannot find companion matrix of non-monic polynomial"
  | otherwise     = Mat $ array ((0,0), (n-1,n-1)) $ lastRow ++ shiftI where
    -- The degree of the polynomial, as well as the size of the matrix
    n       = length ps' - 1
    -- Remove trailing 0s from ps
    ps'     = reverse $ dropWhile (==0) $ reverse ps
    -- Address/value tuples for a shifted identity matrix:
    -- 1s on the diagonal just above the main diagonal, 0s elsewhere
    shiftI  = map (\p@(x,y) -> (p, if y == x + 1 then 1 else 0)) $
      range ((0,0),(n-2,n-1))
    -- Address/value tuples for the last row of the companion matrix:
    -- ascending powers of the polynomial
    lastRow = zipWith (\x y -> ((n-1, x), y)) [0..n-1] $ map negate ps'

-- (charpoly . companion) = id :: Polynomial Int -> Polynomial Int
```

Applying this to $1 - 2x + x^2$ gives us:

```{haskell}
--| code-fold: true

reshape :: Int -> [a] -> [[a]]
reshape n = unfoldr (reshape' n) where
  reshape' n x = if null x then Nothing else Just $ splitAt n x

fromMatrix :: Matrix a -> [[a]]
fromMatrix (Mat m) = let (_,(_,n)) = bounds m in reshape (n+1) $ elems m

texifyMatrix mat = surround mat' where
  mat' = intercalate " \\\\ " $ map (intercalate " & " . map show) $
    fromMatrix mat
  surround = ("\\left( \\begin{matrix}" ++) . (++ "\\end{matrix} \\right)")

markdown $ ("$$" ++) $ (++ "$$") $ texifyMatrix $ companion $ Poly [1, -2, 1]
```
</details>


Field Extensions
----------------

Aside from those of degree 1, the irreducible polynomials over a field cannot be factored
  into monomials over the field.
In other words, irreducibles have roots which do not exist as elements of the field.
A *field extension* formalizes the notion by which one can make a larger field from another
  by adding roots of a polynomial.

$x^2 + 1$ is irreducible, both over the integers and over an actual field like $\mathbb{R}$.
On the other hand, it can be factored into $(x + i)(x - i)$ over $\mathbb{C}$.
We can construct the latter field from the former if an extra number *i* exists alongside everything
  in $\mathbb{R}$ such that *i*^2^ = -1.
Elements of this new field are linear combinations of multiples of powers of *i*
  less than the degree (in this case, 0 and 1; i.e., $a + bi$).

The equation that *i* obeys can be rewritten as $i^2 + 1 = 0$, which is the original polynomial,
  evaluated at *i*.
In order to refer explicitly to the construction of the bigger field from the polynomial,
  we write[^2] $\mathbb{R}[x] / (x^2 + 1) \cong \mathbb{C}$.

[^2]: *Technically*, the left hand side refers to something else
    (cosets of polynomials, from which we extract the canonical member *i*), but this description is good enough.


### The Power of Primes

We can extend a finite field in the same way.
Over GF(2), the smallest irreducible of degree 2 is $x^2 + x + 1$.
Using the same logic as before, we construct
  $\mathbb{F}_2[x] / (x^2 + x + 1) \cong \mathbb{F}_2[\alpha]$.
The new element *α* is a root of the polynomial and obeys the relations:

$$
\begin{gather*}
  \alpha^2 + \alpha + 1 = 0
  \\
  \alpha^2 = -\alpha - 1 \equiv \alpha + 1 \mod 2
  \\
  \alpha^3 = \alpha^2 + \alpha = (\alpha + 1) + \alpha \equiv 1 \mod 2
\end{gather*}
$$

Just like *i*, only powers of *α* less than 2 (again, 0 and 1) are necessary
  to express elements of the field.
Skipping a few steps, we can accumulate all possible sums and products over this new field
  into two new tables:

:::: {layout-ncol="2"}
::: {.cayley-table}
| +       | 0       | 1       | *α*     | *α* + 1 |
|---------|---------|---------|---------|---------|
| 0       | 0       | 1       | *α*     | *α* + 1 |
| 1       | 1       | 0       | *α* + 1 | *α*     |
| *α*     | *α*     | *α* + 1 | 0       | 1       |
| *α* + 1 | *α* + 1 | *α*     | 1       | 0       |
:::

::: {.cayley-table}
| ×       | 0       | 1       | *α*     | *α* + 1 |
|---------|---------|---------|---------|---------|
| 0       | 0       | 0       | 0       | 0       |
| 1       | 0       | 1       | *α*     | *α* + 1 |
| *α*     | 0       | *α*     | *α* + 1 | 1       |
| *α* + 1 | 0       | *α* + 1 | 1       | *α*     |
:::
::::

As you might expect, the resulting field has 4 elements, so it's called GF(4) (or $\mathbb{F}_4$).
In general, when adjoining an irreducible of degree *d* to GF(*p*), the resulting field has *p*^*d*^ elements,
  naturally denoted GF(p^d^) (or $\mathbb{F}_{p^d}$).
*p* is still special, and is called the *characteristic* of the field.
It denotes how many repeated additions are needed to get to 0.
From the above table, it's clear that the characteristic is 2 since
  $1 + 1 = \alpha + \alpha = (\alpha + 1) + (\alpha + 1) = 0$.


### ...and beyond?

All of this is manageable when you're adjoining a root of a degree 2 polynomial like *α* or *i*,
  but things get difficult when you start to work with higher degrees.
The powers of the root form the basis for a *d*-dimensional vector space over GF(*p*)
  (hence the order of the field being *p*^*d*^).
Proceeding as before, we'd have to be able to:

- recognize equality in the new field based on sums of powers of roots (times elements of the field)
- have a canonical method of expressing other elements after adjoining a root
    - ideally, handle both with an algorithm that gives canonical forms from noncanonical ones
- know when we've found every element of the new field

These problems make it difficult to study prime power fields on a computer without the use of
  a CAS like Maple or Mathematica.
They're capable of taking care of these issues symbolically, working with the expressions
  in the same a mathematician would (or at least appearing to do so).
As someone who likes to do things himself, implementing a CAS from scratch seemed a little too cumbersome.
Furthermore, even a more direct approach using the previously-mentioned
  "canonical members of cosets of polynomials" was more annoying than I was willing to put up with.

Fortunately, there's a detour that makes it much easier to dodge all of these problems,
  and it has some interesting consequences.
Join me in [the next post](../2) for a direct, non-symbolic way to work with prime power fields.