--- title: "Exploring Finite Fields, Part 2 Appendix" description: | Additional notes about polynomial evaluation. format: html: html-math-method: katex date: "2024-01-15" date-modified: "2025-07-16" categories: - algebra - finite field - haskell --- In the [second post in this series](../), we briefly discussed alternate means of evaluating polynomials by "plugging in" different structures. Different Kinds of Polynomials ------------------------------ Rather than redefining evaluation for each of these cases, we can map the polynomial into a structure compatible with it should be evaluated. Essentially, this means that from a polynomial in the base structure, we can derive polynomials in these other structures. In particular, there is a distinction between a matrix of polynomials or a polynomial in matrices: :::: {layout-ncol="2"} ::: {} $$ \begin{align*} p &: K[x] \\ p(x) &= x^n + p_{n-1}x^{n-1} + ... \\ \phantom{= p} & + p_1 x + p_0 \end{align*} $$ ::: ::: {} $x$ is a scalar indeterminate ```haskell p :: Polynomial k ``` ::: :::: :::: {layout-ncol="2"} ::: {} $$ \begin{align*} P &: (K[x])^{m \times m} \\ P(x I) &= (x I)^n + (p_{n-1})(x I)^{n-1} + ... \\ & + p_1(x I)+ p_0 I \end{align*} $$ ::: ::: {} $x$ is a scalar indeterminate, $P(x I)= p(x) I$ is a matrix of polynomials in $x$ ```haskell asPolynomialMatrix :: Polynomial k -> Matrix (Polynomial k) pMat :: Matrix (Polynomial k) pMat = asPolynomialMatrix p ``` ::: :::: :::: {layout-ncol="2"} ::: {} $$ \begin{align*} \hat P &: K^{m \times m}[X] \\ \hat P(X) &= X^n + (p_{n-1}I)X^{n-1} + ... \\ & + (p_1 I) X + (p_0 I) \end{align*} $$ ::: ::: {} $X$ is a matrix indeterminate, $\hat P(X)$ is a polynomial over matrices ```haskell asMatrixPolynomial :: Polynomial k -> Polynomial (Matrix k) pHat :: Polynomial (Matrix k) pHat = asMatrixPolynomial p ``` ::: :::: It's easy to confuse the latter two, but the Haskell makes the difference in types clearer. There exists a natural isomorphism between the two, which is discussed further in the [fourth post in this series](../../4/). Cayley-Hamilton Theorem, Revisited ---------------------------------- As a reminder, the [Cayley-Hamilton theorem](https://en.wikipedia.org/wiki/Cayley%E2%80%93Hamilton_theorem) says that a matrix satisfies its own characteristic polynomial. In a type-stricter sense, it says the following relationship holds: ```haskell evalPoly :: a -> Polynomial a -> a mA :: Matrix a charpolyA :: Polynomial a charpolyA = charpoly mA charpolyA :: Polynomial (Matrix a) matCharpolyA = asMatrixPolynomial charPolyA evalPoly mA matCharpolyA == (0 :: Matrix a) ``` Due to the aformentioned isomorphism, factoring a polynomial "inside" a matrix turns out to give the same answer as factoring a polynomial over matrices. :::: {layout-ncol="2"} ::: {} $$ \begin{gather*} P(xI) = \left( \begin{matrix} x^2 + x + 1 & 0 \\ 0 & x^2 + x + 1 \end{matrix}\right) \\ \\ = (xI - C)(xI - C') \\ \\ = \left( \begin{matrix} x & -1 \\ 1 & x + 1 \end{matrix} \right) \left( \begin{matrix} x - a & -b \\ -c & x - d \end{matrix} \right) \\ \\ \begin{align*} x(x - a) + c &= x^2 + x + 1 \\ \textcolor{green}{x(-b) - (x - d)} &\textcolor{green}{= 0} \\ \textcolor{blue}{(x - a) + (x + 1)(-c)} &\textcolor{blue}{= 0} \\ (-b) + (x + 1)(x - d) &= x^2 + x + 1 \end{align*} \\ \\ \textcolor{green}{(-b -1)x +d = 0} \implies b = -1, ~ d = 0 \\ \textcolor{blue}{(1 - c)x - a - c = 0} \implies c = 1, ~ a = -1 \\ \\ C' = \left( \begin{matrix} -1 & -1 \\ 1 & 0 \end{matrix} \right) \end{gather*} $$ ::: ::: {} $$ \begin{gather*} \hat P(X) = X^2 + X + 1I \\[10pt] = (X - C)(X - C') \\[10pt] = X^2 - (C + C')X + CC' \\[10pt] \implies \\[10pt] C + C' = -I, ~ C' = -I - C \\[10pt] CC' = I, ~ C^{-1} = C' \\[10pt] C' = \left( \begin{matrix} -1 & -1 \\ 1 & 0 \end{matrix} \right) \end{gather*} $$ ::: :::: It's important to not that a matrix factorization is not unique. *Any* matrix with a given characteristic polynomial can be used as a root of that polynomial. Of course, choosing one root affects the other matrix roots. ### Moving Roots All matrices commute with the identity and zero matrices. A less obvious fact is that for a matrix factorization, all roots *also* commute with one another. By the Fundamental Theorem of Algebra, [Vieta's formulas](https://en.wikipedia.org/wiki/Vieta%27s_formulas) state: $$ \begin{gather*} \hat P(X) = \prod_{[i]_n} (X - \Xi_i) = (X - \Xi_0) (X - \Xi_1)...(X - \Xi_{n-1}) \\ = \left\{ \begin{align*} & \phantom{+} X^n \\ & - (\Xi_0 + \Xi_1 + ... + \Xi_{n-1}) X^{n-1} \\ & + (\Xi_0 \Xi_1+ \Xi_0 \Xi_2 + ... + \Xi_0 \Xi_{n-1} + \Xi_1 \Xi_2 + ... + \Xi_{n-2} \Xi_{n-1})X^{n-2} \\ & \qquad \vdots \\ & + (-1)^n \Xi_0 \Xi_1 \Xi_2...\Xi_n \end{align*} \right. \\ = X^n -\sigma_1([\Xi]_n)X^{n-1} + \sigma_2([\Xi]_n)X^{n-2} + ... + (-1)^n \sigma_n([\Xi]_n) \end{gather*} $$ The product range \[*i*\]~*n*~ means that the terms are ordered from 0 to *n* - 1 over the index given. On the bottom line, *σ* are [elementary symmetric polynomials](https://en.wikipedia.org/wiki/Elementary_symmetric_polynomial) and \[*Ξ*\]~*n*~ is the list of root matrices from *Ξ*~*0*~ to Ξ~*n-1*~. By factoring the matrix with the roots in a different order, we get another factorization. It suffices to only focus on *σ*~2~, which has all pairwise products. $$ \begin{gather*} \pi \in S_n \\ \qquad \pi \circ \hat P(X) = \prod_{\pi ([i]_n)} (X - \Xi_i) \\ \\ = X^n - \sigma_1 \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)X^{n-1} + \sigma_2 \left(\pi ([\Xi]_n) \vphantom{^{1}} \right)X^{n-2} + ... + (-1)^n \sigma_n \left(\pi ([\Xi]_n) \vphantom{^{1}} \right) \\[10pt] \pi_{(0 ~ 1)} \circ \hat P(X) = (X - \Xi_{1}) (X - \Xi_0)(X - \Xi_2)...(X - \Xi_{n-1}) \\ = X^n + ... + \sigma_2(\Xi_1, \Xi_0, \Xi_2, ...,\Xi_{n-1})X^{n-2} + ... \\[10pt] \begin{array}{} e & (0 ~ 1) & (1 ~ 2) & ... & (n-2 ~~ n-1) \\ \hline \textcolor{red}{\Xi_0 \Xi_1} & \textcolor{red}{\Xi_1 \Xi_0} & \Xi_0 \Xi_1 & & \Xi_0 \Xi_1 \\ \Xi_0 \Xi_2 & \Xi_0 \Xi_2 & \Xi_0 \Xi_2 & & \Xi_0 \Xi_2 \\ \Xi_0 \Xi_3 & \Xi_0 \Xi_3 & \Xi_0 \Xi_3 & & \Xi_0 \Xi_3 \\ \vdots & \vdots & \vdots & & \vdots \\ \Xi_0 \Xi_{n-1} & \Xi_0 \Xi_{n-1} & \Xi_{0} \Xi_{n-1} & & \Xi_{n-1} \Xi_0 \\ \textcolor{green}{\Xi_1 \Xi_2} & \Xi_1 \Xi_2 & \textcolor{green}{\Xi_2 \Xi_1} & & \Xi_1 \Xi_2 \\ \vdots & \vdots & \vdots & & \vdots \\ \textcolor{blue}{\Xi_{n-2} \Xi_{n-1}} & \Xi_{n-2} \Xi_{n-1} & \Xi_{n-2} \Xi_{n-1} & & \textcolor{blue}{\Xi_{n-1} \Xi_{n-2}} \end{array} \end{gather*} $$ The "[path swaps](/posts/math/permutations/1/)" shown commute only the adjacent elements. By contrast, the permutation $(0 ~ 2)$ commutes *Ξ*~0~ past both *Ξ*~1~ and *Ξ*~2~. But since we already know *Ξ*~0~ and *Ξ*~1~ commute by the above list, we learn at this step that *Ξ*~0~ and *Ξ*~2~ commute. This can be repeated until we reach the permutation $(0 ~ n-1)$ to prove commutativity between all pairs.