From 40e3eb831a7ca54809bd6e4c7dafac5410468bdd Mon Sep 17 00:00:00 2001 From: queue-miscreant Date: Fri, 8 Aug 2025 04:25:27 -0500 Subject: [PATCH] render site again --- .../posts/math/chebyshev/1/index/execute-results/html.json | 6 +++--- .../math/finite-field/3/index/execute-results/html.json | 6 +++--- .../math/finite-field/4/index/execute-results/html.json | 6 +++--- .../math/number-number/1/index/execute-results/html.json | 6 +++--- .../polycount/4/appendix/index/execute-results/html.json | 6 +++--- _freeze/posts/math/stereo/2/index/execute-results/html.json | 6 +++--- 6 files changed, 18 insertions(+), 18 deletions(-) diff --git a/_freeze/posts/math/chebyshev/1/index/execute-results/html.json b/_freeze/posts/math/chebyshev/1/index/execute-results/html.json index 3914367..abf3023 100644 --- a/_freeze/posts/math/chebyshev/1/index/execute-results/html.json +++ b/_freeze/posts/math/chebyshev/1/index/execute-results/html.json @@ -1,10 +1,10 @@ { - "hash": "f0d1956695c45b1ca4b803d88c0a4bf5", + "hash": "2a7732da8eeb3c7a2eec0ae77f9ce94d", "result": { "engine": "jupyter", - "markdown": "---\ntitle: \"Generating Polynomials, Part 1: Regular Constructibility\"\ndescription: |\n What kinds of regular polygons are constructible with compass and straightedge?\nformat:\n html:\n html-math-method: katex\ndate: \"2021-08-18\"\ndate-modified: \"2025-06-17\"\ncategories:\n - geometry\n - generating functions\n - algebra\n - python\n---\n\n\n\n\n\n[Recently](/posts/misc/platonic-volume), I used coordinate-free geometry to derive\n the volumes of the Platonic solids, a problem which was very accessible to the ancient Greeks.\nOn the other hand, they found certain problems regarding which figures can be constructed via\n compass and straightedge to be very difficult. For example, they struggled with problems\n like [doubling the cube](https://en.wikipedia.org/wiki/Doubling_the_cube)\n or [squaring the circle](https://en.wikipedia.org/wiki/Squaring_the_circle),\n which are known (through circa 19th century mathematics) to be impossible.\nHowever, before even extending planar geometry by a third dimension or\n calculating the areas of circles, a simpler problem becomes apparent.\nNamely, what kinds of regular polygons are constructible?\n\n\nRegular Geometry and a Complex Series\n-------------------------------------\n\nWhen constructing a regular polygon, one wants a ratio between the length of a edge\n and the distance from a vertex to the center of the figure.\n\n![\n Regular triangle, square, and pentagons inscribed in a unit circle.\n Note the right triangle formed by the apothem, half of an edge, and circumradius.\n](./central_angle_figures.png){.wide}\n\nIn a convex polygon, the total central angle is always one full turn, or 2π radians.\nThe central angle of a regular *n*-gon is ${2\\pi \\over n}$ radians,\n and the green angle above (which we'll call *θ*) is half of that.\nThis means that the ratio we're looking for is $\\sin(\\theta) = \\sin(\\pi / n)$.\nWe can multiply by *n* inside the function on both sides to give\n $\\sin(n\\theta) = \\sin(\\pi) = 0$.\nTherefore, constructing a polygon is actually equivalent to solving this equation,\n and we can rephrase the question as how to express $\\sin(n\\theta)$ (and $\\cos(n\\theta)$).\n\n\n### Complex Recursion\n\nThanks to [Euler's formula](https://en.wikipedia.org/wiki/Euler%27s_formula)\n and [de Moivre's formula](https://en.wikipedia.org/wiki/De_Moivre%27s_formula),\n the expressions we're looking for can be phrased in terms of the complex exponential.\n\n$$\n\\begin{align*}\n e^{i\\theta}\n &= \\text{cis}(\\theta) = \\cos(\\theta) + i\\sin(\\theta)\n & \\text{ Euler's formula}\n \\\\\n \\text{cis}(n \\theta) = e^{i(n\\theta)}\n &= e^{(i\\theta)n} = {(e^{i\\theta})}^n = \\text{cis}(\\theta)^n\n \\\\\n \\cos(n \\theta) + i\\sin(n \\theta)\n &= (\\cos(\\theta) + i\\sin(\\theta))^n\n & \\text{ de Moivre's formula}\n\\end{align*}\n$$\n\nDe Moivre's formula for $n = 2$ gives\n\n$$\n\\begin{align*}\n \\text{cis}(\\theta)^2\n &= (\\text{c} + i\\text{s})^2\n \\\\\n &= \\text{c}^2 + 2i\\text{cs} - \\text{s}^2 + (0 = \\text{c}^2 + \\text{s}^2 - 1)\n \\\\\n &= 2\\text{c}^2 + 2i\\text{cs} - 1\n \\\\\n &= 2\\text{c}(\\text{c} + i\\text{s}) - 1\n \\\\\n &= 2\\cos(\\theta)\\text{cis}(\\theta) - 1\n\\end{align*}\n$$\n\nThis can easily be massaged into a recurrence relation.\n\n$$\n\\begin{align*}\n \\text{cis}(\\theta)^2\n &= 2\\cos(\\theta)\\text{cis}(\\theta) - 1\n \\\\\n \\text{cis}(\\theta)^{n+2}\n &= 2\\cos(\\theta)\\text{cis}(\\theta)^{n+1} - \\text{cis}(\\theta)^n\n \\\\\n \\text{cis}((n+2)\\theta)\n &= 2\\cos(\\theta)\\text{cis}((n+1)\\theta) - \\text{cis}(n\\theta)\n\\end{align*}\n$$\n\nRecurrence relations like this one are powerful.\nThrough some fairly straightforward summatory manipulations,\n the sequence can be interpreted as the coefficients in a Taylor series,\n giving a [generating function](https://en.wikipedia.org/wiki/Generating_function).\nCall this function *F*. Then,\n\n$$\n\\begin{align*}\n \\sum_{n=0}^\\infty \\text{cis}((n+2)\\theta)x^n\n &= 2\\cos(\\theta) \\sum_{n=0}^\\infty \\text{cis}((n+1)\\theta) x^n\n - \\sum_{n=0}^\\infty \\text{cis}(n\\theta) x^n\n \\\\\n {F(x; \\text{cis}(\\theta)) - 1 - x\\text{cis}(\\theta) \\over x^2}\n &= 2\\cos(\\theta) {F(x; \\text{cis}(\\theta)) - 1 \\over x}\n - F(x; \\text{cis}(\\theta))\n \\\\[10pt]\n F - 1 - x\\text{cis}(\\theta)\n &= 2\\cos(\\theta) x (F - 1)\n - x^2 F\n \\\\\n F - 2\\cos(\\theta) x F + x^2 F\n &= 1 + x(\\text{cis}(\\theta) - 2\\cos(\\theta))\n \\\\[10pt]\n F(x; \\text{cis}(\\theta))\n &= {1 + x(\\text{cis}(\\theta) - 2\\cos(\\theta)) \\over\n 1 - 2\\cos(\\theta)x + x^2}\n\\end{align*}\n$$\n\nSince $\\text{cis}$ is a complex function, we can separate *F* into real and imaginary parts.\nConveniently, these correspond to $\\cos(n\\theta)$ and $\\sin(n\\theta)$, respectively.\n\n$$\n\\begin{align*}\n \\Re[ F(x; \\text{cis}(\\theta)) ]\n &= {1 + x(\\cos(\\theta) - 2\\cos(\\theta)) \\over 1 - 2\\cos(\\theta)x + x^2}\n \\\\\n &= {1 - x\\cos(\\theta) \\over 1 - 2\\cos(\\theta)x + x^2} = A(x; \\cos(\\theta))\n \\\\\n \\Im[ F(x; \\text{cis}(\\theta)) ]\n &= {x \\sin(\\theta) \\over 1 - 2\\cos(\\theta)x + x^2} = B(x; \\cos(\\theta))\\sin(\\theta)\n\\end{align*}\n$$\n\nIn this form, it becomes obvious that the even though the generating function *F* was originally\n parametrized by $\\text{cis}(\\theta)$, *A* and *B* are parametrized only by $\\cos(\\theta)$.\nExtracting the coefficients of *x* yields an expression for $\\cos(n\\theta)$ and $\\sin(n\\theta)$\n in terms of $\\cos(\\theta)$ (and in the latter case, a common factor of $\\sin(\\theta)$).\n\nIf $\\cos(\\theta)$ in *A* and *B* is replaced with the parameter *z*, then all trigonometric functions\n are removed from the equation, and we are left with only polynomials[^1].\nThese polynomials are [*Chebyshev polynomials*](https://en.wikipedia.org/wiki/Chebyshev_polynomial)\n *of the first (A) and second (B) kind*.\nIn actuality, the polynomials of the second kind are typically offset by 1\n (the x in the numerator of *B* is omitted).\nHowever, retaining this term makes indexing consistent between *A* and *B*\n (and will make things clearer later).\n\n[^1]:\n This can actually be observed as early as the recurrence relation.\n\n $$\n \\begin{align*}\n \\text{cis}(\\theta)^{n+2}\n &= 2\\cos(\\theta)\\text{cis}(\\theta)^{n+1} - \\text{cis}(\\theta)^n\n \\\\\n a_{n+2}\n &= 2 z a_{n+1} - a_n\n \\\\\n \\Re[ a_0 ]\n &= 1,~~ \\Im[ a_0 ] = 0\n \\\\\n \\Re[ a_1 ]\n &= z,~~ \\Im[ a_1 ] = 1 \\cdot \\sin(\\theta)\n \\end{align*}\n $$\n\n\nWe were primarily interested in $\\sin(n\\theta)$, so let's tabulate\n the first few polynomials of the second kind (at $z / 2$).\n\n::: {#tbl-chebyshevu .cell .plain tbl-cap='[OEIS A049310](http://oeis.org/A049310)' execution_count=3}\n\n::: {.cell-output .cell-output-display .cell-output-markdown execution_count=2}\n*n* $[x^n]B(x; z / 2) = U_{n - 1}(z / 2)$ Factored\n----- --------------------------------------------- -------------------------------------------------------------------------------------------------\n0 $0$ $0$\n1 $1$ $1$\n2 $z$ $z$\n3 $z^{2} - 1$ $\\left(z - 1\\right) \\left(z + 1\\right)$\n4 $z^{3} - 2 z$ $z \\left(z^{2} - 2\\right)$\n5 $z^{4} - 3 z^{2} + 1$ $\\left(z^{2} - z - 1\\right) \\left(z^{2} + z - 1\\right)$\n6 $z^{5} - 4 z^{3} + 3 z$ $z \\left(z - 1\\right) \\left(z + 1\\right) \\left(z^{2} - 3\\right)$\n7 $z^{6} - 5 z^{4} + 6 z^{2} - 1$ $\\left(z^{3} - z^{2} - 2 z + 1\\right) \\left(z^{3} + z^{2} - 2 z - 1\\right)$\n8 $z^{7} - 6 z^{5} + 10 z^{3} - 4 z$ $z \\left(z^{2} - 2\\right) \\left(z^{4} - 4 z^{2} + 2\\right)$\n9 $z^{8} - 7 z^{6} + 15 z^{4} - 10 z^{2} + 1$ $\\left(z - 1\\right) \\left(z + 1\\right) \\left(z^{3} - 3 z - 1\\right) \\left(z^{3} - 3 z + 1\\right)$\n10 $z^{9} - 8 z^{7} + 21 z^{5} - 20 z^{3} + 5 z$ $z \\left(z^{2} - z - 1\\right) \\left(z^{2} + z - 1\\right) \\left(z^{4} - 5 z^{2} + 5\\right)$\n:::\n:::\n\n\nEvaluating the polynomials at $z / 2$ cancels the 2 in the denominator (and recurrence),\n making these expressions much simpler.\nThis evaluation has an interpretation in terms of the previous diagram --\n recall we used *half* the length of a side as a leg of the right triangle.\nFor a unit circumradius, the side length itself is then $2\\sin( {\\pi / n} )$.\nTo compensate for this doubling, the Chebyshev polynomial must be evaluated at half its normal argument.\n\n\n### Back on the Plane\n\nThe constructibility criterion is deeply connected to the Chebyshev polynomials.\nIn compass and straightedge constructions, one only has access to linear forms (lines)\n and quadratic forms (circles).\nThis means that a figure is constructible if and only if the root can be expressed using\n normal arithmetic (which is linear) and square roots (which are quadratic).\n\n\n#### Pentagons\n\nLet's look at a regular pentagon.\nThe relevant polynomial is\n\n$$\n[x^5]B ( x; z / 2 )\n = z^4 - 3z^2 + 1\n = (z^2 - z - 1) (z^2 + z - 1)\n$$\n\nAccording to how we derived this series, when $z = 2\\cos(\\theta)$, the roots of this polynomial\n correspond to when $\\sin(5\\theta) / \\sin(\\theta) = 0$.\nThis relation itself is true when $\\theta = \\pi / 5$, since $\\sin(5 \\pi / 5) = 0$.\n\nOne of the factors must therefore be the minimal polynomial of $2\\cos(\\pi / 5 )$.\nThe former happens to be correct correct, since $2\\cos( \\pi / 5 ) = \\varphi$, the golden ratio.\nNote that the second factor is the first evaluated at -*z*.\n\n\n#### Heptagons\n\nAn example of where constructability fails is for $2\\cos( \\pi / 7 )$.\n\n$$\n\\begin{align*}\n [x^7]B ( x; z / 2 )\n &= z^6 - 5 z^4 + 6 z^2 - 1\n \\\\\n &= ( z^3 - z^2 - 2 z + 1 ) ( z^3 + z^2 - 2 z - 1 )\n\\end{align*}\n$$\n\nWhichever is the minimal polynomial (the former), it is a cubic, and constructing\n a regular heptagon is equivalent to solving it for *z*.\nBut there are no (nondegenerate) cubics that one can produce via compass and straightedge,\n and all constructions necessarily fail.\n\n\n#### Decagons\n\nOne might think the same of $2\\cos(\\pi /10 )$\n\n$$\n\\begin{align*}\n [x^{10}]B ( x; z / 2 )\n &= z^9 - 8 z^7 + 21 z^5 - 20 z^3 + 5 z\n \\\\\n &= z ( z^2 - z - 1 )( z^2 + z - 1 )( z^4 - 5 z^2 + 5 )\n\\end{align*}\n$$\n\nThis expression also contains the polynomials for $2\\cos( \\pi / 5 )$.\nThis is because a regular decagon would contain two disjoint regular pentagons,\n produced by connecting every other vertex.\n\n![\n  \n](./decagon_divisible.png)\n\nThe polynomial which actually corresponds to $2\\cos( \\pi / 10 )$ is the quartic,\n which seems to suggest that it will require a fourth root and somehow decagons are not constructible.\nHowever, it can be solved by completing the square...\n\n$$\n\\begin{align*}\n z^4 - 5z^2 &= -5\n \\\\\n z^4 - 5z^2 + (5/2)^2 &= -5 + (5/2)^2\n \\\\\n ( z^2 - 5/2)^2 &= {25 - 20 \\over 4}\n \\\\\n ( z^2 - 5/2) &= {\\sqrt 5 \\over 2}\n \\\\\n z^2 &= {5 \\over 2} + {\\sqrt 5 \\over 2}\n \\\\\n z &= \\sqrt{ {5 + \\sqrt 5 \\over 2} }\n\\end{align*}\n$$\n\n...and we can breathe a sigh of relief.\n\n\nThe Triangle behind Regular Polygons\n------------------------------------\n\nPreferring *z* to be halved in $B(x; z/2)$ makes something else more evident.\nObserve these four rows of the Chebyshev polynomials\n\n::: {#273cfec0 .cell .plain execution_count=4}\n\n::: {.cell-output .cell-output-display .cell-output-markdown execution_count=3}\n*n* $[x^n]B(x; z / 2)$ *k* $[z^{k}][x^n]B(x; z / 2)$\n----- ------------------------------- ----- ---------------------------\n4 $z^{3} - 2 z$ 3 1\n5 $z^{4} - 3 z^{2} + 1$ 2 -3\n6 $z^{5} - 4 z^{3} + 3 z$ 1 3\n7 $z^{6} - 5 z^{4} + 6 z^{2} - 1$ 0 -1\n:::\n:::\n\n\nThe last column looks like an alternating row of Pascal's triangle\n (namely, ${n - \\lfloor {k / 2} \\rfloor - 1 \\choose k}(-1)^k$).\nThis resemblance can be made more apparent by listing the coefficients of the polynomials in a table.\n\n::: {#92d420a6 .cell .plain execution_count=5}\n\n::: {.cell-output .cell-output-display .cell-output-markdown execution_count=4}\n n $z^9$ $z^8$ $z^7$ $z^6$ $z^5$ $z^4$ $z^3$ $z^2$ $z$ $1$\n--- ------------------------------ ------------------------------------ ------------------------------------- ----------------------------------- ----------------------------------- ----------------------------------- ------------------------------------ ------------------------------------- ------------------------------------- -------------------------------------\n 1 1\n 2 1 0\n 3 1 0 -1\n 4 1 0 -2 0\n 5 1 0 -3 0 1\n 6 1 0 -4 0 3 0\n 7 1 0 -5 0 6 0 -1\n 8 1 0 -6 0 10 0 -4 0\n 9 1 0 -7 0 15 0 -10 0 1\n 10 1 0 -8 0 21 0 -20 0 5 0\n:::\n:::\n\n\nThough they alternate in sign, the rows of Pascal's triangle appear along diagonals,\n which I have marked in rainbow.\nMeanwhile, alternating versions of the naturals (1, 2, 3, 4...),\n the triangular numbers (1, 3, 6, 10...),\n the tetrahedral numbers (1, 4, 10, 20...), etc.\n are present along the columns, albeit spaced out by 0's.\n\nThe relationship of the Chebyshev polynomials to the triangle is easier to see if\n the coefficient extraction of $B(x; z / 2)$ is reversed.\nIn other words, we extract *z* before extracting *x*.\n\n$$\n\\begin{align*}\n B(x; z / 2) &= {x \\over 1 - zx + x^2}\n = {x \\over 1 + x^2 - zx}\n = {x \\over 1 + x^2}\n \\cdot {1 \\over {1 + x^2 \\over 1 + x^2} - z{x \\over 1 + x^2}}\n \\\\[10pt]\n [z^n]B(x; z / 2) &= {x \\over 1 + x^2} [z^n] {1 \\over 1 - z{x \\over 1 + x^2}}\n = {x \\over 1 + x^2} \\left( {x \\over 1 + x^2} \\right)^n\n \\\\\n &= \\left( {x \\over 1 + x^2} \\right)^{n+1}\n = x^{n+1} (1 + x^2)^{-n - 1}\n \\\\\n &= x^{n+1} \\sum_{k=0}^\\infty {-n - 1 \\choose k}(x^2)^k\n \\quad \\text{Binomial theorem}\n\\end{align*}\n$$\n\nWhile the use of the binomial theorem is more than enough to justify\n the appearance of Pascal's triangle (along with explaining the 0's),\n I'll simplify further to explicitly show the alternating signs.\n\n$$\n\\begin{align*}\n {(-n - 1)_k} &= (-n - 1)(-n - 2) \\cdots (-n - k)\n \\\\\n &= (-1)^k (n + k)(n + k - 1) \\cdots (n + 1)\n \\\\\n &= (-1)^k (n + k)_k\n \\\\\n \\implies {-n - 1 \\choose k}\n &= {n + k \\choose k}(-1)^k\n \\\\[10pt]\n [z^n]B(x; z / 2)\n &= x^{n+1} \\sum_{k=0}^\\infty {n + k \\choose k} (-1)^k x^{2k}\n\\end{align*}\n$$\n\nSquinting hard enough, the binomial coefficient is similar to the earlier\n which gave the third row of Pascal's triangle.\nIf k is fixed, then this expression actually generates the antidiagonal entries\n of the coefficient table, which are the columns with uniform sign.\nThe alternation instead occurs between antidiagonals (one is all positive,\n the next is 0's, the next is all negative, etc.).\nThe initial $x^{n+1}$ lags these sequences so that they reproduce the triangle.\n\n\n### Imagined Transmutation\n\nThe generating function of the Chebyshev polynomials resembles other two term recurrences.\nFor example, the Fibonacci numbers have generating function\n\n$$\n\\sum_{n = 0}^\\infty \\text{Fib}_n x^n = {1 \\over 1 - x - x^2}\n$$\n\nThis resemblance can be made explicit with a simple algebraic manipulation.\n\n$$\n\\begin{align*}\n B(ix; -iz / 2)\n &= {1 \\over 1 -\\ (-i z)(ix) + (ix)^2}\n = {1 \\over 1 -\\ (-i^2) z x + (i^2)(x^2)}\n \\\\\n &= {1 \\over 1 -\\ z x -\\ x^2}\n\\end{align*}\n$$\n\nIf $z = 1$, these two generating functions are equal.\nThe same can be said for $z = 2$ with the generating function of the Pell numbers,\n and so on for higher recurrences (corresponding to metallic means) for higher integral *z*.\n\nIn terms of the Chebyshev polynomials, this series manipulation removes the alternation in\n the coefficients of $U_n$, restoring Pascal's triangle to its nonalternating form.\nRelated to the previous point, it is possible to find the Fibonacci numbers (Pell numbers, etc.)\n in Pascal's triangle, which you can read more about\n [here](http://users.dimi.uniud.it/~giacomo.dellariccia/Glossary/Pascal/Koshy2011.pdf).\n\n\nManipulating the Series\n-----------------------\n\nLook back to the table of $U_{n - 1}(z / 2)$ (@tbl-chebyshevu).\nWhen I brought up $U_{10 - 1}(z / 2)$ and decagons, I pointed out their relationship to pentagons\n as an explanation for why $U_{5 -\\ 1}(z / 2)$ appears as a factor.\nConveniently, $U_{2 -\\ 1}(z / 2) = z$ is also a factor, and 2 is likewise a factor of 10.\n\nThis pattern is present throughout the table; $n = 6$ contains factors for\n $n = 2 \\text{ and } 3$ and the prime numbers have no smaller factors.\nIf this observation is legitimate, call the newest term $f_n(z)$\n and denote $p_n(z) = U_{n -\\ 1}( z / 2 )$.\n\n\n### Factorization Attempts\n\nThe relationship between $p_n$ and the intermediate $f_d$, where *d* is a divisor of *n*,\n can be made explicit by a [Möbius inversion](https://en.wikipedia.org/wiki/M%C3%B6bius_inversion_formula).\n\n$$\n\\begin{align*}\n p_n(z) &= \\prod_{d|n} f_n(z)\n \\\\\n \\log( p_n(z) )\n &= \\log \\left( \\prod_{d|n} f_d(z) \\right)\n = \\sum_{d|n} \\log( f_d(z) )\n \\\\\n \\log( f_n(z) ) &= \\sum_{d|n} { \\mu \\left({n \\over d} \\right)}\n \\log( p_d(z) )\n \\\\\n f_n(z) &= \\prod_{d|n} p_d(z)^{ \\mu (n / d) }\n \\\\[10pt]\n f_6(z) = g_6(z)\n &= p_6(z)^{\\mu(1)}\n p_3(z)^{\\mu(2)}\n p_2(z)^{\\mu(3)}\n \\\\\n &= {p_6(z) \\over p_3(z) p_2(z)}\n\\end{align*}\n$$\n\nUnfortunately, it's difficult to apply this technique across our whole series.\nMöbius inversion over series typically uses more advanced generating functions such as\n [Dirichlet series](https://en.wikipedia.org/wiki/Dirichlet_series#Formal_Dirichlet_series)\n or [Lambert series](https://en.wikipedia.org/wiki/Lambert_series).\nHowever, naively reaching for these fails for two reasons:\n\n- We built our series of polynomials on a recurrence relation, and these series\n are opaque to such manipulations.\n- To do a proper Möbius inversion, we need these kinds of series over the *logarithm*\n of each polynomial (*B* is a series over the polynomials themselves).\n\nIgnoring these (and if you're in the mood for awful-looking math) you may note\n the Lambert equivalence[^2]:\n\n[^2]:\n This equivalence applies to other polynomial series obeying the same factorization rule\n such as the [cyclotomic polynomials](https://en.wikipedia.org/wiki/Cyclotomic_polynomial).\n\n$$\n\\begin{align*}\n \\log( p_n(z) )\n &= \\sum_{d|n} \\log( f_d(z) )\n \\\\\n \\sum_{n = 1}^\\infty \\log( p_n ) x^n\n &= \\sum_{n = 1}^\\infty \\sum_{d|n} \\log( f_d ) x^n\n \\\\\n &= \\sum_{k = 1}^\\infty \\sum_{m = 1}^\\infty \\log( f_m ) x^{m k}\n \\\\\n &= \\sum_{m = 1}^\\infty \\log( f_m ) \\sum_{k = 1}^\\infty (x^m)^k\n \\\\\n &= \\sum_{m = 1}^\\infty \\log( f_m ) {x^m \\over 1 - x^m}\n\\end{align*}\n$$\n\nEither way, the number-theoretic properties of this sequence are difficult to ascertain\n without advanced techniques.\nIf research has been done, it is not easily available in the OEIS.\n\n\n### Total Degrees\n\nIt can be also be observed that the new term is symmetric ($f(z) = f(-z)$), and is therefore\n either irreducible or the product of polynomial and its reflection (potentially negated).\nFor example,\n\n$$\np_9(z) = \\left\\{\n\\begin{matrix}\n (z - 1)(z + 1)\n & \\cdot\n & (z^3 - 3z - 1)(z^3 - 3z + 1)\n \\\\\n \\shortparallel && \\shortparallel\n \\\\\n f_3(z)\n & \\cdot\n & f_9(z)\n \\\\\n \\shortparallel && \\shortparallel\n \\\\\n g_3(z) \\cdot g_3(-z)\n & \\cdot\n & g_9(z) \\cdot -g_9(-z)\n\\end{matrix}\n\\right.\n$$\n\nThese factor polynomials $g_n$ are the minimal polynomials of $2\\cos( \\pi / n )$.\n\nMultiplying these minimal polynomials by their reflection can be observed in the Chebyshev polynomials\n for $n = 3, 5, 7, 9$, strongly implying that it occurs on the odd terms.\nAssuming this is true, we have\n\n$$\nf_n(z) = \\begin{cases}\n g_n(z) & \\text{$n$ is even}\n \\\\\n g_n(z)g_n(-z)\n & \\text{$n$ is odd and ${\\deg(f_n) \\over 2}$ is even}\n \\\\\n -g_n(z)g_n(-z)\n & \\text{$n$ is odd and ${\\deg(f_n) \\over 2}$ is odd}\n\\end{cases}\n$$\n\nWithout resorting to any advanced techniques, the degrees of $f_n$ are\n not too difficult to work out.\nThe degree of $p_n(z)$ is $n -\\ 1$, which is also the degree of $f_n(z)$ if *n* is prime.\nIf *n* is composite, then the degree of $f_n(z)$ is $n -\\ 1$ minus the degrees\n of the divisors of $n -\\ 1$.\nThis leaves behind how many numbers less than *n* are coprime to *n*.\nTherefore $\\deg(f_n) = \\phi(n)$, the\n [Euler totient function](https://en.wikipedia.org/wiki/Euler_totient_function) of the index.\n\nThe totient function can be used to examine the parity of *n*.\nIf *n* is odd, it is coprime to 2 and all even numbers.\nThe introduced factor of 2 to 2*n* removes the evens from the totient, but this is compensated by\n the addition of the odd multiples of old numbers coprime to *n* and new primes.\nThis means that $\\phi(2n) = \\phi(n)$ for odd *n* (other than 1).\n\nThe same argument can be used for even *n*: there are as many odd numbers from 0 to *n* as there are\n from *n* to 2*n*, and there are an equal number of numbers coprime to 2*n* in either interval.\nTherefore, $\\phi(2n) = 2\\phi(n)$ for even *n*.\n\nThis collapses all cases of the conditional factorization of $f_n$ into one,\n and the degrees of $g_n$ are\n\n$$\n\\begin{align*}\n \\deg( g_n(z) )\n &= \\begin{cases}\n \\deg( f_n(z) )\n = \\phi(n)\n & n \\text{ is even} & \\implies \\phi(n) = \\phi(2n) / 2\n \\\\\n \\deg( f_n(z) ) / 2\n = \\phi(n) / 2\n & n \\text{ is odd} & \\implies \\phi(n) / 2 = \\phi(2n) / 2\n \\end{cases}\n \\\\\n &= \\varphi(2n) / 2\n\\end{align*}\n$$\n\nThough they were present in the earlier Chebyshev table,\n the $g_n$ themselves are presented again, along with the expression for their degree\n\n::: {#835a5369 .cell .plain execution_count=6}\n\n::: {.cell-output .cell-output-display .cell-output-markdown execution_count=5}\nn $\\varphi(2n)/2$ $g_n(z)$ Coefficient list, rising powers\n--- --------------------------------------- ------------------------- ---------------------------------------\n2 1 $z$ [0, 1]\n3 1 $z - 1$ [-1, 1]\n4 2 $z^{2} - 2$ [-2, 0, 1]\n5 2 $z^{2} - z - 1$ [-1, -1, 1]\n6 2 $z^{2} - 3$ [-3, 0, 1]\n7 3 $z^{3} - z^{2} - 2 z + 1$ [1, -2, -1, 1]\n8 4 $z^{4} - 4 z^{2} + 2$ [2, 0, -4, 0, 1]\n9 3 $z^{3} - 3 z - 1$ [-1, -3, 0, 1]\n9 3 $z^{3} - 3 z + 1$ [1, -3, 0, 1]\n10 4 $z^{4} - 5 z^{2} + 5$ [5, 0, -5, 0, 1]\n- [OEIS A055034](http://oeis.org/A055034) - [OEIS A187360](http://oeis.org/A187360)\n:::\n:::\n\n\nClosing\n-------\n\nMy initial jumping off point for writing this article was completely different.\nHowever, in the process of writing, its share of the article shrank and shrank until its\n introduction was only vaguely related to what preceded it.\nBut alas, the introduction via geometric constructions flows better coming off my\n [post about the Platonic solids](/posts/misc/platonic-volume).\nAlso, it reads better if I rely less on \"if you search for this sequence of numbers\"\n and more on how to interpret the definition.\n\nConsider reading [the follow-up](../2) to this post if you're interested in another way\n one can obtain the Chebyshev polynomials.\n\nDiagrams created with GeoGebra.\n\n\n\n", + "markdown": "---\ntitle: \"Generating Polynomials, Part 1: Regular Constructibility\"\ndescription: |\n What kinds of regular polygons are constructible with compass and straightedge?\nformat:\n html:\n html-math-method: katex\ndate: \"2021-08-18\"\ndate-modified: \"2025-06-17\"\ncategories:\n - geometry\n - generating functions\n - algebra\n - python\n---\n\n\n\n\n\n[Recently](/posts/math/misc/platonic-volume), I used coordinate-free geometry to derive\n the volumes of the Platonic solids, a problem which was very accessible to the ancient Greeks.\nOn the other hand, they found certain problems regarding which figures can be constructed via\n compass and straightedge to be very difficult. For example, they struggled with problems\n like [doubling the cube](https://en.wikipedia.org/wiki/Doubling_the_cube)\n or [squaring the circle](https://en.wikipedia.org/wiki/Squaring_the_circle),\n which are known (through circa 19th century mathematics) to be impossible.\nHowever, before even extending planar geometry by a third dimension or\n calculating the areas of circles, a simpler problem becomes apparent.\nNamely, what kinds of regular polygons are constructible?\n\n\nRegular Geometry and a Complex Series\n-------------------------------------\n\nWhen constructing a regular polygon, one wants a ratio between the length of a edge\n and the distance from a vertex to the center of the figure.\n\n![\n Regular triangle, square, and pentagons inscribed in a unit circle.\n Note the right triangle formed by the apothem, half of an edge, and circumradius.\n](./central_angle_figures.png){.wide}\n\nIn a convex polygon, the total central angle is always one full turn, or 2π radians.\nThe central angle of a regular *n*-gon is ${2\\pi \\over n}$ radians,\n and the green angle above (which we'll call *θ*) is half of that.\nThis means that the ratio we're looking for is $\\sin(\\theta) = \\sin(\\pi / n)$.\nWe can multiply by *n* inside the function on both sides to give\n $\\sin(n\\theta) = \\sin(\\pi) = 0$.\nTherefore, constructing a polygon is actually equivalent to solving this equation,\n and we can rephrase the question as how to express $\\sin(n\\theta)$ (and $\\cos(n\\theta)$).\n\n\n### Complex Recursion\n\nThanks to [Euler's formula](https://en.wikipedia.org/wiki/Euler%27s_formula)\n and [de Moivre's formula](https://en.wikipedia.org/wiki/De_Moivre%27s_formula),\n the expressions we're looking for can be phrased in terms of the complex exponential.\n\n$$\n\\begin{align*}\n e^{i\\theta}\n &= \\text{cis}(\\theta) = \\cos(\\theta) + i\\sin(\\theta)\n & \\text{ Euler's formula}\n \\\\\n \\text{cis}(n \\theta) = e^{i(n\\theta)}\n &= e^{(i\\theta)n} = {(e^{i\\theta})}^n = \\text{cis}(\\theta)^n\n \\\\\n \\cos(n \\theta) + i\\sin(n \\theta)\n &= (\\cos(\\theta) + i\\sin(\\theta))^n\n & \\text{ de Moivre's formula}\n\\end{align*}\n$$\n\nDe Moivre's formula for $n = 2$ gives\n\n$$\n\\begin{align*}\n \\text{cis}(\\theta)^2\n &= (\\text{c} + i\\text{s})^2\n \\\\\n &= \\text{c}^2 + 2i\\text{cs} - \\text{s}^2 + (0 = \\text{c}^2 + \\text{s}^2 - 1)\n \\\\\n &= 2\\text{c}^2 + 2i\\text{cs} - 1\n \\\\\n &= 2\\text{c}(\\text{c} + i\\text{s}) - 1\n \\\\\n &= 2\\cos(\\theta)\\text{cis}(\\theta) - 1\n\\end{align*}\n$$\n\nThis can easily be massaged into a recurrence relation.\n\n$$\n\\begin{align*}\n \\text{cis}(\\theta)^2\n &= 2\\cos(\\theta)\\text{cis}(\\theta) - 1\n \\\\\n \\text{cis}(\\theta)^{n+2}\n &= 2\\cos(\\theta)\\text{cis}(\\theta)^{n+1} - \\text{cis}(\\theta)^n\n \\\\\n \\text{cis}((n+2)\\theta)\n &= 2\\cos(\\theta)\\text{cis}((n+1)\\theta) - \\text{cis}(n\\theta)\n\\end{align*}\n$$\n\nRecurrence relations like this one are powerful.\nThrough some fairly straightforward summatory manipulations,\n the sequence can be interpreted as the coefficients in a Taylor series,\n giving a [generating function](https://en.wikipedia.org/wiki/Generating_function).\nCall this function *F*. Then,\n\n$$\n\\begin{align*}\n \\sum_{n=0}^\\infty \\text{cis}((n+2)\\theta)x^n\n &= 2\\cos(\\theta) \\sum_{n=0}^\\infty \\text{cis}((n+1)\\theta) x^n\n - \\sum_{n=0}^\\infty \\text{cis}(n\\theta) x^n\n \\\\\n {F(x; \\text{cis}(\\theta)) - 1 - x\\text{cis}(\\theta) \\over x^2}\n &= 2\\cos(\\theta) {F(x; \\text{cis}(\\theta)) - 1 \\over x}\n - F(x; \\text{cis}(\\theta))\n \\\\[10pt]\n F - 1 - x\\text{cis}(\\theta)\n &= 2\\cos(\\theta) x (F - 1)\n - x^2 F\n \\\\\n F - 2\\cos(\\theta) x F + x^2 F\n &= 1 + x(\\text{cis}(\\theta) - 2\\cos(\\theta))\n \\\\[10pt]\n F(x; \\text{cis}(\\theta))\n &= {1 + x(\\text{cis}(\\theta) - 2\\cos(\\theta)) \\over\n 1 - 2\\cos(\\theta)x + x^2}\n\\end{align*}\n$$\n\nSince $\\text{cis}$ is a complex function, we can separate *F* into real and imaginary parts.\nConveniently, these correspond to $\\cos(n\\theta)$ and $\\sin(n\\theta)$, respectively.\n\n$$\n\\begin{align*}\n \\Re[ F(x; \\text{cis}(\\theta)) ]\n &= {1 + x(\\cos(\\theta) - 2\\cos(\\theta)) \\over 1 - 2\\cos(\\theta)x + x^2}\n \\\\\n &= {1 - x\\cos(\\theta) \\over 1 - 2\\cos(\\theta)x + x^2} = A(x; \\cos(\\theta))\n \\\\\n \\Im[ F(x; \\text{cis}(\\theta)) ]\n &= {x \\sin(\\theta) \\over 1 - 2\\cos(\\theta)x + x^2} = B(x; \\cos(\\theta))\\sin(\\theta)\n\\end{align*}\n$$\n\nIn this form, it becomes obvious that the even though the generating function *F* was originally\n parametrized by $\\text{cis}(\\theta)$, *A* and *B* are parametrized only by $\\cos(\\theta)$.\nExtracting the coefficients of *x* yields an expression for $\\cos(n\\theta)$ and $\\sin(n\\theta)$\n in terms of $\\cos(\\theta)$ (and in the latter case, a common factor of $\\sin(\\theta)$).\n\nIf $\\cos(\\theta)$ in *A* and *B* is replaced with the parameter *z*, then all trigonometric functions\n are removed from the equation, and we are left with only polynomials[^1].\nThese polynomials are [*Chebyshev polynomials*](https://en.wikipedia.org/wiki/Chebyshev_polynomial)\n *of the first (A) and second (B) kind*.\nIn actuality, the polynomials of the second kind are typically offset by 1\n (the x in the numerator of *B* is omitted).\nHowever, retaining this term makes indexing consistent between *A* and *B*\n (and will make things clearer later).\n\n[^1]:\n This can actually be observed as early as the recurrence relation.\n\n $$\n \\begin{align*}\n \\text{cis}(\\theta)^{n+2}\n &= 2\\cos(\\theta)\\text{cis}(\\theta)^{n+1} - \\text{cis}(\\theta)^n\n \\\\\n a_{n+2}\n &= 2 z a_{n+1} - a_n\n \\\\\n \\Re[ a_0 ]\n &= 1,~~ \\Im[ a_0 ] = 0\n \\\\\n \\Re[ a_1 ]\n &= z,~~ \\Im[ a_1 ] = 1 \\cdot \\sin(\\theta)\n \\end{align*}\n $$\n\n\nWe were primarily interested in $\\sin(n\\theta)$, so let's tabulate\n the first few polynomials of the second kind (at $z / 2$).\n\n::: {#tbl-chebyshevu .cell .plain tbl-cap='[OEIS A049310](http://oeis.org/A049310)' execution_count=3}\n\n::: {.cell-output .cell-output-display .cell-output-markdown execution_count=2}\n*n* $[x^n]B(x; z / 2) = U_{n - 1}(z / 2)$ Factored\n----- --------------------------------------------- -------------------------------------------------------------------------------------------------\n0 $0$ $0$\n1 $1$ $1$\n2 $z$ $z$\n3 $z^{2} - 1$ $\\left(z - 1\\right) \\left(z + 1\\right)$\n4 $z^{3} - 2 z$ $z \\left(z^{2} - 2\\right)$\n5 $z^{4} - 3 z^{2} + 1$ $\\left(z^{2} - z - 1\\right) \\left(z^{2} + z - 1\\right)$\n6 $z^{5} - 4 z^{3} + 3 z$ $z \\left(z - 1\\right) \\left(z + 1\\right) \\left(z^{2} - 3\\right)$\n7 $z^{6} - 5 z^{4} + 6 z^{2} - 1$ $\\left(z^{3} - z^{2} - 2 z + 1\\right) \\left(z^{3} + z^{2} - 2 z - 1\\right)$\n8 $z^{7} - 6 z^{5} + 10 z^{3} - 4 z$ $z \\left(z^{2} - 2\\right) \\left(z^{4} - 4 z^{2} + 2\\right)$\n9 $z^{8} - 7 z^{6} + 15 z^{4} - 10 z^{2} + 1$ $\\left(z - 1\\right) \\left(z + 1\\right) \\left(z^{3} - 3 z - 1\\right) \\left(z^{3} - 3 z + 1\\right)$\n10 $z^{9} - 8 z^{7} + 21 z^{5} - 20 z^{3} + 5 z$ $z \\left(z^{2} - z - 1\\right) \\left(z^{2} + z - 1\\right) \\left(z^{4} - 5 z^{2} + 5\\right)$\n:::\n:::\n\n\nEvaluating the polynomials at $z / 2$ cancels the 2 in the denominator (and recurrence),\n making these expressions much simpler.\nThis evaluation has an interpretation in terms of the previous diagram --\n recall we used *half* the length of a side as a leg of the right triangle.\nFor a unit circumradius, the side length itself is then $2\\sin( {\\pi / n} )$.\nTo compensate for this doubling, the Chebyshev polynomial must be evaluated at half its normal argument.\n\n\n### Back on the Plane\n\nThe constructibility criterion is deeply connected to the Chebyshev polynomials.\nIn compass and straightedge constructions, one only has access to linear forms (lines)\n and quadratic forms (circles).\nThis means that a figure is constructible if and only if the root can be expressed using\n normal arithmetic (which is linear) and square roots (which are quadratic).\n\n\n#### Pentagons\n\nLet's look at a regular pentagon.\nThe relevant polynomial is\n\n$$\n[x^5]B ( x; z / 2 )\n = z^4 - 3z^2 + 1\n = (z^2 - z - 1) (z^2 + z - 1)\n$$\n\nAccording to how we derived this series, when $z = 2\\cos(\\theta)$, the roots of this polynomial\n correspond to when $\\sin(5\\theta) / \\sin(\\theta) = 0$.\nThis relation itself is true when $\\theta = \\pi / 5$, since $\\sin(5 \\pi / 5) = 0$.\n\nOne of the factors must therefore be the minimal polynomial of $2\\cos(\\pi / 5 )$.\nThe former happens to be correct correct, since $2\\cos( \\pi / 5 ) = \\varphi$, the golden ratio.\nNote that the second factor is the first evaluated at -*z*.\n\n\n#### Heptagons\n\nAn example of where constructability fails is for $2\\cos( \\pi / 7 )$.\n\n$$\n\\begin{align*}\n [x^7]B ( x; z / 2 )\n &= z^6 - 5 z^4 + 6 z^2 - 1\n \\\\\n &= ( z^3 - z^2 - 2 z + 1 ) ( z^3 + z^2 - 2 z - 1 )\n\\end{align*}\n$$\n\nWhichever is the minimal polynomial (the former), it is a cubic, and constructing\n a regular heptagon is equivalent to solving it for *z*.\nBut there are no (nondegenerate) cubics that one can produce via compass and straightedge,\n and all constructions necessarily fail.\n\n\n#### Decagons\n\nOne might think the same of $2\\cos(\\pi /10 )$\n\n$$\n\\begin{align*}\n [x^{10}]B ( x; z / 2 )\n &= z^9 - 8 z^7 + 21 z^5 - 20 z^3 + 5 z\n \\\\\n &= z ( z^2 - z - 1 )( z^2 + z - 1 )( z^4 - 5 z^2 + 5 )\n\\end{align*}\n$$\n\nThis expression also contains the polynomials for $2\\cos( \\pi / 5 )$.\nThis is because a regular decagon would contain two disjoint regular pentagons,\n produced by connecting every other vertex.\n\n![\n  \n](./decagon_divisible.png)\n\nThe polynomial which actually corresponds to $2\\cos( \\pi / 10 )$ is the quartic,\n which seems to suggest that it will require a fourth root and somehow decagons are not constructible.\nHowever, it can be solved by completing the square...\n\n$$\n\\begin{align*}\n z^4 - 5z^2 &= -5\n \\\\\n z^4 - 5z^2 + (5/2)^2 &= -5 + (5/2)^2\n \\\\\n ( z^2 - 5/2)^2 &= {25 - 20 \\over 4}\n \\\\\n ( z^2 - 5/2) &= {\\sqrt 5 \\over 2}\n \\\\\n z^2 &= {5 \\over 2} + {\\sqrt 5 \\over 2}\n \\\\\n z &= \\sqrt{ {5 + \\sqrt 5 \\over 2} }\n\\end{align*}\n$$\n\n...and we can breathe a sigh of relief.\n\n\nThe Triangle behind Regular Polygons\n------------------------------------\n\nPreferring *z* to be halved in $B(x; z/2)$ makes something else more evident.\nObserve these four rows of the Chebyshev polynomials\n\n::: {#fa18ec11 .cell .plain execution_count=4}\n\n::: {.cell-output .cell-output-display .cell-output-markdown execution_count=3}\n*n* $[x^n]B(x; z / 2)$ *k* $[z^{k}][x^n]B(x; z / 2)$\n----- ------------------------------- ----- ---------------------------\n4 $z^{3} - 2 z$ 3 1\n5 $z^{4} - 3 z^{2} + 1$ 2 -3\n6 $z^{5} - 4 z^{3} + 3 z$ 1 3\n7 $z^{6} - 5 z^{4} + 6 z^{2} - 1$ 0 -1\n:::\n:::\n\n\nThe last column looks like an alternating row of Pascal's triangle\n (namely, ${n - \\lfloor {k / 2} \\rfloor - 1 \\choose k}(-1)^k$).\nThis resemblance can be made more apparent by listing the coefficients of the polynomials in a table.\n\n::: {#cc29c0f1 .cell .plain execution_count=5}\n\n::: {.cell-output .cell-output-display .cell-output-markdown execution_count=4}\n n $z^9$ $z^8$ $z^7$ $z^6$ $z^5$ $z^4$ $z^3$ $z^2$ $z$ $1$\n--- ------------------------------ ------------------------------------ ------------------------------------- ----------------------------------- ----------------------------------- ----------------------------------- ------------------------------------ ------------------------------------- ------------------------------------- -------------------------------------\n 1 1\n 2 1 0\n 3 1 0 -1\n 4 1 0 -2 0\n 5 1 0 -3 0 1\n 6 1 0 -4 0 3 0\n 7 1 0 -5 0 6 0 -1\n 8 1 0 -6 0 10 0 -4 0\n 9 1 0 -7 0 15 0 -10 0 1\n 10 1 0 -8 0 21 0 -20 0 5 0\n:::\n:::\n\n\nThough they alternate in sign, the rows of Pascal's triangle appear along diagonals,\n which I have marked in rainbow.\nMeanwhile, alternating versions of the naturals (1, 2, 3, 4...),\n the triangular numbers (1, 3, 6, 10...),\n the tetrahedral numbers (1, 4, 10, 20...), etc.\n are present along the columns, albeit spaced out by 0's.\n\nThe relationship of the Chebyshev polynomials to the triangle is easier to see if\n the coefficient extraction of $B(x; z / 2)$ is reversed.\nIn other words, we extract *z* before extracting *x*.\n\n$$\n\\begin{align*}\n B(x; z / 2) &= {x \\over 1 - zx + x^2}\n = {x \\over 1 + x^2 - zx}\n = {x \\over 1 + x^2}\n \\cdot {1 \\over {1 + x^2 \\over 1 + x^2} - z{x \\over 1 + x^2}}\n \\\\[10pt]\n [z^n]B(x; z / 2) &= {x \\over 1 + x^2} [z^n] {1 \\over 1 - z{x \\over 1 + x^2}}\n = {x \\over 1 + x^2} \\left( {x \\over 1 + x^2} \\right)^n\n \\\\\n &= \\left( {x \\over 1 + x^2} \\right)^{n+1}\n = x^{n+1} (1 + x^2)^{-n - 1}\n \\\\\n &= x^{n+1} \\sum_{k=0}^\\infty {-n - 1 \\choose k}(x^2)^k\n \\quad \\text{Binomial theorem}\n\\end{align*}\n$$\n\nWhile the use of the binomial theorem is more than enough to justify\n the appearance of Pascal's triangle (along with explaining the 0's),\n I'll simplify further to explicitly show the alternating signs.\n\n$$\n\\begin{align*}\n {(-n - 1)_k} &= (-n - 1)(-n - 2) \\cdots (-n - k)\n \\\\\n &= (-1)^k (n + k)(n + k - 1) \\cdots (n + 1)\n \\\\\n &= (-1)^k (n + k)_k\n \\\\\n \\implies {-n - 1 \\choose k}\n &= {n + k \\choose k}(-1)^k\n \\\\[10pt]\n [z^n]B(x; z / 2)\n &= x^{n+1} \\sum_{k=0}^\\infty {n + k \\choose k} (-1)^k x^{2k}\n\\end{align*}\n$$\n\nSquinting hard enough, the binomial coefficient is similar to the earlier\n which gave the third row of Pascal's triangle.\nIf k is fixed, then this expression actually generates the antidiagonal entries\n of the coefficient table, which are the columns with uniform sign.\nThe alternation instead occurs between antidiagonals (one is all positive,\n the next is 0's, the next is all negative, etc.).\nThe initial $x^{n+1}$ lags these sequences so that they reproduce the triangle.\n\n\n### Imagined Transmutation\n\nThe generating function of the Chebyshev polynomials resembles other two term recurrences.\nFor example, the Fibonacci numbers have generating function\n\n$$\n\\sum_{n = 0}^\\infty \\text{Fib}_n x^n = {1 \\over 1 - x - x^2}\n$$\n\nThis resemblance can be made explicit with a simple algebraic manipulation.\n\n$$\n\\begin{align*}\n B(ix; -iz / 2)\n &= {1 \\over 1 -\\ (-i z)(ix) + (ix)^2}\n = {1 \\over 1 -\\ (-i^2) z x + (i^2)(x^2)}\n \\\\\n &= {1 \\over 1 -\\ z x -\\ x^2}\n\\end{align*}\n$$\n\nIf $z = 1$, these two generating functions are equal.\nThe same can be said for $z = 2$ with the generating function of the Pell numbers,\n and so on for higher recurrences (corresponding to metallic means) for higher integral *z*.\n\nIn terms of the Chebyshev polynomials, this series manipulation removes the alternation in\n the coefficients of $U_n$, restoring Pascal's triangle to its nonalternating form.\nRelated to the previous point, it is possible to find the Fibonacci numbers (Pell numbers, etc.)\n in Pascal's triangle, which you can read more about\n [here](http://users.dimi.uniud.it/~giacomo.dellariccia/Glossary/Pascal/Koshy2011.pdf).\n\n\nManipulating the Series\n-----------------------\n\nLook back to the table of $U_{n - 1}(z / 2)$ (@tbl-chebyshevu).\nWhen I brought up $U_{10 - 1}(z / 2)$ and decagons, I pointed out their relationship to pentagons\n as an explanation for why $U_{5 -\\ 1}(z / 2)$ appears as a factor.\nConveniently, $U_{2 -\\ 1}(z / 2) = z$ is also a factor, and 2 is likewise a factor of 10.\n\nThis pattern is present throughout the table; $n = 6$ contains factors for\n $n = 2 \\text{ and } 3$ and the prime numbers have no smaller factors.\nIf this observation is legitimate, call the newest term $f_n(z)$\n and denote $p_n(z) = U_{n -\\ 1}( z / 2 )$.\n\n\n### Factorization Attempts\n\nThe relationship between $p_n$ and the intermediate $f_d$, where *d* is a divisor of *n*,\n can be made explicit by a [Möbius inversion](https://en.wikipedia.org/wiki/M%C3%B6bius_inversion_formula).\n\n$$\n\\begin{align*}\n p_n(z) &= \\prod_{d|n} f_n(z)\n \\\\\n \\log( p_n(z) )\n &= \\log \\left( \\prod_{d|n} f_d(z) \\right)\n = \\sum_{d|n} \\log( f_d(z) )\n \\\\\n \\log( f_n(z) ) &= \\sum_{d|n} { \\mu \\left({n \\over d} \\right)}\n \\log( p_d(z) )\n \\\\\n f_n(z) &= \\prod_{d|n} p_d(z)^{ \\mu (n / d) }\n \\\\[10pt]\n f_6(z) = g_6(z)\n &= p_6(z)^{\\mu(1)}\n p_3(z)^{\\mu(2)}\n p_2(z)^{\\mu(3)}\n \\\\\n &= {p_6(z) \\over p_3(z) p_2(z)}\n\\end{align*}\n$$\n\nUnfortunately, it's difficult to apply this technique across our whole series.\nMöbius inversion over series typically uses more advanced generating functions such as\n [Dirichlet series](https://en.wikipedia.org/wiki/Dirichlet_series#Formal_Dirichlet_series)\n or [Lambert series](https://en.wikipedia.org/wiki/Lambert_series).\nHowever, naively reaching for these fails for two reasons:\n\n- We built our series of polynomials on a recurrence relation, and these series\n are opaque to such manipulations.\n- To do a proper Möbius inversion, we need these kinds of series over the *logarithm*\n of each polynomial (*B* is a series over the polynomials themselves).\n\nIgnoring these (and if you're in the mood for awful-looking math) you may note\n the Lambert equivalence[^2]:\n\n[^2]:\n This equivalence applies to other polynomial series obeying the same factorization rule\n such as the [cyclotomic polynomials](https://en.wikipedia.org/wiki/Cyclotomic_polynomial).\n\n$$\n\\begin{align*}\n \\log( p_n(z) )\n &= \\sum_{d|n} \\log( f_d(z) )\n \\\\\n \\sum_{n = 1}^\\infty \\log( p_n ) x^n\n &= \\sum_{n = 1}^\\infty \\sum_{d|n} \\log( f_d ) x^n\n \\\\\n &= \\sum_{k = 1}^\\infty \\sum_{m = 1}^\\infty \\log( f_m ) x^{m k}\n \\\\\n &= \\sum_{m = 1}^\\infty \\log( f_m ) \\sum_{k = 1}^\\infty (x^m)^k\n \\\\\n &= \\sum_{m = 1}^\\infty \\log( f_m ) {x^m \\over 1 - x^m}\n\\end{align*}\n$$\n\nEither way, the number-theoretic properties of this sequence are difficult to ascertain\n without advanced techniques.\nIf research has been done, it is not easily available in the OEIS.\n\n\n### Total Degrees\n\nIt can be also be observed that the new term is symmetric ($f(z) = f(-z)$), and is therefore\n either irreducible or the product of polynomial and its reflection (potentially negated).\nFor example,\n\n$$\np_9(z) = \\left\\{\n\\begin{matrix}\n (z - 1)(z + 1)\n & \\cdot\n & (z^3 - 3z - 1)(z^3 - 3z + 1)\n \\\\\n \\shortparallel && \\shortparallel\n \\\\\n f_3(z)\n & \\cdot\n & f_9(z)\n \\\\\n \\shortparallel && \\shortparallel\n \\\\\n g_3(z) \\cdot g_3(-z)\n & \\cdot\n & g_9(z) \\cdot -g_9(-z)\n\\end{matrix}\n\\right.\n$$\n\nThese factor polynomials $g_n$ are the minimal polynomials of $2\\cos( \\pi / n )$.\n\nMultiplying these minimal polynomials by their reflection can be observed in the Chebyshev polynomials\n for $n = 3, 5, 7, 9$, strongly implying that it occurs on the odd terms.\nAssuming this is true, we have\n\n$$\nf_n(z) = \\begin{cases}\n g_n(z) & \\text{$n$ is even}\n \\\\\n g_n(z)g_n(-z)\n & \\text{$n$ is odd and ${\\deg(f_n) \\over 2}$ is even}\n \\\\\n -g_n(z)g_n(-z)\n & \\text{$n$ is odd and ${\\deg(f_n) \\over 2}$ is odd}\n\\end{cases}\n$$\n\nWithout resorting to any advanced techniques, the degrees of $f_n$ are\n not too difficult to work out.\nThe degree of $p_n(z)$ is $n -\\ 1$, which is also the degree of $f_n(z)$ if *n* is prime.\nIf *n* is composite, then the degree of $f_n(z)$ is $n -\\ 1$ minus the degrees\n of the divisors of $n -\\ 1$.\nThis leaves behind how many numbers less than *n* are coprime to *n*.\nTherefore $\\deg(f_n) = \\phi(n)$, the\n [Euler totient function](https://en.wikipedia.org/wiki/Euler_totient_function) of the index.\n\nThe totient function can be used to examine the parity of *n*.\nIf *n* is odd, it is coprime to 2 and all even numbers.\nThe introduced factor of 2 to 2*n* removes the evens from the totient, but this is compensated by\n the addition of the odd multiples of old numbers coprime to *n* and new primes.\nThis means that $\\phi(2n) = \\phi(n)$ for odd *n* (other than 1).\n\nThe same argument can be used for even *n*: there are as many odd numbers from 0 to *n* as there are\n from *n* to 2*n*, and there are an equal number of numbers coprime to 2*n* in either interval.\nTherefore, $\\phi(2n) = 2\\phi(n)$ for even *n*.\n\nThis collapses all cases of the conditional factorization of $f_n$ into one,\n and the degrees of $g_n$ are\n\n$$\n\\begin{align*}\n \\deg( g_n(z) )\n &= \\begin{cases}\n \\deg( f_n(z) )\n = \\phi(n)\n & n \\text{ is even} & \\implies \\phi(n) = \\phi(2n) / 2\n \\\\\n \\deg( f_n(z) ) / 2\n = \\phi(n) / 2\n & n \\text{ is odd} & \\implies \\phi(n) / 2 = \\phi(2n) / 2\n \\end{cases}\n \\\\\n &= \\varphi(2n) / 2\n\\end{align*}\n$$\n\nThough they were present in the earlier Chebyshev table,\n the $g_n$ themselves are presented again, along with the expression for their degree\n\n::: {#94ab708c .cell .plain execution_count=6}\n\n::: {.cell-output .cell-output-display .cell-output-markdown execution_count=5}\nn $\\varphi(2n)/2$ $g_n(z)$ Coefficient list, rising powers\n--- --------------------------------------- ------------------------- ---------------------------------------\n2 1 $z$ [0, 1]\n3 1 $z - 1$ [-1, 1]\n4 2 $z^{2} - 2$ [-2, 0, 1]\n5 2 $z^{2} - z - 1$ [-1, -1, 1]\n6 2 $z^{2} - 3$ [-3, 0, 1]\n7 3 $z^{3} - z^{2} - 2 z + 1$ [1, -2, -1, 1]\n8 4 $z^{4} - 4 z^{2} + 2$ [2, 0, -4, 0, 1]\n9 3 $z^{3} - 3 z - 1$ [-1, -3, 0, 1]\n9 3 $z^{3} - 3 z + 1$ [1, -3, 0, 1]\n10 4 $z^{4} - 5 z^{2} + 5$ [5, 0, -5, 0, 1]\n- [OEIS A055034](http://oeis.org/A055034) - [OEIS A187360](http://oeis.org/A187360)\n:::\n:::\n\n\nClosing\n-------\n\nMy initial jumping off point for writing this article was completely different.\nHowever, in the process of writing, its share of the article shrank and shrank until its\n introduction was only vaguely related to what preceded it.\nBut alas, the introduction via geometric constructions flows better coming off my\n [post about the Platonic solids](/posts/math/misc/platonic-volume).\nAlso, it reads better if I rely less on \"if you search for this sequence of numbers\"\n and more on how to interpret the definition.\n\nConsider reading [the follow-up](../2) to this post if you're interested in another way\n one can obtain the Chebyshev polynomials.\nI have since rederived the Chebyshev polynomials without the complex exponential,\n which you can read about in [this post](/posts/math/stereo/2).\n\nDiagrams created with GeoGebra.\n\n", "supporting": [ - "index_files" + "index_files/figure-html" ], "filters": [], "includes": {} diff --git a/_freeze/posts/math/finite-field/3/index/execute-results/html.json b/_freeze/posts/math/finite-field/3/index/execute-results/html.json index cb2964b..0e35ee7 100644 --- a/_freeze/posts/math/finite-field/3/index/execute-results/html.json +++ b/_freeze/posts/math/finite-field/3/index/execute-results/html.json @@ -1,10 +1,10 @@ { - "hash": "7266b08e16375ecbf03ed24b484d745b", + "hash": "df9fef54ee2049cdaf90a0487a97b59e", "result": { "engine": "jupyter", - "markdown": "---\ntitle: \"Exploring Finite Fields, Part 3: Roll a d20\"\ndescription: |\n When we extend fields with matrices, what other structures do we encounter?\nformat:\n html:\n html-math-method: katex\ndate: \"2024-02-03\"\ndate-modified: \"2025-08-04\"\ncategories:\n - algebra\n - finite field\n - haskell\n---\n\n\n\n\n\nIn the [previous post](../2), we focused on constructing finite fields using *n*×*n* matrices.\nThese matrices came from from primitive polynomials of degree *n* over GF(*p*),\n and could be used to do explicit arithmetic over GF(*p*^*n*^).\nIn this post, we'll look at a way to apply this in describing certain groups.\n\n\nWeakening the Field\n-------------------\n\nRecall the way we defined GF(4) in the first post.\nWe took the irreducible polynomial *p*(*x*) = *x*^2^ + *x* + 1, called its root *α*,\n and created addition and multiplication tables spanning the four elements.\nAfter the second post, we can do this more cleverly by mapping *α*\n to the companion matrix *C*~*p*~ over GF(2).\n\n$$\n\\begin{gather*}\n f : \\mathbb{F_4} \\longrightarrow \\mathbb{F}_2 {}^{2 \\times 2}\n \\\\[10pt]\n 0 \\mapsto \\left(\\begin{matrix}\n 0 & 0 \\\\\n 0 & 0\n \\end{matrix}\\right)\n ~~\n 1 \\mapsto \\left(\\begin{matrix}\n 1 & 0 \\\\\n 0 & 1\n \\end{matrix}\\right) = I\n ~~\n \\alpha \\mapsto \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 1\n \\end{matrix}\\right) = C_p\n \\\\ \\\\\n \\textcolor{red}{\\alpha} + \\textcolor{blue}{1} = \\alpha^2 \\mapsto\n \\left(\\begin{matrix}\n 1 & 1 \\\\\n 1 & 0\n \\end{matrix}\\right) =\n \\textcolor{red} {\n \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 1\n \\end{matrix}\\right)\n }\n + \\textcolor{blue}{\n \\left(\\begin{matrix}\n 1 & 0 \\\\\n 0 & 1\n \\end{matrix}\\right)\n }\\mod 2\n\\end{gather*}\n$$\n\nIn the images of *f*, the zero matrix has determinant 0 and all other elements have determinant 1.\nTherefore, the product of any two nonzero matrices always has determinant 1,\n and a nonzero determinant means the matrix is invertible.\nPer the definition of the field, the non-zero elements form a group with respect to multiplication.\nHere, they form a cyclic group of order 3, since *C*~*p*~^3^ = *I* mod 2.\nThis is also true using symbols, since *α*^3^ = 1.\n\n\n### Other Matrices\n\nHowever, there are more 2×2 matrices over GF(2) than just these.\nThere are two possible values in four locations, so there are 24 = 16 matrices,\n or 12 more than we've identified.\n\n$$\n\\begin{array}{c|c}\n \\#\\{a_{ij} = 1\\} & \\det = 0 & \\det = 1\n \\\\ \\hline\n 1 &\n \\scriptsize\n \\left(\\begin{matrix}\n 0 & 1 \\\\\n 0 & 0\n \\end{matrix}\\right)\n ~~\n \\left(\\begin{matrix}\n 1 & 0 \\\\\n 0 & 0\n \\end{matrix}\\right)\n ~~\n \\left(\\begin{matrix}\n 0 & 0 \\\\\n 0 & 1\n \\end{matrix}\\right)\n ~~\n \\left(\\begin{matrix}\n 0 & 0 \\\\\n 1 & 0\n \\end{matrix}\\right)\n \\\\\n 2 &\n \\scriptsize\n \\left(\\begin{matrix}\n 1 & 1 \\\\\n 0 & 0\n \\end{matrix}\\right)\n ~~\n \\left(\\begin{matrix}\n 0 & 0 \\\\\n 1 & 1\n \\end{matrix}\\right)\n ~~\n \\left(\\begin{matrix}\n 0 & 1 \\\\\n 0 & 1\n \\end{matrix}\\right)\n ~~\n \\left(\\begin{matrix}\n 1 & 1 \\\\\n 0 & 0\n \\end{matrix}\\right)\n &\n \\scriptsize\n \\textcolor{red}{\n \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 0\n \\end{matrix}\\right)\n }\n \\\\\n 3 & &\n \\scriptsize\n \\textcolor{red}{\n \\left(\\begin{matrix}\n 1 & 1 \\\\\n 0 & 1\n \\end{matrix}\\right)\n }\n ~~\n \\textcolor{red}{\n \\left(\\begin{matrix}\n 1 & 0 \\\\\n 1 & 1\n \\end{matrix}\\right)\n }\n \\\\\n 4 &\n \\scriptsize\n \\left(\\begin{matrix}\n 1 & 1 \\\\\n 1 & 1\n \\end{matrix}\\right)\n\\end{array}\n$$\n\nThe matrices in the right column (in red) have determinant 1, which means they can *also* multiply\n with our field-like elements without producing a singular matrix.\nThis forms a larger group, of which our field's multiplication group is a subgroup.\nHowever, it is *not* commutative, since matrix multiplication is not commutative in general.\n\nThe group of all six matrices with nonzero determinant is called the\n [*general linear group*](https://en.wikipedia.org/wiki/General_linear_group)\n of degree 2 over GF(2), written[^1] GL(2, 2).\nWe can sort the elements into classes by their order, or the number of times we have\n to multiply them before getting to the identity matrix (mod 2):\n\n[^1]: Unfortunately, it's rather easy to confuse \"GF\" with \"GL\".\n Remember that \"F\" is for \"field\", with the former standing for \"Galois field\".\n\n$$\n\\begin{array}{}\n \\text{Order 1} & \\text{Order 2} & \\text{Order 3}\n \\\\ \\hline\n \\left(\\begin{matrix}\n 1 & 0 \\\\\n 0 & 1\n \\end{matrix}\\right)\n &\n \\begin{align*}\n \\left(\\begin{matrix}\n 1 & 1 \\\\\n 0 & 1\n \\end{matrix}\\right)\n \\\\\n \\left(\\begin{matrix}\n 1 & 0 \\\\\n 1 & 1\n \\end{matrix}\\right)\n \\\\\n \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 0\n \\end{matrix}\\right)\n \\end{align*}\n &\n \\begin{align*}\n \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 1\n \\end{matrix}\\right)\n \\\\\n \\left(\\begin{matrix}\n 1 & 1 \\\\\n 1 & 0\n \\end{matrix}\\right)\n \\end{align*}\n\\end{array}\n$$\n\nIf you've studied enough group theory, you know that there are two groups of order 6:\n the cyclic group of order 6, *C*~6~, and the symmetric group on three elements, *S*~3~.\nSince the former group has order-6 elements, but none of these matrices are of order 6,\n the matrix group must be isomorphic to the latter.\nSince the group is small, it's not too difficult to construct an isomorphism between the two.\nWriting the elements of *S*~3~ in [cycle notation](/posts/permutations/1/), we have:\n\n$$\n\\begin{gather*}\n e \\mapsto \\left(\\begin{matrix}\n 1 & 0 \\\\\n 0 & 1\n \\end{matrix}\\right)\n \\\\ \\\\\n (1 ~ 2) \\mapsto \\left(\\begin{matrix}\n 1 & 1 \\\\\n 0 & 1\n \\end{matrix}\\right)\n \\qquad\n (1 ~ 3) \\mapsto \\left(\\begin{matrix}\n 1 & 0 \\\\\n 1 & 1\n \\end{matrix}\\right)\n \\qquad\n (2 ~ 3) \\mapsto \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 0\n \\end{matrix}\\right)\n \\\\ \\\\\n (1 ~ 2 ~ 3) \\mapsto \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 1\n \\end{matrix}\\right)\n \\qquad\n (3 ~ 2 ~ 1) \\mapsto \\left(\\begin{matrix}\n 1 & 1 \\\\\n 1 & 0\n \\end{matrix}\\right)\n\\end{gather*}\n$$\n\n\nBigger Linear Groups\n--------------------\n\nOf course, there is nothing special about GF(2) in this definition.\nFor any field *K*, the general linear group GL(*n*, *K*) is composed of invertible\n *n*×*n* matrices under matrix multiplication.\n\nFor fields other than GF(2), a matrix can have a determinant other than 1.\nSince the determinant is multiplicative, the product of two determinant 1 matrices also has determinant 1.\nTherefore, the general linear group has a subgroup,\n the [*special linear group*](https://en.wikipedia.org/wiki/Special_linear_group)\n SL(*n*, *K*), consisting of these matrices.\n\n\n
\n\nHaskell implementation of GL and SL for prime fields\n\nThis implementation will be based on the `Matrix` type from the first post.\nAssume we have already defined matrix multiplication and addition.\n\n::: {#d70bf3ee .cell execution_count=3}\n``` {.haskell .cell-code}\nimport Data.Array (listArray, bounds, elems)\nimport Data.List (unfoldr)\n\n-- Partition a list into lists of length n\nreshape :: Int -> [a] -> [[a]]\nreshape n = unfoldr (reshape' n) where\n reshape' n x = if null x then Nothing else Just $ splitAt n x\n\n-- Convert list of lists to Matrix\n-- Abuses listArray working across rows, then columns\ntoMatrix :: [[a]] -> Matrix a\ntoMatrix l = Mat $ listArray ((0,0),(n-1,m-1)) $ concat l where\n m = length $ head l\n n = length l\n\n-- Convert Matrix to list of lists\nfromMatrix :: Matrix a -> [[a]]\nfromMatrix (Mat m) = let (_,(_,n)) = bounds m in reshape (n+1) $ elems m\n```\n:::\n\n\nWith helper functions out of the way, we can move on to generating all matrices (mod *n*).\nThen, we filter for matrices with nonzero determinant (in the case of GL) and determinant 1\n (in the case of SL).\n\n::: {#6249f6f4 .cell execution_count=4}\n``` {.haskell .cell-code}\nimport Control.Monad (replicateM)\n\n-- All m x m matrices (mod n)\nallMatrices :: Int -> Int -> [Matrix Int]\nallMatrices m n = map toMatrix $ replicateM m vectors where\n -- Construct all vectors mod n using base-n expansions and padding\n vectors = [pad $ coeffs $ asPoly n l | l <- [1..n^m-1]]\n -- Pad xs to length m with zero\n pad xs = xs ++ replicate (m - length xs) 0\n\n-- All matrices, but paired with their determinants\nmatsWithDets :: Int -> Int -> [(Matrix Int, Int)]\nmatsWithDets m n = map (\\x -> (x, determinant x `mod` n)) $ allMatrices m n\n\n-- Nonzero determinants\nmGL m n = map fst $ filter (\\(x,d) -> d /= 0) $ matsWithDets m n\n-- Determinant is 1\nmSL m n = map fst $ filter (\\(x,d) -> d == 1) $ matsWithDets m n\n```\n:::\n\n\n
\n\n\n### Projectivity\n\nAnother important matrix group is the\n [*projective general linear group*](https://en.wikipedia.org/wiki/Projective_linear_group),\n PGL(*n*, *K*).\nIn this group, two matrices are considered equal if one is a scalar multiple of the other[^2].\nBoth this and the determinant 1 constraint can apply at the same time,\n forming the *projective special linear group*, PSL(*n*, *K*).\n\n[^2]: Equivalently, the elements *are* these equivalence classes.\n The product of two classes is the set of all possible products between the two classes,\n which is another class.\n\nFor GF(2), all of these groups are the same, since the only nonzero determinant and scalar multiple is 1.\nTherefore, it's beneficial to contrast SL and PGL with another example.\n\nLet's arbitrarily examine GL(2, 5).\nSince 4 squares to 1 (mod 5) and we're working with 2×2 matrices, the determinant is unchanged\n when a matrix is scalar-multiplied by 4.\nThese multiples are identified in PSL.\nOn the other hand, in PGL, there are classes of matrices with determinant 2 and 3, which do not square to 1.\nThese classes are exactly the ones which are \"left out\" of PSL.\n\n$$\n\\begin{matrix}\n \\boxed{ \\begin{gather*}\n \\large \\text{GL}(2, 5)\n \\\\\n \\underset{\\det = 4}{\n \\scriptsize\n \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 1\n \\end{matrix} \\right) },\n \\textcolor{red}{ \\underset{\\det = 1}{\n \\scriptsize\n \\left(\\begin{matrix}\n 0 & 2 \\\\\n 2 & 2\n \\end{matrix} \\right)\n }},\n \\underset{\\det = 2}{\n \\scriptsize\n \\left(\\begin{matrix}\n 1 & 0 \\\\\n 0 & 2\n \\end{matrix} \\right)\n },\n \\underset{\\det = 3}{\n \\scriptsize\n \\left(\\begin{matrix}\n 2 & 0 \\\\\n 0 & 4\n \\end{matrix} \\right)\n },\n ...\n \\end{gather*} }\n & \\twoheadrightarrow &\n \\boxed{ \\begin{gather*}\n \\large \\text{PGL}(2,5)\n \\\\\n \\underset{\\det = 1, ~4}{\n \\scriptsize\n \\textcolor{red}{\\left\\{\n \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 1\n \\end{matrix} \\right),\n \\left(\\begin{matrix}\n 0 & 2 \\\\\n 2 & 2\n \\end{matrix} \\right),\n ...\n \\right\\}\n }}\n \\\\\n \\underset{\\det = 2, ~ 3}{\n \\scriptsize\n \\left\\{ \\left(\\begin{matrix}\n 1 & 0 \\\\\n 0 & 2\n \\end{matrix} \\right),\n \\left(\\begin{matrix}\n 2 & 0 \\\\\n 0 & 4\n \\end{matrix} \\right),\n ...\n \\right\\}\n }\n \\\\\n ...\n \\end{gather*}\n }\n \\\\ \\\\\n \\boxed{ \\begin{gather*}\n \\large \\text{SL}(2,5)\n \\\\\n \\textcolor{red}{\n \\scriptsize\n \\left(\\begin{matrix}\n 0 & 2 \\\\\n 2 & 2\n \\end{matrix} \\right)\n },\n \\scriptsize\n \\left(\\begin{matrix}\n 0 & 3 \\\\\n 3 & 3\n \\end{matrix} \\right),\n ...\n \\end{gather*} }\n & \\twoheadrightarrow &\n \\boxed{ \\begin{gather*}\n \\large \\text{PSL}(2,5)\n \\\\\n \\textcolor{red}{ \\left\\{\n \\scriptsize\n \\left(\\begin{matrix}\n 0 & 2 \\\\\n 2 & 2\n \\end{matrix} \\right),\n \\left(\\begin{matrix}\n 0 & 3 \\\\\n 3 & 3\n \\end{matrix} \\right),\n ...\n \\right\\} }\n ...\n \\end{gather*} }\n\\end{matrix}\n$$\n\n::: {#7533e591 .cell execution_count=5}\n``` {.haskell .cell-code code-fold=\"true\" code-summary=\"Haskell implementation of PGL and PSL for prime fields\"}\nimport Data.List (nubBy)\n\n-- PGL and PSL require special equality.\n-- It's certainly possible to write a definition which makes the classes explicit, as its own new type.\n-- We could then define equality on this type through `Eq`.\n-- This is rather inefficient, though, so I'll choose to work with the representatives instead.\n\n-- Scalar-multiply a matrix (mod p)\nscalarTimes :: Int -> Int -> Matrix Int -> Matrix Int\nscalarTimes n k = fmap ((`mod` n) . (*k))\n\n-- Construct all scalar multiples mod n, then check if ys is any of them.\n-- This is ludicrously inefficient, and only works for fields.\nprojEq :: Int -> Matrix Int -> Matrix Int -> Bool\nprojEq n xs ys = ys `elem` [scalarTimes n k xs | k <- [1..n-1]]\n\n-- Strip out duplicates in GL and SL with projective equality\nmPGL m n = nubBy (projEq n) $ mGL m n\nmPSL m n = nubBy (projEq n) $ mSL m n\n```\n:::\n\n\n### Exceptional Isomorphisms\n\nWhen *K* is a finite field, the smaller PSLs turn out specify some interesting groups.\nWe've studied the case of PSL(2, 2) being isomorphic to *S*~3~ already, but it is also the case that:\n\n$$\n\\begin{align*}\n &\\text{PSL}(2,3) \\cong A_4 & & \\text{(order 24)}\n \\\\ \\\\\n &\\text{PSL}(2,4) \\cong \\text{PSL}(2,5) \\cong A_5 & & \\text{(order 60)}\n \\\\ \\\\\n &\\text{PSL}(2,7) \\cong \\text{PSL}(3,2) & & \\text{(order 168)}\n\\end{align*}\n$$\n\nThese relationships can be proven abstractly (and frequently are!).\nHowever, I always found myself wanting.\nFor PSL(2, 3) and *A*~4~, it's trivial to assign elements to one another by hand.\nBut *A*~5~ is getting untenable, to say nothing of PSL(2, 7).\nIn these circumstances, it's a good idea to leverage the computer.\n\n\nWarming Up: *A*~5~ and PSL(2, 5)\n--------------------------------\n\n*A*~5~, the alternating group on 5 elements, is composed of the\n [even](https://en.wikipedia.org/wiki/Parity_of_a_permutation) permutations of 5 elements.\nIt also happens to describe the rotations of an icosahedron.\nWithin the group, there are three kinds of elements:\n\n- The product of two 2-cycles, such as *a* = (1 2)(3 4)\n - On an icosahedron, this corresponds to a 180 degree rotation\n (or more precisely, 1/2 of a turn) about an edge\n- 5-cycles, such as *b* = (1 2 3 4 5)\n - This corresponds to a 72 degree rotation (1/5 of a turn)\n around a vertex\n- 3-cycles, such as *ab* = (2 4 5)\n - This corresponds to a 120 degree rotation (1/3 of a turn)\n around the center of a face\n\nIt happens to be the case that all elements of the group can be expressed\n as a product between *a* and *b* -- they generate the group.\n\n\n### Mapping to Matrices\n\nTo create a correspondence with PSL(2, 5), we need to identify permutations with matrices.\nObviously, the identity permutation goes to the identity matrix.\nThen, since *a* and *b* generate the group, we can search for two matrices which\n obey the same relations (under projective equality, since we're working in PSL).\n\nFortunately, we have a computer, so we can search for candidates rather quickly.\nFirst, let's note a matrix *B* which is cyclic of order 5 to correspond with *b*:\n\n::: {#66d77e13 .cell execution_count=6}\n``` {.haskell .cell-code code-fold=\"true\" code-summary=\"Haskell implementation of finding candidates for B\"}\n-- Repeatedly apply f to p, until the predicate z\n-- (usually equality to some quantity) becomes True.\n-- Get the length of the resulting list\norderWith :: Eq a => (a -> a -> a) -> (a -> Bool) -> a -> Int\norderWith f z p = (+1) $ length $ takeWhile (not . z) $ iterate (f p) p\n\n-- Order with respect to PSL(2, 5): using matrix multiplication (mod 5)\n-- and projective equality to the identity matrix\norderPSL25 = orderWith (\\x -> fmap (`mod` 5) . (x |*|)) (projEq 5 $ eye 2)\n\n-- Only order 5 elements of PSL(2, 5)\npsl25_order5 = filter ((==5) . orderPSL25) $ mPSL 2 5\n\nmarkdown $ (\"$$B = \" ++) $ (++ \"... $$\") $ intercalate \" ~,~ \" $\n take 5 $ map texifyMatrix psl25_order5\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$B = \\left( \\begin{matrix}2 & 0 \\\\ 1 & 2\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}2 & 0 \\\\ 2 & 2\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}2 & 0 \\\\ 3 & 2\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}2 & 0 \\\\ 4 & 2\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}0 & 1 \\\\ 1 & 1\\end{matrix} \\right)... $$\n:::\n:::\n\n\nArbitrarily, let's pick the last entry on this list.\nNow, we can search for order-2 elements in PSL(2, 5) whose product with *B* has order 3.\nThis matrix (*A*) matches exactly with *a* in *A*~5~.\n\n::: {#0ed9f298 .cell execution_count=7}\n``` {.haskell .cell-code code-fold=\"true\" code-summary=\"Haskell implementation using B as a generator to find candidates for A\"}\n-- Start with B as a generator\npsl25_gen_B = toMatrix [[0,2],[2,2]]\n\n-- Only order 2 elements of PSL(2, 5)\npsl25_order2 = filter ((==2) . orderPSL25) $ mPSL 2 5\n\n-- Find an order 2 element whose product with `psl25_gen_B` has order 3\npsl25_gen_A_candidates = filter ((==3) . orderPSL25 . (psl25_gen_B |*|))\n psl25_order2\n\nmarkdown $ (\"$$A = \" ++) $ (++ \"$$\") $ intercalate \" ~,~ \" $\n map texifyMatrix psl25_gen_A_candidates\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$A = \\left( \\begin{matrix}1 & 0 \\\\ 0 & 4\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}1 & 0 \\\\ 2 & 4\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}2 & 1 \\\\ 2 & 3\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}1 & 2 \\\\ 0 & 4\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}2 & 2 \\\\ 1 & 3\\end{matrix} \\right)$$\n:::\n:::\n\n\nAgain, arbitrarily, we'll pick the last entry from this list.\nLet's also peek at what the matrix *AB* looks like.\n\n::: {#4645affa .cell execution_count=8}\n``` {.haskell .cell-code code-fold=\"true\"}\npsl25_gen_AB = (`mod` 5) <$> (psl25_gen_A_candidates !! 4) |*| psl25_gen_B\n\nmarkdown $ (\"$$\" ++) $ (++ \"$$\") $ intercalate \" \\\\quad \" [\n \"(AB) = \" ++ texifyMatrix psl25_gen_AB,\n \"(AB)^3 = \" ++ texifyMatrix ((`mod` 5) <$> (psl25_gen_AB^3))\n ]\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$(AB) = \\left( \\begin{matrix}4 & 3 \\\\ 1 & 3\\end{matrix} \\right) \\quad (AB)^3 = \\left( \\begin{matrix}2 & 0 \\\\ 0 & 2\\end{matrix} \\right)$$\n:::\n:::\n\n\nWe now have a correspondence between three elements of *A*~5~ and PSL(2, 5).\nWe can \"run\" both sets of the generators until we associate all elements to one another.\nThis is most visually appealing to see as a Cayley graph[^3]:\n\n[\n ![\n Cayley graph showing an isomorphism between A5 and PSL(2, 5).
\n Order-2 elements are red, order-3 elements are green, and order-5 elements are blue.\n Purple arrows are order-5 generators, orange arrows are order-2 generators.\n ](./a5_psl25_cayley.png){.narrow}\n](./a5_psl24_cayley.png)\n\n[^3]: Different generators appear to be used for *A* and *B* due to some\n self-imposed turbulence when writing the original post.\n Under projective equality, both are the same as our choices of *A* and *B*.\n\n\nPSL(2, 4)\n---------\n\nWe could do the same for PSL(2, 4), but we can't just work modulo 4\n -- remember, the elements of GF(4) are 0, 1, *α*, and *α*^2^.\nIt follows that GL(2, 4) is composed of (invertible) matrices of those elements,\n and SL(2, 4) is composed of matrices with determinant 1.\n\n$$\n\\begin{matrix}\n \\boxed{ \\begin{gather*}\n \\large \\text{GL}(2, 4)\n \\\\\n \\textcolor{red}{ \\underset{\\det = 1}{\n \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 1\n \\end{matrix} \\right)\n }},\n \\underset{\\det = \\alpha + 1}{\n \\left(\\begin{matrix}\n 0 & \\alpha \\\\\n \\alpha & \\alpha\n \\end{matrix} \\right)\n },\n \\underset{\\det = \\alpha}{\n \\left(\\begin{matrix}\n 1 & 0 \\\\\n 0 & \\alpha\n \\end{matrix} \\right)\n },\n \\textcolor{red}{\n \\underset{\\det = 1}{\n \\left(\\begin{matrix}\n \\alpha & 0 \\\\\n 0 & \\alpha^2\n \\end{matrix} \\right)\n }},\n ...\n \\end{gather*} }\n \\\\ \\\\\n \\boxed{ \\begin{gather*}\n \\large \\text{SL}(2,4)\n \\\\\n \\textcolor{red}{\n \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 1\n \\end{matrix} \\right)\n },\n \\textcolor{red}{\n \\left(\\begin{matrix}\n \\alpha & 0 \\\\\n 0 & \\alpha^2\n \\end{matrix} \\right)\n },\n ...\n \\end{gather*} }\n\\end{matrix}\n$$\n\nScalar multiplication by *α* multiplies the determinant by *α*^2^;\n by *α*^2^ multiplies the determinant by *α*^4^ = *α*.\nThus, SL(2, 4) is also PSL(2, 4), since a scalar multiple has determinant 1.\n\nLet's start by looking at an order-5 matrix over PSL(2, 4).\nWe'll call this matrix *B*' to correspond with our order-5 generator in PSL(2, 5).\n\n$$\n\\begin{gather*}\n B' = \\left(\\begin{matrix}\n 0 & \\alpha \\\\\n \\alpha^2 & \\alpha^2\n \\end{matrix} \\right)\n \\qquad\n (B')^2 = \\left(\\begin{matrix}\n 1 & 1 \\\\\n \\alpha & \\alpha^2\n \\end{matrix}\\right)\n \\qquad\n (B')^3 = \\left(\\begin{matrix}\n \\alpha^2 & 1 \\\\\n \\alpha & 1\n \\end{matrix}\\right)\n \\\\\n (B')^4 = \\left(\\begin{matrix}\n \\alpha^2 & \\alpha \\\\\n \\alpha^2 & 0\n \\end{matrix}\\right)\n \\qquad\n (B')^5 = \\left(\\begin{matrix}\n 1 & 0 \\\\\n 0 & 1\n \\end{matrix}\\right)\n \\\\ \\\\\n \\det B' = 0\\alpha^2 - \\alpha^3 = 1\n\\end{gather*}\n$$\n\n\nWe need to be able to do three things over GL(2, 4) on a computer:\n\n- multiply matrices over GF(4),\n- compute their determinant,\n- visually distinguish between each of them, and\n- be able to systematically write down all of them\n\nIt would then follow for us to repeat what we did with with SL(2, 5).\nBut as I've said, working symbolically is hard for computers, and the methods described for prime fields\n do not work in general with prime power fields.\nFortunately, we're amply prepared to find a solution.\n\n\n### Bootstrapping Matrices\n\nRecall that the elements of GF(4) can also be written as the zero matrix, the identity matrix,\n *C*~*p*~, and *C*~*p*~^2^ (where *C*~*p*~ is the companion matrix of *p*(x)\n and again, *p*(x) = *x*^2^ + *x* + 1).\nThis means we can also write elements of GL(2, 4) as matrices *of matrices*.\nArithmetic works exactly the same as it does symbolically\n -- we just replace all instances of *α* in *B*' with *C*~*p*~.\n\n$$\n\\begin{gather*}\n f^* : \\mathbb{F}_4 {}^{2 \\times 2} \\rightarrow (\\mathbb{F}_2 {}^{2 \\times 2})^{2 \\times 2}\n \\\\ \\\\\n \\begin{align*}\n \\bar {B'} = f^*(B') &= \\left(\\begin{matrix}\n f(0) & f(\\alpha) \\\\\n f(\\alpha^2) & f(\\alpha^2)\n \\end{matrix} \\right) = \\left(\\begin{matrix}\n {\\bf 0} & C_p \\\\\n C_p {}^2 & C_p {}^2\n \\end{matrix} \\right) \\\\\n &= \\left(\\begin{matrix}\n \\left(\\begin{matrix} 0 & 0 \\\\ 0 & 0 \\end{matrix} \\right) &\n \\left(\\begin{matrix} 0 & 1 \\\\ 1 & 1 \\end{matrix} \\right) \\\\\n \\left(\\begin{matrix} 1 & 1 \\\\ 1 & 0 \\end{matrix} \\right) &\n \\left(\\begin{matrix} 1 & 1 \\\\ 1 & 0 \\end{matrix} \\right)\n \\end{matrix} \\right)\n \\\\ ~ \\\\\n (f^*(\\bar B'))^2 &= \\left(\\begin{matrix}\n ({\\bf 0})({\\bf 0}) + C_p {}^3 & ({\\bf 0})C_p +C_p {}^3 \\\\\n ({\\bf 0})C_p {}^2 + C_p {}^4 & C_p {}^3 + C_p {}^4\n \\end{matrix} \\right) \\\\\n &= \\left(\\begin{matrix}\n I & I \\\\\n C_p {} & C_p {}^2\n \\end{matrix} \\right) = \\left(\\begin{matrix}\n f(1) & f(1) \\\\\n f(\\alpha) & f(\\alpha^2)\n \\end{matrix} \\right) =\n f^*((B')^2)\n \\end{align*}\n\\end{gather*}\n$$\n\nMake no mistake, this is *not* a [block matrix](https://en.wikipedia.org/wiki/Block_matrix),\n at least not a typical one.\nNamely, the layering means that the determinant (which signifies its membership in SL) is another matrix:\n\n$$\n\\begin{align*}\n \\det( f^*(B') ) &= {\\bf 0} (C_p {}^2) - (C_p)(C_p {}^2)\n \\\\\n &= \\left(\\begin{matrix}\n 0 & 0 \\\\\n 0 & 0\n \\end{matrix} \\right)\n \\left(\\begin{matrix}\n 1 & 1 \\\\\n 1 & 0\n \\end{matrix} \\right)\n - \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 1\n \\end{matrix} \\right)\n \\left(\\begin{matrix}\n 1 & 1 \\\\\n 1 & 0 \\end{matrix}\n \\right)\n \\\\\n &= \\left(\\begin{matrix}\n 1 & 0 \\\\\n 0 & 1\n \\end{matrix} \\right) \\mod 2\n \\\\\n &= I = f(\\det(B'))\n\\end{align*}\n$$\n\nSince *B*' is in SL(2, 4), the determinant is unsurprisingly *f*(1) = I.\nThe (matrix) determinants of *f*\\* applied to other elements of GL(2, 4)\n could just as well be *f*(*α*) = *C*~*p*~ or *f*(*α*^2^) = *C*~*p*~^2^.\n\n\n### Implementation\n\nUsing this method, we can implement PSL(2, 4) directly.\nAll we need to do is find all possible 4-tuples of **0**, *I*, *C*~*p*~, and *C*~*p*~^2^,\n then arrange each into a 2x2 matrix.\nMultiplication follows from the typical definition and the multiplicative identity is just *f*\\*(*I*).\n\n::: {#55360e06 .cell execution_count=9}\n``` {.haskell .cell-code code-fold=\"true\" code-summary=\"Haskell implementation of PSL(2, 4)\"}\nimport Data.List (elemIndex)\n\n-- Matrices which obey the same relations as the elements of GF(4)\nzero_f4 = zero 2\none_f4 = eye 2\nalpha_f4 = toMatrix [[0,1],[1,1]]\nalpha2_f4 = toMatrix [[1,1],[1,0]]\n\n-- Gathered into a list\nfield4 = [zero_f4, one_f4, alpha_f4, alpha2_f4]\n\n-- Convenient show function for these matrices\nshowF4 x = case elemIndex x field4 of\n Just 0 -> \"0\"\n Just 1 -> \"1\"\n Just 2 -> \"α\"\n Just 3 -> \"α^2\"\n Nothing -> \"N/A\"\n\n-- Identity matrix over GF(4)\npsl_24_identity = toMatrix [[one_f4, zero_f4], [zero_f4, one_f4]]\n\n-- All possible matrices over GF(4)\n-- Create a list of 4-lists of elements from GF(4), then\n-- Shape them into 2x2 matrices\nf4_matrices = map (toMatrix . reshape 2) $ replicateM 4 field4\n\n-- Sieve out those which have a determinant of 1 in the field\nmPSL24 = filter ((==one_f4) . fmap (`mod` 2) . laplaceDet) f4_matrices\n```\n:::\n\n\nNow that we can generate the group, we can finally repeat what we did with PSL(2, 5).\nAll we have to do is filter out order-2 elements, then further filter\n for those which have an order-3 product with *B*'.\n\n::: {#891047c6 .cell execution_count=10}\n``` {.haskell .cell-code code-fold=\"true\" code-summary=\"Haskell implementation using *B*' as a generator to find candidates for *A*'\"}\n-- Order with respect to PSL(2, 4): using matrix multiplication (mod 2)\n-- and projective equality to the identity matrix\norderPSL24 = orderWith (\\x -> fmap (fmap (`mod` 2)) . (x*)) (== psl_24_identity)\n\n-- Only order 2 elements of PSL(2, 4)\npsl24_order2 = filter ((==2) . orderPSL24) mPSL24\n\n-- Start with B as a generator\npsl24_gen_B = toMatrix [[zero_f4, alpha_f4], [alpha2_f4, alpha2_f4]]\n\n-- Find an order 2 element whose product with `psl24_gen_B` has order 3\npsl24_gen_A_candidates = filter ((==3) . orderPSL24 . (psl24_gen_B |*|))\n psl24_order2\n\nmarkdown $ (\"$$ A' = \" ++) $ (++ \"$$\") $ intercalate \" ~,~ \" $\n map (texifyMatrix' showF4) psl24_gen_A_candidates\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$ A' = \\left( \\begin{matrix}0 & 1 \\\\ 1 & 0\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}0 & α^2 \\\\ α & 0\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}1 & 0 \\\\ 1 & 1\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}1 & α^2 \\\\ 0 & 1\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}α & 1 \\\\ α & α\\end{matrix} \\right)$$\n:::\n:::\n\n\nWe'll pick the second entry as our choice of *A*'.\nWe note that the product *A'B'*, does indeed have order 3.\n\n::: {#8ad49800 .cell execution_count=11}\n``` {.haskell .cell-code code-fold=\"true\"}\npsl24_gen_AB = fmap (`mod` 2) <$> (psl24_gen_A_candidates !! 1) |*| psl24_gen_B\n\nmarkdown $ (\"$$\" ++) $ (++ \"$$\") $ intercalate \" \\\\quad \" [\n \"(A'B') = \" ++ texifyMatrix' showF4 psl24_gen_AB,\n \"(A'B')^3 = \" ++ texifyMatrix' showF4 (fmap (`mod` 2) <$> (psl24_gen_AB^3))\n ]\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$(A'B') = \\left( \\begin{matrix}α & α \\\\ 0 & α^2\\end{matrix} \\right) \\quad (A'B')^3 = \\left( \\begin{matrix}1 & 0 \\\\ 0 & 1\\end{matrix} \\right)$$\n:::\n:::\n\n\nFinally, we can arrange these matrices on a Cayley graph in the same way as PSL(2, 5):\n\n[\n ![\n Cayley graph showing an isomorphism between *A*~5~ and PSL(2, 4).
\n Colors indicate the same thing as in the previous diagram.\n ](./a5_psl24_cayley.png){.narrow}\n](./a5_psl24_cayley.png)\n\n\nClosing\n-------\n\nThis post addresses my original goal in implementing finite fields,\n namely computationally finding an explicit map between *A*~5~ and PSL(2, 4).\nI believe the results are a little more satisfying than attempting to wrap your head\n around group-theoretic proofs.\nThat's not to discount the power and incredible logic that goes into the latter method.\nIt does tend to leave things rather opaque, however.\n\nIf you'd prefer a more interactive diagram showing the above isomorphisms,\n I've gone to the liberty of creating a hoverable SVG:\n\n[\n ![\n Click to open interactive version\n ](./a5_psl24_psl25_isomorphism.svg){.narrow}\n](./a5_psl24_psl25_isomorphism.svg)\n\nThis post slightly diverts our course from the previous one's focus on fields.\nThe [next one](../4) will focus on more results regarding the treatment of layered matrices.\nThe algebraic consequences of this structure are notable in and of themselves,\n and are entirely obfuscated by the usual interpretation of block matrices.\n\nDiagrams created with Geogebra and Inkscape.\n\n", + "markdown": "---\ntitle: \"Exploring Finite Fields, Part 3: Roll a d20\"\ndescription: |\n When we extend fields with matrices, what other structures do we encounter?\nformat:\n html:\n html-math-method: katex\ndate: \"2024-02-03\"\ndate-modified: \"2025-08-04\"\ncategories:\n - algebra\n - finite field\n - haskell\n---\n\n\n\n\n\nIn the [previous post](../2), we focused on constructing finite fields using *n*×*n* matrices.\nThese matrices came from from primitive polynomials of degree *n* over GF(*p*),\n and could be used to do explicit arithmetic over GF(*p*^*n*^).\nIn this post, we'll look at a way to apply this in describing certain groups.\n\n\nWeakening the Field\n-------------------\n\nRecall the way we defined GF(4) in the first post.\nWe took the irreducible polynomial *p*(*x*) = *x*^2^ + *x* + 1, called its root *α*,\n and created addition and multiplication tables spanning the four elements.\nAfter the second post, we can do this more cleverly by mapping *α*\n to the companion matrix *C*~*p*~ over GF(2).\n\n$$\n\\begin{gather*}\n f : \\mathbb{F_4} \\longrightarrow \\mathbb{F}_2 {}^{2 \\times 2}\n \\\\[10pt]\n 0 \\mapsto \\left(\\begin{matrix}\n 0 & 0 \\\\\n 0 & 0\n \\end{matrix}\\right)\n ~~\n 1 \\mapsto \\left(\\begin{matrix}\n 1 & 0 \\\\\n 0 & 1\n \\end{matrix}\\right) = I\n ~~\n \\alpha \\mapsto \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 1\n \\end{matrix}\\right) = C_p\n \\\\ \\\\\n \\textcolor{red}{\\alpha} + \\textcolor{blue}{1} = \\alpha^2 \\mapsto\n \\left(\\begin{matrix}\n 1 & 1 \\\\\n 1 & 0\n \\end{matrix}\\right) =\n \\textcolor{red} {\n \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 1\n \\end{matrix}\\right)\n }\n + \\textcolor{blue}{\n \\left(\\begin{matrix}\n 1 & 0 \\\\\n 0 & 1\n \\end{matrix}\\right)\n }\\mod 2\n\\end{gather*}\n$$\n\nIn the images of *f*, the zero matrix has determinant 0 and all other elements have determinant 1.\nTherefore, the product of any two nonzero matrices always has determinant 1,\n and a nonzero determinant means the matrix is invertible.\nPer the definition of the field, the non-zero elements form a group with respect to multiplication.\nHere, they form a cyclic group of order 3, since *C*~*p*~^3^ = *I* mod 2.\nThis is also true using symbols, since *α*^3^ = 1.\n\n\n### Other Matrices\n\nHowever, there are more 2×2 matrices over GF(2) than just these.\nThere are two possible values in four locations, so there are 24 = 16 matrices,\n or 12 more than we've identified.\n\n$$\n\\begin{array}{c|c}\n \\#\\{a_{ij} = 1\\} & \\det = 0 & \\det = 1\n \\\\ \\hline\n 1 &\n \\scriptsize\n \\left(\\begin{matrix}\n 0 & 1 \\\\\n 0 & 0\n \\end{matrix}\\right)\n ~~\n \\left(\\begin{matrix}\n 1 & 0 \\\\\n 0 & 0\n \\end{matrix}\\right)\n ~~\n \\left(\\begin{matrix}\n 0 & 0 \\\\\n 0 & 1\n \\end{matrix}\\right)\n ~~\n \\left(\\begin{matrix}\n 0 & 0 \\\\\n 1 & 0\n \\end{matrix}\\right)\n \\\\\n 2 &\n \\scriptsize\n \\left(\\begin{matrix}\n 1 & 1 \\\\\n 0 & 0\n \\end{matrix}\\right)\n ~~\n \\left(\\begin{matrix}\n 0 & 0 \\\\\n 1 & 1\n \\end{matrix}\\right)\n ~~\n \\left(\\begin{matrix}\n 0 & 1 \\\\\n 0 & 1\n \\end{matrix}\\right)\n ~~\n \\left(\\begin{matrix}\n 1 & 1 \\\\\n 0 & 0\n \\end{matrix}\\right)\n &\n \\scriptsize\n \\textcolor{red}{\n \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 0\n \\end{matrix}\\right)\n }\n \\\\\n 3 & &\n \\scriptsize\n \\textcolor{red}{\n \\left(\\begin{matrix}\n 1 & 1 \\\\\n 0 & 1\n \\end{matrix}\\right)\n }\n ~~\n \\textcolor{red}{\n \\left(\\begin{matrix}\n 1 & 0 \\\\\n 1 & 1\n \\end{matrix}\\right)\n }\n \\\\\n 4 &\n \\scriptsize\n \\left(\\begin{matrix}\n 1 & 1 \\\\\n 1 & 1\n \\end{matrix}\\right)\n\\end{array}\n$$\n\nThe matrices in the right column (in red) have determinant 1, which means they can *also* multiply\n with our field-like elements without producing a singular matrix.\nThis forms a larger group, of which our field's multiplication group is a subgroup.\nHowever, it is *not* commutative, since matrix multiplication is not commutative in general.\n\nThe group of all six matrices with nonzero determinant is called the\n [*general linear group*](https://en.wikipedia.org/wiki/General_linear_group)\n of degree 2 over GF(2), written[^1] GL(2, 2).\nWe can sort the elements into classes by their order, or the number of times we have\n to multiply them before getting to the identity matrix (mod 2):\n\n[^1]: Unfortunately, it's rather easy to confuse \"GF\" with \"GL\".\n Remember that \"F\" is for \"field\", with the former standing for \"Galois field\".\n\n$$\n\\begin{array}{}\n \\text{Order 1} & \\text{Order 2} & \\text{Order 3}\n \\\\ \\hline\n \\left(\\begin{matrix}\n 1 & 0 \\\\\n 0 & 1\n \\end{matrix}\\right)\n &\n \\begin{align*}\n \\left(\\begin{matrix}\n 1 & 1 \\\\\n 0 & 1\n \\end{matrix}\\right)\n \\\\\n \\left(\\begin{matrix}\n 1 & 0 \\\\\n 1 & 1\n \\end{matrix}\\right)\n \\\\\n \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 0\n \\end{matrix}\\right)\n \\end{align*}\n &\n \\begin{align*}\n \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 1\n \\end{matrix}\\right)\n \\\\\n \\left(\\begin{matrix}\n 1 & 1 \\\\\n 1 & 0\n \\end{matrix}\\right)\n \\end{align*}\n\\end{array}\n$$\n\nIf you've studied enough group theory, you know that there are two groups of order 6:\n the cyclic group of order 6, *C*~6~, and the symmetric group on three elements, *S*~3~.\nSince the former group has order-6 elements, but none of these matrices are of order 6,\n the matrix group must be isomorphic to the latter.\nSince the group is small, it's not too difficult to construct an isomorphism between the two.\nWriting the elements of *S*~3~ in [cycle notation](/posts/math/permutations/1/), we have:\n\n$$\n\\begin{gather*}\n e \\mapsto \\left(\\begin{matrix}\n 1 & 0 \\\\\n 0 & 1\n \\end{matrix}\\right)\n \\\\ \\\\\n (1 ~ 2) \\mapsto \\left(\\begin{matrix}\n 1 & 1 \\\\\n 0 & 1\n \\end{matrix}\\right)\n \\qquad\n (1 ~ 3) \\mapsto \\left(\\begin{matrix}\n 1 & 0 \\\\\n 1 & 1\n \\end{matrix}\\right)\n \\qquad\n (2 ~ 3) \\mapsto \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 0\n \\end{matrix}\\right)\n \\\\ \\\\\n (1 ~ 2 ~ 3) \\mapsto \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 1\n \\end{matrix}\\right)\n \\qquad\n (3 ~ 2 ~ 1) \\mapsto \\left(\\begin{matrix}\n 1 & 1 \\\\\n 1 & 0\n \\end{matrix}\\right)\n\\end{gather*}\n$$\n\n\nBigger Linear Groups\n--------------------\n\nOf course, there is nothing special about GF(2) in this definition.\nFor any field *K*, the general linear group GL(*n*, *K*) is composed of invertible\n *n*×*n* matrices under matrix multiplication.\n\nFor fields other than GF(2), a matrix can have a determinant other than 1.\nSince the determinant is multiplicative, the product of two determinant 1 matrices also has determinant 1.\nTherefore, the general linear group has a subgroup,\n the [*special linear group*](https://en.wikipedia.org/wiki/Special_linear_group)\n SL(*n*, *K*), consisting of these matrices.\n\n\n
\n\nHaskell implementation of GL and SL for prime fields\n\nThis implementation will be based on the `Matrix` type from the first post.\nAssume we have already defined matrix multiplication and addition.\n\n::: {#d70bf3ee .cell execution_count=3}\n``` {.haskell .cell-code}\nimport Data.Array (listArray, bounds, elems)\nimport Data.List (unfoldr)\n\n-- Partition a list into lists of length n\nreshape :: Int -> [a] -> [[a]]\nreshape n = unfoldr (reshape' n) where\n reshape' n x = if null x then Nothing else Just $ splitAt n x\n\n-- Convert list of lists to Matrix\n-- Abuses listArray working across rows, then columns\ntoMatrix :: [[a]] -> Matrix a\ntoMatrix l = Mat $ listArray ((0,0),(n-1,m-1)) $ concat l where\n m = length $ head l\n n = length l\n\n-- Convert Matrix to list of lists\nfromMatrix :: Matrix a -> [[a]]\nfromMatrix (Mat m) = let (_,(_,n)) = bounds m in reshape (n+1) $ elems m\n```\n:::\n\n\nWith helper functions out of the way, we can move on to generating all matrices (mod *n*).\nThen, we filter for matrices with nonzero determinant (in the case of GL) and determinant 1\n (in the case of SL).\n\n::: {#6249f6f4 .cell execution_count=4}\n``` {.haskell .cell-code}\nimport Control.Monad (replicateM)\n\n-- All m x m matrices (mod n)\nallMatrices :: Int -> Int -> [Matrix Int]\nallMatrices m n = map toMatrix $ replicateM m vectors where\n -- Construct all vectors mod n using base-n expansions and padding\n vectors = [pad $ coeffs $ asPoly n l | l <- [1..n^m-1]]\n -- Pad xs to length m with zero\n pad xs = xs ++ replicate (m - length xs) 0\n\n-- All matrices, but paired with their determinants\nmatsWithDets :: Int -> Int -> [(Matrix Int, Int)]\nmatsWithDets m n = map (\\x -> (x, determinant x `mod` n)) $ allMatrices m n\n\n-- Nonzero determinants\nmGL m n = map fst $ filter (\\(x,d) -> d /= 0) $ matsWithDets m n\n-- Determinant is 1\nmSL m n = map fst $ filter (\\(x,d) -> d == 1) $ matsWithDets m n\n```\n:::\n\n\n
\n\n\n### Projectivity\n\nAnother important matrix group is the\n [*projective general linear group*](https://en.wikipedia.org/wiki/Projective_linear_group),\n PGL(*n*, *K*).\nIn this group, two matrices are considered equal if one is a scalar multiple of the other[^2].\nBoth this and the determinant 1 constraint can apply at the same time,\n forming the *projective special linear group*, PSL(*n*, *K*).\n\n[^2]: Equivalently, the elements *are* these equivalence classes.\n The product of two classes is the set of all possible products between the two classes,\n which is another class.\n\nFor GF(2), all of these groups are the same, since the only nonzero determinant and scalar multiple is 1.\nTherefore, it's beneficial to contrast SL and PGL with another example.\n\nLet's arbitrarily examine GL(2, 5).\nSince 4 squares to 1 (mod 5) and we're working with 2×2 matrices, the determinant is unchanged\n when a matrix is scalar-multiplied by 4.\nThese multiples are identified in PSL.\nOn the other hand, in PGL, there are classes of matrices with determinant 2 and 3, which do not square to 1.\nThese classes are exactly the ones which are \"left out\" of PSL.\n\n$$\n\\begin{matrix}\n \\boxed{ \\begin{gather*}\n \\large \\text{GL}(2, 5)\n \\\\\n \\underset{\\det = 4}{\n \\scriptsize\n \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 1\n \\end{matrix} \\right) },\n \\textcolor{red}{ \\underset{\\det = 1}{\n \\scriptsize\n \\left(\\begin{matrix}\n 0 & 2 \\\\\n 2 & 2\n \\end{matrix} \\right)\n }},\n \\underset{\\det = 2}{\n \\scriptsize\n \\left(\\begin{matrix}\n 1 & 0 \\\\\n 0 & 2\n \\end{matrix} \\right)\n },\n \\underset{\\det = 3}{\n \\scriptsize\n \\left(\\begin{matrix}\n 2 & 0 \\\\\n 0 & 4\n \\end{matrix} \\right)\n },\n ...\n \\end{gather*} }\n & \\twoheadrightarrow &\n \\boxed{ \\begin{gather*}\n \\large \\text{PGL}(2,5)\n \\\\\n \\underset{\\det = 1, ~4}{\n \\scriptsize\n \\textcolor{red}{\\left\\{\n \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 1\n \\end{matrix} \\right),\n \\left(\\begin{matrix}\n 0 & 2 \\\\\n 2 & 2\n \\end{matrix} \\right),\n ...\n \\right\\}\n }}\n \\\\\n \\underset{\\det = 2, ~ 3}{\n \\scriptsize\n \\left\\{ \\left(\\begin{matrix}\n 1 & 0 \\\\\n 0 & 2\n \\end{matrix} \\right),\n \\left(\\begin{matrix}\n 2 & 0 \\\\\n 0 & 4\n \\end{matrix} \\right),\n ...\n \\right\\}\n }\n \\\\\n ...\n \\end{gather*}\n }\n \\\\ \\\\\n \\boxed{ \\begin{gather*}\n \\large \\text{SL}(2,5)\n \\\\\n \\textcolor{red}{\n \\scriptsize\n \\left(\\begin{matrix}\n 0 & 2 \\\\\n 2 & 2\n \\end{matrix} \\right)\n },\n \\scriptsize\n \\left(\\begin{matrix}\n 0 & 3 \\\\\n 3 & 3\n \\end{matrix} \\right),\n ...\n \\end{gather*} }\n & \\twoheadrightarrow &\n \\boxed{ \\begin{gather*}\n \\large \\text{PSL}(2,5)\n \\\\\n \\textcolor{red}{ \\left\\{\n \\scriptsize\n \\left(\\begin{matrix}\n 0 & 2 \\\\\n 2 & 2\n \\end{matrix} \\right),\n \\left(\\begin{matrix}\n 0 & 3 \\\\\n 3 & 3\n \\end{matrix} \\right),\n ...\n \\right\\} }\n ...\n \\end{gather*} }\n\\end{matrix}\n$$\n\n::: {#7533e591 .cell execution_count=5}\n``` {.haskell .cell-code code-fold=\"true\" code-summary=\"Haskell implementation of PGL and PSL for prime fields\"}\nimport Data.List (nubBy)\n\n-- PGL and PSL require special equality.\n-- It's certainly possible to write a definition which makes the classes explicit, as its own new type.\n-- We could then define equality on this type through `Eq`.\n-- This is rather inefficient, though, so I'll choose to work with the representatives instead.\n\n-- Scalar-multiply a matrix (mod p)\nscalarTimes :: Int -> Int -> Matrix Int -> Matrix Int\nscalarTimes n k = fmap ((`mod` n) . (*k))\n\n-- Construct all scalar multiples mod n, then check if ys is any of them.\n-- This is ludicrously inefficient, and only works for fields.\nprojEq :: Int -> Matrix Int -> Matrix Int -> Bool\nprojEq n xs ys = ys `elem` [scalarTimes n k xs | k <- [1..n-1]]\n\n-- Strip out duplicates in GL and SL with projective equality\nmPGL m n = nubBy (projEq n) $ mGL m n\nmPSL m n = nubBy (projEq n) $ mSL m n\n```\n:::\n\n\n### Exceptional Isomorphisms\n\nWhen *K* is a finite field, the smaller PSLs turn out specify some interesting groups.\nWe've studied the case of PSL(2, 2) being isomorphic to *S*~3~ already, but it is also the case that:\n\n$$\n\\begin{align*}\n &\\text{PSL}(2,3) \\cong A_4 & & \\text{(order 24)}\n \\\\ \\\\\n &\\text{PSL}(2,4) \\cong \\text{PSL}(2,5) \\cong A_5 & & \\text{(order 60)}\n \\\\ \\\\\n &\\text{PSL}(2,7) \\cong \\text{PSL}(3,2) & & \\text{(order 168)}\n\\end{align*}\n$$\n\nThese relationships can be proven abstractly (and frequently are!).\nHowever, I always found myself wanting.\nFor PSL(2, 3) and *A*~4~, it's trivial to assign elements to one another by hand.\nBut *A*~5~ is getting untenable, to say nothing of PSL(2, 7).\nIn these circumstances, it's a good idea to leverage the computer.\n\n\nWarming Up: *A*~5~ and PSL(2, 5)\n--------------------------------\n\n*A*~5~, the alternating group on 5 elements, is composed of the\n [even](https://en.wikipedia.org/wiki/Parity_of_a_permutation) permutations of 5 elements.\nIt also happens to describe the rotations of an icosahedron.\nWithin the group, there are three kinds of elements:\n\n- The product of two 2-cycles, such as *a* = (1 2)(3 4)\n - On an icosahedron, this corresponds to a 180 degree rotation\n (or more precisely, 1/2 of a turn) about an edge\n- 5-cycles, such as *b* = (1 2 3 4 5)\n - This corresponds to a 72 degree rotation (1/5 of a turn)\n around a vertex\n- 3-cycles, such as *ab* = (2 4 5)\n - This corresponds to a 120 degree rotation (1/3 of a turn)\n around the center of a face\n\nIt happens to be the case that all elements of the group can be expressed\n as a product between *a* and *b* -- they generate the group.\n\n\n### Mapping to Matrices\n\nTo create a correspondence with PSL(2, 5), we need to identify permutations with matrices.\nObviously, the identity permutation goes to the identity matrix.\nThen, since *a* and *b* generate the group, we can search for two matrices which\n obey the same relations (under projective equality, since we're working in PSL).\n\nFortunately, we have a computer, so we can search for candidates rather quickly.\nFirst, let's note a matrix *B* which is cyclic of order 5 to correspond with *b*:\n\n::: {#66d77e13 .cell execution_count=6}\n``` {.haskell .cell-code code-fold=\"true\" code-summary=\"Haskell implementation of finding candidates for B\"}\n-- Repeatedly apply f to p, until the predicate z\n-- (usually equality to some quantity) becomes True.\n-- Get the length of the resulting list\norderWith :: Eq a => (a -> a -> a) -> (a -> Bool) -> a -> Int\norderWith f z p = (+1) $ length $ takeWhile (not . z) $ iterate (f p) p\n\n-- Order with respect to PSL(2, 5): using matrix multiplication (mod 5)\n-- and projective equality to the identity matrix\norderPSL25 = orderWith (\\x -> fmap (`mod` 5) . (x |*|)) (projEq 5 $ eye 2)\n\n-- Only order 5 elements of PSL(2, 5)\npsl25_order5 = filter ((==5) . orderPSL25) $ mPSL 2 5\n\nmarkdown $ (\"$$B = \" ++) $ (++ \"... $$\") $ intercalate \" ~,~ \" $\n take 5 $ map texifyMatrix psl25_order5\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$B = \\left( \\begin{matrix}2 & 0 \\\\ 1 & 2\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}2 & 0 \\\\ 2 & 2\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}2 & 0 \\\\ 3 & 2\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}2 & 0 \\\\ 4 & 2\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}0 & 1 \\\\ 1 & 1\\end{matrix} \\right)... $$\n:::\n:::\n\n\nArbitrarily, let's pick the last entry on this list.\nNow, we can search for order-2 elements in PSL(2, 5) whose product with *B* has order 3.\nThis matrix (*A*) matches exactly with *a* in *A*~5~.\n\n::: {#0ed9f298 .cell execution_count=7}\n``` {.haskell .cell-code code-fold=\"true\" code-summary=\"Haskell implementation using B as a generator to find candidates for A\"}\n-- Start with B as a generator\npsl25_gen_B = toMatrix [[0,2],[2,2]]\n\n-- Only order 2 elements of PSL(2, 5)\npsl25_order2 = filter ((==2) . orderPSL25) $ mPSL 2 5\n\n-- Find an order 2 element whose product with `psl25_gen_B` has order 3\npsl25_gen_A_candidates = filter ((==3) . orderPSL25 . (psl25_gen_B |*|))\n psl25_order2\n\nmarkdown $ (\"$$A = \" ++) $ (++ \"$$\") $ intercalate \" ~,~ \" $\n map texifyMatrix psl25_gen_A_candidates\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$A = \\left( \\begin{matrix}1 & 0 \\\\ 0 & 4\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}1 & 0 \\\\ 2 & 4\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}2 & 1 \\\\ 2 & 3\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}1 & 2 \\\\ 0 & 4\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}2 & 2 \\\\ 1 & 3\\end{matrix} \\right)$$\n:::\n:::\n\n\nAgain, arbitrarily, we'll pick the last entry from this list.\nLet's also peek at what the matrix *AB* looks like.\n\n::: {#4645affa .cell execution_count=8}\n``` {.haskell .cell-code code-fold=\"true\"}\npsl25_gen_AB = (`mod` 5) <$> (psl25_gen_A_candidates !! 4) |*| psl25_gen_B\n\nmarkdown $ (\"$$\" ++) $ (++ \"$$\") $ intercalate \" \\\\quad \" [\n \"(AB) = \" ++ texifyMatrix psl25_gen_AB,\n \"(AB)^3 = \" ++ texifyMatrix ((`mod` 5) <$> (psl25_gen_AB^3))\n ]\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$(AB) = \\left( \\begin{matrix}4 & 3 \\\\ 1 & 3\\end{matrix} \\right) \\quad (AB)^3 = \\left( \\begin{matrix}2 & 0 \\\\ 0 & 2\\end{matrix} \\right)$$\n:::\n:::\n\n\nWe now have a correspondence between three elements of *A*~5~ and PSL(2, 5).\nWe can \"run\" both sets of the generators until we associate all elements to one another.\nThis is most visually appealing to see as a Cayley graph[^3]:\n\n[\n ![\n Cayley graph showing an isomorphism between A5 and PSL(2, 5).
\n Order-2 elements are red, order-3 elements are green, and order-5 elements are blue.\n Purple arrows are order-5 generators, orange arrows are order-2 generators.\n ](./a5_psl25_cayley.png){.narrow}\n](./a5_psl24_cayley.png)\n\n[^3]: Different generators appear to be used for *A* and *B* due to some\n self-imposed turbulence when writing the original post.\n Under projective equality, both are the same as our choices of *A* and *B*.\n\n\nPSL(2, 4)\n---------\n\nWe could do the same for PSL(2, 4), but we can't just work modulo 4\n -- remember, the elements of GF(4) are 0, 1, *α*, and *α*^2^.\nIt follows that GL(2, 4) is composed of (invertible) matrices of those elements,\n and SL(2, 4) is composed of matrices with determinant 1.\n\n$$\n\\begin{matrix}\n \\boxed{ \\begin{gather*}\n \\large \\text{GL}(2, 4)\n \\\\\n \\textcolor{red}{ \\underset{\\det = 1}{\n \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 1\n \\end{matrix} \\right)\n }},\n \\underset{\\det = \\alpha + 1}{\n \\left(\\begin{matrix}\n 0 & \\alpha \\\\\n \\alpha & \\alpha\n \\end{matrix} \\right)\n },\n \\underset{\\det = \\alpha}{\n \\left(\\begin{matrix}\n 1 & 0 \\\\\n 0 & \\alpha\n \\end{matrix} \\right)\n },\n \\textcolor{red}{\n \\underset{\\det = 1}{\n \\left(\\begin{matrix}\n \\alpha & 0 \\\\\n 0 & \\alpha^2\n \\end{matrix} \\right)\n }},\n ...\n \\end{gather*} }\n \\\\ \\\\\n \\boxed{ \\begin{gather*}\n \\large \\text{SL}(2,4)\n \\\\\n \\textcolor{red}{\n \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 1\n \\end{matrix} \\right)\n },\n \\textcolor{red}{\n \\left(\\begin{matrix}\n \\alpha & 0 \\\\\n 0 & \\alpha^2\n \\end{matrix} \\right)\n },\n ...\n \\end{gather*} }\n\\end{matrix}\n$$\n\nScalar multiplication by *α* multiplies the determinant by *α*^2^;\n by *α*^2^ multiplies the determinant by *α*^4^ = *α*.\nThus, SL(2, 4) is also PSL(2, 4), since a scalar multiple has determinant 1.\n\nLet's start by looking at an order-5 matrix over PSL(2, 4).\nWe'll call this matrix *B*' to correspond with our order-5 generator in PSL(2, 5).\n\n$$\n\\begin{gather*}\n B' = \\left(\\begin{matrix}\n 0 & \\alpha \\\\\n \\alpha^2 & \\alpha^2\n \\end{matrix} \\right)\n \\qquad\n (B')^2 = \\left(\\begin{matrix}\n 1 & 1 \\\\\n \\alpha & \\alpha^2\n \\end{matrix}\\right)\n \\qquad\n (B')^3 = \\left(\\begin{matrix}\n \\alpha^2 & 1 \\\\\n \\alpha & 1\n \\end{matrix}\\right)\n \\\\\n (B')^4 = \\left(\\begin{matrix}\n \\alpha^2 & \\alpha \\\\\n \\alpha^2 & 0\n \\end{matrix}\\right)\n \\qquad\n (B')^5 = \\left(\\begin{matrix}\n 1 & 0 \\\\\n 0 & 1\n \\end{matrix}\\right)\n \\\\ \\\\\n \\det B' = 0\\alpha^2 - \\alpha^3 = 1\n\\end{gather*}\n$$\n\n\nWe need to be able to do three things over GL(2, 4) on a computer:\n\n- multiply matrices over GF(4),\n- compute their determinant,\n- visually distinguish between each of them, and\n- be able to systematically write down all of them\n\nIt would then follow for us to repeat what we did with with SL(2, 5).\nBut as I've said, working symbolically is hard for computers, and the methods described for prime fields\n do not work in general with prime power fields.\nFortunately, we're amply prepared to find a solution.\n\n\n### Bootstrapping Matrices\n\nRecall that the elements of GF(4) can also be written as the zero matrix, the identity matrix,\n *C*~*p*~, and *C*~*p*~^2^ (where *C*~*p*~ is the companion matrix of *p*(x)\n and again, *p*(x) = *x*^2^ + *x* + 1).\nThis means we can also write elements of GL(2, 4) as matrices *of matrices*.\nArithmetic works exactly the same as it does symbolically\n -- we just replace all instances of *α* in *B*' with *C*~*p*~.\n\n$$\n\\begin{gather*}\n f^* : \\mathbb{F}_4 {}^{2 \\times 2} \\rightarrow (\\mathbb{F}_2 {}^{2 \\times 2})^{2 \\times 2}\n \\\\ \\\\\n \\begin{align*}\n \\bar {B'} = f^*(B') &= \\left(\\begin{matrix}\n f(0) & f(\\alpha) \\\\\n f(\\alpha^2) & f(\\alpha^2)\n \\end{matrix} \\right) = \\left(\\begin{matrix}\n {\\bf 0} & C_p \\\\\n C_p {}^2 & C_p {}^2\n \\end{matrix} \\right) \\\\\n &= \\left(\\begin{matrix}\n \\left(\\begin{matrix} 0 & 0 \\\\ 0 & 0 \\end{matrix} \\right) &\n \\left(\\begin{matrix} 0 & 1 \\\\ 1 & 1 \\end{matrix} \\right) \\\\\n \\left(\\begin{matrix} 1 & 1 \\\\ 1 & 0 \\end{matrix} \\right) &\n \\left(\\begin{matrix} 1 & 1 \\\\ 1 & 0 \\end{matrix} \\right)\n \\end{matrix} \\right)\n \\\\ ~ \\\\\n (f^*(\\bar B'))^2 &= \\left(\\begin{matrix}\n ({\\bf 0})({\\bf 0}) + C_p {}^3 & ({\\bf 0})C_p +C_p {}^3 \\\\\n ({\\bf 0})C_p {}^2 + C_p {}^4 & C_p {}^3 + C_p {}^4\n \\end{matrix} \\right) \\\\\n &= \\left(\\begin{matrix}\n I & I \\\\\n C_p {} & C_p {}^2\n \\end{matrix} \\right) = \\left(\\begin{matrix}\n f(1) & f(1) \\\\\n f(\\alpha) & f(\\alpha^2)\n \\end{matrix} \\right) =\n f^*((B')^2)\n \\end{align*}\n\\end{gather*}\n$$\n\nMake no mistake, this is *not* a [block matrix](https://en.wikipedia.org/wiki/Block_matrix),\n at least not a typical one.\nNamely, the layering means that the determinant (which signifies its membership in SL) is another matrix:\n\n$$\n\\begin{align*}\n \\det( f^*(B') ) &= {\\bf 0} (C_p {}^2) - (C_p)(C_p {}^2)\n \\\\\n &= \\left(\\begin{matrix}\n 0 & 0 \\\\\n 0 & 0\n \\end{matrix} \\right)\n \\left(\\begin{matrix}\n 1 & 1 \\\\\n 1 & 0\n \\end{matrix} \\right)\n - \\left(\\begin{matrix}\n 0 & 1 \\\\\n 1 & 1\n \\end{matrix} \\right)\n \\left(\\begin{matrix}\n 1 & 1 \\\\\n 1 & 0 \\end{matrix}\n \\right)\n \\\\\n &= \\left(\\begin{matrix}\n 1 & 0 \\\\\n 0 & 1\n \\end{matrix} \\right) \\mod 2\n \\\\\n &= I = f(\\det(B'))\n\\end{align*}\n$$\n\nSince *B*' is in SL(2, 4), the determinant is unsurprisingly *f*(1) = I.\nThe (matrix) determinants of *f*\\* applied to other elements of GL(2, 4)\n could just as well be *f*(*α*) = *C*~*p*~ or *f*(*α*^2^) = *C*~*p*~^2^.\n\n\n### Implementation\n\nUsing this method, we can implement PSL(2, 4) directly.\nAll we need to do is find all possible 4-tuples of **0**, *I*, *C*~*p*~, and *C*~*p*~^2^,\n then arrange each into a 2x2 matrix.\nMultiplication follows from the typical definition and the multiplicative identity is just *f*\\*(*I*).\n\n::: {#55360e06 .cell execution_count=9}\n``` {.haskell .cell-code code-fold=\"true\" code-summary=\"Haskell implementation of PSL(2, 4)\"}\nimport Data.List (elemIndex)\n\n-- Matrices which obey the same relations as the elements of GF(4)\nzero_f4 = zero 2\none_f4 = eye 2\nalpha_f4 = toMatrix [[0,1],[1,1]]\nalpha2_f4 = toMatrix [[1,1],[1,0]]\n\n-- Gathered into a list\nfield4 = [zero_f4, one_f4, alpha_f4, alpha2_f4]\n\n-- Convenient show function for these matrices\nshowF4 x = case elemIndex x field4 of\n Just 0 -> \"0\"\n Just 1 -> \"1\"\n Just 2 -> \"α\"\n Just 3 -> \"α^2\"\n Nothing -> \"N/A\"\n\n-- Identity matrix over GF(4)\npsl_24_identity = toMatrix [[one_f4, zero_f4], [zero_f4, one_f4]]\n\n-- All possible matrices over GF(4)\n-- Create a list of 4-lists of elements from GF(4), then\n-- Shape them into 2x2 matrices\nf4_matrices = map (toMatrix . reshape 2) $ replicateM 4 field4\n\n-- Sieve out those which have a determinant of 1 in the field\nmPSL24 = filter ((==one_f4) . fmap (`mod` 2) . laplaceDet) f4_matrices\n```\n:::\n\n\nNow that we can generate the group, we can finally repeat what we did with PSL(2, 5).\nAll we have to do is filter out order-2 elements, then further filter\n for those which have an order-3 product with *B*'.\n\n::: {#891047c6 .cell execution_count=10}\n``` {.haskell .cell-code code-fold=\"true\" code-summary=\"Haskell implementation using *B*' as a generator to find candidates for *A*'\"}\n-- Order with respect to PSL(2, 4): using matrix multiplication (mod 2)\n-- and projective equality to the identity matrix\norderPSL24 = orderWith (\\x -> fmap (fmap (`mod` 2)) . (x*)) (== psl_24_identity)\n\n-- Only order 2 elements of PSL(2, 4)\npsl24_order2 = filter ((==2) . orderPSL24) mPSL24\n\n-- Start with B as a generator\npsl24_gen_B = toMatrix [[zero_f4, alpha_f4], [alpha2_f4, alpha2_f4]]\n\n-- Find an order 2 element whose product with `psl24_gen_B` has order 3\npsl24_gen_A_candidates = filter ((==3) . orderPSL24 . (psl24_gen_B |*|))\n psl24_order2\n\nmarkdown $ (\"$$ A' = \" ++) $ (++ \"$$\") $ intercalate \" ~,~ \" $\n map (texifyMatrix' showF4) psl24_gen_A_candidates\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$ A' = \\left( \\begin{matrix}0 & 1 \\\\ 1 & 0\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}0 & α^2 \\\\ α & 0\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}1 & 0 \\\\ 1 & 1\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}1 & α^2 \\\\ 0 & 1\\end{matrix} \\right) ~,~ \\left( \\begin{matrix}α & 1 \\\\ α & α\\end{matrix} \\right)$$\n:::\n:::\n\n\nWe'll pick the second entry as our choice of *A*'.\nWe note that the product *A'B'*, does indeed have order 3.\n\n::: {#8ad49800 .cell execution_count=11}\n``` {.haskell .cell-code code-fold=\"true\"}\npsl24_gen_AB = fmap (`mod` 2) <$> (psl24_gen_A_candidates !! 1) |*| psl24_gen_B\n\nmarkdown $ (\"$$\" ++) $ (++ \"$$\") $ intercalate \" \\\\quad \" [\n \"(A'B') = \" ++ texifyMatrix' showF4 psl24_gen_AB,\n \"(A'B')^3 = \" ++ texifyMatrix' showF4 (fmap (`mod` 2) <$> (psl24_gen_AB^3))\n ]\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$(A'B') = \\left( \\begin{matrix}α & α \\\\ 0 & α^2\\end{matrix} \\right) \\quad (A'B')^3 = \\left( \\begin{matrix}1 & 0 \\\\ 0 & 1\\end{matrix} \\right)$$\n:::\n:::\n\n\nFinally, we can arrange these matrices on a Cayley graph in the same way as PSL(2, 5):\n\n[\n ![\n Cayley graph showing an isomorphism between *A*~5~ and PSL(2, 4).
\n Colors indicate the same thing as in the previous diagram.\n ](./a5_psl24_cayley.png){.narrow}\n](./a5_psl24_cayley.png)\n\n\nClosing\n-------\n\nThis post addresses my original goal in implementing finite fields,\n namely computationally finding an explicit map between *A*~5~ and PSL(2, 4).\nI believe the results are a little more satisfying than attempting to wrap your head\n around group-theoretic proofs.\nThat's not to discount the power and incredible logic that goes into the latter method.\nIt does tend to leave things rather opaque, however.\n\nIf you'd prefer a more interactive diagram showing the above isomorphisms,\n I've gone to the liberty of creating a hoverable SVG:\n\n[\n ![\n Click to open interactive version\n ](./a5_psl24_psl25_isomorphism.svg){.narrow}\n](./a5_psl24_psl25_isomorphism.svg)\n\nThis post slightly diverts our course from the previous one's focus on fields.\nThe [next one](../4) will focus on more results regarding the treatment of layered matrices.\nThe algebraic consequences of this structure are notable in and of themselves,\n and are entirely obfuscated by the usual interpretation of block matrices.\n\nDiagrams created with Geogebra and Inkscape.\n\n", "supporting": [ - "index_files" + "index_files/figure-html" ], "filters": [], "includes": {} diff --git a/_freeze/posts/math/finite-field/4/index/execute-results/html.json b/_freeze/posts/math/finite-field/4/index/execute-results/html.json index 5387bee..eb4c0d2 100644 --- a/_freeze/posts/math/finite-field/4/index/execute-results/html.json +++ b/_freeze/posts/math/finite-field/4/index/execute-results/html.json @@ -1,10 +1,10 @@ { - "hash": "c0299dc1fc40e16502fc616ecf2a0425", + "hash": "6d14e6ba9847880e6db4924484d7e689", "result": { "engine": "jupyter", - "markdown": "---\ntitle: \"Exploring Finite Fields, Part 4: The Power of Forgetting\"\ndescription: |\n Or: how I stopped learned to worrying and appreciate the Monad.\nformat:\n html:\n html-math-method: katex\ndate: \"2024-02-03\"\ndate-modified: \"2025-08-05\"\ncategories:\n - algebra\n - finite field\n - haskell\n---\n\n\n\nThe [last post](../3) in this series focused on understanding some small linear groups\n and implementing them on the computer over both a prime field and prime power field.\n\nThe prime power case was particularly interesting.\nFirst, we adjoined the roots of a polynomial to the base field, GF(2).\nRather than the traditional means of adding new symbols like *α*, we used companion matrices,\n which behave the same arithmetically.\nFor example, for the smallest prime power field, GF(4), we use the polynomial $p(x) = x^2 + x + 1$,\n and map its symbolic roots (*α* and *α*^2^), to matrices over GF(2):\n\n$$\n\\begin{gather*}\n f : \\mathbb{F}_4 \\longrightarrow \\mathbb{F}_2 {}^{2 \\times 2}\n \\\\ \\\\\n \\begin{gather*}\n f(0) = {\\bf 0} =\n \\left(\\begin{matrix} 0 & 0 \\\\ 0 & 0 \\end{matrix}\\right)\n & f(1) = I\n = \\left(\\begin{matrix} 1 & 0 \\\\ 0 & 1 \\end{matrix}\\right)\n \\\\\n f(\\alpha) = C_p\n = \\left(\\begin{matrix} 0 & 1 \\\\ 1 & 1 \\end{matrix}\\right)\n & f(\\alpha^2) = C_p {}^2\n = \\left(\\begin{matrix} 1 & 1 \\\\ 1 & 0 \\end{matrix}\\right)\n \\end{gather*}\n \\\\ \\\\\n f(a + b)= f(a) + f(b), \\quad f(ab) = f(a)f(b)\n\\end{gather*}\n$$\n\n::: {#1249570e .cell execution_count=3}\n``` {.haskell .cell-code code-fold=\"true\" code-summary=\"Equivalent Haskell\"}\ndata F4 = ZeroF4 | OneF4 | AlphaF4 | Alpha2F4 deriving Eq\nfield4 = [ZeroF4, OneF4, AlphaF4, Alpha2F4]\n\ninstance Show F4 where\n show ZeroF4 = \"0\"\n show OneF4 = \"1\"\n show AlphaF4 = \"α\"\n show Alpha2F4 = \"α^2\"\n\n-- Addition and multiplication over F4\ninstance Num F4 where\n (+) ZeroF4 x = x\n (+) OneF4 AlphaF4 = Alpha2F4\n (+) OneF4 Alpha2F4 = AlphaF4\n (+) AlphaF4 Alpha2F4 = OneF4\n (+) x y = if x == y then ZeroF4 else y + x\n\n (*) ZeroF4 x = ZeroF4\n (*) x ZeroF4 = ZeroF4\n (*) OneF4 x = x\n (*) AlphaF4 AlphaF4 = Alpha2F4\n (*) Alpha2F4 Alpha2F4 = AlphaF4\n (*) AlphaF4 Alpha2F4 = OneF4\n (*) x y = y * x\n\n abs = id\n negate = id\n signum = id\n fromInteger = (cycle field4 !!) . fromInteger\n\n\n-- Companion matrix of `p`, an irreducible polynomial of degree 2 over GF(2)\ncP :: (Num a, Eq a, Integral a) => Matrix a\ncP = companion $ Poly [1, 1, 1]\n\nf ZeroF4 = zero 2\nf OneF4 = eye 2\nf AlphaF4 = cP\nf Alpha2F4 = (`mod` 2) <$> cP |*| cP\n\nfield4M = map f field4\n```\n:::\n\n\nFinally, we constructed GL(2, 4) using matrices of matrices\n -- not [block matrices](https://en.wikipedia.org/wiki/Block_matrix)!\nThis post will focus on studying this method in slightly more detail.\n\n\nReframing the Path Until Now\n----------------------------\n\nIn the above description, we already mentioned larger structures over GF(2),\n namely polynomials and matrices.\nSince GF(4) can itself be described with matrices over GF(2),\n we can generalize *f* to give us two more maps:\n\n- $f^*$, which converts matrices over GF(4) to double-layered matrices over GF(2), and\n- $f^\\bullet$, which converts polynomials over GF(4) to polynomials of matrices over GF(2)\n\n\n### Matrix Map\n\nWe examined the former map briefly in the previous post.\nMore explicitly, we looked at a matrix *B* in SL(2, 4) which had the property\n that it was cyclic of order five.\nThen, to work with it without relying on symbols, we simply applied *f* over the contents of the matrix.\n\n::: {#0554e224 .cell execution_count=4}\n``` {.haskell .cell-code code-fold=\"true\"}\n-- Starred maps are instances of fmap composed with modding out\n-- by the characteristic\n\nfStar :: (Eq a, Num a, Integral a) => Matrix F4 -> Matrix (Matrix a)\nfStar = fmap (fmap (`mod` 2) . f)\n\nmBOrig = toMatrix [[ZeroF4, AlphaF4], [Alpha2F4, Alpha2F4]]\nmBStar = fStar mBOrig\n\nmarkdown $ \"$$\\\\begin{gather*}\" ++ concat [\n -- First row, type of fStar\n \"f^* : \\\\mathbb{F}_4 {}^{2 \\\\times 2}\" ++\n \"\\\\longrightarrow\" ++\n \"(\\\\mathbb{F}_2 {}^{2 \\\\times 2})^{2 \\\\times 2}\" ++\n \"\\\\\\\\[10pt]\",\n -- Second row, B\n \"B = \" ++ texifyMatrix' show mBOrig ++\n \"\\\\\\\\\",\n -- Third row, B*\n \"B^* = f^*(B) = \" ++\n texifyMatrix' (\\x -> \"f(\" ++ show x ++ \")\") mBOrig ++ \" = \" ++\n texifyMatrix' (texifyMatrix' show) mBStar\n ] ++\n \"\\\\end{gather*}$$\"\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{gather*}f^* : \\mathbb{F}_4 {}^{2 \\times 2}\\longrightarrow(\\mathbb{F}_2 {}^{2 \\times 2})^{2 \\times 2}\\\\[10pt]B = \\left( \\begin{matrix}0 & α \\\\ α^2 & α^2\\end{matrix} \\right)\\\\B^* = f^*(B) = \\left( \\begin{matrix}f(0) & f(α) \\\\ f(α^2) & f(α^2)\\end{matrix} \\right) = \\left( \\begin{matrix}\\left( \\begin{matrix}0 & 0 \\\\ 0 & 0\\end{matrix} \\right) & \\left( \\begin{matrix}0 & 1 \\\\ 1 & 1\\end{matrix} \\right) \\\\ \\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right) & \\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right)\\end{matrix} \\right)\\end{gather*}$$\n:::\n:::\n\n\nWe can do this because a matrix contains values in the domain of *f*, thus uniquely determining\n a way to change the internal structure (what Haskell calls\n a [functor](https://wiki.haskell.org/Functor)).\nFurthermore, due to the properties of *f*, it and *f*\\* commute with the determinant,\n as shown by the following diagram:\n\n$$\n\\begin{gather*}\n f(\\det(B)) = f(1) = I =\\det(B^*)= \\det(f^*(B))\n \\\\[10pt]\n \\begin{CD}\n \\mathbb{F}_4 {}^{2 \\times 2}\n @>{\\det}>>\n \\mathbb{F}_4\n \\\\\n @V{f^*}VV ~ @VV{f}V\n \\\\\n (\\mathbb{F}_2 {}^{2 \\times 2})^{2 \\times 2}\n @>>{\\det}>\n \\mathbb{F}_2 {}^{2 \\times 2}\n \\end{CD}\n\\end{gather*}\n$$\n\nIt should be noted that the determinant strips off the *outer* matrix.\nWe could also consider the map **det**\\* , where we apply the determinant\n to the internal matrices (in Haskell terms, `fmap determinant`).\nThis map isn't as nice though, since:\n\n::: {#f2977c19 .cell execution_count=5}\n``` {.haskell .cell-code code-fold=\"true\"}\nmarkdown $ \"$$\\\\begin{align*}\" ++ concat [\n -- First row, det* of B\n \"\\\\det {}^*(B^*) &= \" ++\n texifyMatrix' ((\"\\\\det\" ++) . texifyMatrix' show) mBStar ++ \" = \" ++\n texifyMatrix ((`mod` 2) . determinant <$> mBStar) ++\n \"\\\\\\\\ \\\\\\\\\",\n -- Second row, determinant of B*\n -- Note how the commutation between `determinant` and <$> fails\n \"&\\\\neq\" ++\n texifyMatrix ((`mod` 2) <$> determinant mBStar) ++ \" = \" ++\n \"\\\\det(B^*)\",\n \"\"\n ] ++\n \"\\\\end{align*}$$\"\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{align*}\\det {}^*(B^*) &= \\left( \\begin{matrix}\\det\\left( \\begin{matrix}0 & 0 \\\\ 0 & 0\\end{matrix} \\right) & \\det\\left( \\begin{matrix}0 & 1 \\\\ 1 & 1\\end{matrix} \\right) \\\\ \\det\\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right) & \\det\\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right)\\end{matrix} \\right) = \\left( \\begin{matrix}0 & 1 \\\\ 1 & 1\\end{matrix} \\right)\\\\ \\\\&\\neq\\left( \\begin{matrix}1 & 0 \\\\ 0 & 1\\end{matrix} \\right) = \\det(B^*)\\end{align*}$$\n:::\n:::\n\n\n### Polynomial Map\n\nMuch like how we can change the internal structure of matrices, we can do the same for polynomials.\nFor the purposes of demonstration, we'll work with $b = \\lambda^2 + \\alpha^2 \\lambda + 1$,\n the characteristic polynomial of *B*, since it has coefficients in the domain of *f*.\nWe define the extended map $f^\\bullet$ as:\n\n::: {#58a36854 .cell execution_count=6}\n``` {.haskell .cell-code code-fold=\"true\"}\n-- Bulleted maps are also just instances of fmap, like the starred maps\n\nfBullet :: (Eq a, Num a, Integral a) => Polynomial F4 -> Polynomial (Matrix a)\nfBullet = fmap (fmap (`mod` 2) . f)\n```\n:::\n\n\n$$\n\\begin{gather*}\n f^{\\bullet} : \\mathbb{F}_4[\\lambda] \\longrightarrow\n \\mathbb{F}_2 {}^{2 \\times 2}[\\Lambda]\n \\\\\n f^{\\bullet} (\\lambda) = \\Lambda \\qquad\n f^{\\bullet}(a) = f(a), \\quad a \\in \\mathbb{F}_4\n \\\\ \\\\\n \\begin{align*}\n b^{\\bullet}\n = f^{\\bullet}(b)\n &= f^{\\bullet}(\\lambda^2)\n &&+&& f^{\\bullet}(\\alpha^2)f^{\\bullet}(\\lambda)\n &&+&& f^{\\bullet}(1)\n \\\\\n &= \\Lambda^2\n &&+&& \\left(\\begin{matrix} 1 & 1 \\\\ 1 & 0\\end{matrix}\\right) \\Lambda\n &&+&& \\left(\\begin{matrix} 1 & 0 \\\\ 0 & 1 \\end{matrix}\\right)\n \\end{align*}\n\\end{gather*}\n$$\n\nSince we're looking at the characteristic polynomial of *B*, we might as well also look\n at the characteristic polynomial of *B*\\*, its image under $f^*$.\nWe already looked at the determinant of this matrix, which is the constant term\n of the characteristic polynomial (up to sign).\nTherefore, it's probably not surprising that $f^\\bullet$ and the characteristic polynomial commute\n in a similar fashion to the determinant.\n\n::: {#9126ada7 .cell execution_count=7}\n``` {.haskell .cell-code code-fold=\"true\"}\nbStar = fmap (fmap (`mod` 2)) $ charpoly $ fStar mBOrig\nbBullet = fmap (fmap (`mod` 2)) $ fBullet $ charpoly mBOrig\n\nif bStar /= bBullet then\n markdown \"$b^\\\\star$ and $b^\\\\bullet$ are not equal!\"\n else\n markdown $ \"$$\\\\begin{align*}\" ++ concat [\n \"b^* &= \\\\text{charpoly}(f^*(B)) = \\\\text{charpoly} \" ++\n texifyMatrix' (texifyMatrix' show) mBStar ++\n \"\\\\\\\\\",\n \"&= \" ++\n texifyPoly' \"\\\\Lambda\" (texifyMatrix' show) bStar ++ \" = \" ++\n \"f^\\\\bullet(\\\\text{charpoly}(B)) = b^\\\\bullet\",\n \"\"\n ] ++\n \"\\\\end{align*}$$\"\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{align*}b^* &= \\text{charpoly}(f^*(B)) = \\text{charpoly} \\left( \\begin{matrix}\\left( \\begin{matrix}0 & 0 \\\\ 0 & 0\\end{matrix} \\right) & \\left( \\begin{matrix}0 & 1 \\\\ 1 & 1\\end{matrix} \\right) \\\\ \\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right) & \\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right)\\end{matrix} \\right)\\\\&= \\left( \\begin{matrix}1 & 0 \\\\ 0 & 1\\end{matrix} \\right) + \\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right)\\Lambda + \\Lambda^{2} = f^\\bullet(\\text{charpoly}(B)) = b^\\bullet\\end{align*}$$\n:::\n:::\n\n\n$$\n\\begin{CD}\n \\mathbb{F}_4 {}^{2 \\times 2}\n @>{\\text{charpoly}}>>\n \\mathbb{F}_4[\\lambda]\n \\\\\n @V{f^*}VV ~ @VV{f^\\bullet}V\n \\\\\n (\\mathbb{F}_2 {}^{2 \\times 2})^{2 \\times 2}\n @>>{\\text{charpoly}}>\n (\\mathbb{F}_2 {}^{2 \\times 2})[\\Lambda]\n\\end{CD}\n$$\n\nIt should also be mentioned that **charpoly**\\*, taking the characteristic polynomials\n of the internal matrices, does *not* obey the same relationship.\nFor one, the type is wrong: the codomain is a matrix *containing* polynomials,\n rather than a polynomial over matrices.\n\nThere *does* happen to be an isomorphism between the two structures\n (a direction of which we'll discuss momentarily).\nBut even by converting to the proper type, we already have a counterexample in the constant term\n from taking **det**\\* earlier.\n\n::: {#80572089 .cell execution_count=8}\n``` {.haskell .cell-code code-fold=\"true\"}\nmarkdown $ \"$$\\\\begin{align*}\" ++ concat [\n \"\\\\text{charpoly}^*(B^*) &= \" ++\n texifyMatrix' ((\"\\\\text{charpoly}\" ++) . texifyMatrix' show) mBStar ++\n \"\\\\\\\\\",\n \"&= \" ++\n texifyMatrix' (texifyPoly' \"\\\\lambda\" show)\n (fmap (fmap (`mod` 2) . charpoly) mBStar) ++\n \"\\\\\\\\\",\n \"&\\\\cong \" ++\n -- Not constructing this by isomorphism yet\n texifyPoly' \"\\\\Lambda\" texifyMatrix\n (Poly [\n toMatrix [[0,1], [1,1]],\n toMatrix [[0,1], [1,1]],\n toMatrix [[1,1], [1,1]]\n ]) ++\n \"\\\\\\\\ \\\\\\\\\",\n \"&\\\\neq f^\\\\bullet(\\\\text{charpoly}(B))\"\n ] ++\n \"\\\\end{align*}$$\"\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{align*}\\text{charpoly}^*(B^*) &= \\left( \\begin{matrix}\\text{charpoly}\\left( \\begin{matrix}0 & 0 \\\\ 0 & 0\\end{matrix} \\right) & \\text{charpoly}\\left( \\begin{matrix}0 & 1 \\\\ 1 & 1\\end{matrix} \\right) \\\\ \\text{charpoly}\\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right) & \\text{charpoly}\\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right)\\end{matrix} \\right)\\\\&= \\left( \\begin{matrix}\\lambda^{2} & 1 + \\lambda + \\lambda^{2} \\\\ 1 + \\lambda + \\lambda^{2} & 1 + \\lambda + \\lambda^{2}\\end{matrix} \\right)\\\\&\\cong \\left( \\begin{matrix}0 & 1 \\\\ 1 & 1\\end{matrix} \\right) + \\left( \\begin{matrix}0 & 1 \\\\ 1 & 1\\end{matrix} \\right)\\Lambda + \\left( \\begin{matrix}1 & 1 \\\\ 1 & 1\\end{matrix} \\right)\\Lambda^{2}\\\\ \\\\&\\neq f^\\bullet(\\text{charpoly}(B))\\end{align*}$$\n:::\n:::\n\n\nForgetting\n----------\n\nClearly, layering matrices has several advantages over how we usually interpret block matrices.\nBut what happens if we *do* \"forget\" about the internal structure?\n\n::: {#fd9b9f12 .cell execution_count=9}\n``` {.haskell .cell-code code-fold=\"true\" code-summary=\"Haskell implementation of `forget`\"}\nimport Data.List (transpose)\n\n-- Massively complicated point-free way to forget double matrices:\n-- 1. Convert internal matrices to lists of lists\n-- 2. Convert the external matrix to a list of lists\n-- 3. There are now four layers of lists. Transpose the second and third.\n-- 4. Concat the new third and fourth layers together\n-- 5. Concat the first and second layers together\n-- 6. Convert the list of lists back to a matrix\nforget :: Matrix (Matrix a) -> Matrix a\nforget = toMatrix . concatMap (fmap concat . transpose) .\n fromMatrix . fmap fromMatrix\n\n-- To see why this is the structure, remember that we need to work with rows\n-- of the external matrix at the same time.\n-- We'd like to read across the whole row, but this involves descending into two matrices.\n-- The `fmap transpose` allows us to collect rows in the way we expect.\n-- For example, for the above matrix, We get `[[[0,0],[0,1]], [[0,0],[1,1]]]` after the transposition,\n-- which are the first two rows, grouped by the matrix they belonged to.\n-- Then, we can finally get the desired row by `fmap (fmap concat)`ing the rows together.\n-- Finally, we `concat` once more to undo the column grouping.\n\nmBHat = forget mBStar\n\nmarkdown $ \"$$\\\\begin{gather*}\" ++ concat [\n \"\\\\text{forget} : (\\\\mathbb{F}_2 {}^{2 \\\\times 2})^{2 \\\\times 2}\" ++\n \"\\\\longrightarrow \\\\mathbb{F}_2 {}^{4 \\\\times 4}\" ++\n \"\\\\\\\\[10pt]\",\n \"\\\\hat B = \\\\text{forget}(B^*) = \\\\text{forget}\" ++\n texifyMatrix' (texifyMatrix' show) mBStar ++ \" = \" ++\n texifyMatrix mBHat,\n \"\"\n ] ++\n \"\\\\end{gather*}$$\"\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{gather*}\\text{forget} : (\\mathbb{F}_2 {}^{2 \\times 2})^{2 \\times 2}\\longrightarrow \\mathbb{F}_2 {}^{4 \\times 4}\\\\[10pt]\\hat B = \\text{forget}(B^*) = \\text{forget}\\left( \\begin{matrix}\\left( \\begin{matrix}0 & 0 \\\\ 0 & 0\\end{matrix} \\right) & \\left( \\begin{matrix}0 & 1 \\\\ 1 & 1\\end{matrix} \\right) \\\\ \\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right) & \\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right)\\end{matrix} \\right) = \\left( \\begin{matrix}0 & 0 & 0 & 1 \\\\ 0 & 0 & 1 & 1 \\\\ 1 & 1 & 1 & 1 \\\\ 1 & 0 & 1 & 0\\end{matrix} \\right)\\end{gather*}$$\n:::\n:::\n\n\nLike *f*, `forget` preserves addition and multiplication, a fact already appreciated by block matrices.\nFurther, by *f*, the internal matrices multiply the same as elements of GF(4).\nHence, this shows us directly that GL(2, 4) is a subgroup of GL(4, 2).\n\nHowever, an obvious difference between layered and \"forgotten\" matrices is\n the determinant and characteristic polynomial:\n\n::: {#623d4a04 .cell execution_count=10}\n``` {.haskell .cell-code code-fold=\"true\"}\nmarkdown $ \"$$\\\\begin{align*}\" ++ intercalate \" \\\\\\\\ \\\\\\\\ \" (\n map (intercalate \" & \") [\n [\n \"\\\\det B^* &= \" ++\n texifyMatrix ((`mod` 2) <$> determinant mBStar),\n \"\\\\text{charpoly} B^* &= \" ++\n texifyPoly' \"\\\\Lambda\" texifyMatrix (fmap (`mod` 2) <$> charpoly mBStar)\n ], [\n \"\\\\det \\\\hat B &= \" ++\n show ((`mod` 2) $ determinant mBHat),\n \"\\\\text{charpoly} \\\\hat B &= \" ++\n texifyPoly' \"\\\\lambda\" show ((`mod` 2) <$> charpoly mBHat)\n ]\n ]) ++\n \"\\\\end{align*}$$\"\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{align*}\\det B^* &= \\left( \\begin{matrix}1 & 0 \\\\ 0 & 1\\end{matrix} \\right) & \\text{charpoly} B^* &= \\left( \\begin{matrix}1 & 0 \\\\ 0 & 1\\end{matrix} \\right) + \\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right)\\Lambda + \\Lambda^{2} \\\\ \\\\ \\det \\hat B &= 1 & \\text{charpoly} \\hat B &= 1 + \\lambda + \\lambda^{2} + \\lambda^{3} + \\lambda^{4}\\end{align*}$$\n:::\n:::\n\n\n### Another Forgotten Path\n\nIt's a relatively simple matter to move between determinants, since it's straightforward\n to identify 1 and the identity matrix.\nHowever, a natural question to ask is whether there's a way to reconcile or coerce\n the matrix polynomial into the \"forgotten\" one.\n\nFirst, let's formally establish a path from matrix polynomials to a matrix of polynomials.\nWe need only use our friend from the [second post](../2) -- polynomial evaluation.\nSimply evaluating a matrix polynomial *r* at *λI* converts our matrix indeterminate (*Λ*)\n into a scalar one (*λ*).\n\n$$\n\\begin{align*}\n \\text{eval}_{\\Lambda \\mapsto \\lambda I}\n &: (\\mathbb{F}_2 {}^{2 \\times 2})[\\Lambda]\n \\rightarrow (\\mathbb{F}_2[\\lambda]) {}^{2 \\times 2}\n \\\\\n &:: \\quad\n r(\\Lambda) \\mapsto r(\\lambda I)\n\\end{align*}\n$$\n\n::: {#8e5ab00c .cell execution_count=11}\n``` {.haskell .cell-code code-fold=\"true\"}\n-- Function following from the evaluation definition above\n-- Note that `Poly . pure` is used to transform matrices of `a`\n-- into matrices of polynomials.\n\ntoMatrixPolynomial :: (Eq a, Num a) =>\n Polynomial (Matrix a) -> Matrix (Polynomial a)\ntoMatrixPolynomial xs = evalPoly eyeLambda $ fmap (fmap (Poly . pure)) xs where\n -- First dimensions of the coefficients\n (is, _) = unzip $ map (snd . bounds . unMat) $ coeffs xs\n -- Properly-sized identity matrix times a scalar lambda\n eyeLambda = eye (1 + maximum is) * toMatrix [[Poly [0, 1]]]\n\n\nmarkdown $ \"$$\\\\begin{align*}\" ++\n \"\\\\text{eval}_{\\\\Lambda \\\\mapsto \\\\lambda I}(\\\\text{charpoly}(B^*)) &=\" ++\n texifyPoly' \"(\\\\lambda I)\" texifyMatrix\n (fmap (`mod` 2) <$> charpoly mBStar) ++\n \"\\\\\\\\ &= \" ++\n texifyMatrix' (texifyPoly' \"\\\\lambda\" show)\n (toMatrixPolynomial $ fmap (`mod` 2) <$> charpoly mBStar) ++\n \"\\\\end{align*}$$\"\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{align*}\\text{eval}_{\\Lambda \\mapsto \\lambda I}(\\text{charpoly}(B^*)) &=\\left( \\begin{matrix}1 & 0 \\\\ 0 & 1\\end{matrix} \\right) + \\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right)(\\lambda I) + (\\lambda I)^{2}\\\\ &= \\left( \\begin{matrix}1 + \\lambda + \\lambda^{2} & \\lambda \\\\ \\lambda & 1 + \\lambda^{2}\\end{matrix} \\right)\\end{align*}$$\n:::\n:::\n\n\nSince a matrix containing polynomials is still a matrix, we can then take its determinant.\nWhat pops out is exactly what we were after...\n\n::: {#a9a302d9 .cell execution_count=12}\n``` {.haskell .cell-code code-fold=\"true\"}\nmarkdown $ \"$$\\\\begin{align*}\" ++\n \"\\\\det(\\\\text{eval}_{\\\\Lambda \\\\mapsto \\\\lambda I}(\" ++\n \"\\\\text{charpoly}(B^*))) &=\" ++\n \"(1 + \\\\lambda + \\\\lambda^2)(1 + \\\\lambda^2) - \\\\lambda^2\" ++\n \"\\\\\\\\ &=\" ++\n texifyPoly' \"\\\\lambda\" show\n (fmap (`mod` 2) <$> determinant $ toMatrixPolynomial $ charpoly mBStar) ++\n \"\\\\\\\\ &= \\\\text{charpoly}{\\\\hat B}\" ++\n \"\\\\end{align*}$$\"\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{align*}\\det(\\text{eval}_{\\Lambda \\mapsto \\lambda I}(\\text{charpoly}(B^*))) &=(1 + \\lambda + \\lambda^2)(1 + \\lambda^2) - \\lambda^2\\\\ &=1 + \\lambda + \\lambda^{2} + \\lambda^{3} + \\lambda^{4}\\\\ &= \\text{charpoly}{\\hat B}\\end{align*}$$\n:::\n:::\n\n\n...and we can arrange our maps into another diagram:\n\n$$\n\\begin{gather*}\n \\begin{CD}\n (\\mathbb{F}_2 {}^{2 \\times 2})^{2 \\times 2}\n @>{\\text{charpoly}}>>\n (\\mathbb{F}_2 {}^{2 \\times 2})[\\Lambda]\n \\\\\n @V{\\text{id}}VV ~ @VV{\\text{eval}_{\\Lambda \\mapsto \\lambda I}}V\n \\\\\n -\n @. (\\mathbb{F}_2 [\\lambda])^{2 \\times 2}\n \\\\\n @V{\\text{forget}}VV ~ @VV{\\det}V\n \\\\\n \\mathbb{F}_2 {}^{4 \\times 4}\n @>>{\\text{charpoly}}>\n \\mathbb{F}_2[\\lambda]\n \\end{CD}\n \\\\ \\\\\n \\text{charpoly} \\circ \\text{forget}\n = \\det \\circ ~\\text{eval}_{\\Lambda \\mapsto \\lambda I} \\circ\\text{charpoly}\n\\end{gather*}\n$$\n\nIt should be noted that we do *not* get the same results by taking the determinant after\n applying **charpoly**\\*, indicating that the above method is \"correct\".\n\n::: {#c9d284c9 .cell execution_count=13}\n``` {.haskell .cell-code code-fold=\"true\"}\nmarkdown $ \"$$\\\\begin{align*}\" ++\n \"\\\\text{charpoly}^*(B^*) &=\" ++\n texifyMatrix' (texifyPoly' \"\\\\lambda\" show)\n (fmap (`mod` 2) <$> fmap charpoly mBStar) ++\n \"\\\\\\\\ \\\\\\\\\" ++\n \"\\\\det(\\\\text{charpoly}^*(B^*)) &=\" ++\n \"\\\\lambda^2(1 + \\\\lambda + \\\\lambda^2) - (1 + \\\\lambda + \\\\lambda^2)^2\" ++\n \"\\\\\\\\ &= \" ++\n texifyPoly' \"\\\\lambda\" show\n (fmap (`mod` 2) <$> determinant $ fmap charpoly mBStar) ++\n \"\\\\end{align*}$$\"\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{align*}\\text{charpoly}^*(B^*) &=\\left( \\begin{matrix}\\lambda^{2} & 1 + \\lambda + \\lambda^{2} \\\\ 1 + \\lambda + \\lambda^{2} & 1 + \\lambda + \\lambda^{2}\\end{matrix} \\right)\\\\ \\\\\\det(\\text{charpoly}^*(B^*)) &=\\lambda^2(1 + \\lambda + \\lambda^2) - (1 + \\lambda + \\lambda^2)^2\\\\ &= 1 + \\lambda^{3}\\end{align*}$$\n:::\n:::\n\n\n### Cycles and Cycles\n\nSince we can get $\\lambda^4 + \\lambda^3 + \\lambda^2 + \\lambda + 1$ in two ways,\n it's natural to assume this polynomial is significant in some way.\nIn the language of the the second post, the polynomial can also be written as ~2~31,\n whose root we determined was cyclic of order 5.\nThis happens to match the order of *B* in GL(2, 4).\n\nPerhaps this is unsurprising, since there are only so many polynomials of degree 4 over GF(2).\nHowever, the reason we see it is more obvious if we look at the powers of scalar multiples of *B*.\nFirst, recall that *f*\\* takes us from a matrix over GF(4) to a matrix of matrices of GF(2).\nThen define a map *g* that gives us degree 4 polynomials:\n\n$$\n\\begin{gather*}\n g : \\mathbb{F}_4^{2 \\times 2} \\rightarrow \\mathbb{F}_2[\\lambda]\n \\\\\n g = \\text{charpoly} \\circ \\text{forget} \\circ f^*\n\\end{gather*}\n$$\n\n::: {#5e05ff31 .cell layout-ncol='3' execution_count=14}\n``` {.haskell .cell-code code-fold=\"true\"}\ng = fmap (`mod` 2) . charpoly . forget . fStar\n\nshowSeries varName var = \"$$\\\\begin{array}{}\" ++\n \" & \\\\scriptsize \" ++\n texifyMatrix var ++\n \"\\\\\\\\\" ++\n intercalate \" \\\\\\\\ \" [\n (if n == 1 then varName' else varName' ++ \"^{\" ++ show n ++ \"}\") ++\n \"& \\\\overset{g}{\\\\mapsto} &\" ++\n texPolyAsPositional' \"\\\\lambda\" (g $ var^n)\n | n <- [1..5]\n ] ++\n \"\\\\end{array}$$\" where\n varName' = if length varName == 1 then varName else \"(\" ++ varName ++ \")\"\n\nmarkdown $ showSeries \"B\" mBOrig\nmarkdown $ showSeries \"αB\" (fmap (AlphaF4*) mBOrig)\nmarkdown $ showSeries \"α^2 B\" (fmap (Alpha2F4*) mBOrig)\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{array}{} & \\scriptsize \\left( \\begin{matrix}0 & α \\\\ α^2 & α^2\\end{matrix} \\right)\\\\B& \\overset{g}{\\mapsto} &11111_{\\lambda} \\\\ B^{2}& \\overset{g}{\\mapsto} &11111_{\\lambda} \\\\ B^{3}& \\overset{g}{\\mapsto} &11111_{\\lambda} \\\\ B^{4}& \\overset{g}{\\mapsto} &11111_{\\lambda} \\\\ B^{5}& \\overset{g}{\\mapsto} &10001_{\\lambda}\\end{array}$$\n:::\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{array}{} & \\scriptsize \\left( \\begin{matrix}0 & α^2 \\\\ 1 & 1\\end{matrix} \\right)\\\\(αB)& \\overset{g}{\\mapsto} &10011_{\\lambda} \\\\ (αB)^{2}& \\overset{g}{\\mapsto} &10011_{\\lambda} \\\\ (αB)^{3}& \\overset{g}{\\mapsto} &11111_{\\lambda} \\\\ (αB)^{4}& \\overset{g}{\\mapsto} &10011_{\\lambda} \\\\ (αB)^{5}& \\overset{g}{\\mapsto} &10101_{\\lambda}\\end{array}$$\n:::\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{array}{} & \\scriptsize \\left( \\begin{matrix}0 & 1 \\\\ α & α\\end{matrix} \\right)\\\\(α^2 B)& \\overset{g}{\\mapsto} &11001_{\\lambda} \\\\ (α^2 B)^{2}& \\overset{g}{\\mapsto} &11001_{\\lambda} \\\\ (α^2 B)^{3}& \\overset{g}{\\mapsto} &11111_{\\lambda} \\\\ (α^2 B)^{4}& \\overset{g}{\\mapsto} &11001_{\\lambda} \\\\ (α^2 B)^{5}& \\overset{g}{\\mapsto} &10101_{\\lambda}\\end{array}$$\n:::\n:::\n\n\nThe matrices in the middle and rightmost columns both have order 15 inside GL(2, 4).\nCorrespondingly, both 10011~λ~ = ~2~19 and 11001~λ~ = ~2~25 are primitive,\n and so have roots of order 15 over GF(2).\n\n\n### A Field?\n\nSince we have 15 matrices generated by the powers of one, you might wonder whether or not\n they can correspond to the nonzero elements of GF(16).\nAnd they can!\nIn a sense, we've \"borrowed\" the order 15 elements from this \"field\" within GL(4, 2).\nHowever, none of the powers of this matrix are the companion matrix of either ~2~19 or ~2~25.\n\n
\n\nHaskell demonstration of the field-like-ness of these matrices\n\n\nAll we really need to do is test additive closure, since the powers trivially commute and include the identity matrix.\n\n::: {#027fb2e4 .cell execution_count=15}\n``` {.haskell .cell-code}\n-- Check whether n x n matrices (mod p) have additive closure\n-- Supplement the identity, even if it is not already present\nhasAdditiveClosure :: Integral a => Int -> a -> [Matrix a] -> Bool\nhasAdditiveClosure n p xs = all (`elem` xs') sums where\n -- Add in the zero matrix\n xs' = zero n:xs\n -- Calculate all possible sums of pairs (mod p)\n sums = map (fmap (`mod` p)) $ (+) <$> xs' <*> xs'\n\n-- Generate the powers of x, then test if they form a field (mod p)\ngeneratesField :: Integral a => Int -> a -> Matrix a -> Bool\ngeneratesField n p x = hasAdditiveClosure n p xs where\n xs = map (fmap (`mod` p) . (x^)) [1..p^n-1]\n\n\nprint $ generatesField 4 2 $ forget $ fStar $ fmap (AlphaF4*) mBOrig\n```\n\n::: {.cell-output .cell-output-display}\n```\nTrue\n```\n:::\n:::\n\n\n
\n\nMore directly, we might also observe that *α*^2^*B* is the companion matrix of\n an irreducible polynomial over GF(4), namely $q(x) = x^2 - \\alpha x - \\alpha$.\n\nBoth the \"forgotten\" matrices and the aforementioned companion matrices lie within GL(4, 2).\nA natural question to ask is whether we can make fields by the following process:\n\n1. Filter out all order-15 elements of GL(4, 2)\n2. Partition the elements and their powers into their respective order-15 subgroups\n3. Add the zero matrix into each class\n4. Check whether all classes are additively closed (and are therefore fields)\n\nIn this case, it happens to be true, but proving this in general is difficult, and I haven't done so.\n\n\nExpanding Dimensions\n--------------------\n\nOf course, we need not only focus on GF(4) -- we can just as easily work over GL(2, 2*r*) for other *r* than 2.\nIn this case, the internal matrices will be *r*×*r* while the external one remains 2×2.\nBut neither do we have to work exclusively with 2×2 matrices -- we can work over GL(*n*, 2^*r*^).\nIn either circumstance, the \"borrowing\" of elements of larger order still occurs.\nThis is summarized by the following diagram:\n\n$$\n\\begin{CD}\n \\underset{\n \\scriptsize S \\text{ (order $k$)}\n }{\n \\text{SL}(n,2^r)\n }\n @>>>\n \\underset{\n \\scriptsize\n \\begin{matrix}\n S \\text{ (order $k$)} \\\\\n T \\text{ (order $2^{nr}-1$)}\n \\end{matrix}\n }{\n \\text{GL}(n, 2^r)\n }\n @>{\\text{forget} \\circ f_{r}^*}>>\n {\\text{GL}(nr, 2)}\n @<{f_{nr}}<<\n \\underset{\n \\scriptsize\n \\begin{matrix}\n s \\text{ (order $k$)} \\\\\n t \\text{ (order $2^{nr}-1$)}\n \\end{matrix}\n }{\n \\mathbb{F}_{2^{nr}}\n }\n\\end{CD}\n$$\n\nHere, *f*~*r*~ is our map from GF(2^*r*^) to *r*×*r* matrices and *f*~*nr*~ is a similar map.\n*r* must greater than 1 for us to properly make use of matrix arithmetic.\nSimilarly, *n* must be greater than 1 for the leftmost GL.\nThus, *nr* is a composite number.\nHere, *k* is a proper factor of 2^*nr*^ - 1.\nIn the prior discussion, *k* was 5 and 2^*nr*^ - 1 was 15.\n\nRecall that primitive polynomials over GF(2^*nr*^) have roots with order 2^*nr*^ - 1.\nThis number can *never* be prime, since the only primes of the form\n 2^*p*^ - 1 are Mersenne primes -- *p* itself must be prime.\nThus, in GL of prime dimensions, we can never loan to a GL over a field\n of larger order with the same characteristic.\nConversely, GL(*nr* + 1, 2) trivially contains GL(*nr*, 2) by fixing a subspace.\nSo we do eventually see elements of order 2^*m*^ - 1 for either prime or composite *m*.\n\n\n### Other Primes\n\nThis concern about prime dimensions is unique to characteristic 2.\nFor any other prime *p*, *p*^*m*^ - 1 is composite since it is at the very least even.\nAll other remarks about the above diagram should still hold for any other prime *p*.\n\nIn addition, the diagram where we found a correspondence between the orders of elements in\n GL(2, 2^2^) and GF(2^2×2^) via the characteristic polynomial also generalizes.\nThough I have not proven it, I strongly suspect the following diagram commutes,\n at least in the case where *K* is a finite field:\n\n$$\n\\begin{CD}\n (K^{r \\times r})^{n \\times n}\n @>{\\text{charpoly}}>>\n (K^{r \\times r})[\\Lambda]\n \\\\\n @V{\\text{id}}VV ~ @VV{\\text{eval}_{\\Lambda \\mapsto \\lambda I}}V\n \\\\\n -\n @. (K [\\lambda])^{r \\times r}\n \\\\\n @V{\\text{forget}}VV ~ @VV{\\det}V\n \\\\\n K^{nr \\times nr}\n @>>{\\text{charpoly}}>\n K[\\lambda]\n\\end{CD}\n$$\n\nOver larger primes, the gap between GL and SL may grow ever larger,\n but SL over a prime power field seems to inject into SL over a prime field.\nIf the above diagram is true, then the prior statement follows.\n\n\n### Monadicity and Injections\n\nThe action of forgetting the internal structure may sound somewhat familiar if you know your Haskell.\nRemember that for lists, we can do something similar\n -- converting `[[1,2,3],[4,5,6]]` to `[1,2,3,4,5,6]` is just a matter of applying `concat`.\nThis is an instance in which we know lists to behave like a [monad](https://wiki.haskell.org/Monad).\nDespite being an indecipherable bit of jargon to newcomers, it just means we:\n\n1. can apply functions inside the structure (for example, to the elements of a list),\n2. have a sensible injection into the structure (creating singleton lists, called `return`), and\n3. can reduce two layers to one (`concat`, or `join` for monads in general).\n - Monads are traditionally defined using the operator `>>=`, but `join = (>>= id)`\n\nJust comparing the types of `join :: Monad m => m (m a) -> m a`\n and `forget :: Matrix (Matrix a) -> Matrix a` suggests that `Matrix` (meaning square matrices)\n could be a monad, and further, one which respects addition and multiplication.\nOf course, **this is only true when our internal matrices are all the same size**.\nIn the above diagrams, this restriction has applied, but should be stated explicitly\n since no dimension is specified by `Matrix a`.\n\nCondition 2 gives us some trouble, though.\nFor one, only \"numbers\" (elements of a ring) can go inside matrices, which restricts\n where monadicity can hold.\nMore importantly, we have a *lot* of freedom in what dimension we choose to inject into.\nFor example, we might pick a `return` that uses 1×1 matrices (which add no additional structure).\nWe might also pick `return2`, which scalar-multiplies its argument to a 2×2 identity matrix instead.\n\nUnfortunately, there's no good answer.\nAt the very least, we can close our eyes and pretend that we have a nice diagram:\n\n$$\n\\begin{gather*}\n \\begin{matrix}\n & L\\underset{\\text{degree } r}{/} K\n \\\\ \\\\\n \\small f\n & \\begin{matrix} | \\\\ \\downarrow \\end{matrix}\n \\\\ \\\\\n & K^{r \\times r}\n \\end{matrix}\n & \\quad & \\quad\n & \\begin{matrix}\n & (L\\underset{\\text{degree } r}{/} K)^{n \\times n}\n \\\\ \\\\\n \\small f^* &\n \\begin{matrix} | \\\\ \\downarrow \\end{matrix}\n & \\searrow & \\small \\texttt{>>=} ~ f \\qquad\n \\\\ \\\\\n & (K^{r \\times r})^{n \\times n}\n & \\underset{\\text{forget}} {\\longrightarrow}\n & K {}^{nr \\times nr}\n \\end{matrix}\n\\end{gather*}\n$$\n\nAs one last note on the monadicity of matrices, I *have* played around with an alternative `Matrix`\n type which includes scalars alongside proper matrices, which would allow for\n a simple canonical injection.\nUnfortunately, it complicates `join` -- we just place the responsibility of sizing the internal matrices\n front-and-center since we can correspond internal scalars with identity matrices.\n\n\nClosing\n-------\n\nAt this point, I've gone on far too long about algebra.\nOne nagging curiosity makes me wonder whether the there are any diagrams like the following:\n\n$$\n\\begin{matrix}\n & (L\\underset{\\text{degree } r}{/} K)^{n \\times n}\n & & & & (L\\underset{\\text{degree } n}{/} K)^{r \\times r}\n \\\\ \\\\\n \\small f_1^*\n & \\begin{matrix} | \\\\ \\downarrow \\end{matrix}\n & \\searrow & & \\swarrow\n & \\begin{matrix} | \\\\ \\downarrow \\end{matrix}\n & \\small f_2^*\n \\\\ \\\\\n & (K^{r \\times r})^{n \\times n}\n & \\underset{\\text{forget}} {\\longrightarrow}\n & K {}^{nr \\times nr}\n & \\underset{\\text{forget}}{\\longleftarrow}\n & (K^{n \\times n})^{r \\times r}\n\\end{matrix}\n$$\n\nOr in English, whether \"rebracketing\" certain *nr* × *nr* matrices can be traced back to\n not only a degree *r* field extension, but also one of degree *n*.\n\nThe mathematician in me tells me to believe in well-defined structures.\nMatrices are one such structure, with myriad applications.\nHowever, the computer scientist in me laments that the application of these structures is\n buried in symbols and that layering them is at most glossed over.\nThere is clear utility and interest in doing so, otherwise the diagrams shown above would not exist.\n\nOf course, there's plenty of reason *not* to go down this route.\nFor one, it's plainly inefficient -- GPUs are *built* on matrix operations being\n as efficient as possible, i.e., without the layering.\nIt's also inefficient to learn for people *just* learning matrices.\nI'd still argue that the method is useful for learning about more complex topics, like field extensions.\n\n", + "markdown": "---\ntitle: \"Exploring Finite Fields, Part 4: The Power of Forgetting\"\ndescription: |\n Or: how I stopped learned to worrying and appreciate the Monad.\nformat:\n html:\n html-math-method: katex\ndate: \"2024-02-20\"\ndate-modified: \"2025-08-05\"\ncategories:\n - algebra\n - finite field\n - haskell\n---\n\n\n\nThe [last post](../3) in this series focused on understanding some small linear groups\n and implementing them on the computer over both a prime field and prime power field.\n\nThe prime power case was particularly interesting.\nFirst, we adjoined the roots of a polynomial to the base field, GF(2).\nRather than the traditional means of adding new symbols like *α*, we used companion matrices,\n which behave the same arithmetically.\nFor example, for the smallest prime power field, GF(4), we use the polynomial $p(x) = x^2 + x + 1$,\n and map its symbolic roots (*α* and *α*^2^), to matrices over GF(2):\n\n$$\n\\begin{gather*}\n f : \\mathbb{F}_4 \\longrightarrow \\mathbb{F}_2 {}^{2 \\times 2}\n \\\\ \\\\\n \\begin{gather*}\n f(0) = {\\bf 0} =\n \\left(\\begin{matrix} 0 & 0 \\\\ 0 & 0 \\end{matrix}\\right)\n & f(1) = I\n = \\left(\\begin{matrix} 1 & 0 \\\\ 0 & 1 \\end{matrix}\\right)\n \\\\\n f(\\alpha) = C_p\n = \\left(\\begin{matrix} 0 & 1 \\\\ 1 & 1 \\end{matrix}\\right)\n & f(\\alpha^2) = C_p {}^2\n = \\left(\\begin{matrix} 1 & 1 \\\\ 1 & 0 \\end{matrix}\\right)\n \\end{gather*}\n \\\\ \\\\\n f(a + b)= f(a) + f(b), \\quad f(ab) = f(a)f(b)\n\\end{gather*}\n$$\n\n::: {#1249570e .cell execution_count=3}\n``` {.haskell .cell-code code-fold=\"true\" code-summary=\"Equivalent Haskell\"}\ndata F4 = ZeroF4 | OneF4 | AlphaF4 | Alpha2F4 deriving Eq\nfield4 = [ZeroF4, OneF4, AlphaF4, Alpha2F4]\n\ninstance Show F4 where\n show ZeroF4 = \"0\"\n show OneF4 = \"1\"\n show AlphaF4 = \"α\"\n show Alpha2F4 = \"α^2\"\n\n-- Addition and multiplication over F4\ninstance Num F4 where\n (+) ZeroF4 x = x\n (+) OneF4 AlphaF4 = Alpha2F4\n (+) OneF4 Alpha2F4 = AlphaF4\n (+) AlphaF4 Alpha2F4 = OneF4\n (+) x y = if x == y then ZeroF4 else y + x\n\n (*) ZeroF4 x = ZeroF4\n (*) x ZeroF4 = ZeroF4\n (*) OneF4 x = x\n (*) AlphaF4 AlphaF4 = Alpha2F4\n (*) Alpha2F4 Alpha2F4 = AlphaF4\n (*) AlphaF4 Alpha2F4 = OneF4\n (*) x y = y * x\n\n abs = id\n negate = id\n signum = id\n fromInteger = (cycle field4 !!) . fromInteger\n\n\n-- Companion matrix of `p`, an irreducible polynomial of degree 2 over GF(2)\ncP :: (Num a, Eq a, Integral a) => Matrix a\ncP = companion $ Poly [1, 1, 1]\n\nf ZeroF4 = zero 2\nf OneF4 = eye 2\nf AlphaF4 = cP\nf Alpha2F4 = (`mod` 2) <$> cP |*| cP\n\nfield4M = map f field4\n```\n:::\n\n\nFinally, we constructed GL(2, 4) using matrices of matrices\n -- not [block matrices](https://en.wikipedia.org/wiki/Block_matrix)!\nThis post will focus on studying this method in slightly more detail.\n\n\nReframing the Path Until Now\n----------------------------\n\nIn the above description, we already mentioned larger structures over GF(2),\n namely polynomials and matrices.\nSince GF(4) can itself be described with matrices over GF(2),\n we can generalize *f* to give us two more maps:\n\n- $f^*$, which converts matrices over GF(4) to double-layered matrices over GF(2), and\n- $f^\\bullet$, which converts polynomials over GF(4) to polynomials of matrices over GF(2)\n\n\n### Matrix Map\n\nWe examined the former map briefly in the previous post.\nMore explicitly, we looked at a matrix *B* in SL(2, 4) which had the property\n that it was cyclic of order five.\nThen, to work with it without relying on symbols, we simply applied *f* over the contents of the matrix.\n\n::: {#0554e224 .cell execution_count=4}\n``` {.haskell .cell-code code-fold=\"true\"}\n-- Starred maps are instances of fmap composed with modding out\n-- by the characteristic\n\nfStar :: (Eq a, Num a, Integral a) => Matrix F4 -> Matrix (Matrix a)\nfStar = fmap (fmap (`mod` 2) . f)\n\nmBOrig = toMatrix [[ZeroF4, AlphaF4], [Alpha2F4, Alpha2F4]]\nmBStar = fStar mBOrig\n\nmarkdown $ \"$$\\\\begin{gather*}\" ++ concat [\n -- First row, type of fStar\n \"f^* : \\\\mathbb{F}_4 {}^{2 \\\\times 2}\" ++\n \"\\\\longrightarrow\" ++\n \"(\\\\mathbb{F}_2 {}^{2 \\\\times 2})^{2 \\\\times 2}\" ++\n \"\\\\\\\\[10pt]\",\n -- Second row, B\n \"B = \" ++ texifyMatrix' show mBOrig ++\n \"\\\\\\\\\",\n -- Third row, B*\n \"B^* = f^*(B) = \" ++\n texifyMatrix' (\\x -> \"f(\" ++ show x ++ \")\") mBOrig ++ \" = \" ++\n texifyMatrix' (texifyMatrix' show) mBStar\n ] ++\n \"\\\\end{gather*}$$\"\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{gather*}f^* : \\mathbb{F}_4 {}^{2 \\times 2}\\longrightarrow(\\mathbb{F}_2 {}^{2 \\times 2})^{2 \\times 2}\\\\[10pt]B = \\left( \\begin{matrix}0 & α \\\\ α^2 & α^2\\end{matrix} \\right)\\\\B^* = f^*(B) = \\left( \\begin{matrix}f(0) & f(α) \\\\ f(α^2) & f(α^2)\\end{matrix} \\right) = \\left( \\begin{matrix}\\left( \\begin{matrix}0 & 0 \\\\ 0 & 0\\end{matrix} \\right) & \\left( \\begin{matrix}0 & 1 \\\\ 1 & 1\\end{matrix} \\right) \\\\ \\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right) & \\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right)\\end{matrix} \\right)\\end{gather*}$$\n:::\n:::\n\n\nWe can do this because a matrix contains values in the domain of *f*, thus uniquely determining\n a way to change the internal structure (what Haskell calls\n a [functor](https://wiki.haskell.org/Functor)).\nFurthermore, due to the properties of *f*, it and *f*\\* commute with the determinant,\n as shown by the following diagram:\n\n$$\n\\begin{gather*}\n f(\\det(B)) = f(1) = I =\\det(B^*)= \\det(f^*(B))\n \\\\[10pt]\n \\begin{CD}\n \\mathbb{F}_4 {}^{2 \\times 2}\n @>{\\det}>>\n \\mathbb{F}_4\n \\\\\n @V{f^*}VV ~ @VV{f}V\n \\\\\n (\\mathbb{F}_2 {}^{2 \\times 2})^{2 \\times 2}\n @>>{\\det}>\n \\mathbb{F}_2 {}^{2 \\times 2}\n \\end{CD}\n\\end{gather*}\n$$\n\nIt should be noted that the determinant strips off the *outer* matrix.\nWe could also consider the map **det**\\* , where we apply the determinant\n to the internal matrices (in Haskell terms, `fmap determinant`).\nThis map isn't as nice though, since:\n\n::: {#f2977c19 .cell execution_count=5}\n``` {.haskell .cell-code code-fold=\"true\"}\nmarkdown $ \"$$\\\\begin{align*}\" ++ concat [\n -- First row, det* of B\n \"\\\\det {}^*(B^*) &= \" ++\n texifyMatrix' ((\"\\\\det\" ++) . texifyMatrix' show) mBStar ++ \" = \" ++\n texifyMatrix ((`mod` 2) . determinant <$> mBStar) ++\n \"\\\\\\\\ \\\\\\\\\",\n -- Second row, determinant of B*\n -- Note how the commutation between `determinant` and <$> fails\n \"&\\\\neq\" ++\n texifyMatrix ((`mod` 2) <$> determinant mBStar) ++ \" = \" ++\n \"\\\\det(B^*)\",\n \"\"\n ] ++\n \"\\\\end{align*}$$\"\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{align*}\\det {}^*(B^*) &= \\left( \\begin{matrix}\\det\\left( \\begin{matrix}0 & 0 \\\\ 0 & 0\\end{matrix} \\right) & \\det\\left( \\begin{matrix}0 & 1 \\\\ 1 & 1\\end{matrix} \\right) \\\\ \\det\\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right) & \\det\\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right)\\end{matrix} \\right) = \\left( \\begin{matrix}0 & 1 \\\\ 1 & 1\\end{matrix} \\right)\\\\ \\\\&\\neq\\left( \\begin{matrix}1 & 0 \\\\ 0 & 1\\end{matrix} \\right) = \\det(B^*)\\end{align*}$$\n:::\n:::\n\n\n### Polynomial Map\n\nMuch like how we can change the internal structure of matrices, we can do the same for polynomials.\nFor the purposes of demonstration, we'll work with $b = \\lambda^2 + \\alpha^2 \\lambda + 1$,\n the characteristic polynomial of *B*, since it has coefficients in the domain of *f*.\nWe define the extended map $f^\\bullet$ as:\n\n::: {#58a36854 .cell execution_count=6}\n``` {.haskell .cell-code code-fold=\"true\"}\n-- Bulleted maps are also just instances of fmap, like the starred maps\n\nfBullet :: (Eq a, Num a, Integral a) => Polynomial F4 -> Polynomial (Matrix a)\nfBullet = fmap (fmap (`mod` 2) . f)\n```\n:::\n\n\n$$\n\\begin{gather*}\n f^{\\bullet} : \\mathbb{F}_4[\\lambda] \\longrightarrow\n \\mathbb{F}_2 {}^{2 \\times 2}[\\Lambda]\n \\\\\n f^{\\bullet} (\\lambda) = \\Lambda \\qquad\n f^{\\bullet}(a) = f(a), \\quad a \\in \\mathbb{F}_4\n \\\\ \\\\\n \\begin{align*}\n b^{\\bullet}\n = f^{\\bullet}(b)\n &= f^{\\bullet}(\\lambda^2)\n &&+&& f^{\\bullet}(\\alpha^2)f^{\\bullet}(\\lambda)\n &&+&& f^{\\bullet}(1)\n \\\\\n &= \\Lambda^2\n &&+&& \\left(\\begin{matrix} 1 & 1 \\\\ 1 & 0\\end{matrix}\\right) \\Lambda\n &&+&& \\left(\\begin{matrix} 1 & 0 \\\\ 0 & 1 \\end{matrix}\\right)\n \\end{align*}\n\\end{gather*}\n$$\n\nSince we're looking at the characteristic polynomial of *B*, we might as well also look\n at the characteristic polynomial of *B*\\*, its image under $f^*$.\nWe already looked at the determinant of this matrix, which is the constant term\n of the characteristic polynomial (up to sign).\nTherefore, it's probably not surprising that $f^\\bullet$ and the characteristic polynomial commute\n in a similar fashion to the determinant.\n\n::: {#9126ada7 .cell execution_count=7}\n``` {.haskell .cell-code code-fold=\"true\"}\nbStar = fmap (fmap (`mod` 2)) $ charpoly $ fStar mBOrig\nbBullet = fmap (fmap (`mod` 2)) $ fBullet $ charpoly mBOrig\n\nif bStar /= bBullet then\n markdown \"$b^\\\\star$ and $b^\\\\bullet$ are not equal!\"\n else\n markdown $ \"$$\\\\begin{align*}\" ++ concat [\n \"b^* &= \\\\text{charpoly}(f^*(B)) = \\\\text{charpoly} \" ++\n texifyMatrix' (texifyMatrix' show) mBStar ++\n \"\\\\\\\\\",\n \"&= \" ++\n texifyPoly' \"\\\\Lambda\" (texifyMatrix' show) bStar ++ \" = \" ++\n \"f^\\\\bullet(\\\\text{charpoly}(B)) = b^\\\\bullet\",\n \"\"\n ] ++\n \"\\\\end{align*}$$\"\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{align*}b^* &= \\text{charpoly}(f^*(B)) = \\text{charpoly} \\left( \\begin{matrix}\\left( \\begin{matrix}0 & 0 \\\\ 0 & 0\\end{matrix} \\right) & \\left( \\begin{matrix}0 & 1 \\\\ 1 & 1\\end{matrix} \\right) \\\\ \\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right) & \\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right)\\end{matrix} \\right)\\\\&= \\left( \\begin{matrix}1 & 0 \\\\ 0 & 1\\end{matrix} \\right) + \\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right)\\Lambda + \\Lambda^{2} = f^\\bullet(\\text{charpoly}(B)) = b^\\bullet\\end{align*}$$\n:::\n:::\n\n\n$$\n\\begin{CD}\n \\mathbb{F}_4 {}^{2 \\times 2}\n @>{\\text{charpoly}}>>\n \\mathbb{F}_4[\\lambda]\n \\\\\n @V{f^*}VV ~ @VV{f^\\bullet}V\n \\\\\n (\\mathbb{F}_2 {}^{2 \\times 2})^{2 \\times 2}\n @>>{\\text{charpoly}}>\n (\\mathbb{F}_2 {}^{2 \\times 2})[\\Lambda]\n\\end{CD}\n$$\n\nIt should also be mentioned that **charpoly**\\*, taking the characteristic polynomials\n of the internal matrices, does *not* obey the same relationship.\nFor one, the type is wrong: the codomain is a matrix *containing* polynomials,\n rather than a polynomial over matrices.\n\nThere *does* happen to be an isomorphism between the two structures\n (a direction of which we'll discuss momentarily).\nBut even by converting to the proper type, we already have a counterexample in the constant term\n from taking **det**\\* earlier.\n\n::: {#80572089 .cell execution_count=8}\n``` {.haskell .cell-code code-fold=\"true\"}\nmarkdown $ \"$$\\\\begin{align*}\" ++ concat [\n \"\\\\text{charpoly}^*(B^*) &= \" ++\n texifyMatrix' ((\"\\\\text{charpoly}\" ++) . texifyMatrix' show) mBStar ++\n \"\\\\\\\\\",\n \"&= \" ++\n texifyMatrix' (texifyPoly' \"\\\\lambda\" show)\n (fmap (fmap (`mod` 2) . charpoly) mBStar) ++\n \"\\\\\\\\\",\n \"&\\\\cong \" ++\n -- Not constructing this by isomorphism yet\n texifyPoly' \"\\\\Lambda\" texifyMatrix\n (Poly [\n toMatrix [[0,1], [1,1]],\n toMatrix [[0,1], [1,1]],\n toMatrix [[1,1], [1,1]]\n ]) ++\n \"\\\\\\\\ \\\\\\\\\",\n \"&\\\\neq f^\\\\bullet(\\\\text{charpoly}(B))\"\n ] ++\n \"\\\\end{align*}$$\"\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{align*}\\text{charpoly}^*(B^*) &= \\left( \\begin{matrix}\\text{charpoly}\\left( \\begin{matrix}0 & 0 \\\\ 0 & 0\\end{matrix} \\right) & \\text{charpoly}\\left( \\begin{matrix}0 & 1 \\\\ 1 & 1\\end{matrix} \\right) \\\\ \\text{charpoly}\\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right) & \\text{charpoly}\\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right)\\end{matrix} \\right)\\\\&= \\left( \\begin{matrix}\\lambda^{2} & 1 + \\lambda + \\lambda^{2} \\\\ 1 + \\lambda + \\lambda^{2} & 1 + \\lambda + \\lambda^{2}\\end{matrix} \\right)\\\\&\\cong \\left( \\begin{matrix}0 & 1 \\\\ 1 & 1\\end{matrix} \\right) + \\left( \\begin{matrix}0 & 1 \\\\ 1 & 1\\end{matrix} \\right)\\Lambda + \\left( \\begin{matrix}1 & 1 \\\\ 1 & 1\\end{matrix} \\right)\\Lambda^{2}\\\\ \\\\&\\neq f^\\bullet(\\text{charpoly}(B))\\end{align*}$$\n:::\n:::\n\n\nForgetting\n----------\n\nClearly, layering matrices has several advantages over how we usually interpret block matrices.\nBut what happens if we *do* \"forget\" about the internal structure?\n\n::: {#fd9b9f12 .cell execution_count=9}\n``` {.haskell .cell-code code-fold=\"true\" code-summary=\"Haskell implementation of `forget`\"}\nimport Data.List (transpose)\n\n-- Massively complicated point-free way to forget double matrices:\n-- 1. Convert internal matrices to lists of lists\n-- 2. Convert the external matrix to a list of lists\n-- 3. There are now four layers of lists. Transpose the second and third.\n-- 4. Concat the new third and fourth layers together\n-- 5. Concat the first and second layers together\n-- 6. Convert the list of lists back to a matrix\nforget :: Matrix (Matrix a) -> Matrix a\nforget = toMatrix . concatMap (fmap concat . transpose) .\n fromMatrix . fmap fromMatrix\n\n-- To see why this is the structure, remember that we need to work with rows\n-- of the external matrix at the same time.\n-- We'd like to read across the whole row, but this involves descending into two matrices.\n-- The `fmap transpose` allows us to collect rows in the way we expect.\n-- For example, for the above matrix, We get `[[[0,0],[0,1]], [[0,0],[1,1]]]` after the transposition,\n-- which are the first two rows, grouped by the matrix they belonged to.\n-- Then, we can finally get the desired row by `fmap (fmap concat)`ing the rows together.\n-- Finally, we `concat` once more to undo the column grouping.\n\nmBHat = forget mBStar\n\nmarkdown $ \"$$\\\\begin{gather*}\" ++ concat [\n \"\\\\text{forget} : (\\\\mathbb{F}_2 {}^{2 \\\\times 2})^{2 \\\\times 2}\" ++\n \"\\\\longrightarrow \\\\mathbb{F}_2 {}^{4 \\\\times 4}\" ++\n \"\\\\\\\\[10pt]\",\n \"\\\\hat B = \\\\text{forget}(B^*) = \\\\text{forget}\" ++\n texifyMatrix' (texifyMatrix' show) mBStar ++ \" = \" ++\n texifyMatrix mBHat,\n \"\"\n ] ++\n \"\\\\end{gather*}$$\"\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{gather*}\\text{forget} : (\\mathbb{F}_2 {}^{2 \\times 2})^{2 \\times 2}\\longrightarrow \\mathbb{F}_2 {}^{4 \\times 4}\\\\[10pt]\\hat B = \\text{forget}(B^*) = \\text{forget}\\left( \\begin{matrix}\\left( \\begin{matrix}0 & 0 \\\\ 0 & 0\\end{matrix} \\right) & \\left( \\begin{matrix}0 & 1 \\\\ 1 & 1\\end{matrix} \\right) \\\\ \\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right) & \\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right)\\end{matrix} \\right) = \\left( \\begin{matrix}0 & 0 & 0 & 1 \\\\ 0 & 0 & 1 & 1 \\\\ 1 & 1 & 1 & 1 \\\\ 1 & 0 & 1 & 0\\end{matrix} \\right)\\end{gather*}$$\n:::\n:::\n\n\nLike *f*, `forget` preserves addition and multiplication, a fact already appreciated by block matrices.\nFurther, by *f*, the internal matrices multiply the same as elements of GF(4).\nHence, this shows us directly that GL(2, 4) is a subgroup of GL(4, 2).\n\nHowever, an obvious difference between layered and \"forgotten\" matrices is\n the determinant and characteristic polynomial:\n\n::: {#623d4a04 .cell execution_count=10}\n``` {.haskell .cell-code code-fold=\"true\"}\nmarkdown $ \"$$\\\\begin{align*}\" ++ intercalate \" \\\\\\\\ \\\\\\\\ \" (\n map (intercalate \" & \") [\n [\n \"\\\\det B^* &= \" ++\n texifyMatrix ((`mod` 2) <$> determinant mBStar),\n \"\\\\text{charpoly} B^* &= \" ++\n texifyPoly' \"\\\\Lambda\" texifyMatrix (fmap (`mod` 2) <$> charpoly mBStar)\n ], [\n \"\\\\det \\\\hat B &= \" ++\n show ((`mod` 2) $ determinant mBHat),\n \"\\\\text{charpoly} \\\\hat B &= \" ++\n texifyPoly' \"\\\\lambda\" show ((`mod` 2) <$> charpoly mBHat)\n ]\n ]) ++\n \"\\\\end{align*}$$\"\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{align*}\\det B^* &= \\left( \\begin{matrix}1 & 0 \\\\ 0 & 1\\end{matrix} \\right) & \\text{charpoly} B^* &= \\left( \\begin{matrix}1 & 0 \\\\ 0 & 1\\end{matrix} \\right) + \\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right)\\Lambda + \\Lambda^{2} \\\\ \\\\ \\det \\hat B &= 1 & \\text{charpoly} \\hat B &= 1 + \\lambda + \\lambda^{2} + \\lambda^{3} + \\lambda^{4}\\end{align*}$$\n:::\n:::\n\n\n### Another Forgotten Path\n\nIt's a relatively simple matter to move between determinants, since it's straightforward\n to identify 1 and the identity matrix.\nHowever, a natural question to ask is whether there's a way to reconcile or coerce\n the matrix polynomial into the \"forgotten\" one.\n\nFirst, let's formally establish a path from matrix polynomials to a matrix of polynomials.\nWe need only use our friend from the [second post](../2) -- polynomial evaluation.\nSimply evaluating a matrix polynomial *r* at *λI* converts our matrix indeterminate (*Λ*)\n into a scalar one (*λ*).\n\n$$\n\\begin{align*}\n \\text{eval}_{\\Lambda \\mapsto \\lambda I}\n &: (\\mathbb{F}_2 {}^{2 \\times 2})[\\Lambda]\n \\rightarrow (\\mathbb{F}_2[\\lambda]) {}^{2 \\times 2}\n \\\\\n &:: \\quad\n r(\\Lambda) \\mapsto r(\\lambda I)\n\\end{align*}\n$$\n\n::: {#8e5ab00c .cell execution_count=11}\n``` {.haskell .cell-code code-fold=\"true\"}\n-- Function following from the evaluation definition above\n-- Note that `Poly . pure` is used to transform matrices of `a`\n-- into matrices of polynomials.\n\ntoMatrixPolynomial :: (Eq a, Num a) =>\n Polynomial (Matrix a) -> Matrix (Polynomial a)\ntoMatrixPolynomial xs = evalPoly eyeLambda $ fmap (fmap (Poly . pure)) xs where\n -- First dimensions of the coefficients\n (is, _) = unzip $ map (snd . bounds . unMat) $ coeffs xs\n -- Properly-sized identity matrix times a scalar lambda\n eyeLambda = eye (1 + maximum is) * toMatrix [[Poly [0, 1]]]\n\n\nmarkdown $ \"$$\\\\begin{align*}\" ++\n \"\\\\text{eval}_{\\\\Lambda \\\\mapsto \\\\lambda I}(\\\\text{charpoly}(B^*)) &=\" ++\n texifyPoly' \"(\\\\lambda I)\" texifyMatrix\n (fmap (`mod` 2) <$> charpoly mBStar) ++\n \"\\\\\\\\ &= \" ++\n texifyMatrix' (texifyPoly' \"\\\\lambda\" show)\n (toMatrixPolynomial $ fmap (`mod` 2) <$> charpoly mBStar) ++\n \"\\\\end{align*}$$\"\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{align*}\\text{eval}_{\\Lambda \\mapsto \\lambda I}(\\text{charpoly}(B^*)) &=\\left( \\begin{matrix}1 & 0 \\\\ 0 & 1\\end{matrix} \\right) + \\left( \\begin{matrix}1 & 1 \\\\ 1 & 0\\end{matrix} \\right)(\\lambda I) + (\\lambda I)^{2}\\\\ &= \\left( \\begin{matrix}1 + \\lambda + \\lambda^{2} & \\lambda \\\\ \\lambda & 1 + \\lambda^{2}\\end{matrix} \\right)\\end{align*}$$\n:::\n:::\n\n\nSince a matrix containing polynomials is still a matrix, we can then take its determinant.\nWhat pops out is exactly what we were after...\n\n::: {#a9a302d9 .cell execution_count=12}\n``` {.haskell .cell-code code-fold=\"true\"}\nmarkdown $ \"$$\\\\begin{align*}\" ++\n \"\\\\det(\\\\text{eval}_{\\\\Lambda \\\\mapsto \\\\lambda I}(\" ++\n \"\\\\text{charpoly}(B^*))) &=\" ++\n \"(1 + \\\\lambda + \\\\lambda^2)(1 + \\\\lambda^2) - \\\\lambda^2\" ++\n \"\\\\\\\\ &=\" ++\n texifyPoly' \"\\\\lambda\" show\n (fmap (`mod` 2) <$> determinant $ toMatrixPolynomial $ charpoly mBStar) ++\n \"\\\\\\\\ &= \\\\text{charpoly}{\\\\hat B}\" ++\n \"\\\\end{align*}$$\"\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{align*}\\det(\\text{eval}_{\\Lambda \\mapsto \\lambda I}(\\text{charpoly}(B^*))) &=(1 + \\lambda + \\lambda^2)(1 + \\lambda^2) - \\lambda^2\\\\ &=1 + \\lambda + \\lambda^{2} + \\lambda^{3} + \\lambda^{4}\\\\ &= \\text{charpoly}{\\hat B}\\end{align*}$$\n:::\n:::\n\n\n...and we can arrange our maps into another diagram:\n\n$$\n\\begin{gather*}\n \\begin{CD}\n (\\mathbb{F}_2 {}^{2 \\times 2})^{2 \\times 2}\n @>{\\text{charpoly}}>>\n (\\mathbb{F}_2 {}^{2 \\times 2})[\\Lambda]\n \\\\\n @V{\\text{id}}VV ~ @VV{\\text{eval}_{\\Lambda \\mapsto \\lambda I}}V\n \\\\\n -\n @. (\\mathbb{F}_2 [\\lambda])^{2 \\times 2}\n \\\\\n @V{\\text{forget}}VV ~ @VV{\\det}V\n \\\\\n \\mathbb{F}_2 {}^{4 \\times 4}\n @>>{\\text{charpoly}}>\n \\mathbb{F}_2[\\lambda]\n \\end{CD}\n \\\\ \\\\\n \\text{charpoly} \\circ \\text{forget}\n = \\det \\circ ~\\text{eval}_{\\Lambda \\mapsto \\lambda I} \\circ\\text{charpoly}\n\\end{gather*}\n$$\n\nIt should be noted that we do *not* get the same results by taking the determinant after\n applying **charpoly**\\*, indicating that the above method is \"correct\".\n\n::: {#c9d284c9 .cell execution_count=13}\n``` {.haskell .cell-code code-fold=\"true\"}\nmarkdown $ \"$$\\\\begin{align*}\" ++\n \"\\\\text{charpoly}^*(B^*) &=\" ++\n texifyMatrix' (texifyPoly' \"\\\\lambda\" show)\n (fmap (`mod` 2) <$> fmap charpoly mBStar) ++\n \"\\\\\\\\ \\\\\\\\\" ++\n \"\\\\det(\\\\text{charpoly}^*(B^*)) &=\" ++\n \"\\\\lambda^2(1 + \\\\lambda + \\\\lambda^2) - (1 + \\\\lambda + \\\\lambda^2)^2\" ++\n \"\\\\\\\\ &= \" ++\n texifyPoly' \"\\\\lambda\" show\n (fmap (`mod` 2) <$> determinant $ fmap charpoly mBStar) ++\n \"\\\\end{align*}$$\"\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{align*}\\text{charpoly}^*(B^*) &=\\left( \\begin{matrix}\\lambda^{2} & 1 + \\lambda + \\lambda^{2} \\\\ 1 + \\lambda + \\lambda^{2} & 1 + \\lambda + \\lambda^{2}\\end{matrix} \\right)\\\\ \\\\\\det(\\text{charpoly}^*(B^*)) &=\\lambda^2(1 + \\lambda + \\lambda^2) - (1 + \\lambda + \\lambda^2)^2\\\\ &= 1 + \\lambda^{3}\\end{align*}$$\n:::\n:::\n\n\n### Cycles and Cycles\n\nSince we can get $\\lambda^4 + \\lambda^3 + \\lambda^2 + \\lambda + 1$ in two ways,\n it's natural to assume this polynomial is significant in some way.\nIn the language of the the second post, the polynomial can also be written as ~2~31,\n whose root we determined was cyclic of order 5.\nThis happens to match the order of *B* in GL(2, 4).\n\nPerhaps this is unsurprising, since there are only so many polynomials of degree 4 over GF(2).\nHowever, the reason we see it is more obvious if we look at the powers of scalar multiples of *B*.\nFirst, recall that *f*\\* takes us from a matrix over GF(4) to a matrix of matrices of GF(2).\nThen define a map *g* that gives us degree 4 polynomials:\n\n$$\n\\begin{gather*}\n g : \\mathbb{F}_4^{2 \\times 2} \\rightarrow \\mathbb{F}_2[\\lambda]\n \\\\\n g = \\text{charpoly} \\circ \\text{forget} \\circ f^*\n\\end{gather*}\n$$\n\n::: {#5e05ff31 .cell layout-ncol='3' execution_count=14}\n``` {.haskell .cell-code code-fold=\"true\"}\ng = fmap (`mod` 2) . charpoly . forget . fStar\n\nshowSeries varName var = \"$$\\\\begin{array}{}\" ++\n \" & \\\\scriptsize \" ++\n texifyMatrix var ++\n \"\\\\\\\\\" ++\n intercalate \" \\\\\\\\ \" [\n (if n == 1 then varName' else varName' ++ \"^{\" ++ show n ++ \"}\") ++\n \"& \\\\overset{g}{\\\\mapsto} &\" ++\n texPolyAsPositional' \"\\\\lambda\" (g $ var^n)\n | n <- [1..5]\n ] ++\n \"\\\\end{array}$$\" where\n varName' = if length varName == 1 then varName else \"(\" ++ varName ++ \")\"\n\nmarkdown $ showSeries \"B\" mBOrig\nmarkdown $ showSeries \"αB\" (fmap (AlphaF4*) mBOrig)\nmarkdown $ showSeries \"α^2 B\" (fmap (Alpha2F4*) mBOrig)\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{array}{} & \\scriptsize \\left( \\begin{matrix}0 & α \\\\ α^2 & α^2\\end{matrix} \\right)\\\\B& \\overset{g}{\\mapsto} &11111_{\\lambda} \\\\ B^{2}& \\overset{g}{\\mapsto} &11111_{\\lambda} \\\\ B^{3}& \\overset{g}{\\mapsto} &11111_{\\lambda} \\\\ B^{4}& \\overset{g}{\\mapsto} &11111_{\\lambda} \\\\ B^{5}& \\overset{g}{\\mapsto} &10001_{\\lambda}\\end{array}$$\n:::\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{array}{} & \\scriptsize \\left( \\begin{matrix}0 & α^2 \\\\ 1 & 1\\end{matrix} \\right)\\\\(αB)& \\overset{g}{\\mapsto} &10011_{\\lambda} \\\\ (αB)^{2}& \\overset{g}{\\mapsto} &10011_{\\lambda} \\\\ (αB)^{3}& \\overset{g}{\\mapsto} &11111_{\\lambda} \\\\ (αB)^{4}& \\overset{g}{\\mapsto} &10011_{\\lambda} \\\\ (αB)^{5}& \\overset{g}{\\mapsto} &10101_{\\lambda}\\end{array}$$\n:::\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n$$\\begin{array}{} & \\scriptsize \\left( \\begin{matrix}0 & 1 \\\\ α & α\\end{matrix} \\right)\\\\(α^2 B)& \\overset{g}{\\mapsto} &11001_{\\lambda} \\\\ (α^2 B)^{2}& \\overset{g}{\\mapsto} &11001_{\\lambda} \\\\ (α^2 B)^{3}& \\overset{g}{\\mapsto} &11111_{\\lambda} \\\\ (α^2 B)^{4}& \\overset{g}{\\mapsto} &11001_{\\lambda} \\\\ (α^2 B)^{5}& \\overset{g}{\\mapsto} &10101_{\\lambda}\\end{array}$$\n:::\n:::\n\n\nThe matrices in the middle and rightmost columns both have order 15 inside GL(2, 4).\nCorrespondingly, both 10011~λ~ = ~2~19 and 11001~λ~ = ~2~25 are primitive,\n and so have roots of order 15 over GF(2).\n\n\n### A Field?\n\nSince we have 15 matrices generated by the powers of one, you might wonder whether or not\n they can correspond to the nonzero elements of GF(16).\nAnd they can!\nIn a sense, we've \"borrowed\" the order 15 elements from this \"field\" within GL(4, 2).\nHowever, none of the powers of this matrix are the companion matrix of either ~2~19 or ~2~25.\n\n
\n\nHaskell demonstration of the field-like-ness of these matrices\n\n\nAll we really need to do is test additive closure, since the powers trivially commute and include the identity matrix.\n\n::: {#027fb2e4 .cell execution_count=15}\n``` {.haskell .cell-code}\n-- Check whether n x n matrices (mod p) have additive closure\n-- Supplement the identity, even if it is not already present\nhasAdditiveClosure :: Integral a => Int -> a -> [Matrix a] -> Bool\nhasAdditiveClosure n p xs = all (`elem` xs') sums where\n -- Add in the zero matrix\n xs' = zero n:xs\n -- Calculate all possible sums of pairs (mod p)\n sums = map (fmap (`mod` p)) $ (+) <$> xs' <*> xs'\n\n-- Generate the powers of x, then test if they form a field (mod p)\ngeneratesField :: Integral a => Int -> a -> Matrix a -> Bool\ngeneratesField n p x = hasAdditiveClosure n p xs where\n xs = map (fmap (`mod` p) . (x^)) [1..p^n-1]\n\n\nprint $ generatesField 4 2 $ forget $ fStar $ fmap (AlphaF4*) mBOrig\n```\n\n::: {.cell-output .cell-output-display}\n```\nTrue\n```\n:::\n:::\n\n\n
\n\nMore directly, we might also observe that *α*^2^*B* is the companion matrix of\n an irreducible polynomial over GF(4), namely $q(x) = x^2 - \\alpha x - \\alpha$.\n\nBoth the \"forgotten\" matrices and the aforementioned companion matrices lie within GL(4, 2).\nA natural question to ask is whether we can make fields by the following process:\n\n1. Filter out all order-15 elements of GL(4, 2)\n2. Partition the elements and their powers into their respective order-15 subgroups\n3. Add the zero matrix into each class\n4. Check whether all classes are additively closed (and are therefore fields)\n\nIn this case, it happens to be true, but proving this in general is difficult, and I haven't done so.\n\n\nExpanding Dimensions\n--------------------\n\nOf course, we need not only focus on GF(4) -- we can just as easily work over GL(2, 2*r*) for other *r* than 2.\nIn this case, the internal matrices will be *r*×*r* while the external one remains 2×2.\nBut neither do we have to work exclusively with 2×2 matrices -- we can work over GL(*n*, 2^*r*^).\nIn either circumstance, the \"borrowing\" of elements of larger order still occurs.\nThis is summarized by the following diagram:\n\n$$\n\\begin{CD}\n \\underset{\n \\scriptsize S \\text{ (order $k$)}\n }{\n \\text{SL}(n,2^r)\n }\n @>>>\n \\underset{\n \\scriptsize\n \\begin{matrix}\n S \\text{ (order $k$)} \\\\\n T \\text{ (order $2^{nr}-1$)}\n \\end{matrix}\n }{\n \\text{GL}(n, 2^r)\n }\n @>{\\text{forget} \\circ f_{r}^*}>>\n {\\text{GL}(nr, 2)}\n @<{f_{nr}}<<\n \\underset{\n \\scriptsize\n \\begin{matrix}\n s \\text{ (order $k$)} \\\\\n t \\text{ (order $2^{nr}-1$)}\n \\end{matrix}\n }{\n \\mathbb{F}_{2^{nr}}\n }\n\\end{CD}\n$$\n\nHere, *f*~*r*~ is our map from GF(2^*r*^) to *r*×*r* matrices and *f*~*nr*~ is a similar map.\n*r* must greater than 1 for us to properly make use of matrix arithmetic.\nSimilarly, *n* must be greater than 1 for the leftmost GL.\nThus, *nr* is a composite number.\nHere, *k* is a proper factor of 2^*nr*^ - 1.\nIn the prior discussion, *k* was 5 and 2^*nr*^ - 1 was 15.\n\nRecall that primitive polynomials over GF(2^*nr*^) have roots with order 2^*nr*^ - 1.\nThis number can *never* be prime, since the only primes of the form\n 2^*p*^ - 1 are Mersenne primes -- *p* itself must be prime.\nThus, in GL of prime dimensions, we can never loan to a GL over a field\n of larger order with the same characteristic.\nConversely, GL(*nr* + 1, 2) trivially contains GL(*nr*, 2) by fixing a subspace.\nSo we do eventually see elements of order 2^*m*^ - 1 for either prime or composite *m*.\n\n\n### Other Primes\n\nThis concern about prime dimensions is unique to characteristic 2.\nFor any other prime *p*, *p*^*m*^ - 1 is composite since it is at the very least even.\nAll other remarks about the above diagram should still hold for any other prime *p*.\n\nIn addition, the diagram where we found a correspondence between the orders of elements in\n GL(2, 2^2^) and GF(2^2×2^) via the characteristic polynomial also generalizes.\nThough I have not proven it, I strongly suspect the following diagram commutes,\n at least in the case where *K* is a finite field:\n\n$$\n\\begin{CD}\n (K^{r \\times r})^{n \\times n}\n @>{\\text{charpoly}}>>\n (K^{r \\times r})[\\Lambda]\n \\\\\n @V{\\text{id}}VV ~ @VV{\\text{eval}_{\\Lambda \\mapsto \\lambda I}}V\n \\\\\n -\n @. (K [\\lambda])^{r \\times r}\n \\\\\n @V{\\text{forget}}VV ~ @VV{\\det}V\n \\\\\n K^{nr \\times nr}\n @>>{\\text{charpoly}}>\n K[\\lambda]\n\\end{CD}\n$$\n\nOver larger primes, the gap between GL and SL may grow ever larger,\n but SL over a prime power field seems to inject into SL over a prime field.\nIf the above diagram is true, then the prior statement follows.\n\n\n### Monadicity and Injections\n\nThe action of forgetting the internal structure may sound somewhat familiar if you know your Haskell.\nRemember that for lists, we can do something similar\n -- converting `[[1,2,3],[4,5,6]]` to `[1,2,3,4,5,6]` is just a matter of applying `concat`.\nThis is an instance in which we know lists to behave like a [monad](https://wiki.haskell.org/Monad).\nDespite being an indecipherable bit of jargon to newcomers, it just means we:\n\n1. can apply functions inside the structure (for example, to the elements of a list),\n2. have a sensible injection into the structure (creating singleton lists, called `return`), and\n3. can reduce two layers to one (`concat`, or `join` for monads in general).\n - Monads are traditionally defined using the operator `>>=`, but `join = (>>= id)`\n\nJust comparing the types of `join :: Monad m => m (m a) -> m a`\n and `forget :: Matrix (Matrix a) -> Matrix a` suggests that `Matrix` (meaning square matrices)\n could be a monad, and further, one which respects addition and multiplication.\nOf course, **this is only true when our internal matrices are all the same size**.\nIn the above diagrams, this restriction has applied, but should be stated explicitly\n since no dimension is specified by `Matrix a`.\n\nCondition 2 gives us some trouble, though.\nFor one, only \"numbers\" (elements of a ring) can go inside matrices, which restricts\n where monadicity can hold.\nMore importantly, we have a *lot* of freedom in what dimension we choose to inject into.\nFor example, we might pick a `return` that uses 1×1 matrices (which add no additional structure).\nWe might also pick `return2`, which scalar-multiplies its argument to a 2×2 identity matrix instead.\n\nUnfortunately, there's no good answer.\nAt the very least, we can close our eyes and pretend that we have a nice diagram:\n\n$$\n\\begin{gather*}\n \\begin{matrix}\n & L\\underset{\\text{degree } r}{/} K\n \\\\ \\\\\n \\small f\n & \\begin{matrix} | \\\\ \\downarrow \\end{matrix}\n \\\\ \\\\\n & K^{r \\times r}\n \\end{matrix}\n & \\quad & \\quad\n & \\begin{matrix}\n & (L\\underset{\\text{degree } r}{/} K)^{n \\times n}\n \\\\ \\\\\n \\small f^* &\n \\begin{matrix} | \\\\ \\downarrow \\end{matrix}\n & \\searrow & \\small \\texttt{>>=} ~ f \\qquad\n \\\\ \\\\\n & (K^{r \\times r})^{n \\times n}\n & \\underset{\\text{forget}} {\\longrightarrow}\n & K {}^{nr \\times nr}\n \\end{matrix}\n\\end{gather*}\n$$\n\nAs one last note on the monadicity of matrices, I *have* played around with an alternative `Matrix`\n type which includes scalars alongside proper matrices, which would allow for\n a simple canonical injection.\nUnfortunately, it complicates `join` -- we just place the responsibility of sizing the internal matrices\n front-and-center since we can correspond internal scalars with identity matrices.\n\n\nClosing\n-------\n\nAt this point, I've gone on far too long about algebra.\nOne nagging curiosity makes me wonder whether the there are any diagrams like the following:\n\n$$\n\\begin{matrix}\n & (L\\underset{\\text{degree } r}{/} K)^{n \\times n}\n & & & & (L\\underset{\\text{degree } n}{/} K)^{r \\times r}\n \\\\ \\\\\n \\small f_1^*\n & \\begin{matrix} | \\\\ \\downarrow \\end{matrix}\n & \\searrow & & \\swarrow\n & \\begin{matrix} | \\\\ \\downarrow \\end{matrix}\n & \\small f_2^*\n \\\\ \\\\\n & (K^{r \\times r})^{n \\times n}\n & \\underset{\\text{forget}} {\\longrightarrow}\n & K {}^{nr \\times nr}\n & \\underset{\\text{forget}}{\\longleftarrow}\n & (K^{n \\times n})^{r \\times r}\n\\end{matrix}\n$$\n\nOr in English, whether \"rebracketing\" certain *nr* × *nr* matrices can be traced back to\n not only a degree *r* field extension, but also one of degree *n*.\n\nThe mathematician in me tells me to believe in well-defined structures.\nMatrices are one such structure, with myriad applications.\nHowever, the computer scientist in me laments that the application of these structures is\n buried in symbols and that layering them is at most glossed over.\nThere is clear utility and interest in doing so, otherwise the diagrams shown above would not exist.\n\nOf course, there's plenty of reason *not* to go down this route.\nFor one, it's plainly inefficient -- GPUs are *built* on matrix operations being\n as efficient as possible, i.e., without the layering.\nIt's also inefficient to learn for people *just* learning matrices.\nI'd still argue that the method is useful for learning about more complex topics, like field extensions.\n\n", "supporting": [ - "index_files" + "index_files/figure-html" ], "filters": [], "includes": {} diff --git a/_freeze/posts/math/number-number/1/index/execute-results/html.json b/_freeze/posts/math/number-number/1/index/execute-results/html.json index e6ff56b..f3f5581 100644 --- a/_freeze/posts/math/number-number/1/index/execute-results/html.json +++ b/_freeze/posts/math/number-number/1/index/execute-results/html.json @@ -1,10 +1,10 @@ { - "hash": "749e06a7af6447084f45c7dd55a91231", + "hash": "9db15bf9ac55734a382c20d347ef1a6d", "result": { "engine": "jupyter", - "markdown": "---\ntitle: \"Numbering Numbers: From 0 to ∞\"\ndescription: |\n How do we count an infinitude of numbers?\nformat:\n html:\n html-math-method: katex\ndate: \"2023-11-26\"\ndate-modified: \"2025-07-22\"\ncategories:\n - algebra\n - question-mark function\n - haskell\n---\n\n\n\nThe infinite is replete with paradoxes.\nSome of the best come from comparing sizes of infinite collections.\nFor example, every natural number can be mapped to a (nonnegative) even number and vice versa.\n\n$$\n\\begin{gather*}\n \\N \\rightarrow 2\\N\n \\\\\n n \\mapsto 2n\n \\\\ \\\\\n 0 \\mapsto 0,~ 1 \\mapsto2,~ 2 \\mapsto 4,~ 3 \\mapsto 6,~ 4 \\mapsto 8, ...\n\\end{gather*}\n$$\n\n(For the purposes of this post, $0 \\in \\N, ~ 0 \\notin \\N^+$)\n\nAll even numbers are \"hit\" by this map (by the definition of an even number),\n and no two natural numbers map to the same even number\n (again, more or less by definition, since $2m = 2n$ implies that $m = n$ over $\\N$).\nTherefore, the map is [one-to-one](https://en.wikipedia.org/wiki/Injective_function)\n and [onto](https://en.wikipedia.org/wiki/Surjective_function),\n so the map is a [bijection](https://en.wikipedia.org/wiki/Bijection).\nA consequence is that the map has an inverse, namely by reversing all of the arrows in the above block\n (i.e., the action of halving an even number).\n\nBijections with the natural numbers are easier to understand as a way to place things\n into a linear sequence.\nIn other words, they enumerate \"some sort of item\"; in this case, even numbers.\n\nIn the finite world, a bijection between two things implies that they have the same size.\nIt makes sense to extend the same logic to the infinite world, but there's a catch.\nThe nonnegative even numbers are clearly a strict subset of the natural numbers,\n but by this argument they have the same size.\n\n$$\n\\begin{matrix}\n 2\\N & \\longleftrightarrow & \\N & \\hookleftarrow & 2\\N\n \\\\\n 0 & \\mapsto & \\textcolor{red}0 & \\dashleftarrow & \\textcolor{red}0\n \\\\\n 2 & \\mapsto & 1 & &\n \\\\\n 4 & \\mapsto & \\textcolor{red}2 & \\dashleftarrow & \\textcolor{red}2\n \\\\\n 6 & \\mapsto & 3 & &\n \\\\\n 8 & \\mapsto & \\textcolor{red}4 & \\dashleftarrow & \\textcolor{red}4\n \\\\\n 10 & \\mapsto & 5 & &\n \\\\\n 12 & \\mapsto & \\textcolor{red}6 & \\dashleftarrow & \\textcolor{red}6\n \\\\\n 14 & \\mapsto & 7 & &\n \\\\\n 16 & \\mapsto & \\textcolor{red}8 & \\dashleftarrow & \\textcolor{red}8\n \\\\\n \\vdots & & \\vdots & & \\vdots\n\\end{matrix}\n$$\n\n\nAre we Positive?\n----------------\n\nThe confusion continues if we look at the integers and the naturals.\nIntegers are the natural numbers and their negatives, so it would be intuitive to assume that\n there are twice as many of them as there are naturals (more or less one to account for zero).\nBut since that logic fails for the naturals and the even numbers,\n it fails for the naturals and integers as well.\n\n$$\n\\begin{gather*}\n \\begin{align*}\n \\mathbb{N} &\\rightarrow \\mathbb{Z}\n \\\\\n n &\\mapsto \\left\\{ \\begin{matrix}\n n/2 & n \\text{ even}\n \\\\\n -(n+1)/2 & n \\text{ odd}\n \\end{matrix} \\right.\n \\end{align*}\n \\\\ \\\\\n 0 \\mapsto 0,\\quad 2 \\mapsto 1, \\quad 4 \\mapsto 2, \\quad 6 \\mapsto 3, \\quad 8 \\mapsto 4,~...\n \\\\\n 1 \\mapsto -1, \\quad 3 \\mapsto -2, \\quad 5 \\mapsto -3, \\quad 7 \\mapsto -4, \\quad 9 \\mapsto -5,~...\n\\end{gather*}\n$$\n\nOr, in Haskell[^1]:\n\n[^1]: That is, if you cover your eyes and pretend that `undefined` will never happen,\n and if you ignore that `Int` is bounded, unlike `Integer`.\n\n::: {#775b525e .cell execution_count=2}\n``` {.haskell .cell-code}\ntype Nat = Int\n\nlistIntegers :: Nat -> Int\nlistIntegers n\n | n < 0 = undefined\n | even n = n `div` 2\n | otherwise = -(n + 1) `div` 2\n```\n:::\n\n\nIn other words, this map sends even numbers to the naturals (the inverse of the doubling map)\n and the odds to the negatives.\nThe same arguments about the bijective nature of this map apply as before, and so the paradox persists,\n since naturals are also a strict subset of integers.\n\n\n### Rational Numbers\n\nRationals are a bit worse.\nTo make things a little easier, let's focus on the positive rationals (i.e., fractions excluding 0).\nUnlike the integers, there is no obvious \"next rational\" after (or even before) 1.\nIf there were, we could follow it with its reciprocal, like how an integer is followed\n by its negative in the map above.\n\nOn the other hand, the integers provide a sliver of hope that listing all rational numbers is possible.\nIntegers can be defined as pairs of natural numbers, along with a way of considering two pairs equal.\n\n$$\n\\begin{gather*}\n -1 = (0,1) \\sim_\\Z (1,2) \\sim_\\Z (2,3) \\sim_\\Z (3,4) \\sim_\\Z ...\n \\\\[10pt]\n (a,b) \\sim_\\mathbb{Z} (c,d) \\iff a+d = b+c \\quad a,b,c,d \\in \\mathbb{N}\n \\\\[10pt]\n \\mathbb{Z} := ( \\mathbb{N} \\times \\mathbb{N} ) / \\sim_\\mathbb{Z}\n\\end{gather*}\n$$\n\n::: {#97b6db11 .cell execution_count=3}\n``` {.haskell .cell-code}\nintEqual :: (Nat, Nat) -> (Nat, Nat) -> Bool\nintEqual (a, b) (c, d) = a + d == b + c\n```\n:::\n\n\nThis relation is the same as saying $a - b = c - d$ (i.e., that -1 = 0 - 1, etc.),\n but has the benefit of not requiring subtraction to be defined.\nThis is all the better, since, as grade-schoolers are taught, subtracting a larger natural number\n from a smaller one is impossible.\n\nThe same equivalence definition exists for positive rationals.\nIt is perhaps more familiar, because of the emphasis placed on simplifying fractions when learning them.\nWe can [cross-multiply](https://en.wikipedia.org/wiki/Cross-multiplication) fractions to get\n a similar equality condition to the one for integers.\n\n$$\n\\begin{gather*}\n {1 \\over 2} = (1,2) \\sim_\\mathbb{Q} \\overset{2/4}{(2,4)} \\sim_\\mathbb{Q}\n \\overset{3/6}{(3,6)} \\sim_\\mathbb{Q} \\overset{4/8}{(4,8)} \\sim_\\mathbb{Q} ...\n \\\\ \\\\\n (a,b) \\sim_\\mathbb{Q} (c,d) \\iff ad = bc \\quad a,b,c,d \\in \\mathbb{N}^+\n \\\\ ~ \\\\\n \\mathbb{Q^+} := ( \\mathbb{N^+} \\times \\mathbb{N^+} ) / \\sim_\\mathbb{Q}\n\\end{gather*}\n$$\n\n::: {#4c4d2c94 .cell execution_count=4}\n``` {.haskell .cell-code}\nratEqual :: (Nat, Nat) -> (Nat, Nat) -> Bool\nratEqual (a, b) (c, d) = a * d == b * c\n```\n:::\n\n\nWe specify that neither element of the pair can be zero, so this excludes divisions by zero\n (and the especially tricky case of 0/0, which would be equal to all fractions).\nEffectively, this just replaces where addition appears in the integer equivalence with multiplication.\n\n\n### Eliminating Repeats\n\nNaively, to tackle both of these cases, we might consider enumerating pairs of natural numbers.\nWe order them by sums and break ties by sorting on the first index.\n\n::: {#aa044caa .cell execution_count=5}\n``` {.haskell .cell-code}\n-- All pairs of natural numbers that sum to n\nlistPairs :: Nat -> [(Nat, Nat)]\nlistPairs n = [ (k, n - k) | k <- [0..n] ]\n\n-- \"Triangular\" enumeration of all pairs of positive integers\nallPairs :: [(Nat, Nat)]\nallPairs = concatMap listPairs [0..]\n\n-- Use a natural number to index the enumeration of all pairs\nallPairsMap :: Nat -> (Nat, Nat)\nallPairsMap n = allPairs !! n\n```\n:::\n\n\n::: {#6143624a .cell .plain execution_count=6}\n``` {.haskell .cell-code code-fold=\"true\"}\npairEnumeration = columns (\\(_, f) v -> f v) (\\(l, _) -> Headed l) [\n (\"Index\", show . fst),\n (\"Pair (a, b)\", show . snd),\n (\"Sum (a + b)\", show . uncurry (+) . snd),\n (\"Integer (a - b)\", show . uncurry (-) . snd),\n (\"Rational (a+1 / b+1)\", (\\(a, b) -> show (a + 1) ++ \"/\" ++ show (b + 1)) . snd)\n ]\n\nrenderTable (rmap stringCell pairEnumeration) $ take 10 $ zip [0..] allPairs\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
\n Index\n \n Pair (a, b)\n \n Sum (a + b)\n \n Integer (a - b)\n \n Rational (a+1 / b+1)\n
\n 0\n \n (0,0)\n \n 0\n \n 0\n \n 1/1\n
\n 1\n \n (0,1)\n \n 1\n \n -1\n \n 1/2\n
\n 2\n \n (1,0)\n \n 1\n \n 1\n \n 2/1\n
\n 3\n \n (0,2)\n \n 2\n \n -2\n \n 1/3\n
\n 4\n \n (1,1)\n \n 2\n \n 0\n \n 2/2\n
\n 5\n \n (2,0)\n \n 2\n \n 2\n \n 3/1\n
\n 6\n \n (0,3)\n \n 3\n \n -3\n \n 1/4\n
\n 7\n \n (1,2)\n \n 3\n \n -1\n \n 2/3\n
\n 8\n \n (2,1)\n \n 3\n \n 1\n \n 3/2\n
\n 9\n \n (3,0)\n \n 3\n \n 3\n \n 4/1\n
\n```\n:::\n:::\n\n\nThis certainly works to show that naturals and pairs of naturals can be put into bijection,\n but it when interpreting the results as integers or rationals, we double-count several of them.\nThis is easy to see in the case of the integers, but it will also happen in the rationals.\nFor example, the pair (3, 5) would correspond to 4/6 = 2/3, which has already been counted.\n\nIncidentally, Haskell comes with a function called `nubBy`.\nThis function eliminates duplicates according to another function of our choosing.\nWe can also just implement it ourselves and use it to create a naive enumeration of integers and rationals,\n based on the equalities defined earlier:\n\n::: {#7017b14d .cell execution_count=7}\n``` {.haskell .cell-code}\nnubBy :: (a -> a -> Bool) -> [a] -> [a]\nnubBy f = nubBy' [] where\n nubBy' ys [] = []\n nubBy' ys (z:zs)\n -- Ignore this element, something equivalent is in ys\n | any (f z) ys = nubBy' ys zs\n -- Append this element to the result and our internal list\n | otherwise = z:nubBy' (z:ys) zs\n\nallIntegers :: [(Nat, Nat)]\n-- Remove duplicates under integer equality\nallIntegers = nubBy intEqual allPairs\n\nallIntegersMap :: Nat -> (Nat, Nat)\nallIntegersMap n = allIntegers !! n\n\nallRationals :: [(Nat, Nat)]\n-- Add 1 to the numerator and denominator to get rid of 0,\n-- then remove duplicates under fraction equality\nallRationals = nubBy ratEqual $ map (\\(a,b) -> (a+1, b+1)) allPairs\n\nallRationalsMap :: Nat -> (Nat, Nat)\nallRationalsMap n = allRationals !! n\n```\n:::\n\n\nFor completeness's sake, the resulting pairs of each map are as follows\n\n::: {#95e58f2c .cell .plain execution_count=8}\n``` {.haskell .cell-code code-fold=\"true\"}\ncodeCell = htmlCell . Html.code . Html.string\n\nshowAsInteger p@(a,b) = show p ++ \" = \" ++ show (a - b)\nshowAsRational' p@(a,b) = show a ++ \"/\" ++ show b\nshowAsRational p@(a,b) = show p ++ \" = \" ++ showAsRational' p\n\nmapEnumeration = columns (\\(_, f) v -> f v) (\\(l, _) -> Headed l) [\n (stringCell \"n\", stringCell . show),\n (codeCell \"allIntegersMap n\",\n stringCell . showAsInteger . allIntegersMap),\n (codeCell \"allRationalsMap n\",\n stringCell . showAsRational . allRationalsMap)\n ]\n\nrenderTable mapEnumeration [0..9]\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
\n n\n \n \n allIntegersMap n\n \n \n \n allRationalsMap n\n \n
\n 0\n \n (0,0) = 0\n \n (1,1) = 1/1\n
\n 1\n \n (0,1) = -1\n \n (1,2) = 1/2\n
\n 2\n \n (1,0) = 1\n \n (2,1) = 2/1\n
\n 3\n \n (0,2) = -2\n \n (1,3) = 1/3\n
\n 4\n \n (2,0) = 2\n \n (3,1) = 3/1\n
\n 5\n \n (0,3) = -3\n \n (1,4) = 1/4\n
\n 6\n \n (3,0) = 3\n \n (2,3) = 2/3\n
\n 7\n \n (0,4) = -4\n \n (3,2) = 3/2\n
\n 8\n \n (4,0) = 4\n \n (4,1) = 4/1\n
\n 9\n \n (0,5) = -5\n \n (1,5) = 1/5\n
\n```\n:::\n:::\n\n\nNote that the tuples produced by `allIntegers`, when interpreted as integers, happen to coincide\n with the earlier enumeration given by `listIntegers`.\n\n\nTree of Fractions\n-----------------\n\nThere's an entirely separate structure which contains all rationals in least terms.\nIt relies on an operation between two fractions called the *mediant*.\nFor two rational numbers in least terms *p* and *q*, such that *p* < *q*, the mediant is designated *p* ⊕ *q* and will:\n\n1. also be in least terms (with some exceptions, see below),\n2. be larger than *p*, and\n3. be smaller than *q*\n\n$$\n\\begin{gather*}\n p = {a \\over b} < {c \\over d} = q, \\quad \\gcd(a,b) = \\gcd(c,d) = 1\n \\\\ \\\\\n p < p \\oplus q < q \\quad \\phantom{\\gcd(a+c, b+d) = 1}\n \\\\ \\\\\n {a \\over b} < {a+c \\over b+d} < {c \\over d}, \\quad \\gcd(a+c, b+d) = 1\n\\end{gather*}\n$$\n\nWe know our sequence of rationals starts with 1/1, 1/2, and 2/1.\nIf we start as before with 1/1 and want to get the other quantities,\n then we can take its mediants with 0/1 and 1/0, respectively\n (handwaving the fact that the latter isn't a legitimate fraction).\n\n$$\n\\begin{align*}\n && && \\large{1 \\over 1} && &&\n \\\\\n { \\oplus {0 \\over 1} } && \\large{/} && && \\large{\\backslash} ~ && \\oplus {1 \\over 0}\n \\\\\n && \\large{1 \\over 2} && && \\large{2 \\over 1} &&\n\\end{align*}\n$$\n\nWe might try continuing this pattern by doing the same thing to 1/2.\nWe can take its mediant with 0/1 to get 1/3.\nUnfortunately, the mediant of 1/2 and 1/0 is 2/2 (as is the mediant of 2/1 with 0/1),\n which isn't in least terms, and has already appeared as 1/1.\n\nWe could try another fraction that's appeared in the tree.\nUnfortunately, 2/1 suffers from the same issue as 1/0 -- 1/2 ⊕ 2/1 = 3/3, which is\n the same quantity as before, despite both fractions being in least terms.\nOn the other hand, 1/2 ⊕ 1/1 = 2/3, which is in least terms.\nSimilarly, 2/1 ⊕ 1/1 is 3/2, its reciprocal.\n\n$$\n\\begin{align*}\n && && \\large{1 \\over 2} && &&\n \\\\\n { \\oplus {0 \\over 1} } && \\large{/} && && \\large{\\backslash} ~ && \\oplus {1 \\over 1}\n \\\\\n && \\large{1 \\over 3} && && \\large{2 \\over 3} &&\n\\end{align*}\n\\qquad \\qquad\n\\begin{align*}\n && && \\large{2 \\over 1} && &&\n \\\\\n { \\oplus {1 \\over 1} } && \\large{/} && && \\large{\\backslash} ~ && \\oplus {1 \\over 0}\n \\\\\n && \\large{3 \\over 2} && && \\large{3 \\over 1} &&\n\\end{align*}\n$$\n\nThe trick is to notice that a step to the left \"updates\" what the next step to the right looks like.\nSteps to the right behave symmetrically.\nFor example, in the row we just generated, the left child of 2/3 is its mediant with 1/2,\n its right child is its mediant with 1/1.\n\nContinuing this iteration ad infinitum forms the\n [Stern-Brocot tree](https://en.wikipedia.org/wiki/Stern%E2%80%93Brocot_tree).\nA notable feature of this is that it is a\n [binary search tree](https://en.wikipedia.org/wiki/Binary_search_tree) (of infinite height).\nThis means that for any node, the value at the node is greater than all values in the left subtree\n and less than all values in the right subtree.\n\n![](./stern-brocot_tree.png)\n\nThere's a bit of a lie in presenting the tree like this.\nAs a binary tree, it's most convenient to show the nodes spaced evenly, but the distance between\n 1/1 and 2/1 is not typically seen as the same as the distance between 1/1 and 1/2.\n\nWe can implement this in Haskell using `Data.Tree`.\nThis package actually lets you describe trees with any number of child nodes,\n but we only need two for the sake of the Stern-Brocot tree.\n\n::: {#688b980d .cell execution_count=9}\n``` {.haskell .cell-code}\nimport Data.Tree\n\n-- Make a tree by applying the function `make` to each node\n-- Start with the root value (1, 1), along with\n-- its left and right steps, (0, 1) and (1, 0)\nsternBrocot = unfoldTree make ((1,1), (0,1), (1,0)) where\n -- Place the first value in the tree, then describe the next\n -- values for `make` in a list:\n make (v@(vn, vd), l@(ln, ld), r@(rn, rd))\n = (v, [\n -- the left value, and its left (unchanged) and right steps...\n ((ln + vn, ld + vd), l, v),\n -- and the right value, and its left and right (unchanged) steps\n ((vn + rn, vd + rd), v, r)\n ])\n```\n:::\n\n\n### Cutting the Tree Down\n\nWe're halfway there. All that remains is to read off every value in the tree as a sequence.\nPerhaps the most naive way would be to read off by always following the left or right child.\nUnfortunately, these give some fairly dull sequences.\n\n::: {#5911816b .cell layout-ncol='2' execution_count=10}\n``` {.haskell .cell-code}\ntreePath :: [Int] -> Tree a -> [a]\ntreePath xs (Node y ys)\n -- If we don't have any directions (xs), or the node\n -- has no children (ys), then there's nowhere to go\n | null xs || null ys = [y]\n -- Otherwise, go down subtree \"x\", then recurse with that tree\n -- and the rest of the directions (xs)\n | otherwise = y:treePath (tail xs) (ys !! head xs)\n\n-- Always go left (child 0)\n-- i.e., numbers with numerator 1\nmapM_ print $ take 10 $ treePath (repeat 0) sternBrocot\n\n-- Always go right (child 1)\n-- i.e., numbers with denominator 1\nmapM_ print $ take 10 $ treePath (repeat 1) sternBrocot\n```\n\n::: {.cell-output .cell-output-display}\n```\n(1,1)\n(1,2)\n(1,3)\n(1,4)\n(1,5)\n(1,6)\n(1,7)\n(1,8)\n(1,9)\n(1,10)\n```\n:::\n\n::: {.cell-output .cell-output-display}\n```\n(1,1)\n(2,1)\n(3,1)\n(4,1)\n(5,1)\n(6,1)\n(7,1)\n(8,1)\n(9,1)\n(10,1)\n```\n:::\n:::\n\n\nRather than by following paths in the tree, we can instead do a breadth-first search.\nIn other words, we read off each row individually, in order.\nThis gives us our sequence of rational numbers with no repeats.\n\n$$\n\\begin{gather*}\n \\begin{align*}\n \\mathbb{N^+}& ~\\rightarrow~ \\mathbb{Q}\n \\\\\n n & ~\\mapsto~ \\text{bfs}[n]\n \\end{align*}\n \\\\ \\\\\n 1 \\mapsto 1/1,~ \\\\\n 2 \\mapsto 1/2,\\quad 3 \\mapsto 2/1,~ \\\\\n 4 \\mapsto 1/3,\\quad 5 \\mapsto 2/3, \\quad 6 \\mapsto 3/2, \\quad 7 \\mapsto 3/1,~ ...\n\\end{gather*}\n$$\n\nFor convenience, this enumeration is given starting from 1 rather than from 0.\nThis numbering makes it clearer that each row starts with a power of 2,\n since the structure is a binary tree, and the complexity doubles with each row.\nThe enumeration could just as easily start from 0 by starting with $\\N$,\n then getting to $\\N^+$ with $n \\mapsto n+1$.\n\nWe can also write a breadth-first search in Haskell, for posterity:\n\n::: {#91920684 .cell execution_count=11}\n``` {.haskell .cell-code}\nbfs :: Tree a -> [a]\nbfs (Node root children) = bfs' root children where\n -- Place the current node in the list\n bfs' v [] = [v]\n -- Pluck one node off our list of trees, then recurse with\n -- the rest, along with that node's children\n bfs' v ((Node y ys):xs) = v:bfs' y (xs ++ ys)\n\nsternBrocotRationals = bfs sternBrocot\n\nmapM_ putStrLn $ take 10 $ map showAsRational sternBrocotRationals\n```\n\n::: {.cell-output .cell-output-display}\n```\n(1,1) = 1/1\n(1,2) = 1/2\n(2,1) = 2/1\n(1,3) = 1/3\n(2,3) = 2/3\n(3,2) = 3/2\n(3,1) = 3/1\n(1,4) = 1/4\n(2,5) = 2/5\n(3,5) = 3/5\n```\n:::\n:::\n\n\nThe entries in this enumeration have already been given.\n\n\n### Another Tree\n\nAnother tree of fractions to consider is the tree of binary fractions.\nThese fractions simply consist of odd numbers divided by powers of two.\nThe most convenient way to organize these into a tree is to keep denominators equal\n if the nodes have the same depth from the root.\nWe also stipulate that we arrange the nodes as a binary search tree, like the Stern-Brocot tree.\n\nThe tree starts from 1/1 as before.\nIts children have denominator 2, so we have 1/2 to the left and 3/2 to the right.\nThis is equivalent to subtracting 1/2 for the left step and adding 1/2 for the right step.\nAt the next layer, we want fractions with denominator 1/4, and do similarly.\nIn terms of adding and subtracting, we just use 1/4 instead of 1/2.\n\n![](./dyadic_fraction_tree.png)\n\nWe can describe this easily in Haskell:\n\n::: {#555c1dba .cell execution_count=12}\n``` {.haskell .cell-code}\n-- Start with 1/1 (i.e., (1, 1))\nbinFracTree = unfoldTree make (1,1) where\n -- Place the first value in the tree, then describe the next\n -- values for `make` in a list:\n make v@(vn, vd)\n = (v, [\n -- double the numerator and denominator, then subtract 1 from the numerator\n (2*vn - 1, 2*vd),\n -- same, but add 1 to the numerator instead\n (2*vn + 1, 2*vd)\n ])\n```\n:::\n\n\nThe entries of this tree have an additional interpretation when converted to their binary expansions.\nThese fractions always terminate in a \"1\" in binary, but ignoring this final entry, starting from the root\n and following \"left\" for 0 and \"right\" for 1 places us at that fraction in the tree.\nIn other words, the binary expansions encode the path from the root to the node.\n\n![](./binary_expansion_tree.png)\n\n\nWhy Bother?\n-----------\n\nThe tree of binary fractions and the Stern-Brocot tree are both infinite binary search trees,\n so we might imagine overlaying one tree over the other, pairing up the individual entries.\n\n![](./question_mark_tree.png)\n\nIn Haskell, we can pair up entries recursively:\n\n::: {#82577de7 .cell execution_count=13}\n``` {.haskell .cell-code}\nzipTree :: Tree a -> Tree b -> Tree (a,b)\n-- Pair the values in the nodes together, then recurse with the child trees\nzipTree (Node x xs) (Node y ys) = Node (x,y) $ zipWith zipTree xs ys\n\nbinarySBTree = zipTree sternBrocot binFracTree\n```\n:::\n\n\nConveniently, both left subtrees of the root fall in the interval (0, 1).\nIt also pairs up 1 and 1/2 with themselves.\nDoing so establishes a bijection between the rationals and the binary rationals in that interval.\nRationals are more continuous than integers, so it might be of some curiosity to plot this function.\nWe only have to look at a square over the unit interval. Doing so reveals a curious shape:\n\n::: {#66c88259 .cell layout-ncol='2' execution_count=14}\n``` {.haskell .cell-code code-fold=\"true\"}\nimport Data.Tuple (swap)\nimport Data.List (sort)\nimport Data.Bifunctor (bimap, first)\n\nleftSubtree (Node _ (x:_)) = x\n\n-- Divide entries of the (zipped) trees\n() (a,b) = fromIntegral a / fromIntegral b :: Double\nbinarySBDoubles n = take n $ map (bimap () ()) $ bfs $ leftSubtree binarySBTree\n\n(MPL.tightLayout <>) $ uncurry MPL.plot $ unzip $ sort $ map swap $ binarySBDoubles 250\n(MPL.tightLayout <>) $ uncurry MPL.plot $ unzip $ sort $ binarySBDoubles 250\n```\n\n::: {.cell-output .cell-output-display}\n![Binary rationals on the x-axis, rationals on the y-axis](index_files/figure-html/cell-15-output-1.svg){}\n:::\n\n::: {.cell-output .cell-output-display}\n![Rationals on the x-axis, binary rationals on the y-axis](index_files/figure-html/cell-15-output-2.svg){}\n:::\n:::\n\n\nThe plot on the right which maps the rationals to the binary rationals is known as\n [Minkowski's question mark function](https://en.wikipedia.org/wiki/Minkowski%27s_question-mark_function).\nNotice that this function is nearly 1/2 for values near 1/2\n (nearly 1/4 for values near 1/3, nearly 1/8 for values near 1/4, etc.).\n\n\n### I'm Repeating Myself\n\nThe inverse question mark map (which I'll call ¿ for short), besides mapping binary rationals to rationals,\n has an interesting relationship with other rational numbers.\nRecall that we only defined the function in terms of fractions\n which happen to have finite binary expansions.\nThose with infinite binary expansions, such as 1/3 (and indeed, any fraction whose denominator\n isn't a power of 2) aren't defined.\n\n$$\n\\begin{gather*}\n {1 \\over 2} = 0.1_2\n \\\\\n {1 \\over 3} = 0.\\overline{01} = 0.\\textcolor{red}{01}\\textcolor{green}{01}\\textcolor{blue}{01}...\n \\\\\n {1 \\over 4} = 0.01_2\n \\\\\n {1 \\over 5} = 0.\\overline{0011} = 0.\\textcolor{red}{0011}\\textcolor{green}{0011}\\textcolor{blue}{0011}...\n \\\\\n \\vdots\n\\end{gather*}\n$$\n\nWe can persevere if we continue to interpret the binary strings as a path in the tree.\nThis means that for 1/3, we go left initially, then alternate between going left and right.\nAs we do so, let's take note of the values we pass along the way:\n\n::: {#a0b1bf2f .cell execution_count=15}\n``` {.haskell .cell-code}\n-- Follow the path described by the binary expansion of 1/3\noneThirdPath = treePath (0:cycle [0,1]) $ zipTree sternBrocot binFracTree\n```\n:::\n\n\n::: {#f54ff78f .cell .plain execution_count=16}\n``` {.haskell .cell-code code-fold=\"true\"}\ntrimTo n x = if length x > n then \"(too big to show)\" else x\n\ntreePathColumns = columns (\\(_, f) v -> f v) (\\(l, _) -> Headed l) [\n (stringCell \"n\",\n stringCell . fromEither . fmap show),\n (stringCell \"Binary fraction\",\n stringCell . fromEither . fmap (trimTo 10 . showAsRational' . snd . (oneThirdPath !!))),\n (stringCell \"Binary fraction (decimal)\",\n stringCell . fromEither . fmap (show . () . snd . (oneThirdPath !!))),\n (stringCell \"Stern-Brocot rational\",\n stringCell . fromEither . fmap (trimTo 10 . showAsRational' . fst . (oneThirdPath !!))),\n (stringCell \"Stern-Brocot rational (decimal)\",\n stringCell . fromEither . fmap (show . () . fst . (oneThirdPath !!)))\n ] where\n fromEither = either id id\n\nrenderTable treePathColumns (map Right [0..8] ++ [Left \"...\", Right 100, Left \"...\"])\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
\n n\n \n Binary fraction\n \n Binary fraction (decimal)\n \n Stern-Brocot rational\n \n Stern-Brocot rational (decimal)\n
\n 0\n \n 1/1\n \n 1.0\n \n 1/1\n \n 1.0\n
\n 1\n \n 1/2\n \n 0.5\n \n 1/2\n \n 0.5\n
\n 2\n \n 1/4\n \n 0.25\n \n 1/3\n \n 0.3333333333333333\n
\n 3\n \n 3/8\n \n 0.375\n \n 2/5\n \n 0.4\n
\n 4\n \n 5/16\n \n 0.3125\n \n 3/8\n \n 0.375\n
\n 5\n \n 11/32\n \n 0.34375\n \n 5/13\n \n 0.38461538461538464\n
\n 6\n \n 21/64\n \n 0.328125\n \n 8/21\n \n 0.38095238095238093\n
\n 7\n \n 43/128\n \n 0.3359375\n \n 13/34\n \n 0.38235294117647056\n
\n 8\n \n 85/256\n \n 0.33203125\n \n 21/55\n \n 0.38181818181818183\n
\n ...\n \n ...\n \n ...\n \n ...\n \n ...\n
\n 100\n \n (too big to show)\n \n 0.3333333333333333\n \n (too big to show)\n \n 0.3819660112501052\n
\n ...\n \n ...\n \n ...\n \n ...\n \n ...\n
\n```\n:::\n:::\n\n\n::: {#9f7b9346 .cell layout-ncol='2' execution_count=17}\n``` {.haskell .cell-code code-fold=\"true\"}\nconvergentsOneThird = map (() . snd) oneThirdPath\nconvergentsSBNumber = map (() . fst) oneThirdPath\n\nplotSequence n = uncurry MPL.plot . unzip . take n . zip ([0..] :: [Int])\n\n(MPL.tightLayout <>) $ plotSequence 20 convergentsOneThird\n(MPL.tightLayout <>) $ plotSequence 20 convergentsSBNumber\n```\n\n::: {.cell-output .cell-output-display}\n![Binary convergents of 1/3](index_files/figure-html/cell-18-output-1.svg){}\n:::\n\n::: {.cell-output .cell-output-display}\n![¿ applied to binary convergents of 1/3, which also appear to converge](index_files/figure-html/cell-18-output-2.svg){}\n:::\n:::\n\n\nBoth sequences appear to converge to a number, with the binary fractions obviously converging to 1/3.\nThe rationals from the Stern-Brocot don't appear to be converging to a repeating decimal.\nLooking closer, the numerators and denominators of the fractions appear to come from the Fibonacci numbers.\nIn fact, the quantity that the fractions approach is $2 - \\varphi$, where φ is the golden ratio.\nThis number is the root of the polynomial $x^2 - 3x + 1$.\n\nIn fact, all degree 2 polynomials have roots that are encoded by a repeating path in the Stern-Brocot tree.\nPut another way, ¿ can be extended to map rationals other than binary fractions to quadratic roots\n (and ? maps quadratic roots to rational numbers).\nThis is easier to understand when writing the quantity as its\n [continued fraction expansion](https://en.wikipedia.org/wiki/Continued_fraction),\n but that's an entirely separate discussion.\n\nEither way, it tells us something interesting: not only can all rational numbers be enumerated,\n but so can quadratic *irrationals*.\n\n\n### The Other Side\n\nI'd like to briefly digress from talking about enumerations and mention the right subtree.\nThe question mark function, as defined here, is only defined on numbers between 0 and 1\n (and even then, technically only rational numbers).\nAccording to Wikipedia's definition, the question mark function is quasi-periodic --\n $?(x + 1) = ?(x) + 1$.\nOn the other hand, according to the definition by pairing up the two trees,\n rationals greater than 1 get mapped to binary fractions between 1 and 2.\n\n::: {#fig-question-mark-linlog .cell layout-ncol='2' execution_count=18}\n``` {.haskell .cell-code code-fold=\"true\"}\nbinarySBDoublesAll n = take n $ map (bimap () ()) $ bfs binarySBTree\n\n(MPL.tightLayout <>) $ uncurry MPL.plot $\n unzip $ sort $ binarySBDoublesAll 250\n(MPL.tightLayout <>) $ uncurry MPL.plot $\n unzip $ map (first log) $ sort $ binarySBDoublesAll 250\n```\n\n::: {.cell-output .cell-output-display}\n![linear x-axis](index_files/figure-html/fig-question-mark-linlog-output-1.svg){#fig-question-mark-linlog-1}\n:::\n\n::: {.cell-output .cell-output-display}\n![(base 2)-logarithmic x-axis](index_files/figure-html/fig-question-mark-linlog-output-2.svg){#fig-question-mark-linlog-2}\n:::\n\nQuestion mark function including right subtree\n:::\n\n\nHere are graphs describing *our* question mark function, on linear and logarithmic plots.\nInstead of repeating, the function continues its self-similar behavior\n as it proceeds onward to infinity (logarithmically).\nThe right graph stretches from -∞, where its value would be 0, to ∞, where its value would be 2.\n\nPersonally, I like this definition a bit better, if only because it matches other ways\n of thinking about the interval (0, 1).\nFor example,\n\n- In topology, it's common to show that this interval is homeomorphic to the entire real line\n- It's similar to the [rational functions which appear in stereography](/posts/stereo/1/),\n which continue to infinity instead of being periodic\n- It showcases how the Stern-Brocot tree sorts rational numbers by complexity better\n\nHowever, it's also true that different definitions are good for different things.\nFor example, periodicity matches the intuition that numbers can be decomposed\n into a fractional and integral part.\nIntegral parts grow without bound, while fractional parts are periodic,\n just like the function would be.\n\n\nClosing\n-------\n\nI'd like to draw this discussion of enumerating numbers to a close for now.\nI wrote this article to establish some preliminaries regarding *another* post that I have planned.\nOn the other hand, since I was describing the Stern-Brocot tree, I felt it also pertinent\n to show the question mark function, since it's a very interesting self-similar curve.\nEven then, I have shown them as a curiosity instead of giving them their time in the spotlight.\n\nI have omitted some things I would like to have discussed, such as\n [order type](https://en.wikipedia.org/wiki/Order_type),\n and enumerating things beyond just the quadratic irrationals.\nI may return to some of these topics in the future, such as to show a way to order integer polynomials.\n\nDiagrams created with GeoGebra (because trying to render them in LaTeX would have taken too long)\n and Matplotlib.\n\n", + "markdown": "---\ntitle: \"Numbering Numbers: From 0 to ∞\"\ndescription: |\n How do we count an infinitude of numbers?\nformat:\n html:\n html-math-method: katex\ndate: \"2023-11-26\"\ndate-modified: \"2025-07-22\"\ncategories:\n - algebra\n - question-mark function\n - haskell\n---\n\n\n\nThe infinite is replete with paradoxes.\nSome of the best come from comparing sizes of infinite collections.\nFor example, every natural number can be mapped to a (nonnegative) even number and vice versa.\n\n$$\n\\begin{gather*}\n \\N \\rightarrow 2\\N\n \\\\\n n \\mapsto 2n\n \\\\ \\\\\n 0 \\mapsto 0,~ 1 \\mapsto2,~ 2 \\mapsto 4,~ 3 \\mapsto 6,~ 4 \\mapsto 8, ...\n\\end{gather*}\n$$\n\n(For the purposes of this post, $0 \\in \\N, ~ 0 \\notin \\N^+$)\n\nAll even numbers are \"hit\" by this map (by the definition of an even number),\n and no two natural numbers map to the same even number\n (again, more or less by definition, since $2m = 2n$ implies that $m = n$ over $\\N$).\nTherefore, the map is [one-to-one](https://en.wikipedia.org/wiki/Injective_function)\n and [onto](https://en.wikipedia.org/wiki/Surjective_function),\n so the map is a [bijection](https://en.wikipedia.org/wiki/Bijection).\nA consequence is that the map has an inverse, namely by reversing all of the arrows in the above block\n (i.e., the action of halving an even number).\n\nBijections with the natural numbers are easier to understand as a way to place things\n into a linear sequence.\nIn other words, they enumerate \"some sort of item\"; in this case, even numbers.\n\nIn the finite world, a bijection between two things implies that they have the same size.\nIt makes sense to extend the same logic to the infinite world, but there's a catch.\nThe nonnegative even numbers are clearly a strict subset of the natural numbers,\n but by this argument they have the same size.\n\n$$\n\\begin{matrix}\n 2\\N & \\longleftrightarrow & \\N & \\hookleftarrow & 2\\N\n \\\\\n 0 & \\mapsto & \\textcolor{red}0 & \\dashleftarrow & \\textcolor{red}0\n \\\\\n 2 & \\mapsto & 1 & &\n \\\\\n 4 & \\mapsto & \\textcolor{red}2 & \\dashleftarrow & \\textcolor{red}2\n \\\\\n 6 & \\mapsto & 3 & &\n \\\\\n 8 & \\mapsto & \\textcolor{red}4 & \\dashleftarrow & \\textcolor{red}4\n \\\\\n 10 & \\mapsto & 5 & &\n \\\\\n 12 & \\mapsto & \\textcolor{red}6 & \\dashleftarrow & \\textcolor{red}6\n \\\\\n 14 & \\mapsto & 7 & &\n \\\\\n 16 & \\mapsto & \\textcolor{red}8 & \\dashleftarrow & \\textcolor{red}8\n \\\\\n \\vdots & & \\vdots & & \\vdots\n\\end{matrix}\n$$\n\n\nAre we Positive?\n----------------\n\nThe confusion continues if we look at the integers and the naturals.\nIntegers are the natural numbers and their negatives, so it would be intuitive to assume that\n there are twice as many of them as there are naturals (more or less one to account for zero).\nBut since that logic fails for the naturals and the even numbers,\n it fails for the naturals and integers as well.\n\n$$\n\\begin{gather*}\n \\begin{align*}\n \\mathbb{N} &\\rightarrow \\mathbb{Z}\n \\\\\n n &\\mapsto \\left\\{ \\begin{matrix}\n n/2 & n \\text{ even}\n \\\\\n -(n+1)/2 & n \\text{ odd}\n \\end{matrix} \\right.\n \\end{align*}\n \\\\ \\\\\n 0 \\mapsto 0,\\quad 2 \\mapsto 1, \\quad 4 \\mapsto 2, \\quad 6 \\mapsto 3, \\quad 8 \\mapsto 4,~...\n \\\\\n 1 \\mapsto -1, \\quad 3 \\mapsto -2, \\quad 5 \\mapsto -3, \\quad 7 \\mapsto -4, \\quad 9 \\mapsto -5,~...\n\\end{gather*}\n$$\n\nOr, in Haskell[^1]:\n\n[^1]: That is, if you cover your eyes and pretend that `undefined` will never happen,\n and if you ignore that `Int` is bounded, unlike `Integer`.\n\n::: {#775b525e .cell execution_count=3}\n``` {.haskell .cell-code}\ntype Nat = Int\n\nlistIntegers :: Nat -> Int\nlistIntegers n\n | n < 0 = undefined\n | even n = n `div` 2\n | otherwise = -(n + 1) `div` 2\n```\n:::\n\n\nIn other words, this map sends even numbers to the naturals (the inverse of the doubling map)\n and the odds to the negatives.\nThe same arguments about the bijective nature of this map apply as before, and so the paradox persists,\n since naturals are also a strict subset of integers.\n\n\n### Rational Numbers\n\nRationals are a bit worse.\nTo make things a little easier, let's focus on the positive rationals (i.e., fractions excluding 0).\nUnlike the integers, there is no obvious \"next rational\" after (or even before) 1.\nIf there were, we could follow it with its reciprocal, like how an integer is followed\n by its negative in the map above.\n\nOn the other hand, the integers provide a sliver of hope that listing all rational numbers is possible.\nIntegers can be defined as pairs of natural numbers, along with a way of considering two pairs equal.\n\n$$\n\\begin{gather*}\n -1 = (0,1) \\sim_\\Z (1,2) \\sim_\\Z (2,3) \\sim_\\Z (3,4) \\sim_\\Z ...\n \\\\[10pt]\n (a,b) \\sim_\\mathbb{Z} (c,d) \\iff a+d = b+c \\quad a,b,c,d \\in \\mathbb{N}\n \\\\[10pt]\n \\mathbb{Z} := ( \\mathbb{N} \\times \\mathbb{N} ) / \\sim_\\mathbb{Z}\n\\end{gather*}\n$$\n\n::: {#97b6db11 .cell execution_count=4}\n``` {.haskell .cell-code}\nintEqual :: (Nat, Nat) -> (Nat, Nat) -> Bool\nintEqual (a, b) (c, d) = a + d == b + c\n```\n:::\n\n\nThis relation is the same as saying $a - b = c - d$ (i.e., that -1 = 0 - 1, etc.),\n but has the benefit of not requiring subtraction to be defined.\nThis is all the better, since, as grade-schoolers are taught, subtracting a larger natural number\n from a smaller one is impossible.\n\nThe same equivalence definition exists for positive rationals.\nIt is perhaps more familiar, because of the emphasis placed on simplifying fractions when learning them.\nWe can [cross-multiply](https://en.wikipedia.org/wiki/Cross-multiplication) fractions to get\n a similar equality condition to the one for integers.\n\n$$\n\\begin{gather*}\n {1 \\over 2} = (1,2) \\sim_\\mathbb{Q} \\overset{2/4}{(2,4)} \\sim_\\mathbb{Q}\n \\overset{3/6}{(3,6)} \\sim_\\mathbb{Q} \\overset{4/8}{(4,8)} \\sim_\\mathbb{Q} ...\n \\\\ \\\\\n (a,b) \\sim_\\mathbb{Q} (c,d) \\iff ad = bc \\quad a,b,c,d \\in \\mathbb{N}^+\n \\\\ ~ \\\\\n \\mathbb{Q^+} := ( \\mathbb{N^+} \\times \\mathbb{N^+} ) / \\sim_\\mathbb{Q}\n\\end{gather*}\n$$\n\n::: {#4c4d2c94 .cell execution_count=5}\n``` {.haskell .cell-code}\nratEqual :: (Nat, Nat) -> (Nat, Nat) -> Bool\nratEqual (a, b) (c, d) = a * d == b * c\n```\n:::\n\n\nWe specify that neither element of the pair can be zero, so this excludes divisions by zero\n (and the especially tricky case of 0/0, which would be equal to all fractions).\nEffectively, this just replaces where addition appears in the integer equivalence with multiplication.\n\n\n### Eliminating Repeats\n\nNaively, to tackle both of these cases, we might consider enumerating pairs of natural numbers.\nWe order them by sums and break ties by sorting on the first index.\n\n::: {#aa044caa .cell execution_count=6}\n``` {.haskell .cell-code}\n-- All pairs of natural numbers that sum to n\nlistPairs :: Nat -> [(Nat, Nat)]\nlistPairs n = [ (k, n - k) | k <- [0..n] ]\n\n-- \"Triangular\" enumeration of all pairs of positive integers\nallPairs :: [(Nat, Nat)]\nallPairs = concatMap listPairs [0..]\n\n-- Use a natural number to index the enumeration of all pairs\nallPairsMap :: Nat -> (Nat, Nat)\nallPairsMap n = allPairs !! n\n```\n:::\n\n\n::: {#6143624a .cell .plain execution_count=7}\n``` {.haskell .cell-code code-fold=\"true\"}\npairEnumeration = columns (\\(_, f) v -> f v) (\\(l, _) -> Headed l) [\n (\"Index\", show . fst),\n (\"Pair (a, b)\", show . snd),\n (\"Sum (a + b)\", show . uncurry (+) . snd),\n (\"Integer (a - b)\", show . uncurry (-) . snd),\n (\"Rational (a+1 / b+1)\", (\\(a, b) -> show (a + 1) ++ \"/\" ++ show (b + 1)) . snd)\n ]\n\nrenderTable (rmap stringCell pairEnumeration) $ take 10 $ zip [0..] allPairs\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
\n Index\n \n Pair (a, b)\n \n Sum (a + b)\n \n Integer (a - b)\n \n Rational (a+1 / b+1)\n
\n 0\n \n (0,0)\n \n 0\n \n 0\n \n 1/1\n
\n 1\n \n (0,1)\n \n 1\n \n -1\n \n 1/2\n
\n 2\n \n (1,0)\n \n 1\n \n 1\n \n 2/1\n
\n 3\n \n (0,2)\n \n 2\n \n -2\n \n 1/3\n
\n 4\n \n (1,1)\n \n 2\n \n 0\n \n 2/2\n
\n 5\n \n (2,0)\n \n 2\n \n 2\n \n 3/1\n
\n 6\n \n (0,3)\n \n 3\n \n -3\n \n 1/4\n
\n 7\n \n (1,2)\n \n 3\n \n -1\n \n 2/3\n
\n 8\n \n (2,1)\n \n 3\n \n 1\n \n 3/2\n
\n 9\n \n (3,0)\n \n 3\n \n 3\n \n 4/1\n
\n```\n:::\n:::\n\n\nThis certainly works to show that naturals and pairs of naturals can be put into bijection,\n but it when interpreting the results as integers or rationals, we double-count several of them.\nThis is easy to see in the case of the integers, but it will also happen in the rationals.\nFor example, the pair (3, 5) would correspond to 4/6 = 2/3, which has already been counted.\n\nIncidentally, Haskell comes with a function called `nubBy`.\nThis function eliminates duplicates according to another function of our choosing.\nWe can also just implement it ourselves and use it to create a naive enumeration of integers and rationals,\n based on the equalities defined earlier:\n\n::: {#7017b14d .cell execution_count=8}\n``` {.haskell .cell-code}\nnubBy :: (a -> a -> Bool) -> [a] -> [a]\nnubBy f = nubBy' [] where\n nubBy' ys [] = []\n nubBy' ys (z:zs)\n -- Ignore this element, something equivalent is in ys\n | any (f z) ys = nubBy' ys zs\n -- Append this element to the result and our internal list\n | otherwise = z:nubBy' (z:ys) zs\n\nallIntegers :: [(Nat, Nat)]\n-- Remove duplicates under integer equality\nallIntegers = nubBy intEqual allPairs\n\nallIntegersMap :: Nat -> (Nat, Nat)\nallIntegersMap n = allIntegers !! n\n\nallRationals :: [(Nat, Nat)]\n-- Add 1 to the numerator and denominator to get rid of 0,\n-- then remove duplicates under fraction equality\nallRationals = nubBy ratEqual $ map (\\(a,b) -> (a+1, b+1)) allPairs\n\nallRationalsMap :: Nat -> (Nat, Nat)\nallRationalsMap n = allRationals !! n\n```\n:::\n\n\nFor completeness's sake, the resulting pairs of each map are as follows\n\n::: {#95e58f2c .cell .plain execution_count=9}\n``` {.haskell .cell-code code-fold=\"true\"}\ncodeCell = htmlCell . Html.code . Html.string\n\nshowAsInteger p@(a,b) = show p ++ \" = \" ++ show (a - b)\nshowAsRational' p@(a,b) = show a ++ \"/\" ++ show b\nshowAsRational p@(a,b) = show p ++ \" = \" ++ showAsRational' p\n\nmapEnumeration = columns (\\(_, f) v -> f v) (\\(l, _) -> Headed l) [\n (stringCell \"n\", stringCell . show),\n (codeCell \"allIntegersMap n\",\n stringCell . showAsInteger . allIntegersMap),\n (codeCell \"allRationalsMap n\",\n stringCell . showAsRational . allRationalsMap)\n ]\n\nrenderTable mapEnumeration [0..9]\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
\n n\n \n \n allIntegersMap n\n \n \n \n allRationalsMap n\n \n
\n 0\n \n (0,0) = 0\n \n (1,1) = 1/1\n
\n 1\n \n (0,1) = -1\n \n (1,2) = 1/2\n
\n 2\n \n (1,0) = 1\n \n (2,1) = 2/1\n
\n 3\n \n (0,2) = -2\n \n (1,3) = 1/3\n
\n 4\n \n (2,0) = 2\n \n (3,1) = 3/1\n
\n 5\n \n (0,3) = -3\n \n (1,4) = 1/4\n
\n 6\n \n (3,0) = 3\n \n (2,3) = 2/3\n
\n 7\n \n (0,4) = -4\n \n (3,2) = 3/2\n
\n 8\n \n (4,0) = 4\n \n (4,1) = 4/1\n
\n 9\n \n (0,5) = -5\n \n (1,5) = 1/5\n
\n```\n:::\n:::\n\n\nNote that the tuples produced by `allIntegers`, when interpreted as integers, happen to coincide\n with the earlier enumeration given by `listIntegers`.\n\n\nTree of Fractions\n-----------------\n\nThere's an entirely separate structure which contains all rationals in least terms.\nIt relies on an operation between two fractions called the *mediant*.\nFor two rational numbers in least terms *p* and *q*, such that *p* < *q*, the mediant is designated *p* ⊕ *q* and will:\n\n1. also be in least terms (with some exceptions, see below),\n2. be larger than *p*, and\n3. be smaller than *q*\n\n$$\n\\begin{gather*}\n p = {a \\over b} < {c \\over d} = q, \\quad \\gcd(a,b) = \\gcd(c,d) = 1\n \\\\ \\\\\n p < p \\oplus q < q \\quad \\phantom{\\gcd(a+c, b+d) = 1}\n \\\\ \\\\\n {a \\over b} < {a+c \\over b+d} < {c \\over d}, \\quad \\gcd(a+c, b+d) = 1\n\\end{gather*}\n$$\n\nWe know our sequence of rationals starts with 1/1, 1/2, and 2/1.\nIf we start as before with 1/1 and want to get the other quantities,\n then we can take its mediants with 0/1 and 1/0, respectively\n (handwaving the fact that the latter isn't a legitimate fraction).\n\n$$\n\\begin{align*}\n && && \\large{1 \\over 1} && &&\n \\\\\n { \\oplus {0 \\over 1} } && \\large{/} && && \\large{\\backslash} ~ && \\oplus {1 \\over 0}\n \\\\\n && \\large{1 \\over 2} && && \\large{2 \\over 1} &&\n\\end{align*}\n$$\n\nWe might try continuing this pattern by doing the same thing to 1/2.\nWe can take its mediant with 0/1 to get 1/3.\nUnfortunately, the mediant of 1/2 and 1/0 is 2/2 (as is the mediant of 2/1 with 0/1),\n which isn't in least terms, and has already appeared as 1/1.\n\nWe could try another fraction that's appeared in the tree.\nUnfortunately, 2/1 suffers from the same issue as 1/0 -- 1/2 ⊕ 2/1 = 3/3, which is\n the same quantity as before, despite both fractions being in least terms.\nOn the other hand, 1/2 ⊕ 1/1 = 2/3, which is in least terms.\nSimilarly, 2/1 ⊕ 1/1 is 3/2, its reciprocal.\n\n$$\n\\begin{align*}\n && && \\large{1 \\over 2} && &&\n \\\\\n { \\oplus {0 \\over 1} } && \\large{/} && && \\large{\\backslash} ~ && \\oplus {1 \\over 1}\n \\\\\n && \\large{1 \\over 3} && && \\large{2 \\over 3} &&\n\\end{align*}\n\\qquad \\qquad\n\\begin{align*}\n && && \\large{2 \\over 1} && &&\n \\\\\n { \\oplus {1 \\over 1} } && \\large{/} && && \\large{\\backslash} ~ && \\oplus {1 \\over 0}\n \\\\\n && \\large{3 \\over 2} && && \\large{3 \\over 1} &&\n\\end{align*}\n$$\n\nThe trick is to notice that a step to the left \"updates\" what the next step to the right looks like.\nSteps to the right behave symmetrically.\nFor example, in the row we just generated, the left child of 2/3 is its mediant with 1/2,\n its right child is its mediant with 1/1.\n\nContinuing this iteration ad infinitum forms the\n [Stern-Brocot tree](https://en.wikipedia.org/wiki/Stern%E2%80%93Brocot_tree).\nA notable feature of this is that it is a\n [binary search tree](https://en.wikipedia.org/wiki/Binary_search_tree) (of infinite height).\nThis means that for any node, the value at the node is greater than all values in the left subtree\n and less than all values in the right subtree.\n\n![](./stern-brocot_tree.png)\n\nThere's a bit of a lie in presenting the tree like this.\nAs a binary tree, it's most convenient to show the nodes spaced evenly, but the distance between\n 1/1 and 2/1 is not typically seen as the same as the distance between 1/1 and 1/2.\n\nWe can implement this in Haskell using `Data.Tree`.\nThis package actually lets you describe trees with any number of child nodes,\n but we only need two for the sake of the Stern-Brocot tree.\n\n::: {#688b980d .cell execution_count=10}\n``` {.haskell .cell-code}\nimport Data.Tree\n\n-- Make a tree by applying the function `make` to each node\n-- Start with the root value (1, 1), along with\n-- its left and right steps, (0, 1) and (1, 0)\nsternBrocot = unfoldTree make ((1,1), (0,1), (1,0)) where\n -- Place the first value in the tree, then describe the next\n -- values for `make` in a list:\n make (v@(vn, vd), l@(ln, ld), r@(rn, rd))\n = (v, [\n -- the left value, and its left (unchanged) and right steps...\n ((ln + vn, ld + vd), l, v),\n -- and the right value, and its left and right (unchanged) steps\n ((vn + rn, vd + rd), v, r)\n ])\n```\n:::\n\n\n### Cutting the Tree Down\n\nWe're halfway there. All that remains is to read off every value in the tree as a sequence.\nPerhaps the most naive way would be to read off by always following the left or right child.\nUnfortunately, these give some fairly dull sequences.\n\n::: {#5911816b .cell layout-ncol='2' execution_count=11}\n``` {.haskell .cell-code}\ntreePath :: [Int] -> Tree a -> [a]\ntreePath xs (Node y ys)\n -- If we don't have any directions (xs), or the node\n -- has no children (ys), then there's nowhere to go\n | null xs || null ys = [y]\n -- Otherwise, go down subtree \"x\", then recurse with that tree\n -- and the rest of the directions (xs)\n | otherwise = y:treePath (tail xs) (ys !! head xs)\n\n-- Always go left (child 0)\n-- i.e., numbers with numerator 1\nmapM_ print $ take 10 $ treePath (repeat 0) sternBrocot\n\n-- Always go right (child 1)\n-- i.e., numbers with denominator 1\nmapM_ print $ take 10 $ treePath (repeat 1) sternBrocot\n```\n\n::: {.cell-output .cell-output-display}\n```\n(1,1)\n(1,2)\n(1,3)\n(1,4)\n(1,5)\n(1,6)\n(1,7)\n(1,8)\n(1,9)\n(1,10)\n```\n:::\n\n::: {.cell-output .cell-output-display}\n```\n(1,1)\n(2,1)\n(3,1)\n(4,1)\n(5,1)\n(6,1)\n(7,1)\n(8,1)\n(9,1)\n(10,1)\n```\n:::\n:::\n\n\nRather than by following paths in the tree, we can instead do a breadth-first search.\nIn other words, we read off each row individually, in order.\nThis gives us our sequence of rational numbers with no repeats.\n\n$$\n\\begin{gather*}\n \\begin{align*}\n \\mathbb{N^+}& ~\\rightarrow~ \\mathbb{Q}\n \\\\\n n & ~\\mapsto~ \\text{bfs}[n]\n \\end{align*}\n \\\\ \\\\\n 1 \\mapsto 1/1,~ \\\\\n 2 \\mapsto 1/2,\\quad 3 \\mapsto 2/1,~ \\\\\n 4 \\mapsto 1/3,\\quad 5 \\mapsto 2/3, \\quad 6 \\mapsto 3/2, \\quad 7 \\mapsto 3/1,~ ...\n\\end{gather*}\n$$\n\nFor convenience, this enumeration is given starting from 1 rather than from 0.\nThis numbering makes it clearer that each row starts with a power of 2,\n since the structure is a binary tree, and the complexity doubles with each row.\nThe enumeration could just as easily start from 0 by starting with $\\N$,\n then getting to $\\N^+$ with $n \\mapsto n+1$.\n\nWe can also write a breadth-first search in Haskell, for posterity:\n\n::: {#91920684 .cell execution_count=12}\n``` {.haskell .cell-code}\nbfs :: Tree a -> [a]\nbfs (Node root children) = bfs' root children where\n -- Place the current node in the list\n bfs' v [] = [v]\n -- Pluck one node off our list of trees, then recurse with\n -- the rest, along with that node's children\n bfs' v ((Node y ys):xs) = v:bfs' y (xs ++ ys)\n\nsternBrocotRationals = bfs sternBrocot\n\nmapM_ putStrLn $ take 10 $ map showAsRational sternBrocotRationals\n```\n\n::: {.cell-output .cell-output-display}\n```\n(1,1) = 1/1\n(1,2) = 1/2\n(2,1) = 2/1\n(1,3) = 1/3\n(2,3) = 2/3\n(3,2) = 3/2\n(3,1) = 3/1\n(1,4) = 1/4\n(2,5) = 2/5\n(3,5) = 3/5\n```\n:::\n:::\n\n\nThe entries in this enumeration have already been given.\n\n\n### Another Tree\n\nAnother tree of fractions to consider is the tree of binary fractions.\nThese fractions simply consist of odd numbers divided by powers of two.\nThe most convenient way to organize these into a tree is to keep denominators equal\n if the nodes have the same depth from the root.\nWe also stipulate that we arrange the nodes as a binary search tree, like the Stern-Brocot tree.\n\nThe tree starts from 1/1 as before.\nIts children have denominator 2, so we have 1/2 to the left and 3/2 to the right.\nThis is equivalent to subtracting 1/2 for the left step and adding 1/2 for the right step.\nAt the next layer, we want fractions with denominator 1/4, and do similarly.\nIn terms of adding and subtracting, we just use 1/4 instead of 1/2.\n\n![](./dyadic_fraction_tree.png)\n\nWe can describe this easily in Haskell:\n\n::: {#555c1dba .cell execution_count=13}\n``` {.haskell .cell-code}\n-- Start with 1/1 (i.e., (1, 1))\nbinFracTree = unfoldTree make (1,1) where\n -- Place the first value in the tree, then describe the next\n -- values for `make` in a list:\n make v@(vn, vd)\n = (v, [\n -- double the numerator and denominator, then subtract 1 from the numerator\n (2*vn - 1, 2*vd),\n -- same, but add 1 to the numerator instead\n (2*vn + 1, 2*vd)\n ])\n```\n:::\n\n\nThe entries of this tree have an additional interpretation when converted to their binary expansions.\nThese fractions always terminate in a \"1\" in binary, but ignoring this final entry, starting from the root\n and following \"left\" for 0 and \"right\" for 1 places us at that fraction in the tree.\nIn other words, the binary expansions encode the path from the root to the node.\n\n![](./binary_expansion_tree.png)\n\n\nWhy Bother?\n-----------\n\nThe tree of binary fractions and the Stern-Brocot tree are both infinite binary search trees,\n so we might imagine overlaying one tree over the other, pairing up the individual entries.\n\n![](./question_mark_tree.png)\n\nIn Haskell, we can pair up entries recursively:\n\n::: {#82577de7 .cell execution_count=14}\n``` {.haskell .cell-code}\nzipTree :: Tree a -> Tree b -> Tree (a,b)\n-- Pair the values in the nodes together, then recurse with the child trees\nzipTree (Node x xs) (Node y ys) = Node (x,y) $ zipWith zipTree xs ys\n\nbinarySBTree = zipTree sternBrocot binFracTree\n```\n:::\n\n\nConveniently, both left subtrees of the root fall in the interval (0, 1).\nIt also pairs up 1 and 1/2 with themselves.\nDoing so establishes a bijection between the rationals and the binary rationals in that interval.\nRationals are more continuous than integers, so it might be of some curiosity to plot this function.\nWe only have to look at a square over the unit interval. Doing so reveals a curious shape:\n\n::: {#66c88259 .cell layout-ncol='2' execution_count=15}\n``` {.haskell .cell-code code-fold=\"true\"}\nimport Data.Tuple (swap)\nimport Data.List (sort)\nimport Data.Bifunctor (bimap, first)\n\nleftSubtree (Node _ (x:_)) = x\n\n-- Divide entries of the (zipped) trees\n() (a,b) = fromIntegral a / fromIntegral b :: Double\nbinarySBDoubles n = take n $ map (bimap () ()) $ bfs $ leftSubtree binarySBTree\n\n(MPL.tightLayout <>) $ uncurry MPL.plot $ unzip $ sort $ map swap $ binarySBDoubles 250\n(MPL.tightLayout <>) $ uncurry MPL.plot $ unzip $ sort $ binarySBDoubles 250\n```\n\n::: {.cell-output .cell-output-display}\n![Binary rationals on the x-axis, rationals on the y-axis](index_files/figure-html/cell-15-output-1.svg){}\n:::\n\n::: {.cell-output .cell-output-display}\n![Rationals on the x-axis, binary rationals on the y-axis](index_files/figure-html/cell-15-output-2.svg){}\n:::\n:::\n\n\nThe plot on the right which maps the rationals to the binary rationals is known as\n [Minkowski's question mark function](https://en.wikipedia.org/wiki/Minkowski%27s_question-mark_function).\nNotice that this function is nearly 1/2 for values near 1/2\n (nearly 1/4 for values near 1/3, nearly 1/8 for values near 1/4, etc.).\n\n\n### I'm Repeating Myself\n\nThe inverse question mark map (which I'll call ¿ for short), besides mapping binary rationals to rationals,\n has an interesting relationship with other rational numbers.\nRecall that we only defined the function in terms of fractions\n which happen to have finite binary expansions.\nThose with infinite binary expansions, such as 1/3 (and indeed, any fraction whose denominator\n isn't a power of 2) aren't defined.\n\n$$\n\\begin{gather*}\n {1 \\over 2} = 0.1_2\n \\\\\n {1 \\over 3} = 0.\\overline{01} = 0.\\textcolor{red}{01}\\textcolor{green}{01}\\textcolor{blue}{01}...\n \\\\\n {1 \\over 4} = 0.01_2\n \\\\\n {1 \\over 5} = 0.\\overline{0011} = 0.\\textcolor{red}{0011}\\textcolor{green}{0011}\\textcolor{blue}{0011}...\n \\\\\n \\vdots\n\\end{gather*}\n$$\n\nWe can persevere if we continue to interpret the binary strings as a path in the tree.\nThis means that for 1/3, we go left initially, then alternate between going left and right.\nAs we do so, let's take note of the values we pass along the way:\n\n::: {#a0b1bf2f .cell execution_count=16}\n``` {.haskell .cell-code}\n-- Follow the path described by the binary expansion of 1/3\noneThirdPath = treePath (0:cycle [0,1]) $ zipTree sternBrocot binFracTree\n```\n:::\n\n\n::: {#f54ff78f .cell .plain execution_count=17}\n``` {.haskell .cell-code code-fold=\"true\"}\ntrimTo n x = if length x > n then \"(too big to show)\" else x\n\ntreePathColumns = columns (\\(_, f) v -> f v) (\\(l, _) -> Headed l) [\n (stringCell \"n\",\n stringCell . fromEither . fmap show),\n (stringCell \"Binary fraction\",\n stringCell . fromEither . fmap (trimTo 10 . showAsRational' . snd . (oneThirdPath !!))),\n (stringCell \"Binary fraction (decimal)\",\n stringCell . fromEither . fmap (show . () . snd . (oneThirdPath !!))),\n (stringCell \"Stern-Brocot rational\",\n stringCell . fromEither . fmap (trimTo 10 . showAsRational' . fst . (oneThirdPath !!))),\n (stringCell \"Stern-Brocot rational (decimal)\",\n stringCell . fromEither . fmap (show . () . fst . (oneThirdPath !!)))\n ] where\n fromEither = either id id\n\nrenderTable treePathColumns (map Right [0..8] ++ [Left \"...\", Right 100, Left \"...\"])\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
\n n\n \n Binary fraction\n \n Binary fraction (decimal)\n \n Stern-Brocot rational\n \n Stern-Brocot rational (decimal)\n
\n 0\n \n 1/1\n \n 1.0\n \n 1/1\n \n 1.0\n
\n 1\n \n 1/2\n \n 0.5\n \n 1/2\n \n 0.5\n
\n 2\n \n 1/4\n \n 0.25\n \n 1/3\n \n 0.3333333333333333\n
\n 3\n \n 3/8\n \n 0.375\n \n 2/5\n \n 0.4\n
\n 4\n \n 5/16\n \n 0.3125\n \n 3/8\n \n 0.375\n
\n 5\n \n 11/32\n \n 0.34375\n \n 5/13\n \n 0.38461538461538464\n
\n 6\n \n 21/64\n \n 0.328125\n \n 8/21\n \n 0.38095238095238093\n
\n 7\n \n 43/128\n \n 0.3359375\n \n 13/34\n \n 0.38235294117647056\n
\n 8\n \n 85/256\n \n 0.33203125\n \n 21/55\n \n 0.38181818181818183\n
\n ...\n \n ...\n \n ...\n \n ...\n \n ...\n
\n 100\n \n (too big to show)\n \n 0.3333333333333333\n \n (too big to show)\n \n 0.3819660112501052\n
\n ...\n \n ...\n \n ...\n \n ...\n \n ...\n
\n```\n:::\n:::\n\n\n::: {#9f7b9346 .cell layout-ncol='2' execution_count=18}\n``` {.haskell .cell-code code-fold=\"true\"}\nconvergentsOneThird = map (() . snd) oneThirdPath\nconvergentsSBNumber = map (() . fst) oneThirdPath\n\nplotSequence n = uncurry MPL.plot . unzip . take n . zip ([0..] :: [Int])\n\n(MPL.tightLayout <>) $ plotSequence 20 convergentsOneThird\n(MPL.tightLayout <>) $ plotSequence 20 convergentsSBNumber\n```\n\n::: {.cell-output .cell-output-display}\n![Binary convergents of 1/3](index_files/figure-html/cell-18-output-1.svg){}\n:::\n\n::: {.cell-output .cell-output-display}\n![¿ applied to binary convergents of 1/3, which also appear to converge](index_files/figure-html/cell-18-output-2.svg){}\n:::\n:::\n\n\nBoth sequences appear to converge to a number, with the binary fractions obviously converging to 1/3.\nThe rationals from the Stern-Brocot don't appear to be converging to a repeating decimal.\nLooking closer, the numerators and denominators of the fractions appear to come from the Fibonacci numbers.\nIn fact, the quantity that the fractions approach is $2 - \\varphi$, where φ is the golden ratio.\nThis number is the root of the polynomial $x^2 - 3x + 1$.\n\nIn fact, all degree 2 polynomials have roots that are encoded by a repeating path in the Stern-Brocot tree.\nPut another way, ¿ can be extended to map rationals other than binary fractions to quadratic roots\n (and ? maps quadratic roots to rational numbers).\nThis is easier to understand when writing the quantity as its\n [continued fraction expansion](https://en.wikipedia.org/wiki/Continued_fraction),\n but that's an entirely separate discussion.\n\nEither way, it tells us something interesting: not only can all rational numbers be enumerated,\n but so can quadratic *irrationals*.\n\n\n### The Other Side\n\nI'd like to briefly digress from talking about enumerations and mention the right subtree.\nThe question mark function, as defined here, is only defined on numbers between 0 and 1\n (and even then, technically only rational numbers).\nAccording to Wikipedia's definition, the question mark function is quasi-periodic --\n $?(x + 1) = ?(x) + 1$.\nOn the other hand, according to the definition by pairing up the two trees,\n rationals greater than 1 get mapped to binary fractions between 1 and 2.\n\n::: {#fig-question-mark-linlog .cell layout-ncol='2' execution_count=19}\n``` {.haskell .cell-code code-fold=\"true\"}\nbinarySBDoublesAll n = take n $ map (bimap () ()) $ bfs binarySBTree\n\n(MPL.tightLayout <>) $ uncurry MPL.plot $\n unzip $ sort $ binarySBDoublesAll 250\n(MPL.tightLayout <>) $ uncurry MPL.plot $\n unzip $ map (first log) $ sort $ binarySBDoublesAll 250\n```\n\n::: {.cell-output .cell-output-display}\n![linear x-axis](index_files/figure-html/fig-question-mark-linlog-output-1.svg){#fig-question-mark-linlog-1}\n:::\n\n::: {.cell-output .cell-output-display}\n![(base 2)-logarithmic x-axis](index_files/figure-html/fig-question-mark-linlog-output-2.svg){#fig-question-mark-linlog-2}\n:::\n\nQuestion mark function including right subtree\n:::\n\n\nHere are graphs describing *our* question mark function, on linear and logarithmic plots.\nInstead of repeating, the function continues its self-similar behavior\n as it proceeds onward to infinity (logarithmically).\nThe right graph stretches from -∞, where its value would be 0, to ∞, where its value would be 2.\n\nPersonally, I like this definition a bit better, if only because it matches other ways\n of thinking about the interval (0, 1).\nFor example,\n\n- In topology, it's common to show that this interval is homeomorphic to the entire real line\n- It's similar to the [rational functions which appear in stereography](/posts/math/stereo/1/),\n which continue to infinity instead of being periodic\n- It showcases how the Stern-Brocot tree sorts rational numbers by complexity better\n\nHowever, it's also true that different definitions are good for different things.\nFor example, periodicity matches the intuition that numbers can be decomposed\n into a fractional and integral part.\nIntegral parts grow without bound, while fractional parts are periodic,\n just like the function would be.\n\n\nClosing\n-------\n\nI'd like to draw this discussion of enumerating numbers to a close for now.\nI wrote this article to establish some preliminaries regarding *another* post that I have planned.\nOn the other hand, since I was describing the Stern-Brocot tree, I felt it also pertinent\n to show the question mark function, since it's a very interesting self-similar curve.\nEven then, I have shown them as a curiosity instead of giving them their time in the spotlight.\n\nI have omitted some things I would like to have discussed, such as\n [order type](https://en.wikipedia.org/wiki/Order_type),\n and enumerating things beyond just the quadratic irrationals.\nI may return to some of these topics in the future, such as to show a way to order integer polynomials.\n\nDiagrams created with GeoGebra (because trying to render them in LaTeX would have taken too long)\n and Matplotlib.\n\n", "supporting": [ - "index_files" + "index_files/figure-html" ], "filters": [], "includes": { diff --git a/_freeze/posts/math/polycount/4/appendix/index/execute-results/html.json b/_freeze/posts/math/polycount/4/appendix/index/execute-results/html.json index 8420d57..0280590 100644 --- a/_freeze/posts/math/polycount/4/appendix/index/execute-results/html.json +++ b/_freeze/posts/math/polycount/4/appendix/index/execute-results/html.json @@ -1,10 +1,10 @@ { - "hash": "aa6be74cf2b25008676feea358e2a9ba", + "hash": "75955c8622a399b72ae2c26309a190aa", "result": { "engine": "jupyter", - "markdown": "---\ntitle: \"Polynomial Counting 4, Addendum\"\ndescription: |\n Additional notes on irrational -adic expansions, including complex embeddings thereof.\nformat:\n html:\n html-math-method: katex\ndate: \"2025-03-03\"\ncategories:\n - algebra\n - haskell\n - interactive\n---\n\n\n\nAfter converting my original [Two 2's post](../), I grew pleased with making its diagrams\n and content more reproducible.\nHowever, I noticed some things which required further examination.\n\nFirst, let's write out a double-width carry function more concretely in Haskell.\n\n::: {#0676e879 .cell execution_count=3}\n``` {.haskell .cell-code}\n-- Widened carry of a particular repeated amount\n-- i.e., for carry2 2, the carry is 22 = 100\ncarry2 b = carry2' []\n where carry2' zs (x:y:z:xs)\n | q == 0 = carry2' (x:zs) (y:z:xs) -- try carrying at a higher place value\n | otherwise = foldl (flip (:)) ys zs -- carry here\n where ys = r : y-x+r : z+q : xs\n (q, r) = x `quotRem` b\n```\n:::\n\n\nIn the parent post, it was discussed that the integer four has a non-repeating expansion\n when expressed as an *κ*-adic integer.\nLet's put this to the test by writing out each step for producing expansions of two and four.\nWe'll also test these expansions by evaluating them at an approximation of $\\kappa = \\sqrt 3 + 1$.\nWe should roughly get two and four at each step.\n\n::: {#3d716217 .cell layout-ncol='2' execution_count=4}\n``` {.haskell .cell-code code-fold=\"true\"}\n-- Directly expand the integer `n`, using the `carry` showing each step for `count` steps\nexpandSteps count carry n = take count $ iterate carry $ n:replicate count 0\n\n-- Horner evaluation on polynomials of ascending powers\nhornerEval x = foldr (\\c a -> a * x + c) 0\n-- Pair a polynomial with its evaluation at x\npairEval x = (,) <*> hornerEval x . map fromIntegral\n\nadicTwoSteps = map (pairEval (sqrt 3 + 1)) $ expandSteps 10 (carry2 2) 2\nadicFourSteps = map (pairEval (sqrt 3 + 1)) $ expandSteps 10 (carry2 2) 4\n\nmarkdown \"Iteratively carrying \\\"2\\\"\"\nmarkdown \"Iteratively carrying \\\"4\\\"\"\n\nputStrLn . unlines . map show $ adicTwoSteps\nputStrLn . unlines . map show $ adicFourSteps\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\nIteratively carrying \"2\"\n:::\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\nIteratively carrying \"4\"\n:::\n\n::: {.cell-output .cell-output-display}\n```\n([2,0,0,0,0,0,0,0,0,0,0],2.0)\n([0,-2,1,0,0,0,0,0,0,0,0],1.9999999999999996)\n([0,0,3,-1,0,0,0,0,0,0,0],2.0000000000000004)\n([0,0,1,-3,1,0,0,0,0,0,0],1.9999999999999982)\n([0,0,1,-1,3,-1,0,0,0,0,0],2.000000000000005)\n([0,0,1,-1,1,-3,1,0,0,0,0],1.9999999999999867)\n([0,0,1,-1,1,-1,3,-1,0,0,0],2.0000000000000364)\n([0,0,1,-1,1,-1,1,-3,1,0,0],1.9999999999999005)\n([0,0,1,-1,1,-1,1,-1,3,-1,0],2.000000000000272)\n([0,0,1,-1,1,-1,1,-1,1,-3,1],1.999999999999257)\n```\n:::\n\n::: {.cell-output .cell-output-display}\n```\n([4,0,0,0,0,0,0,0,0,0,0],4.0)\n([0,-4,2,0,0,0,0,0,0,0,0],3.999999999999999)\n([0,0,6,-2,0,0,0,0,0,0,0],4.000000000000001)\n([0,0,0,-8,3,0,0,0,0,0,0],4.000000000000003)\n([0,0,0,0,11,-4,0,0,0,0,0],4.000000000000021)\n([0,0,0,0,1,-14,5,0,0,0,0],3.9999999999998543)\n([0,0,0,0,1,0,19,-7,0,0,0],4.000000000000845)\n([0,0,0,0,1,0,1,-25,9,0,0],4.000000000000486)\n([0,0,0,0,1,0,1,-1,33,-12,0],3.9999999999982596)\n([0,0,0,0,1,0,1,-1,1,-44,16],3.999999999986544)\n```\n:::\n:::\n\n\nNote that when an list is displayed in this post, it should be interpreted as an expansion\n in increasing powers.\n\n\nExpansions of *κ*-adics\n-----------------------\n\nIn the original post, we ignored expansions other than the chaotic expansions of four.\nConsider that we can use *two different* expansions for three, since we are using a balanced alphabet:\n\n::: {#8b74477c .cell layout-ncol='2' execution_count=5}\n``` {.haskell .cell-code code-fold=\"true\"}\nmarkdown \"Increment two:\"\nmarkdown \"Decrement three:\"\n\nadicThree = (1:) $ tail $ fst $ last adicTwoSteps\nprint adicThree\n\nadicThree' = ((-1):) $ tail $ fst $ last adicFourSteps\nprint adicThree'\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\nIncrement two:\n:::\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\nDecrement three:\n:::\n\n::: {.cell-output .cell-output-display}\n```\n[1,0,1,-1,1,-1,1,-1,1,-3,1]\n```\n:::\n\n::: {.cell-output .cell-output-display}\n```\n[-1,0,0,0,1,0,1,-1,1,-44,16]\n```\n:::\n:::\n\n\nFor convenience (and correctness), we'll retain the carry heads rather than truncating them.\n\nHow can we be sure that both of these are expansions of three?\nEasy. Just negate every term of one of them, then add them together to see if they cancel.\n\n::: {#5b125f65 .cell execution_count=6}\n``` {.haskell .cell-code}\nmaybeZero = take 10 $ iterate (carry2 2) $ zipWith (+) adicThree (map negate adicThree')\nputStrLn . unlines . map show $ maybeZero\n```\n\n::: {.cell-output .cell-output-display}\n```\n[2,0,1,-1,0,-1,0,0,0,41,-15]\n[0,-2,2,-1,0,-1,0,0,0,41,-15]\n[0,0,4,-2,0,-1,0,0,0,41,-15]\n[0,0,0,-6,2,-1,0,0,0,41,-15]\n[0,0,0,0,8,-4,0,0,0,41,-15]\n[0,0,0,0,0,-12,4,0,0,41,-15]\n[0,0,0,0,0,0,16,-6,0,41,-15]\n[0,0,0,0,0,0,0,-22,8,41,-15]\n[0,0,0,0,0,0,0,0,30,30,-15]\n[0,0,0,0,0,0,0,0,0,0,0]\n```\n:::\n:::\n\n\nAs you can see, eventually we completely clear the number and just get list of zeros.\n\nWhich of these expansions is more valid?\nI would argue that we should prefer expansions obtained by incrementing, since the natural numbers\n are built in the same way.\nThis is flimsy, though. By doing this, negative one will never appear in the one's place.\n\nIt's also worth pointing out that we only have this choice at odd numbers.\nAt even numbers, the one's place is always zero.\nIt's easy to see why this is the case -- if there's a \"2\" in the one's place, it can be carried,\n just like in binary.\n\n\nBalanced vs. Binary *κ*-adics\n-----------------------------\n\nA more unusual consequence of the carry is that despite the initial choice of alphabet,\n [negative numerals can be cleared](../#all-positive)\n from the expansion by using an extra-greedy borrow.\n\nThere is only one significant difference in how the two are implemented:\n the choice of integer division function.\nIn Haskell, there exists both `quotRem` and `divMod`, with the two disagreeing on negative remainders.\n\n::: {#50408001 .cell layout-ncol='2' execution_count=7}\n``` {.haskell .cell-code code-fold=\"true\"}\nmarkdown \"`quotRem`\"\nmarkdown \"`divMod`\"\n\nprint $ quotRem (-27) 5\nprint $ divMod (-27) 5\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n`quotRem`\n:::\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n`divMod`\n:::\n\n::: {.cell-output .cell-output-display}\n```\n(-5,-2)\n```\n:::\n\n::: {.cell-output .cell-output-display}\n```\n(-6,3)\n```\n:::\n:::\n\n\nWe can factor our choice out of our two digit-wide carry function by passing it as an argument:\n\n::: {#f7b54625 .cell execution_count=8}\n``` {.haskell .cell-code}\ncarry2QR qr b = carry2QR' []\n where carry2QR' zs (x:y:z:xs)\n | q == 0 = carry2QR' (x:zs) (y:z:xs) -- try carrying at a higher place value\n | otherwise = foldl (flip (:)) ys zs -- carry here\n where ys = r : y-x+r : z+q : xs\n (q, r) = x `qr` b\n```\n:::\n\n\nNow, let's compare the iterates of the two options by applying them to \"4\":\n\n::: {#e8249842 .cell layout-ncol='2' execution_count=9}\n``` {.haskell .cell-code code-fold=\"true\"}\ncendree4QuotRem10Steps = map (pairEval (sqrt 3 + 1)) $ expandSteps 10 (carry2QR quotRem 2) 4\ncendree4DivMod10Steps = map (pairEval (sqrt 3 + 1)) $ expandSteps 10 (carry2QR divMod 2) 4\n\nmarkdown \"`quotRem`\"\nmarkdown \"`divMod`\"\n\nputStrLn . unlines . map show $ cendree4QuotRem10Steps\nputStrLn . unlines . map show $ cendree4DivMod10Steps\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n`quotRem`\n:::\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n`divMod`\n:::\n\n::: {.cell-output .cell-output-display}\n```\n([4,0,0,0,0,0,0,0,0,0,0],4.0)\n([0,-4,2,0,0,0,0,0,0,0,0],3.999999999999999)\n([0,0,6,-2,0,0,0,0,0,0,0],4.000000000000001)\n([0,0,0,-8,3,0,0,0,0,0,0],4.000000000000003)\n([0,0,0,0,11,-4,0,0,0,0,0],4.000000000000021)\n([0,0,0,0,1,-14,5,0,0,0,0],3.9999999999998543)\n([0,0,0,0,1,0,19,-7,0,0,0],4.000000000000845)\n([0,0,0,0,1,0,1,-25,9,0,0],4.000000000000486)\n([0,0,0,0,1,0,1,-1,33,-12,0],3.9999999999982596)\n([0,0,0,0,1,0,1,-1,1,-44,16],3.999999999986544)\n```\n:::\n\n::: {.cell-output .cell-output-display}\n```\n([4,0,0,0,0,0,0,0,0,0,0],4.0)\n([0,-4,2,0,0,0,0,0,0,0,0],3.999999999999999)\n([0,0,6,-2,0,0,0,0,0,0,0],4.000000000000001)\n([0,0,0,-8,3,0,0,0,0,0,0],4.000000000000003)\n([0,0,0,0,11,-4,0,0,0,0,0],4.000000000000021)\n([0,0,0,0,1,-14,5,0,0,0,0],3.9999999999998543)\n([0,0,0,0,1,0,19,-7,0,0,0],4.000000000000845)\n([0,0,0,0,1,0,1,-25,9,0,0],4.000000000000486)\n([0,0,0,0,1,0,1,1,35,-13,0],4.000000000005557)\n([0,0,0,0,1,0,1,1,1,-47,17],3.9999999999669607)\n```\n:::\n:::\n\n\nFortunately, regardless of which function we pick, the evaluation roughly gives four at each step.\nNote that since `quotRem` allows negative remainders, implementing the carry with it causes\n negative numbers to show up in our expansions.\nConversely, negative numbers *cannot* show up if we use `divMod`.\n\n\n### Chaos before Four\n\nRecall again the series for two in the *κ*-adics:\n\n$$\n2 = ...1\\bar{1}1\\bar{1}1\\bar{1}100_{\\kappa}\n$$\n\nSince the `divMod` implementation clears negative numbers from expansions, we can try using it on this series.\nThe result is another chaotic series:\n\n::: {#27acbab0 .cell execution_count=10}\n``` {.haskell .cell-code}\ncendree2DivModCycleExpansion = take 11 $\n iterate (carry2QR divMod 2) $ take 15 $ 0:0:cycle [1,-1]\nputStrLn . unlines . map show $ cendree2DivModCycleExpansion\n```\n\n::: {.cell-output .cell-output-display}\n```\n[0,0,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1]\n[0,0,1,1,3,-2,1,-1,1,-1,1,-1,1,-1,1]\n[0,0,1,1,1,-4,2,-1,1,-1,1,-1,1,-1,1]\n[0,0,1,1,1,0,6,-3,1,-1,1,-1,1,-1,1]\n[0,0,1,1,1,0,0,-9,4,-1,1,-1,1,-1,1]\n[0,0,1,1,1,0,0,1,14,-6,1,-1,1,-1,1]\n[0,0,1,1,1,0,0,1,0,-20,8,-1,1,-1,1]\n[0,0,1,1,1,0,0,1,0,0,28,-11,1,-1,1]\n[0,0,1,1,1,0,0,1,0,0,0,-39,15,-1,1]\n[0,0,1,1,1,0,0,1,0,0,0,1,55,-21,1]\n[0,0,1,1,1,0,0,1,0,0,0,1,1,-75,28]\n```\n:::\n:::\n\n\nNote that in this case, we're truncating the alternating series!\nThis means that naively evaluating the series as before will not give the correct value.\n\nTo check the validity of this series, we can check that we get the same series before the truncated elements\n by expanding 2 directly:\n\n::: {#e8d9f36e .cell execution_count=11}\n``` {.haskell .cell-code}\ncendree2DivMod15Steps = map (pairEval (sqrt 3 + 1)) $\n expandSteps 15 (carry2QR divMod 2) 2\nputStrLn . unlines . map show $ cendree2DivMod15Steps\n```\n\n::: {.cell-output .cell-output-display}\n```\n([2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],2.0)\n([0,-2,1,0,0,0,0,0,0,0,0,0,0,0,0,0],1.9999999999999996)\n([0,0,3,-1,0,0,0,0,0,0,0,0,0,0,0,0],2.0000000000000004)\n([0,0,1,-3,1,0,0,0,0,0,0,0,0,0,0,0],1.9999999999999982)\n([0,0,1,1,5,-2,0,0,0,0,0,0,0,0,0,0],2.0000000000000115)\n([0,0,1,1,1,-6,2,0,0,0,0,0,0,0,0,0],1.9999999999999758)\n([0,0,1,1,1,0,8,-3,0,0,0,0,0,0,0,0],1.9999999999999394)\n([0,0,1,1,1,0,0,-11,4,0,0,0,0,0,0,0],1.9999999999995541)\n([0,0,1,1,1,0,0,1,16,-6,0,0,0,0,0,0],1.999999999999047)\n([0,0,1,1,1,0,0,1,0,-22,8,0,0,0,0,0],1.999999999993247)\n([0,0,1,1,1,0,0,1,0,0,30,-11,0,0,0,0],2.000000000015452)\n([0,0,1,1,1,0,0,1,0,0,0,-41,15,0,0,0],2.0000000000454636)\n([0,0,1,1,1,0,0,1,0,0,0,1,57,-21,0,0],1.9999999998345677)\n([0,0,1,1,1,0,0,1,0,0,0,1,1,-77,28,0],1.9999999961807242)\n([0,0,1,1,1,0,0,1,0,0,0,1,1,1,106,-39],1.999999997642363)\n```\n:::\n:::\n\n\nUp to the carry head, the series are the same.\nWe can do the same thing to get a series for negative one, a number which has a terminating expansion\n in the balanced alphabet.\n\n::: {#7600aa54 .cell execution_count=12}\n``` {.haskell .cell-code}\ncendreeNeg1DivMod15Steps = map (pairEval (sqrt 3 + 1)) $\n expandSteps 15 (carry2QR divMod 2) (-1)\nputStrLn . unlines . map show $ cendreeNeg1DivMod15Steps\n```\n\n::: {.cell-output .cell-output-display}\n```\n([-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],-1.0)\n([1,2,-1,0,0,0,0,0,0,0,0,0,0,0,0,0],-0.9999999999999996)\n([1,0,-3,1,0,0,0,0,0,0,0,0,0,0,0,0],-1.0000000000000004)\n([1,0,1,5,-2,0,0,0,0,0,0,0,0,0,0,0],-0.9999999999999958)\n([1,0,1,1,-6,2,0,0,0,0,0,0,0,0,0,0],-1.0000000000000089)\n([1,0,1,1,0,8,-3,0,0,0,0,0,0,0,0,0],-1.0000000000000222)\n([1,0,1,1,0,0,-11,4,0,0,0,0,0,0,0,0],-1.000000000000163)\n([1,0,1,1,0,0,1,16,-6,0,0,0,0,0,0,0],-1.0000000000003486)\n([1,0,1,1,0,0,1,0,-22,8,0,0,0,0,0,0],-1.0000000000024718)\n([1,0,1,1,0,0,1,0,0,30,-11,0,0,0,0,0],-0.9999999999943441)\n([1,0,1,1,0,0,1,0,0,0,-41,15,0,0,0,0],-0.9999999999833591)\n([1,0,1,1,0,0,1,0,0,0,1,57,-21,0,0,0],-1.0000000000605525)\n([1,0,1,1,0,0,1,0,0,0,1,1,-77,28,0,0],-1.000000001397952)\n([1,0,1,1,0,0,1,0,0,0,1,1,1,106,-39,0],-1.000000000862955)\n([1,0,1,1,0,0,1,0,0,0,1,1,1,0,-145,53],-1.0000000211727116)\n```\n:::\n:::\n\n\nThe most natural property of negative one should be that if we add one to it, we get zero.\nIf we take the last iterate of this, increment the zeroth place value, and apply the carry,\n we find that everything clears properly.\n\n::: {#32d73dbb .cell execution_count=13}\n``` {.haskell .cell-code}\ncendree0FromNeg1DivMod = (\\(x:xs) -> (x + 1):xs) $ fst $ last cendreeNeg1DivMod15Steps\ncendree0IncrementSteps = map (pairEval (sqrt 3 + 1)) $ take 15 $\n iterate (carry2QR divMod 2) cendree0FromNeg1DivMod\n\nputStrLn . unlines . map show $ cendree0IncrementSteps\n```\n\n::: {.cell-output .cell-output-display}\n```\n([2,0,1,1,0,0,1,0,0,0,1,1,1,0,-145,53],-2.117271158397216e-8)\n([0,-2,2,1,0,0,1,0,0,0,1,1,1,0,-145,53],-2.117271183049399e-8)\n([0,0,4,0,0,0,1,0,0,0,1,1,1,0,-145,53],-2.1172712567825478e-8)\n([0,0,0,-4,2,0,1,0,0,0,1,1,1,0,-145,53],-2.1172716608068815e-8)\n([0,0,0,0,6,-2,1,0,0,0,1,1,1,0,-145,53],-2.11727015296754e-8)\n([0,0,0,0,0,-8,4,0,0,0,1,1,1,0,-145,53],-2.117275780300572e-8)\n([0,0,0,0,0,0,12,-4,0,0,1,1,1,0,-145,53],-2.1172363115313245e-8)\n([0,0,0,0,0,0,0,-16,6,0,1,1,1,0,-145,53],-2.1172322503707736e-8)\n([0,0,0,0,0,0,0,0,22,-8,1,1,1,0,-145,53],-2.1172474068282878e-8)\n([0,0,0,0,0,0,0,0,0,-30,12,1,1,0,-145,53],-2.1179440228212383e-8)\n([0,0,0,0,0,0,0,0,0,0,42,-14,1,0,-145,53],-2.1235751278905916e-8)\n([0,0,0,0,0,0,0,0,0,0,0,-56,22,0,-145,53],-2.1138031916672377e-8)\n([0,0,0,0,0,0,0,0,0,0,0,0,78,-28,-145,53],-1.9659634780718795e-8)\n([0,0,0,0,0,0,0,0,0,0,0,0,0,-106,-106,53],-2.0141670404689492e-8)\n([0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],0.0)\n```\n:::\n:::\n\n\nNaturally, it should be possible to use the the expansions of negative one and two in tandem on any\n series in the balanced alphabet to convert it to the binary alphabet.\nActually demonstrating this and proving it is left as an exercise.\n\n\nAre these really -adic?\n-----------------------\n\nPerhaps it is still unconvincing that expanding the integers in this way gives something\n indeed related to *p*-adics.\nIn fact, since the expansions are in binary or (balanced) ternary, the integers should just\n be a subset of the 2-adics or 3-adics.\n\nStill, I wanted to see what these numbers actually \"look\" like, so I whipped up an interactive diagram.\nYou should definitely see [this page](/interactive/adic/) for more information, but\n the gist is that *p*-adics can be sent into the complex plane in a fractal-like way.\n\n\n```{ojs}\n// | echo: false\n\n/*\nFileAttachments:\n ./cendree_DivMod_count_1024_256_digits.csv: ./cendree_DivMod_count_1024_256_digits.csv\n ./cendree_QuotRem_count_1024_256_digits.csv: ./cendree_QuotRem_count_1024_256_digits.csv\n*/\n\n// Import expansions from file.\n// Odd numbers are injected by replacing the first entry of each row with \"1\"\nasIntegers = (x) => {\n let xs = x.split(\"\\n\").map((y) => y.split(\",\").map((z) => +z))\n return [...xs, ...(xs.map((ys) => ys.with(0, 1)))]\n};\n\nadicExpansionsDivMod = FileAttachment(\n \"./cendree_DivMod_count_1024_256_digits.csv\"\n).text().then(asIntegers);\nadicExpansionsQuotRem = FileAttachment(\n \"./cendree_QuotRem_count_1024_256_digits.csv\"\n).text().then(asIntegers);\n\nimport { expansions as oldExpansions } with { base as base } from \"../../../../interactive/p-adics/showAdic.ojs\";\n\nexpansionsOrAdics = baseSelector == \"b-adic\"\n ? oldExpansions\n : baseSelector == \"κ-adic, balanced\"\n ? adicExpansionsQuotRem\n : baseSelector == \"κ-adic, binary\"\n ? adicExpansionsDivMod\n : d3.range(adicExpansionsQuotRem.length / 10).map(() => d3.range(15).map(() => +(Math.random() > 0.5)))\n\nimport { plot } with {\n expansionsOrAdics as expansions,\n embedBase as embedBase,\n geometric as geometric,\n} from \"../../../../interactive/p-adics/showAdic.ojs\";\n\nviewof baseSelector = Inputs.radio([\n \"b-adic\",\n \"κ-adic, balanced\",\n \"κ-adic, binary\",\n \"Random Binary\",\n], {\n value: \"b-adic\",\n label: \"Expansions\",\n});\n\nviewof base = Inputs.range([2, 5], {\n value: 2,\n step: 1,\n label: \"Base of expansions (b)\",\n disabled: baseSelector != \"b-adic\",\n});\n\nviewof embedBase = Inputs.range([2, 5], {\n value: 2,\n step: 0.1,\n label: \"Embedding base (p)\",\n});\n\nviewof geometric = Inputs.range([0.005, 0.995], {\n value: 0.9,\n step: 0.005,\n label: \"Geometric ratio (c)\",\n});\n\nplot\n\n```\n\nFirst, notice that with $b = 2$ and $p = 2$, switching between the \"b-adic\" option\n and the \"*κ*-adic\" option appears cause some points to appear and disappear.\nIt is easiest to see this when $c \\approx 0.5$.\nThis corresponds with the intuition that these are subsets of the 2- and 3-adics.\n\nNext, notice that when plotting the *κ*-adics, there is some self-similarity different\n from the 2- and 3-adics.\nTo see this, try setting $c \\approx 0.75$.\nThere appear to be four clusters, with the topmost and rightmost appearing to be similar to one another.\nWithin these two clusters, the rightmost portion of them appears to be the same shape\n as the larger figure.\nIf you try switching between the *κ*-adic options, you can even see the smaller\n and larger shapes changing in the same way as one another.\n\nThis is actually great news -- if you switch between the *κ*-adics and the \"random binary\" option,\n you can see that the latter option tends to the same pattern as the 2-adics.\nThus, even if the expansions for the integers are individually chaotic, together they possess a\n much different structure than pure randomness.\n\nIf you prefer not to use JavaScript, I also prepared a [Python script](./kadic.py) using Matplotlib.[^1]\nHere are a couple of screenshots from the script, which demonstrates the self-similarity mentioned above.\n\n:::: {.row}\n::: {layout-ncol=\"2\"}\n![`quotRem`](./cendree_quotrem_fractal.png)\n\n![`divMod`](./cendree_divmod_fractal.png)\n:::\n\nClusters of *κ*-adics, with self-similar patterns boxed in red.\n::::\n\n[^1]: You will also need the [`divMod`](./cendree_DivMod_count_1024_256_digits.csv)\n and [`quotRem`](./cendree_QuotRem_count_1024_256_digits.csv) data files.\n\n", + "markdown": "---\ntitle: \"Polynomial Counting 4, Addendum\"\ndescription: |\n Additional notes on irrational -adic expansions, including complex embeddings thereof.\nformat:\n html:\n html-math-method: katex\ndate: \"2025-03-03\"\ncategories:\n - algebra\n - haskell\n - interactive\n---\n\n\n\nAfter converting my original [Two 2's post](../), I grew pleased with making its diagrams\n and content more reproducible.\nHowever, I noticed some things which required further examination.\n\nFirst, let's write out a double-width carry function more concretely in Haskell.\n\n::: {#6043b2f0 .cell execution_count=3}\n``` {.haskell .cell-code}\n-- Widened carry of a particular repeated amount\n-- i.e., for carry2 2, the carry is 22 = 100\ncarry2 b = carry2' []\n where carry2' zs (x:y:z:xs)\n | q == 0 = carry2' (x:zs) (y:z:xs) -- try carrying at a higher place value\n | otherwise = foldl (flip (:)) ys zs -- carry here\n where ys = r : y-x+r : z+q : xs\n (q, r) = x `quotRem` b\n```\n:::\n\n\nIn the parent post, it was discussed that the integer four has a non-repeating expansion\n when expressed as an *κ*-adic integer.\nLet's put this to the test by writing out each step for producing expansions of two and four.\nWe'll also test these expansions by evaluating them at an approximation of $\\kappa = \\sqrt 3 + 1$.\nWe should roughly get two and four at each step.\n\n::: {#61215a8d .cell layout-ncol='2' execution_count=4}\n``` {.haskell .cell-code code-fold=\"true\"}\n-- Directly expand the integer `n`, using the `carry` showing each step for `count` steps\nexpandSteps count carry n = take count $ iterate carry $ n:replicate count 0\n\n-- Horner evaluation on polynomials of ascending powers\nhornerEval x = foldr (\\c a -> a * x + c) 0\n-- Pair a polynomial with its evaluation at x\npairEval x = (,) <*> hornerEval x . map fromIntegral\n\nadicTwoSteps = map (pairEval (sqrt 3 + 1)) $ expandSteps 10 (carry2 2) 2\nadicFourSteps = map (pairEval (sqrt 3 + 1)) $ expandSteps 10 (carry2 2) 4\n\nmarkdown \"Iteratively carrying \\\"2\\\"\"\nmarkdown \"Iteratively carrying \\\"4\\\"\"\n\nputStrLn . unlines . map show $ adicTwoSteps\nputStrLn . unlines . map show $ adicFourSteps\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\nIteratively carrying \"2\"\n:::\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\nIteratively carrying \"4\"\n:::\n\n::: {.cell-output .cell-output-display}\n```\n([2,0,0,0,0,0,0,0,0,0,0],2.0)\n([0,-2,1,0,0,0,0,0,0,0,0],1.9999999999999996)\n([0,0,3,-1,0,0,0,0,0,0,0],2.0000000000000004)\n([0,0,1,-3,1,0,0,0,0,0,0],1.9999999999999982)\n([0,0,1,-1,3,-1,0,0,0,0,0],2.000000000000005)\n([0,0,1,-1,1,-3,1,0,0,0,0],1.9999999999999867)\n([0,0,1,-1,1,-1,3,-1,0,0,0],2.0000000000000364)\n([0,0,1,-1,1,-1,1,-3,1,0,0],1.9999999999999005)\n([0,0,1,-1,1,-1,1,-1,3,-1,0],2.000000000000272)\n([0,0,1,-1,1,-1,1,-1,1,-3,1],1.999999999999257)\n```\n:::\n\n::: {.cell-output .cell-output-display}\n```\n([4,0,0,0,0,0,0,0,0,0,0],4.0)\n([0,-4,2,0,0,0,0,0,0,0,0],3.999999999999999)\n([0,0,6,-2,0,0,0,0,0,0,0],4.000000000000001)\n([0,0,0,-8,3,0,0,0,0,0,0],4.000000000000003)\n([0,0,0,0,11,-4,0,0,0,0,0],4.000000000000021)\n([0,0,0,0,1,-14,5,0,0,0,0],3.9999999999998543)\n([0,0,0,0,1,0,19,-7,0,0,0],4.000000000000845)\n([0,0,0,0,1,0,1,-25,9,0,0],4.000000000000486)\n([0,0,0,0,1,0,1,-1,33,-12,0],3.9999999999982596)\n([0,0,0,0,1,0,1,-1,1,-44,16],3.999999999986544)\n```\n:::\n:::\n\n\nNote that when an list is displayed in this post, it should be interpreted as an expansion\n in increasing powers.\n\n\nExpansions of *κ*-adics\n-----------------------\n\nIn the original post, we ignored expansions other than the chaotic expansions of four.\nConsider that we can use *two different* expansions for three, since we are using a balanced alphabet:\n\n::: {#342e241c .cell layout-ncol='2' execution_count=5}\n``` {.haskell .cell-code code-fold=\"true\"}\nmarkdown \"Increment two:\"\nmarkdown \"Decrement three:\"\n\nadicThree = (1:) $ tail $ fst $ last adicTwoSteps\nprint adicThree\n\nadicThree' = ((-1):) $ tail $ fst $ last adicFourSteps\nprint adicThree'\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\nIncrement two:\n:::\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\nDecrement three:\n:::\n\n::: {.cell-output .cell-output-display}\n```\n[1,0,1,-1,1,-1,1,-1,1,-3,1]\n```\n:::\n\n::: {.cell-output .cell-output-display}\n```\n[-1,0,0,0,1,0,1,-1,1,-44,16]\n```\n:::\n:::\n\n\nFor convenience (and correctness), we'll retain the carry heads rather than truncating them.\n\nHow can we be sure that both of these are expansions of three?\nEasy. Just negate every term of one of them, then add them together to see if they cancel.\n\n::: {#777b162e .cell execution_count=6}\n``` {.haskell .cell-code}\nmaybeZero = take 10 $ iterate (carry2 2) $ zipWith (+) adicThree (map negate adicThree')\nputStrLn . unlines . map show $ maybeZero\n```\n\n::: {.cell-output .cell-output-display}\n```\n[2,0,1,-1,0,-1,0,0,0,41,-15]\n[0,-2,2,-1,0,-1,0,0,0,41,-15]\n[0,0,4,-2,0,-1,0,0,0,41,-15]\n[0,0,0,-6,2,-1,0,0,0,41,-15]\n[0,0,0,0,8,-4,0,0,0,41,-15]\n[0,0,0,0,0,-12,4,0,0,41,-15]\n[0,0,0,0,0,0,16,-6,0,41,-15]\n[0,0,0,0,0,0,0,-22,8,41,-15]\n[0,0,0,0,0,0,0,0,30,30,-15]\n[0,0,0,0,0,0,0,0,0,0,0]\n```\n:::\n:::\n\n\nAs you can see, eventually we completely clear the number and just get list of zeros.\n\nWhich of these expansions is more valid?\nI would argue that we should prefer expansions obtained by incrementing, since the natural numbers\n are built in the same way.\nThis is flimsy, though. By doing this, negative one will never appear in the one's place.\n\nIt's also worth pointing out that we only have this choice at odd numbers.\nAt even numbers, the one's place is always zero.\nIt's easy to see why this is the case -- if there's a \"2\" in the one's place, it can be carried,\n just like in binary.\n\n\nBalanced vs. Binary *κ*-adics\n-----------------------------\n\nA more unusual consequence of the carry is that despite the initial choice of alphabet,\n [negative numerals can be cleared](../#all-positive)\n from the expansion by using an extra-greedy borrow.\n\nThere is only one significant difference in how the two are implemented:\n the choice of integer division function.\nIn Haskell, there exists both `quotRem` and `divMod`, with the two disagreeing on negative remainders.\n\n::: {#369099de .cell layout-ncol='2' execution_count=7}\n``` {.haskell .cell-code code-fold=\"true\"}\nmarkdown \"`quotRem`\"\nmarkdown \"`divMod`\"\n\nprint $ quotRem (-27) 5\nprint $ divMod (-27) 5\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n`quotRem`\n:::\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n`divMod`\n:::\n\n::: {.cell-output .cell-output-display}\n```\n(-5,-2)\n```\n:::\n\n::: {.cell-output .cell-output-display}\n```\n(-6,3)\n```\n:::\n:::\n\n\nWe can factor our choice out of our two digit-wide carry function by passing it as an argument:\n\n::: {#c93283a5 .cell execution_count=8}\n``` {.haskell .cell-code}\ncarry2QR qr b = carry2QR' []\n where carry2QR' zs (x:y:z:xs)\n | q == 0 = carry2QR' (x:zs) (y:z:xs) -- try carrying at a higher place value\n | otherwise = foldl (flip (:)) ys zs -- carry here\n where ys = r : y-x+r : z+q : xs\n (q, r) = x `qr` b\n```\n:::\n\n\nNow, let's compare the iterates of the two options by applying them to \"4\":\n\n::: {#50e75b41 .cell layout-ncol='2' execution_count=9}\n``` {.haskell .cell-code code-fold=\"true\"}\ncendree4QuotRem10Steps = map (pairEval (sqrt 3 + 1)) $ expandSteps 10 (carry2QR quotRem 2) 4\ncendree4DivMod10Steps = map (pairEval (sqrt 3 + 1)) $ expandSteps 10 (carry2QR divMod 2) 4\n\nmarkdown \"`quotRem`\"\nmarkdown \"`divMod`\"\n\nputStrLn . unlines . map show $ cendree4QuotRem10Steps\nputStrLn . unlines . map show $ cendree4DivMod10Steps\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n`quotRem`\n:::\n\n::: {.cell-output .cell-output-display .cell-output-markdown}\n`divMod`\n:::\n\n::: {.cell-output .cell-output-display}\n```\n([4,0,0,0,0,0,0,0,0,0,0],4.0)\n([0,-4,2,0,0,0,0,0,0,0,0],3.999999999999999)\n([0,0,6,-2,0,0,0,0,0,0,0],4.000000000000001)\n([0,0,0,-8,3,0,0,0,0,0,0],4.000000000000003)\n([0,0,0,0,11,-4,0,0,0,0,0],4.000000000000021)\n([0,0,0,0,1,-14,5,0,0,0,0],3.9999999999998543)\n([0,0,0,0,1,0,19,-7,0,0,0],4.000000000000845)\n([0,0,0,0,1,0,1,-25,9,0,0],4.000000000000486)\n([0,0,0,0,1,0,1,-1,33,-12,0],3.9999999999982596)\n([0,0,0,0,1,0,1,-1,1,-44,16],3.999999999986544)\n```\n:::\n\n::: {.cell-output .cell-output-display}\n```\n([4,0,0,0,0,0,0,0,0,0,0],4.0)\n([0,-4,2,0,0,0,0,0,0,0,0],3.999999999999999)\n([0,0,6,-2,0,0,0,0,0,0,0],4.000000000000001)\n([0,0,0,-8,3,0,0,0,0,0,0],4.000000000000003)\n([0,0,0,0,11,-4,0,0,0,0,0],4.000000000000021)\n([0,0,0,0,1,-14,5,0,0,0,0],3.9999999999998543)\n([0,0,0,0,1,0,19,-7,0,0,0],4.000000000000845)\n([0,0,0,0,1,0,1,-25,9,0,0],4.000000000000486)\n([0,0,0,0,1,0,1,1,35,-13,0],4.000000000005557)\n([0,0,0,0,1,0,1,1,1,-47,17],3.9999999999669607)\n```\n:::\n:::\n\n\nFortunately, regardless of which function we pick, the evaluation roughly gives four at each step.\nNote that since `quotRem` allows negative remainders, implementing the carry with it causes\n negative numbers to show up in our expansions.\nConversely, negative numbers *cannot* show up if we use `divMod`.\n\n\n### Chaos before Four\n\nRecall again the series for two in the *κ*-adics:\n\n$$\n2 = ...1\\bar{1}1\\bar{1}1\\bar{1}100_{\\kappa}\n$$\n\nSince the `divMod` implementation clears negative numbers from expansions, we can try using it on this series.\nThe result is another chaotic series:\n\n::: {#d9f2437c .cell execution_count=10}\n``` {.haskell .cell-code}\ncendree2DivModCycleExpansion = take 11 $\n iterate (carry2QR divMod 2) $ take 15 $ 0:0:cycle [1,-1]\nputStrLn . unlines . map show $ cendree2DivModCycleExpansion\n```\n\n::: {.cell-output .cell-output-display}\n```\n[0,0,1,-1,1,-1,1,-1,1,-1,1,-1,1,-1,1]\n[0,0,1,1,3,-2,1,-1,1,-1,1,-1,1,-1,1]\n[0,0,1,1,1,-4,2,-1,1,-1,1,-1,1,-1,1]\n[0,0,1,1,1,0,6,-3,1,-1,1,-1,1,-1,1]\n[0,0,1,1,1,0,0,-9,4,-1,1,-1,1,-1,1]\n[0,0,1,1,1,0,0,1,14,-6,1,-1,1,-1,1]\n[0,0,1,1,1,0,0,1,0,-20,8,-1,1,-1,1]\n[0,0,1,1,1,0,0,1,0,0,28,-11,1,-1,1]\n[0,0,1,1,1,0,0,1,0,0,0,-39,15,-1,1]\n[0,0,1,1,1,0,0,1,0,0,0,1,55,-21,1]\n[0,0,1,1,1,0,0,1,0,0,0,1,1,-75,28]\n```\n:::\n:::\n\n\nNote that in this case, we're truncating the alternating series!\nThis means that naively evaluating the series as before will not give the correct value.\n\nTo check the validity of this series, we can check that we get the same series before the truncated elements\n by expanding 2 directly:\n\n::: {#87820447 .cell execution_count=11}\n``` {.haskell .cell-code}\ncendree2DivMod15Steps = map (pairEval (sqrt 3 + 1)) $\n expandSteps 15 (carry2QR divMod 2) 2\nputStrLn . unlines . map show $ cendree2DivMod15Steps\n```\n\n::: {.cell-output .cell-output-display}\n```\n([2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],2.0)\n([0,-2,1,0,0,0,0,0,0,0,0,0,0,0,0,0],1.9999999999999996)\n([0,0,3,-1,0,0,0,0,0,0,0,0,0,0,0,0],2.0000000000000004)\n([0,0,1,-3,1,0,0,0,0,0,0,0,0,0,0,0],1.9999999999999982)\n([0,0,1,1,5,-2,0,0,0,0,0,0,0,0,0,0],2.0000000000000115)\n([0,0,1,1,1,-6,2,0,0,0,0,0,0,0,0,0],1.9999999999999758)\n([0,0,1,1,1,0,8,-3,0,0,0,0,0,0,0,0],1.9999999999999394)\n([0,0,1,1,1,0,0,-11,4,0,0,0,0,0,0,0],1.9999999999995541)\n([0,0,1,1,1,0,0,1,16,-6,0,0,0,0,0,0],1.999999999999047)\n([0,0,1,1,1,0,0,1,0,-22,8,0,0,0,0,0],1.999999999993247)\n([0,0,1,1,1,0,0,1,0,0,30,-11,0,0,0,0],2.000000000015452)\n([0,0,1,1,1,0,0,1,0,0,0,-41,15,0,0,0],2.0000000000454636)\n([0,0,1,1,1,0,0,1,0,0,0,1,57,-21,0,0],1.9999999998345677)\n([0,0,1,1,1,0,0,1,0,0,0,1,1,-77,28,0],1.9999999961807242)\n([0,0,1,1,1,0,0,1,0,0,0,1,1,1,106,-39],1.999999997642363)\n```\n:::\n:::\n\n\nUp to the carry head, the series are the same.\nWe can do the same thing to get a series for negative one, a number which has a terminating expansion\n in the balanced alphabet.\n\n::: {#38586303 .cell execution_count=12}\n``` {.haskell .cell-code}\ncendreeNeg1DivMod15Steps = map (pairEval (sqrt 3 + 1)) $\n expandSteps 15 (carry2QR divMod 2) (-1)\nputStrLn . unlines . map show $ cendreeNeg1DivMod15Steps\n```\n\n::: {.cell-output .cell-output-display}\n```\n([-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],-1.0)\n([1,2,-1,0,0,0,0,0,0,0,0,0,0,0,0,0],-0.9999999999999996)\n([1,0,-3,1,0,0,0,0,0,0,0,0,0,0,0,0],-1.0000000000000004)\n([1,0,1,5,-2,0,0,0,0,0,0,0,0,0,0,0],-0.9999999999999958)\n([1,0,1,1,-6,2,0,0,0,0,0,0,0,0,0,0],-1.0000000000000089)\n([1,0,1,1,0,8,-3,0,0,0,0,0,0,0,0,0],-1.0000000000000222)\n([1,0,1,1,0,0,-11,4,0,0,0,0,0,0,0,0],-1.000000000000163)\n([1,0,1,1,0,0,1,16,-6,0,0,0,0,0,0,0],-1.0000000000003486)\n([1,0,1,1,0,0,1,0,-22,8,0,0,0,0,0,0],-1.0000000000024718)\n([1,0,1,1,0,0,1,0,0,30,-11,0,0,0,0,0],-0.9999999999943441)\n([1,0,1,1,0,0,1,0,0,0,-41,15,0,0,0,0],-0.9999999999833591)\n([1,0,1,1,0,0,1,0,0,0,1,57,-21,0,0,0],-1.0000000000605525)\n([1,0,1,1,0,0,1,0,0,0,1,1,-77,28,0,0],-1.000000001397952)\n([1,0,1,1,0,0,1,0,0,0,1,1,1,106,-39,0],-1.000000000862955)\n([1,0,1,1,0,0,1,0,0,0,1,1,1,0,-145,53],-1.0000000211727116)\n```\n:::\n:::\n\n\nThe most natural property of negative one should be that if we add one to it, we get zero.\nIf we take the last iterate of this, increment the zeroth place value, and apply the carry,\n we find that everything clears properly.\n\n::: {#94e3bbca .cell execution_count=13}\n``` {.haskell .cell-code}\ncendree0FromNeg1DivMod = (\\(x:xs) -> (x + 1):xs) $ fst $ last cendreeNeg1DivMod15Steps\ncendree0IncrementSteps = map (pairEval (sqrt 3 + 1)) $ take 15 $\n iterate (carry2QR divMod 2) cendree0FromNeg1DivMod\n\nputStrLn . unlines . map show $ cendree0IncrementSteps\n```\n\n::: {.cell-output .cell-output-display}\n```\n([2,0,1,1,0,0,1,0,0,0,1,1,1,0,-145,53],-2.117271158397216e-8)\n([0,-2,2,1,0,0,1,0,0,0,1,1,1,0,-145,53],-2.117271183049399e-8)\n([0,0,4,0,0,0,1,0,0,0,1,1,1,0,-145,53],-2.1172712567825478e-8)\n([0,0,0,-4,2,0,1,0,0,0,1,1,1,0,-145,53],-2.1172716608068815e-8)\n([0,0,0,0,6,-2,1,0,0,0,1,1,1,0,-145,53],-2.11727015296754e-8)\n([0,0,0,0,0,-8,4,0,0,0,1,1,1,0,-145,53],-2.117275780300572e-8)\n([0,0,0,0,0,0,12,-4,0,0,1,1,1,0,-145,53],-2.1172363115313245e-8)\n([0,0,0,0,0,0,0,-16,6,0,1,1,1,0,-145,53],-2.1172322503707736e-8)\n([0,0,0,0,0,0,0,0,22,-8,1,1,1,0,-145,53],-2.1172474068282878e-8)\n([0,0,0,0,0,0,0,0,0,-30,12,1,1,0,-145,53],-2.1179440228212383e-8)\n([0,0,0,0,0,0,0,0,0,0,42,-14,1,0,-145,53],-2.1235751278905916e-8)\n([0,0,0,0,0,0,0,0,0,0,0,-56,22,0,-145,53],-2.1138031916672377e-8)\n([0,0,0,0,0,0,0,0,0,0,0,0,78,-28,-145,53],-1.9659634780718795e-8)\n([0,0,0,0,0,0,0,0,0,0,0,0,0,-106,-106,53],-2.0141670404689492e-8)\n([0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],0.0)\n```\n:::\n:::\n\n\nNaturally, it should be possible to use the the expansions of negative one and two in tandem on any\n series in the balanced alphabet to convert it to the binary alphabet.\nActually demonstrating this and proving it is left as an exercise.\n\n\nAre these really -adic?\n-----------------------\n\nPerhaps it is still unconvincing that expanding the integers in this way gives something\n indeed related to *p*-adics.\nIn fact, since the expansions are in binary or (balanced) ternary, the integers should just\n be a subset of the 2-adics or 3-adics.\n\nStill, I wanted to see what these numbers actually \"look\" like, so I whipped up an interactive diagram.\nYou should definitely see [this page](/interactive/p-adics/) for more information, but\n the gist is that *p*-adics can be sent into the complex plane in a fractal-like way.\n\n\n```{ojs}\n// | echo: false\n\n/*\nFileAttachments:\n ./cendree_DivMod_count_1024_256_digits.csv: ./cendree_DivMod_count_1024_256_digits.csv\n ./cendree_QuotRem_count_1024_256_digits.csv: ./cendree_QuotRem_count_1024_256_digits.csv\n*/\n\n// Import expansions from file.\n// Odd numbers are injected by replacing the first entry of each row with \"1\"\nasIntegers = (x) => {\n let xs = x.split(\"\\n\").map((y) => y.split(\",\").map((z) => +z))\n return [...xs, ...(xs.map((ys) => ys.with(0, 1)))]\n};\n\nadicExpansionsDivMod = FileAttachment(\n \"./cendree_DivMod_count_1024_256_digits.csv\"\n).text().then(asIntegers);\nadicExpansionsQuotRem = FileAttachment(\n \"./cendree_QuotRem_count_1024_256_digits.csv\"\n).text().then(asIntegers);\n\nimport { expansions as oldExpansions } with { base as base } from \"/interactive/p-adics/showAdic.ojs\";\n\nexpansionsOrAdics = baseSelector == \"b-adic\"\n ? oldExpansions\n : baseSelector == \"κ-adic, balanced\"\n ? adicExpansionsQuotRem\n : baseSelector == \"κ-adic, binary\"\n ? adicExpansionsDivMod\n : d3.range(adicExpansionsQuotRem.length / 10).map(() => d3.range(15).map(() => +(Math.random() > 0.5)))\n\nimport { plot } with {\n expansionsOrAdics as expansions,\n embedBase as embedBase,\n geometric as geometric,\n} from \"/interactive/p-adics/showAdic.ojs\";\n\nviewof baseSelector = Inputs.radio([\n \"b-adic\",\n \"κ-adic, balanced\",\n \"κ-adic, binary\",\n \"Random Binary\",\n], {\n value: \"b-adic\",\n label: \"Expansions\",\n});\n\nviewof base = Inputs.range([2, 5], {\n value: 2,\n step: 1,\n label: \"Base of expansions (b)\",\n disabled: baseSelector != \"b-adic\",\n});\n\nviewof embedBase = Inputs.range([2, 5], {\n value: 2,\n step: 0.1,\n label: \"Embedding base (p)\",\n});\n\nviewof geometric = Inputs.range([0.005, 0.995], {\n value: 0.9,\n step: 0.005,\n label: \"Geometric ratio (c)\",\n});\n\nplot\n\n```\n\nFirst, notice that with $b = 2$ and $p = 2$, switching between the \"b-adic\" option\n and the \"*κ*-adic\" option appears cause some points to appear and disappear.\nIt is easiest to see this when $c \\approx 0.5$.\nThis corresponds with the intuition that these are subsets of the 2- and 3-adics.\n\nNext, notice that when plotting the *κ*-adics, there is some self-similarity different\n from the 2- and 3-adics.\nTo see this, try setting $c \\approx 0.75$.\nThere appear to be four clusters, with the topmost and rightmost appearing to be similar to one another.\nWithin these two clusters, the rightmost portion of them appears to be the same shape\n as the larger figure.\nIf you try switching between the *κ*-adic options, you can even see the smaller\n and larger shapes changing in the same way as one another.\n\nThis is actually great news -- if you switch between the *κ*-adics and the \"random binary\" option,\n you can see that the latter option tends to the same pattern as the 2-adics.\nThus, even if the expansions for the integers are individually chaotic, together they possess a\n much different structure than pure randomness.\n\nIf you prefer not to use JavaScript, I also prepared a [Python script](./kadic.py) using Matplotlib.[^1]\nHere are a couple of screenshots from the script, which demonstrates the self-similarity mentioned above.\n\n:::: {.row}\n::: {layout-ncol=\"2\"}\n![`quotRem`](./cendree_quotrem_fractal.png)\n\n![`divMod`](./cendree_divmod_fractal.png)\n:::\n\nClusters of *κ*-adics, with self-similar patterns boxed in red.\n::::\n\n[^1]: You will also need the [`divMod`](./cendree_DivMod_count_1024_256_digits.csv)\n and [`quotRem`](./cendree_QuotRem_count_1024_256_digits.csv) data files.\n\n", "supporting": [ - "index_files" + "index_files/figure-html" ], "filters": [], "includes": {} diff --git a/_freeze/posts/math/stereo/2/index/execute-results/html.json b/_freeze/posts/math/stereo/2/index/execute-results/html.json index c657e0f..9615f07 100644 --- a/_freeze/posts/math/stereo/2/index/execute-results/html.json +++ b/_freeze/posts/math/stereo/2/index/execute-results/html.json @@ -1,10 +1,10 @@ { - "hash": "e723e799442b3e52f9543809a686e1f9", + "hash": "a233a2c3e506dde0e740c7aaf2014a18", "result": { "engine": "jupyter", - "markdown": "---\ntitle: \"Further Notes on Algebraic Stereography\"\ndescription: |\n How do you rotate in 2D and 3D without standard trigonometry?\nformat:\n html:\n html-math-method: katex\njupyter: python3\ndate: \"2021-10-10\"\ndate-modified: \"2025-06-30\"\ncategories:\n - algebra\n - complex analysis\n - polar roses\n - generating functions\n---\n\n\n\n\n\nIn my previous post, I discussed the stereographic projection of a circle as it pertains\n to complex numbers, as well as its applications in 2D and 3D rotation.\nIn an effort to document more interesting facts about this mathematical object\n (of which scarce information is immediately available online),\n I will now elaborate on more of its properties.\n\n\nChebyshev Polynomials\n---------------------\n\n[Previously](/posts/chebyshev/1), I derived the\n [Chebyshev polynomials](https://en.wikipedia.org/wiki/Chebyshev_polynomials)\n with the archetypal complex exponential.\nThese polynomials express the sines and cosines of a multiple of an angle from\n the sine and cosine of the base angle.\nWhere $T_n(t)$ are Chebyshev polynomials of the first kind and $U_n(t)$ are those of the second kind,\n\n$$\n\\begin{gather*}\n \\cos(n \\theta) = T_n(\\cos(\\theta))\n \\\\\n \\sin(n \\theta) = U_{n - 1}(\\cos(\\theta)) \\sin(\\theta)\n\\end{gather*}\n$$\n\nThe complex exponential derivation begins by squaring and developing a second-order recurrence.\n\n$$\n\\begin{align*}\n (e^{i\\theta})^2 &= (\\cos + i\\sin)^2\n \\\\\n &= \\cos^2 + 2i\\cos \\cdot \\sin - \\sin^2 + (0 = \\cos^2 + \\sin^2 - 1)\n \\\\\n &= 2\\cos^2 + 2i\\cos \\cdot \\sin - 1\n \\\\\n &= 2\\cos \\cdot (\\cos + i\\sin) - 1\n \\\\\n &= 2\\cos(\\theta)e^{i\\theta} - 1\n \\\\\n (e^{i\\theta})^{n+2} &= 2\\cos(\\theta)(e^{i\\theta})^{n+1} - (e^{i\\theta})^n\n\\end{align*}\n$$\n\nThis recurrence relation can then be used to obtain the Chebyshev polynomials, and hence,\n the expressions using sine and cosine above.\nPresented this way with such a simple derivation, it appears as though these relationships\n are inherently trigonometric.\n\nHowever, these polynomials actually have *nothing* to do with sine and cosine on their own.\nFor one, [they appear in graph theory](/posts/chebyshev/2), and for two,\n the importance of the complex exponential is overstated.\n$e^{i\\theta}$ really just specifies a point on the complex unit circle.\nThis property is used on the second line to coax the equation into a quadratic in $e^{i\\theta}$.\nThis is also the *only* property upon which the recurrence depends; all else is algebraic manipulation.\n\n\n### Back to the Stereograph\n\nKnowing this, let's start over with the stereographic projection of the circle:\n\n$$\no_1(t) = {1 + it \\over 1 - it}\n = {1 - t^2 \\over 1 + t^2} + i {2t \\over 1 + t^2}\n = \\text{c}_1 + i\\text{s}_1\n$$\n\nThe subscript \"1\" is because as *t* ranges over $(-\\infty, \\infty)$, the function loops once\n around the unit circle.\nTaking this to higher powers keeps points on the circle since all points on the circle\n have a norm of 1.\nIt also makes more loops around the circle, which we can denote by larger subscripts:\n\n$$\n\\begin{align*}\n o_n &= (o_1)^n\n = \\left( {1 + it \\over 1 - it} \\right)^n\n \\\\\n \\text{c}_n + i\\text{s}_n\n &= (\\text{c}_1 + i\\text{s}_1)^n\n\\end{align*}\n$$\n\nThis mirrors raising the complex exponential to a power\n (which loops over the range $(-\\pi, \\pi)$ instead).\nThe final line is analogous to de Moivre's formula, but in a form where everything is\n a ratio of polynomials in *t*.\nThis means that the Chebyshev polynomials can be obtained directly from these rational expressions:\n\n$$\n\\begin{align*}\n o_2 = (o_1)^2 &= (\\text{c}_1 + i\\text{s}_1)^2\n \\\\\n &= \\text{c}_1^2 + 2i\\text{c}_1\\text{s}_1 - \\text{s}_1^2\n + (0 = \\text{c}_1^2 + \\text{s}_1^2 - 1)\n \\\\\n &= 2\\text{c}_1^2 + 2i\\text{c}_1\\text{s}_1 - 1\n \\\\\n &= 2\\text{c}_1(\\text{c}_1 + i\\text{s}_1) - 1\n \\\\\n &= 2\\text{c}_1 o_1 - 1\n \\\\\n o_2 \\cdot (o_1)^n &= 2\\text{c}_1 o_1 \\cdot (o_1)^n - (o_1)^n\n \\\\\n o_{n+2} &= 2\\text{c}_1 o_{n+1} - o_n\n\\end{align*}\n$$\n\nThis matches the earlier recurrence relation with the complex exponential and therefore\n the recurrence relation of the Chebyshev polynomials.\nIt also means that the the rational functions obey the same relationship as sine and cosine:\n\n$$\n\\begin{matrix}\n \\begin{gather*}\n \\text{c}_n = T_n(\\text{c}_1)\n \\\\\n \\text{s}_n = U_{n-1}(\\text{c}_1) \\text{s}_1\n \\end{gather*}\n & \\text{where }\n \\text{c}_1 = {1 - t^2 \\over 1 + t^2}, &\n \\text{s}_1 = {2t \\over 1 + t^2}\n\\end{matrix}\n$$\n\nThus, the Chebyshev polynomials are tied to (coordinates on) circles,\n rather than explicitly to the trigonometric functions.\nIt is a bit strange that these polynomials are in terms of rational functions, but no stranger\n than them being in terms of *ir*rational functions like sine and cosine.\n\n\nCalculus\n--------\n\nSince these functions behave similarly to sine and cosine, one might wonder about\n the nature of these expressions in the context of calculus.\n\nFor comparison, the complex exponential (as it is a parallel construction) has a simple derivative[^1].\nSince the exponential function is its own derivative, the expression acquires\n an imaginary coefficient through the chain rule.\n\n[^1]: This is forgoing the fact that complex derivatives require more care than their real counterparts.\n It matters slightly less in this case since this function is complex-valued, but has a real parameter.\n\n$$\n\\begin{align*}\n e^{it} &= \\cos(t) + i\\sin(t)\n \\\\\n {d \\over dt} e^{it}\n &= {d \\over dt} \\cos(t) + {d \\over dt} i\\sin(t)\n \\\\\n i e^{it} &= -\\sin(t) + i\\cos(t)\n \\\\\n i[\\cos(t) + i\\sin(t)]\n &\\stackrel{\\checkmark}{=} -\\sin(t) + i\\cos(t)\n\\end{align*}\n$$\n\nMeanwhile, the complex stereograph has derivative\n\n$$\n\\begin{align*}\n {d \\over dt} o_1(t) &= {d \\over dt} {1 + it \\over 1 - it}\n = {i(1 - it) + i(1 + it) \\over (1 - it)^2}\n \\\\\n &= {2i \\over (1 - it)^2}\n = {2i(1 + it)^2 \\over (1 + t^2)^2}\n = {2i(1 - t^2 + 2it) \\over (1 + t^2)^2}\n \\\\\n &= {-4t \\over (1 + t^2)^2} + i {2(1 - t^2) \\over (1 + t^2)^2}\n \\\\\n &= {-2 \\over 1 + t^2}s_1 + i {2 \\over 1 + t^2}c_1\n \\\\\n &= -(1 + c_1)s_1 + i(1 + c_1)c_1\n \\\\\n &= i(1 + c_1)o_1\n\\end{align*}\n$$\n\nJust like the complex exponential, an imaginary coefficient falls out.\nHowever, the expression also accrues a $1 + c_1$ term, almost like an adjustment factor\n for its failure to be the complex exponential.\nSine and cosine obey a simpler relationship with respect to the derivative,\n and thus need no adjustment.\n\n\n### Complex Analysis\n\nSince $o_n$ is a curve which loops around the unit circle *n* times, that possibly suits it\n to showing a simple result from complex analysis.\nIntegrating along a contour which wraps around a sufficiently nice function's pole\n (i.e., where its magnitude grows without bound) yields a familiar value.\nThis is easiest to see with $f(z) = 1 / z$:\n\n$$\n\\oint_\\Gamma {1 \\over z} dz\n = \\int_a^b {\\gamma'(t) \\over \\gamma(t)} dt\n = 2\\pi i\n$$\n\nIn this example, Γ is a counterclockwise curve parametrized by γ which loops once around\n the pole at *z* = 0.\nMore loops will scale this by a factor according to the number of loops.\n\nNormally this equality is demonstrated with the complex exponential, but will $o_1$ work just as well?\nIf Γ is the unit circle, the integral is:\n\n$$\n\\oint_\\Gamma {1 \\over z} dz\n = \\int_{-\\infty}^\\infty {o_1'(t) \\over o_1(t)} dt\n = \\int_{-\\infty}^\\infty i(1 + c_1(t)) dt\n = 2i\\int_{-\\infty}^\\infty {1 \\over 1 + t^2} dt\n$$\n\nIf one has studied their integral identities, the indefinite version of the final integral\n will be obvious as $\\arctan(t)$, which has horizontal asymptotes of $\\pi / 2$ and $-\\pi / 2$.\nTherefore, the value of the integral is indeed $2\\pi i$.\n\nIf there are *n* loops, then naturally there are *n* of these $2\\pi i$s.\nSince powers of *o* are more loops around the circle, the chain and power rules show:\n\n$$\n\\begin{gather*}\n {d \\over dt} (o_1)^n = n(o_1)^{n-1} {d \\over dt} o_1\n \\\\[14pt]\n \\oint_\\Gamma {1 \\over z} dz\n = \\int_{-\\infty}^\\infty {n o_1(t)^{n-1} o_1'(t) \\over o_1(t)^n} dt\n = n \\int_{-\\infty}^\\infty {o_1'(t) \\over o_1(t)} dt\n = 2 \\pi i n\n\\end{gather*}\n$$\n\nIt is certainly possible to perform these contour integrals along straight lines;\n in fact, integrating along lines from 1 to *i* to -1 to -*i* deals with a\n similar integral involving arctangent.\nHowever, the best one can do to construct more loops with lines is to count each line\n multiple times, which isn't extraordinarily convincing.\n\nPerhaps the use of $\\infty$ in the integral bounds is also unconvincing.\nThe integral can be shifted back into the realm of plausibility by considering simpler bounds on $o_2$:\n\n$$\n\\begin{align*}\n \\oint_\\Gamma {1 \\over z} dz\n &= \\int_{-1}^1 {2 o_1(t) o_1'(t) \\over o_1(t)^2} dt\n \\\\\n &= 2 \\int_{-1}^1 {o_1'(t) \\over o_1(t)} dt\n \\\\\n &= 2(2i\\arctan(1) - 2i\\arctan(-1))\n \\\\\n &= 2\\pi i\n\\end{align*}\n$$\n\nThis has an additional benefit: using the series form of $1 / (1 + t^2)$ and integrating,\n one obtains the series form of the arctangent.\nThis series converges for $-1 \\le t \\le 1$, which happens to match the bounds of integration.\nThe convergence of this series is fairly important, since it is tied to formulas for π,\n in particular [Leibniz's formula](https://en.wikipedia.org/wiki/Leibniz_formula_for_%CF%80).\n\nWere one to integrate with the complex exponential, we would instead use the bounds $(0, 2\\pi)$,\n since at this point a full loop has been made.\nBut think to yourself -- how do you know the period of the complex exponential?\nHow do you know that 2π radians is equivalent to 0 radians?\nThe result using stereography relies on neither of these prior results and is directly pinned\n to a formula for π instead an apparent detour through the number *e*.\n\n\nPolar Curves\n------------\n\nPolar coordinates are useful for expressing for which the distance from the origin is\n a function of the angle with respect to the positive *x*-axis.\nThey can also be readily converted to parametric forms:\n\n$$\n\\begin{gather*}\n r(\\theta) &\\Longleftrightarrow&\n \\begin{matrix}\n x(\\theta) = r \\cos(\\theta) \\\\\n y(\\theta) = r \\sin(\\theta)\n \\end{matrix}\n\\end{gather*}\n$$\n\nPolar curves frequently loop in on themselves, and so it is necessary to choose appropriate bounds\n for θ (usually as multiples of π) when plotting.\nEvidently, this is due to the use of sine and cosine in the above parametrization.\nFortunately, $s_n$ and $c_n$ (as shown by the calculus above) have much simpler bounds.\nSo what happens when one substitutes the rational functions in place of the trig ones?\n\n\n### Polar Roses\n\n[Polar roses](https://en.wikipedia.org/wiki/Rose_(mathematics)) are beautiful shapes which have\n a simple form when expressed in polar coordinates.\n\n$$\nr(\\theta) = \\cos \\left( {p \\over q} \\cdot \\theta \\right)\n$$\n\nThe ratio $p/q$ in least terms uniquely determines the shape of the curve.\n\nIf you weren't reading this post, you might assume this curve is transcendental since it uses cosine,\n but you probably know better at this point.\nThe Chebyshev examples above demonstrate the resemblance between $c_n$ and $\\cos(n\\theta)$.\nThe subscript of $c$ is easiest to work with as an integer, so let $q = 1$.\n\n$$\nx(t) = c_p(t) c_1(t) \\qquad y(t) = c_p(t) s_1(t)\n$$\n\nwill plot a $p/1$ polar rose as t ranges over $(-\\infty, \\infty)$.\n\n\n\n::: {#fig-polar-roses-1}\n{{< video \"./polar_roses_1.mp4\" >}}\n\np/1 polar roses as rational curves.\nSince *t* never reaches infinity, a bite appears to be taken out of the graphs near (-1, 0).\"\n:::\n\n$q = 1$ happens to match the subscript *c* term of *x* and *s* term of *y*, so one might wonder\n whether the other polar curves can be obtained by allowing it to vary as well.\nAnd you'd be right.\n\n$$\nx(t) = c_p(t) c_q(t) \\qquad y(t) = c_p(t) s_q(t)\n$$\n\nwill plot a $p/q$ polar rose as t ranges over $(-\\infty, \\infty)$.\n\n\n\n::: {#fig-polar-roses-2}\n{{< video \"./polar_roses_2.mp4\" >}}\n\np/q polar roses as rational curves\n:::\n\nJust as with the prior calculus examples, doubling all subscripts of *c* and *s* will\n only require *t* to range over $(-1, 1)$, which removes the ugly bite mark.\nPerhaps it is also slightly less satisfying, since the fraction $p/q$ directly appears in the\n typical polar incarnation with cosine.\nOn the other hand, it exposes an important property of these curves: they are all rational.\n\nThis approach lends additional precision to a prospective pseudo-polar coordinate system.\nIn the next few examples, I will be using the following notation for compactness:\n\n$$\n \\begin{gather*}\n R_n(t) = f(t) &\\Longleftrightarrow&\n \\begin{matrix}\n x(t) = f(t) c_n(t) \\\\\n y(t) = f(t) s_n(t)\n \\end{matrix}\n\\end{gather*}\n$$\n\n\n### Conic Sections\n\nThe polar equation for a conic section (with a particular unit length occurring somewhere)\n in terms of its eccentricity $\\varepsilon$ is:\n\n$$\nr(\\theta) = {1 \\over 1 - \\varepsilon \\cos(\\theta)}\n$$\n\nCorrespondingly, the rational polar form can be expressed as\n\n$$\nR_1(t) = {1 \\over 1 - \\varepsilon c_1}\n$$\n\nSince polynomial arithmetic is easier to work with than trigonometric identities,\n it is a matter of pencil-and-paper algebra to recover the implicit form from a parametric one.\n\n\n#### Parabola ($|\\varepsilon| = 1$)\n\nThe conic section with the simplest implicit equation is the parabola.\nSince $c_n$ is a simple ratio of polynomials in *t*, it is much simpler to recover the implicit equation.\nFor $\\varepsilon = 1$,\n\n:::: {layout-ncol=\"2\"}\n\n::: {#39eea2a2 .cell execution_count=5}\n\n::: {.cell-output .cell-output-display}\n![$x = {c_1 \\over 1 - c_1} \\quad y = {s_1 \\over 1 - c_1}$](index_files/figure-html/cell-5-output-1.png){}\n:::\n:::\n\n\n::: {}\n$$\n\\begin{align*}\n 1 - c_1 &= 1 - {1 - t^2 \\over 1 + t^2}\n = {2 t^2 \\over 1 + t^2}\n \\\\\n y &= {s_1 \\over 1 - c_1}\n = {2t \\over 1 + t^2} {1 + t^2 \\over 2 t^2}\n = {1 \\over t}\n \\\\\n x &= {c_1 \\over 1 - c_1}\n = {1 - t^2 \\over 1 + t^2} \\cdot {1 + t^2 \\over 2 t^2}\n = {1 - t^2 \\over 2t^2}\n \\\\\n &= {1 \\over 2t^2} - {1 \\over 2}\n = {y^2 \\over 2} - {1 \\over 2}\n\\end{align*}\n$$\n:::\n::::\n\n*x* is a quadratic polynomial in *y*, so trivially the figure formed is a parabola.\nTechnically it is missing the point where $y = 0 ~ (t = \\infty)$, and this is not a circumstance\n where using a higher $c_n$ would help.\nIt is however, similar to the situation where we allow $o_1(\\infty) = -1$, and an argument\n can be made to waive away any concerns one might have.\n\n\n#### Ellipse ($|\\varepsilon| < 1$)\n\nEllipses are next.\nThe simplest fraction between zero and one is 1/2, so for $\\varepsilon = 1/2$,\n\n:::: {layout-ncol = \"2\"}\n\n::: {#f9484e49 .cell execution_count=6}\n\n::: {.cell-output .cell-output-display}\n![$x = {c_1 \\over 1 - c_1 / 2} \\quad y = {s_1 \\over 1 - c_1 / 2}$](index_files/figure-html/cell-6-output-1.png){}\n:::\n:::\n\n\n::: {}\n$$\n\\begin{align*}\n 1 - {1 \\over 2}c_1 &= 1 - {1 \\over 2} \\cdot {1 - t^2 \\over 1 + t^2}\n = {3 t^2 + 1 \\over 2 + 2t^2}\n \\\\\n y &= {s_1 \\over 1 - {1 \\over 2}c_1}\n = {4t \\over 3t^2 + 1}\n \\\\\n x &= {c_1 \\over 1 - {1 \\over 2}c_1}\n = {2 - 2t^2 \\over 3t^2 + 1}\n\\end{align*}\n$$\n:::\n::::\n\nThere isn't an obvious way to combine products of *x* and *y* into a single equation.\nThe general form of a conic section is $Ax^2 + Bxy + Cy^2 + Dx + Ey + F = 0$, so\n we know that the implicit equation for the curve almost certainly involves $x^2$ and $y^2$.\n\n$$\nx^2 = {4 - 8t^2 + 4t^4 \\over (3t^2 + 1)^2} \\qquad\n y^2 = {16t^2 \\over (3t^2 + 1)^2}\n$$\n\nSquaring produces some $t^4$ terms which cannot exist outside of these terms and *xy*.\nA linear combination of $x^2$ and $y^2$ never includes any cubic terms in the numerator\n which would appear in *xy*, so $B = 0$.\nSince all remaining terms are linear in *x* and *y*, their denominator must appear as a factor\n in the numerator of $Ax^2 + Cy^2$, whatever *A* and *C* are.\n\nSince the coefficient of $t^4$ in $x^2$ is 4, *A* must be multiple of 3.\nThrough trial and error, $A = 3, C = 4$ gives:\n\n$$\n\\begin{align*}\n 3x^2 + 4y^2\n &= {12 - 24t^2 + 12t^4 + 64t^2 \\over (3t^2 + 1)^2}\n \\\\\n &= {12 - 40t^2 + 12t^4 \\over (3t^2 + 1)^2}\n \\\\\n &= {(4t^2 + 12) (3t^2 + 1) \\over (3t^2 + 1)^2}\n \\\\\n &= {4t^2 + 12 \\over 3t^2 + 1}\n\\end{align*}\n$$\n\nSince the numerator of *y* has a *t*, this is clearly some combination of *x* and a constant.\nBy the previous line of thought, the constant term must be a multiple of 4, and picking the smallest\n option finally results in the implicit form:\n\n$$\n\\begin{align*}\n 4 &= {4(3t^2 + 1) \\over 3t^2 + 1}\n = {12t^2 + 4 \\over 3t^2 + 1}\n \\\\\n {4t^2 + 12 \\over 3t^2 + 1} - 4\n &= {8 - 8t^2 \\over 3t^2 + 1} = 4x\n \\\\[14pt]\n 3x^2 + 4y^2 &= 4x + 4\n \\\\\n 3x^2 + 4y^2 - 4x - 4 &= 0\n\\end{align*}\n$$\n\nNotably, the coefficients of *x* and *y* are 3 and 4.\nSimultaneously, $o_1(\\varepsilon) = o_1(1/2) = {3 \\over 5} + i{4 \\over 5}$.\nThis binds together three concepts: the simplest case of the Pythagorean theorem,\n the 3-4-5 right triangle; the coefficients of the implicit form; and the role of eccentricity\n with respect to stereography.\n\n\n#### Hyperbola ($|\\varepsilon| > 1$)\n\nAs evidenced by the bound on the eccentricity above, hyperbolae are in some way the inverses of ellipses.\nSince $o_1(2)$ is a reflection of $o_1(1/2)$, you might think the implicit equation for\n $\\varepsilon = 2$ to be the same, but with a flipped sign or two.\nUnfortunately, you'd be wrong.\n\n:::: {layout-ncol=\"2\"}\n\n::: {#226c9935 .cell execution_count=7}\n\n::: {.cell-output .cell-output-display}\n![$x = {c_1 \\over 1 - 2c_1} \\quad y = {s_1 \\over 1 - 2c_1}$](index_files/figure-html/cell-7-output-1.png){}\n:::\n:::\n\n\n::: {}\n$$\n\\begin{gather*}\n \\begin{align*}\n x &= {c_1 \\over 1 - 2c_1} = {1 - t^2 \\over 3t^2 - 1}\n \\\\\n y &= {s_1 \\over 1 - 2c_1} = {2t \\over 3t^2 - 1}\n \\\\[14pt]\n 3x^2 - y^2 &= {3 - 6t^2 + 3t^4 - 4t^2 \\over 3t^2 - 1}\n \\\\\n &= {(t^2 - 3)(3t^2 - 1) \\over (3t^2 - 1)^2 }\n \\\\\n &= {t^2 - 3 \\over 3t^2 - 1 }\n = ... = -4x - 1\n \\end{align*}\n \\\\[14pt]\n 3x^2 - y^2 + 4x + 1 = 0\n\\end{gather*}\n$$\n:::\n::::\n\nAt the very least, the occurrences of 1 in the place of 4 have a simple explanation: 1 = 4 - 3.\n\n\n### Archimedean Spiral\n\nArguably the simplest (non-circular) polar curve is $r(\\theta) = \\theta$, the unit\n [Archimedean spiral](https://en.wikipedia.org/wiki/Archimedean_spiral).\nSince the curve is defined by a constant turning, this is a natural application of the properties\n of sine and cosine.\nThe closest equivalent in rational polar coordinates is $R_1(t) = t$.\nBut this can be converted to an implicit form:\n\n$$\n\\begin{gather*}\n x = tc_1 \\qquad y = ts_1\n \\\\[14pt]\n x^2 + y^2 = t^2(c_1^2 + s_1^2) = t^2\n \\\\\n y = {2t^2 \\over 1 + t^2} = {2(x^2 + y^2) \\over 1 + (x^2 + y^2)}\n \\\\[14pt]\n (1 + x^2 + y^2)y = 2(x^2 + y^2)\n\\end{gather*}\n$$\n\nThe curve produced by this equation is a\n [right strophoid](https://mathworld.wolfram.com/RightStrophoid.html)\n with a node at (0, 1) and asymptote $y = 2$.\nThis form suggests something interesting about this curve: it approximates the Archimedean spiral\n (specifically the one with polar equation $r(\\theta) = \\theta/2$).\nIndeed, the sequence of curves with parametrization $R_n(t) = 2nt$ approximate the (unit) spiral\n for larger *n*, as can be seen in the following video.\n\n\n\n::: {#fig-approx-archimedes}\n{{< video ./approximate_archimedes.mp4 >}}\n\nApproximations to the Archimedean spiral\n:::\n\n\nSince R necessarily defines a rational curve, the curves will never be equal,\n just as any stretching of $c_n$ will never exactly become cosine.\n\n\nClosing\n-------\n\nSine, cosine, and the exponential function, are useful in a calculus setting precisely\n because of their constant \"velocity\" around the circle.\nAlso, nearly every modern scientific calculator in the world features buttons\n for trigonometric functions, so there seems to be no reason *not* to use them.\n\nWe can however be misled by their apparent omnipresence.\nStereographic projection has been around for *millennia*, and not every formula needs to be rewritten\n in its language.\nFor example (and as previously mentioned), defining the Chebyshev polynomials really only requires\n understanding the multiplication of two complex numbers whose norm cannot grow,\n not trigonometry and dividing angles.\nMany other instances of sine and cosine merely rely on a number (or ratio) of loops around a circle.\nWhen velocity does not factor, it will obviously do to \"stay rational\".\n\nOne of my favorite things to plot as a kid were polar roses, so I was somewhat intrigued\n to see that they are, in fact, rational curves.\nOn the other hand, their rationality follows immediately from the rationality of the circle\n (which itself follows from the existence of Pythagorean triples).\nIf I were more experienced with manipulating Chebyshev polynomials or willing to set up a\n linear system in (way too) many terms, I might have considered attempting to find\n an implicit form for them as well.\n\nDiagrams created with Sympy and Matplotlib.\n\n", + "markdown": "---\ntitle: \"Further Notes on Algebraic Stereography\"\ndescription: |\n How do you rotate in 2D and 3D without standard trigonometry?\nformat:\n html:\n html-math-method: katex\njupyter: python3\ndate: \"2021-10-10\"\ndate-modified: \"2025-06-30\"\ncategories:\n - algebra\n - complex analysis\n - polar roses\n - generating functions\n---\n\n\n\n\n\nIn my previous post, I discussed the stereographic projection of a circle as it pertains\n to complex numbers, as well as its applications in 2D and 3D rotation.\nIn an effort to document more interesting facts about this mathematical object\n (of which scarce information is immediately available online),\n I will now elaborate on more of its properties.\n\n\nChebyshev Polynomials\n---------------------\n\n[Previously](/posts/math/chebyshev/1), I derived the\n [Chebyshev polynomials](https://en.wikipedia.org/wiki/Chebyshev_polynomials)\n with the archetypal complex exponential.\nThese polynomials express the sines and cosines of a multiple of an angle from\n the sine and cosine of the base angle.\nWhere $T_n(t)$ are Chebyshev polynomials of the first kind and $U_n(t)$ are those of the second kind,\n\n$$\n\\begin{gather*}\n \\cos(n \\theta) = T_n(\\cos(\\theta))\n \\\\\n \\sin(n \\theta) = U_{n - 1}(\\cos(\\theta)) \\sin(\\theta)\n\\end{gather*}\n$$\n\nThe complex exponential derivation begins by squaring and developing a second-order recurrence.\n\n$$\n\\begin{align*}\n (e^{i\\theta})^2 &= (\\cos + i\\sin)^2\n \\\\\n &= \\cos^2 + 2i\\cos \\cdot \\sin - \\sin^2 + (0 = \\cos^2 + \\sin^2 - 1)\n \\\\\n &= 2\\cos^2 + 2i\\cos \\cdot \\sin - 1\n \\\\\n &= 2\\cos \\cdot (\\cos + i\\sin) - 1\n \\\\\n &= 2\\cos(\\theta)e^{i\\theta} - 1\n \\\\\n (e^{i\\theta})^{n+2} &= 2\\cos(\\theta)(e^{i\\theta})^{n+1} - (e^{i\\theta})^n\n\\end{align*}\n$$\n\nThis recurrence relation can then be used to obtain the Chebyshev polynomials, and hence,\n the expressions using sine and cosine above.\nPresented this way with such a simple derivation, it appears as though these relationships\n are inherently trigonometric.\n\nHowever, these polynomials actually have *nothing* to do with sine and cosine on their own.\nFor one, [they appear in graph theory](/posts/math/chebyshev/2), and for two,\n the importance of the complex exponential is overstated.\n$e^{i\\theta}$ really just specifies a point on the complex unit circle.\nThis property is used on the second line to coax the equation into a quadratic in $e^{i\\theta}$.\nThis is also the *only* property upon which the recurrence depends; all else is algebraic manipulation.\n\n\n### Back to the Stereograph\n\nKnowing this, let's start over with the stereographic projection of the circle:\n\n$$\no_1(t) = {1 + it \\over 1 - it}\n = {1 - t^2 \\over 1 + t^2} + i {2t \\over 1 + t^2}\n = \\text{c}_1 + i\\text{s}_1\n$$\n\nThe subscript \"1\" is because as *t* ranges over $(-\\infty, \\infty)$, the function loops once\n around the unit circle.\nTaking this to higher powers keeps points on the circle since all points on the circle\n have a norm of 1.\nIt also makes more loops around the circle, which we can denote by larger subscripts:\n\n$$\n\\begin{align*}\n o_n &= (o_1)^n\n = \\left( {1 + it \\over 1 - it} \\right)^n\n \\\\\n \\text{c}_n + i\\text{s}_n\n &= (\\text{c}_1 + i\\text{s}_1)^n\n\\end{align*}\n$$\n\nThis mirrors raising the complex exponential to a power\n (which loops over the range $(-\\pi, \\pi)$ instead).\nThe final line is analogous to de Moivre's formula, but in a form where everything is\n a ratio of polynomials in *t*.\nThis means that the Chebyshev polynomials can be obtained directly from these rational expressions:\n\n$$\n\\begin{align*}\n o_2 = (o_1)^2 &= (\\text{c}_1 + i\\text{s}_1)^2\n \\\\\n &= \\text{c}_1^2 + 2i\\text{c}_1\\text{s}_1 - \\text{s}_1^2\n + (0 = \\text{c}_1^2 + \\text{s}_1^2 - 1)\n \\\\\n &= 2\\text{c}_1^2 + 2i\\text{c}_1\\text{s}_1 - 1\n \\\\\n &= 2\\text{c}_1(\\text{c}_1 + i\\text{s}_1) - 1\n \\\\\n &= 2\\text{c}_1 o_1 - 1\n \\\\\n o_2 \\cdot (o_1)^n &= 2\\text{c}_1 o_1 \\cdot (o_1)^n - (o_1)^n\n \\\\\n o_{n+2} &= 2\\text{c}_1 o_{n+1} - o_n\n\\end{align*}\n$$\n\nThis matches the earlier recurrence relation with the complex exponential and therefore\n the recurrence relation of the Chebyshev polynomials.\nIt also means that the the rational functions obey the same relationship as sine and cosine:\n\n$$\n\\begin{matrix}\n \\begin{gather*}\n \\text{c}_n = T_n(\\text{c}_1)\n \\\\\n \\text{s}_n = U_{n-1}(\\text{c}_1) \\text{s}_1\n \\end{gather*}\n & \\text{where }\n \\text{c}_1 = {1 - t^2 \\over 1 + t^2}, &\n \\text{s}_1 = {2t \\over 1 + t^2}\n\\end{matrix}\n$$\n\nThus, the Chebyshev polynomials are tied to (coordinates on) circles,\n rather than explicitly to the trigonometric functions.\nIt is a bit strange that these polynomials are in terms of rational functions, but no stranger\n than them being in terms of *ir*rational functions like sine and cosine.\n\n\nCalculus\n--------\n\nSince these functions behave similarly to sine and cosine, one might wonder about\n the nature of these expressions in the context of calculus.\n\nFor comparison, the complex exponential (as it is a parallel construction) has a simple derivative[^1].\nSince the exponential function is its own derivative, the expression acquires\n an imaginary coefficient through the chain rule.\n\n[^1]: This is forgoing the fact that complex derivatives require more care than their real counterparts.\n It matters slightly less in this case since this function is complex-valued, but has a real parameter.\n\n$$\n\\begin{align*}\n e^{it} &= \\cos(t) + i\\sin(t)\n \\\\\n {d \\over dt} e^{it}\n &= {d \\over dt} \\cos(t) + {d \\over dt} i\\sin(t)\n \\\\\n i e^{it} &= -\\sin(t) + i\\cos(t)\n \\\\\n i[\\cos(t) + i\\sin(t)]\n &\\stackrel{\\checkmark}{=} -\\sin(t) + i\\cos(t)\n\\end{align*}\n$$\n\nMeanwhile, the complex stereograph has derivative\n\n$$\n\\begin{align*}\n {d \\over dt} o_1(t) &= {d \\over dt} {1 + it \\over 1 - it}\n = {i(1 - it) + i(1 + it) \\over (1 - it)^2}\n \\\\\n &= {2i \\over (1 - it)^2}\n = {2i(1 + it)^2 \\over (1 + t^2)^2}\n = {2i(1 - t^2 + 2it) \\over (1 + t^2)^2}\n \\\\\n &= {-4t \\over (1 + t^2)^2} + i {2(1 - t^2) \\over (1 + t^2)^2}\n \\\\\n &= {-2 \\over 1 + t^2}s_1 + i {2 \\over 1 + t^2}c_1\n \\\\\n &= -(1 + c_1)s_1 + i(1 + c_1)c_1\n \\\\\n &= i(1 + c_1)o_1\n\\end{align*}\n$$\n\nJust like the complex exponential, an imaginary coefficient falls out.\nHowever, the expression also accrues a $1 + c_1$ term, almost like an adjustment factor\n for its failure to be the complex exponential.\nSine and cosine obey a simpler relationship with respect to the derivative,\n and thus need no adjustment.\n\n\n### Complex Analysis\n\nSince $o_n$ is a curve which loops around the unit circle *n* times, that possibly suits it\n to showing a simple result from complex analysis.\nIntegrating along a contour which wraps around a sufficiently nice function's pole\n (i.e., where its magnitude grows without bound) yields a familiar value.\nThis is easiest to see with $f(z) = 1 / z$:\n\n$$\n\\oint_\\Gamma {1 \\over z} dz\n = \\int_a^b {\\gamma'(t) \\over \\gamma(t)} dt\n = 2\\pi i\n$$\n\nIn this example, Γ is a counterclockwise curve parametrized by γ which loops once around\n the pole at *z* = 0.\nMore loops will scale this by a factor according to the number of loops.\n\nNormally this equality is demonstrated with the complex exponential, but will $o_1$ work just as well?\nIf Γ is the unit circle, the integral is:\n\n$$\n\\oint_\\Gamma {1 \\over z} dz\n = \\int_{-\\infty}^\\infty {o_1'(t) \\over o_1(t)} dt\n = \\int_{-\\infty}^\\infty i(1 + c_1(t)) dt\n = 2i\\int_{-\\infty}^\\infty {1 \\over 1 + t^2} dt\n$$\n\nIf one has studied their integral identities, the indefinite version of the final integral\n will be obvious as $\\arctan(t)$, which has horizontal asymptotes of $\\pi / 2$ and $-\\pi / 2$.\nTherefore, the value of the integral is indeed $2\\pi i$.\n\nIf there are *n* loops, then naturally there are *n* of these $2\\pi i$s.\nSince powers of *o* are more loops around the circle, the chain and power rules show:\n\n$$\n\\begin{gather*}\n {d \\over dt} (o_1)^n = n(o_1)^{n-1} {d \\over dt} o_1\n \\\\[14pt]\n \\oint_\\Gamma {1 \\over z} dz\n = \\int_{-\\infty}^\\infty {n o_1(t)^{n-1} o_1'(t) \\over o_1(t)^n} dt\n = n \\int_{-\\infty}^\\infty {o_1'(t) \\over o_1(t)} dt\n = 2 \\pi i n\n\\end{gather*}\n$$\n\nIt is certainly possible to perform these contour integrals along straight lines;\n in fact, integrating along lines from 1 to *i* to -1 to -*i* deals with a\n similar integral involving arctangent.\nHowever, the best one can do to construct more loops with lines is to count each line\n multiple times, which isn't extraordinarily convincing.\n\nPerhaps the use of $\\infty$ in the integral bounds is also unconvincing.\nThe integral can be shifted back into the realm of plausibility by considering simpler bounds on $o_2$:\n\n$$\n\\begin{align*}\n \\oint_\\Gamma {1 \\over z} dz\n &= \\int_{-1}^1 {2 o_1(t) o_1'(t) \\over o_1(t)^2} dt\n \\\\\n &= 2 \\int_{-1}^1 {o_1'(t) \\over o_1(t)} dt\n \\\\\n &= 2(2i\\arctan(1) - 2i\\arctan(-1))\n \\\\\n &= 2\\pi i\n\\end{align*}\n$$\n\nThis has an additional benefit: using the series form of $1 / (1 + t^2)$ and integrating,\n one obtains the series form of the arctangent.\nThis series converges for $-1 \\le t \\le 1$, which happens to match the bounds of integration.\nThe convergence of this series is fairly important, since it is tied to formulas for π,\n in particular [Leibniz's formula](https://en.wikipedia.org/wiki/Leibniz_formula_for_%CF%80).\n\nWere one to integrate with the complex exponential, we would instead use the bounds $(0, 2\\pi)$,\n since at this point a full loop has been made.\nBut think to yourself -- how do you know the period of the complex exponential?\nHow do you know that 2π radians is equivalent to 0 radians?\nThe result using stereography relies on neither of these prior results and is directly pinned\n to a formula for π instead an apparent detour through the number *e*.\n\n\nPolar Curves\n------------\n\nPolar coordinates are useful for expressing for which the distance from the origin is\n a function of the angle with respect to the positive *x*-axis.\nThey can also be readily converted to parametric forms:\n\n$$\n\\begin{gather*}\n r(\\theta) &\\Longleftrightarrow&\n \\begin{matrix}\n x(\\theta) = r \\cos(\\theta) \\\\\n y(\\theta) = r \\sin(\\theta)\n \\end{matrix}\n\\end{gather*}\n$$\n\nPolar curves frequently loop in on themselves, and so it is necessary to choose appropriate bounds\n for θ (usually as multiples of π) when plotting.\nEvidently, this is due to the use of sine and cosine in the above parametrization.\nFortunately, $s_n$ and $c_n$ (as shown by the calculus above) have much simpler bounds.\nSo what happens when one substitutes the rational functions in place of the trig ones?\n\n\n### Polar Roses\n\n[Polar roses](https://en.wikipedia.org/wiki/Rose_(mathematics)) are beautiful shapes which have\n a simple form when expressed in polar coordinates.\n\n$$\nr(\\theta) = \\cos \\left( {p \\over q} \\cdot \\theta \\right)\n$$\n\nThe ratio $p/q$ in least terms uniquely determines the shape of the curve.\n\nIf you weren't reading this post, you might assume this curve is transcendental since it uses cosine,\n but you probably know better at this point.\nThe Chebyshev examples above demonstrate the resemblance between $c_n$ and $\\cos(n\\theta)$.\nThe subscript of $c$ is easiest to work with as an integer, so let $q = 1$.\n\n$$\nx(t) = c_p(t) c_1(t) \\qquad y(t) = c_p(t) s_1(t)\n$$\n\nwill plot a $p/1$ polar rose as t ranges over $(-\\infty, \\infty)$.\n\n\n\n::: {#fig-polar-roses-1}\n{{< video \"./polar_roses_1.mp4\" >}}\n\np/1 polar roses as rational curves.\nSince *t* never reaches infinity, a bite appears to be taken out of the graphs near (-1, 0).\"\n:::\n\n$q = 1$ happens to match the subscript *c* term of *x* and *s* term of *y*, so one might wonder\n whether the other polar curves can be obtained by allowing it to vary as well.\nAnd you'd be right.\n\n$$\nx(t) = c_p(t) c_q(t) \\qquad y(t) = c_p(t) s_q(t)\n$$\n\nwill plot a $p/q$ polar rose as t ranges over $(-\\infty, \\infty)$.\n\n\n\n::: {#fig-polar-roses-2}\n{{< video \"./polar_roses_2.mp4\" >}}\n\np/q polar roses as rational curves\n:::\n\nJust as with the prior calculus examples, doubling all subscripts of *c* and *s* will\n only require *t* to range over $(-1, 1)$, which removes the ugly bite mark.\nPerhaps it is also slightly less satisfying, since the fraction $p/q$ directly appears in the\n typical polar incarnation with cosine.\nOn the other hand, it exposes an important property of these curves: they are all rational.\n\nThis approach lends additional precision to a prospective pseudo-polar coordinate system.\nIn the next few examples, I will be using the following notation for compactness:\n\n$$\n \\begin{gather*}\n R_n(t) = f(t) &\\Longleftrightarrow&\n \\begin{matrix}\n x(t) = f(t) c_n(t) \\\\\n y(t) = f(t) s_n(t)\n \\end{matrix}\n\\end{gather*}\n$$\n\n\n### Conic Sections\n\nThe polar equation for a conic section (with a particular unit length occurring somewhere)\n in terms of its eccentricity $\\varepsilon$ is:\n\n$$\nr(\\theta) = {1 \\over 1 - \\varepsilon \\cos(\\theta)}\n$$\n\nCorrespondingly, the rational polar form can be expressed as\n\n$$\nR_1(t) = {1 \\over 1 - \\varepsilon c_1}\n$$\n\nSince polynomial arithmetic is easier to work with than trigonometric identities,\n it is a matter of pencil-and-paper algebra to recover the implicit form from a parametric one.\n\n\n#### Parabola ($|\\varepsilon| = 1$)\n\nThe conic section with the simplest implicit equation is the parabola.\nSince $c_n$ is a simple ratio of polynomials in *t*, it is much simpler to recover the implicit equation.\nFor $\\varepsilon = 1$,\n\n:::: {layout-ncol=\"2\"}\n\n::: {#3902cb4b .cell execution_count=5}\n\n::: {.cell-output .cell-output-display}\n![$x = {c_1 \\over 1 - c_1} \\quad y = {s_1 \\over 1 - c_1}$](index_files/figure-html/cell-5-output-1.png){}\n:::\n:::\n\n\n::: {}\n$$\n\\begin{align*}\n 1 - c_1 &= 1 - {1 - t^2 \\over 1 + t^2}\n = {2 t^2 \\over 1 + t^2}\n \\\\\n y &= {s_1 \\over 1 - c_1}\n = {2t \\over 1 + t^2} {1 + t^2 \\over 2 t^2}\n = {1 \\over t}\n \\\\\n x &= {c_1 \\over 1 - c_1}\n = {1 - t^2 \\over 1 + t^2} \\cdot {1 + t^2 \\over 2 t^2}\n = {1 - t^2 \\over 2t^2}\n \\\\\n &= {1 \\over 2t^2} - {1 \\over 2}\n = {y^2 \\over 2} - {1 \\over 2}\n\\end{align*}\n$$\n:::\n::::\n\n*x* is a quadratic polynomial in *y*, so trivially the figure formed is a parabola.\nTechnically it is missing the point where $y = 0 ~ (t = \\infty)$, and this is not a circumstance\n where using a higher $c_n$ would help.\nIt is however, similar to the situation where we allow $o_1(\\infty) = -1$, and an argument\n can be made to waive away any concerns one might have.\n\n\n#### Ellipse ($|\\varepsilon| < 1$)\n\nEllipses are next.\nThe simplest fraction between zero and one is 1/2, so for $\\varepsilon = 1/2$,\n\n:::: {layout-ncol = \"2\"}\n\n::: {#ce1bccb3 .cell execution_count=6}\n\n::: {.cell-output .cell-output-display}\n![$x = {c_1 \\over 1 - c_1 / 2} \\quad y = {s_1 \\over 1 - c_1 / 2}$](index_files/figure-html/cell-6-output-1.png){}\n:::\n:::\n\n\n::: {}\n$$\n\\begin{align*}\n 1 - {1 \\over 2}c_1 &= 1 - {1 \\over 2} \\cdot {1 - t^2 \\over 1 + t^2}\n = {3 t^2 + 1 \\over 2 + 2t^2}\n \\\\\n y &= {s_1 \\over 1 - {1 \\over 2}c_1}\n = {4t \\over 3t^2 + 1}\n \\\\\n x &= {c_1 \\over 1 - {1 \\over 2}c_1}\n = {2 - 2t^2 \\over 3t^2 + 1}\n\\end{align*}\n$$\n:::\n::::\n\nThere isn't an obvious way to combine products of *x* and *y* into a single equation.\nThe general form of a conic section is $Ax^2 + Bxy + Cy^2 + Dx + Ey + F = 0$, so\n we know that the implicit equation for the curve almost certainly involves $x^2$ and $y^2$.\n\n$$\nx^2 = {4 - 8t^2 + 4t^4 \\over (3t^2 + 1)^2} \\qquad\n y^2 = {16t^2 \\over (3t^2 + 1)^2}\n$$\n\nSquaring produces some $t^4$ terms which cannot exist outside of these terms and *xy*.\nA linear combination of $x^2$ and $y^2$ never includes any cubic terms in the numerator\n which would appear in *xy*, so $B = 0$.\nSince all remaining terms are linear in *x* and *y*, their denominator must appear as a factor\n in the numerator of $Ax^2 + Cy^2$, whatever *A* and *C* are.\n\nSince the coefficient of $t^4$ in $x^2$ is 4, *A* must be multiple of 3.\nThrough trial and error, $A = 3, C = 4$ gives:\n\n$$\n\\begin{align*}\n 3x^2 + 4y^2\n &= {12 - 24t^2 + 12t^4 + 64t^2 \\over (3t^2 + 1)^2}\n \\\\\n &= {12 - 40t^2 + 12t^4 \\over (3t^2 + 1)^2}\n \\\\\n &= {(4t^2 + 12) (3t^2 + 1) \\over (3t^2 + 1)^2}\n \\\\\n &= {4t^2 + 12 \\over 3t^2 + 1}\n\\end{align*}\n$$\n\nSince the numerator of *y* has a *t*, this is clearly some combination of *x* and a constant.\nBy the previous line of thought, the constant term must be a multiple of 4, and picking the smallest\n option finally results in the implicit form:\n\n$$\n\\begin{align*}\n 4 &= {4(3t^2 + 1) \\over 3t^2 + 1}\n = {12t^2 + 4 \\over 3t^2 + 1}\n \\\\\n {4t^2 + 12 \\over 3t^2 + 1} - 4\n &= {8 - 8t^2 \\over 3t^2 + 1} = 4x\n \\\\[14pt]\n 3x^2 + 4y^2 &= 4x + 4\n \\\\\n 3x^2 + 4y^2 - 4x - 4 &= 0\n\\end{align*}\n$$\n\nNotably, the coefficients of *x* and *y* are 3 and 4.\nSimultaneously, $o_1(\\varepsilon) = o_1(1/2) = {3 \\over 5} + i{4 \\over 5}$.\nThis binds together three concepts: the simplest case of the Pythagorean theorem,\n the 3-4-5 right triangle; the coefficients of the implicit form; and the role of eccentricity\n with respect to stereography.\n\n\n#### Hyperbola ($|\\varepsilon| > 1$)\n\nAs evidenced by the bound on the eccentricity above, hyperbolae are in some way the inverses of ellipses.\nSince $o_1(2)$ is a reflection of $o_1(1/2)$, you might think the implicit equation for\n $\\varepsilon = 2$ to be the same, but with a flipped sign or two.\nUnfortunately, you'd be wrong.\n\n:::: {layout-ncol=\"2\"}\n\n::: {#12a8b8f9 .cell execution_count=7}\n\n::: {.cell-output .cell-output-display}\n![$x = {c_1 \\over 1 - 2c_1} \\quad y = {s_1 \\over 1 - 2c_1}$](index_files/figure-html/cell-7-output-1.png){}\n:::\n:::\n\n\n::: {}\n$$\n\\begin{gather*}\n \\begin{align*}\n x &= {c_1 \\over 1 - 2c_1} = {1 - t^2 \\over 3t^2 - 1}\n \\\\\n y &= {s_1 \\over 1 - 2c_1} = {2t \\over 3t^2 - 1}\n \\\\[14pt]\n 3x^2 - y^2 &= {3 - 6t^2 + 3t^4 - 4t^2 \\over 3t^2 - 1}\n \\\\\n &= {(t^2 - 3)(3t^2 - 1) \\over (3t^2 - 1)^2 }\n \\\\\n &= {t^2 - 3 \\over 3t^2 - 1 }\n = ... = -4x - 1\n \\end{align*}\n \\\\[14pt]\n 3x^2 - y^2 + 4x + 1 = 0\n\\end{gather*}\n$$\n:::\n::::\n\nAt the very least, the occurrences of 1 in the place of 4 have a simple explanation: 1 = 4 - 3.\n\n\n### Archimedean Spiral\n\nArguably the simplest (non-circular) polar curve is $r(\\theta) = \\theta$, the unit\n [Archimedean spiral](https://en.wikipedia.org/wiki/Archimedean_spiral).\nSince the curve is defined by a constant turning, this is a natural application of the properties\n of sine and cosine.\nThe closest equivalent in rational polar coordinates is $R_1(t) = t$.\nBut this can be converted to an implicit form:\n\n$$\n\\begin{gather*}\n x = tc_1 \\qquad y = ts_1\n \\\\[14pt]\n x^2 + y^2 = t^2(c_1^2 + s_1^2) = t^2\n \\\\\n y = {2t^2 \\over 1 + t^2} = {2(x^2 + y^2) \\over 1 + (x^2 + y^2)}\n \\\\[14pt]\n (1 + x^2 + y^2)y = 2(x^2 + y^2)\n\\end{gather*}\n$$\n\nThe curve produced by this equation is a\n [right strophoid](https://mathworld.wolfram.com/RightStrophoid.html)\n with a node at (0, 1) and asymptote $y = 2$.\nThis form suggests something interesting about this curve: it approximates the Archimedean spiral\n (specifically the one with polar equation $r(\\theta) = \\theta/2$).\nIndeed, the sequence of curves with parametrization $R_n(t) = 2nt$ approximate the (unit) spiral\n for larger *n*, as can be seen in the following video.\n\n\n\n::: {#fig-approx-archimedes}\n{{< video ./approximate_archimedes.mp4 >}}\n\nApproximations to the Archimedean spiral\n:::\n\n\nSince R necessarily defines a rational curve, the curves will never be equal,\n just as any stretching of $c_n$ will never exactly become cosine.\n\n\nClosing\n-------\n\nSine, cosine, and the exponential function, are useful in a calculus setting precisely\n because of their constant \"velocity\" around the circle.\nAlso, nearly every modern scientific calculator in the world features buttons\n for trigonometric functions, so there seems to be no reason *not* to use them.\n\nWe can however be misled by their apparent omnipresence.\nStereographic projection has been around for *millennia*, and not every formula needs to be rewritten\n in its language.\nFor example (and as previously mentioned), defining the Chebyshev polynomials really only requires\n understanding the multiplication of two complex numbers whose norm cannot grow,\n not trigonometry and dividing angles.\nMany other instances of sine and cosine merely rely on a number (or ratio) of loops around a circle.\nWhen velocity does not factor, it will obviously do to \"stay rational\".\n\nOne of my favorite things to plot as a kid were polar roses, so I was somewhat intrigued\n to see that they are, in fact, rational curves.\nOn the other hand, their rationality follows immediately from the rationality of the circle\n (which itself follows from the existence of Pythagorean triples).\nIf I were more experienced with manipulating Chebyshev polynomials or willing to set up a\n linear system in (way too) many terms, I might have considered attempting to find\n an implicit form for them as well.\n\nDiagrams created with Sympy and Matplotlib.\n\n", "supporting": [ - "index_files" + "index_files/figure-html" ], "filters": [], "includes": {}