The Wonderful world of Math (at Khan Academy)

— 7 minute read

My current Khan Academy project is to replace our current math renderer, KaTeX with MathJax v3 and have most learners be none-the-wiser. When this project is over screenreader-users will be the happiest because our math will be read out more clearly and keyboard accessible. But the second happiest group will be my fellow developers who will have one less math library to maintain. It's a zoo of math libraries at Khan Academy.

As of early 2023, we use 3 different libraries to show math to learners: KaTeX, Mathjax v2, and Mathquill. The most important distinction is between KaTeX/MathJax and Mathquill. The former are static math libraries- they are meant to show math at rest that the learner does not interact with. The latter is an interactive math library where we expect the learner to be in an interactive mode where they can modify equations with their keyboard, mouse, and fingers. My project is to upgrade static math, although there is a parallel project making improvements to the mechanisms we use to allow learners to interact with dynamic math.

The KaTeX story permalink

Despite my long tenure at Khan Academy, I wasn't here when KaTeX was started so my understanding is through lore. KaTeX got started as a Khan Academy project because there wasn't a good way to show math fast on the web. The goal was that a learner would jump into a practice session and never have to wait while the math filled in after the fact (have you spent time with a grade-schooler? They're not known for their patience). Over time KaTeX continued development as a way to render math on the backend so that, again, math was quickly pushed to the screen.

In the beginning, KaTeX supported a parsing strategy that was perfectly aligned with the TeX that was being produced at Khan Academy. But over time it became clear that that parsing strategy was ambiguous and didn't work well for others' needs. The maintainers of KaTeX moved on, KaTeX got upgraded and Khan Academy was stuck on an island of our special version. This would have been fine if our needs hadn't continued to evolve.

Over the past 10 years, Khan Academy has expanded from being a helpful addon for learners to being essential, including in classrooms. As our learner base has expanded and especially as we added schools it's been clear that we need to make sure that we're living up to our slogan: "A free world-class education for anyone, anywhere" with an emphasis on anyone. This has guided us towards investing in accessibility for all of the folks learning with us.

MathJax [v2 and v3] permalink

The story of Mathjax is the story of walking us coming home to a library that has always been a part of our codebase. MathJax v2 was the original math rendering tool used at Khan Academy when the practice experience was called "Khan Exercises" (the modern practice experience is called "Perseus"). When we made the move from Khan Exercises to Perseus we always made a library transition from MathJax to KaTeX. Sort of. KaTeX could never display the complete corpus of equations in our datastore so Mathjax always hung around, in the wings, in case rendering failed. Yes, it continued to be slow but a slow experience is better than no experience.

The current project is to wipe the slate clean on static math rendering so that in the end there is only one static math rendering library: MathJax v3. At the end of this project, Khan Academy will be aligned with what the industry is doing and we won't be directly responsible for maintaining a math rendering library.

But remember that thing about non-standard KaTeX parsing? It turns out there are hundreds and hundreds of equations that rely on the quirks of our special version. The project I'm working on is to build a set of transformers that take the math and transform it before feeding it into MathJax v3. Someday those transformers will be used in an ETL job to update the datastore, remove the transformers, and have perfect math from the datastore to the learner.

MathQuill permalink

The other major class of tools that we use to show math is interactive math. Mathquill is responsible for the fancy equation editor that powers the learning experience when a learner is interactively moving between a numerator and denominator or filling in a quadratic or any of the other multitude of math features that we need to support.

Have you ever been out for a walk and seen a set of stones stacked perfectly despite their irregular shapes and sizes? Maybe you thought- there's no way this will keep standing up but on walk after walk there they are. That's how MathQuill fits into the rest of the practice experience codebase. It shouldn't be stable and reliable and yet time after time it's there standing firmly.

MathQuill as a library also has lore. It was actively maintained for many years and then dropped off, presumably as folks were satisfied with the feature set. One group wasn't super happy, though, Desmos and they continued to maintain a bespoke version. Some years later MathQuill maintenance and development started up again. So there are three important versions of MathQuill- the stable one we use at Khan, the actively developed one, and the Desmos branch. Over time I would like the versions to collapse and for there to be one obvious solution.

One interesting thing about interactive and static math at Khan is that never the twain shall meet. Static math is rendered to the page and remains at rest. Dynamic math is entered by a learner and then checked by Perseus for correctness. That situation is tenable as long as the two branches don't touch.

(Spoiler: learners would like to be able to copy and paste math between those experiences.)

But what about MathML permalink

I wish MathML was the library that unified all. MathML is a markup language for showing math in browsers. It's alluring to have the browser just handle showing math, have it part of the render cycle, and not have to load anything extra. It would certainly achieve the goal of fast math.

MathML is still very newly added to Chrome. The tea on MathML is that it used to be available in Chrome, got removed, and then Igalia did a heroic amount of work to make sure it was added back and working perfectly. MathML has been available in Safari since forever.

The reality is that MathML feels like a maybe for existing. Maybe Chrome and Igalia will continue to maintain it. Maybe accessibility will be a priority. Maybe it'll be fast and well-optimized. Maybe there will be lots of CSS options to style it. Time will have to tell how it plays out.

One concern is that MathML seems very designed for math as math appears in journal articles. It's not designed for colorful, illustrative math and although that's not all of Khan Academy's content there is enough that that kind of control is a priority.

Another concern is that math is written in MathML's markup language and not in TeX. Tooling would be needed to go from

And finally, MathML for interactivity doesn't seem to exist, yet. Maybe that's a function that libraries will fill. Maybe that's a function that Mathquill will build on top of.

Request for feedback permalink

This is the story, as I know it. There's lots that I've inferred over time and I might not have it right. I'd also love to be persuaded that MathML is the future we're waiting for. For now, though, I'll continue to live with the (growing smaller) zoo of math at Khan Academy.