How AI Discovered a Faster Matrix Multiplication Algorithm

1,346,332

35,765 0

Published 2023-05-22

Researchers at Google research lab DeepMind trained an AI system called AlphaTensor to find new, faster algorithms to tackle an age-old math problem: matrix multiplication. Advances in matrix multiplication could lead to breakthroughs in physics, engineering and computer science.

AlphaTensor quickly rediscovered - and surpassed, for some cases - the reigning algorithm discovered by German mathematician Volker Strassen in 1969. However, mathematicians soon took inspiration from the results of the game-playing neural network to make advances of their own.

Read the full article at Quanta Magazine: www.quantamagazine.org/ai-reveals-new-possibilitie…

Correction: At 2:53 in the video, the text previously read "67% less" but has been changed to "67%" for accuracy.

00:00 What is matrix multiplication?
01:06 The standard algorithm for multiplying matrices
02:06 Strassen's faster algorithm for faster matrix multiplication methods
03:55 DeepMind AlphaGo beats a human
04:28 DeepMind uses AI system AlphaTensor to search for new algorithms
05:18 A computer helps prove the four color theorem
06:17 What is a tensor?
07:16 Tensor decomposition explained
08:48 AlphaTensor discovers new and faster faster matrix multiplication algorithms
11:09 Mathematician Manuel Kauers improves on AlphaTensor's results

- VISIT our Website: www.quantamagazine.org/
- LIKE us on Facebook: www.facebook.com/QuantaNews
- FOLLOW us Twitter: twitter.com/QuantaMagazine

Quanta Magazine is an editorially independent publication supported by the Simons Foundation: www.simonsfoundation.org/

#math #AlphaTensor #matrices

All Comments (21)

@alexanderkvenvolden4067 11 months ago

So to make matrix multiplication more efficient, they used machine learning, which relies heavily on matrix multiplication. Very cool!
@sheevys 11 months ago

One cool thing from tha paper which didn't make it to the video was that instead of using number of multiplications as your cost function, one can use the actual timings from running each algorithm. This allows to find fast multiplication algorithm which is tailored for the device it's running on. Interestingly, they showed that indeed for different hardware platforms, different algorithms performed better.
@Wooksley 11 months ago

I finished university 4 years ago and I still get nightmares about my diploma being revoked because they’ve discovered that I failed linear algebra exam. Doing matrix multiplication by hand is a literal nightmare for me.
@ABaumstumpf 5 months ago

For matrix-multiplications Strassen is the best overall - works for any size and any form of matrices. But there are of course faster methods for specific cases. There are algorithms that are faster for where large matrices, that are better for diagonalised matrices etc They found an algorithm for matrices only containing 1 and 0 that is slightly faster - single-digit percentage faster. It is an achievement and does help, but the bigger benefit might just be that with specialised hardware it can use fewer multiplication-units - thus less power.
@alphalunamare 11 months ago

I devised an algorithm in 1977 that utilised 24 multiplications and 69 additions for the 3x3 case but didn't get much further as it was equivalent to Strassen's in the 2x2 case. I did it with pencil and paper. After seeing this video I don't feel so bad now about my failure :-)
@RepChris 11 months ago

I mean technically weve had "faster" algorithms than the stassen one for quite a bit. Were down from the factor of ~2.7 to ~2.37 for very large matrices. Although an actually usable algorithm thats faster even in the relatively special cases is always nice to see
@QuantenMagier 11 months ago

I read the corresponding papers and this is a pretty good recap video on the topic. Kudos for visualizing the algorithms in an understandable way and even giving some background information from the researchers! 🧑‍🎓
@KnakuanaRka 5 months ago

The way you described the tensor decomposition problem as a game of removing tensors from the original to reduce it to all zeros as fast as possible, and how you showed it with the Strassen example, reminds me a lot of the Lights Out game where you press a light to flip it and all adjacent to make all the lights dark. The moves in tensor decomposition are a lot more complex of course, but seeing it as a game makes a lot of sense.
@silentblackhole 10 months ago

I didn't follow along with some of the more detailed parts but over all found it to be really interesting. Nice work to everyone involved in this directly or otherwise.
@Dismythed 10 months ago

Never explained how AT achieved the matrix calculations with fewer steps than previous (47 to 45 or 96 to 95).
@Alexandre-ur1oi 11 months ago

For the sake of completness, in practical applications one should keep in mind that reducing the number of basic operations is not the main bottelneck for matrix multiplication efficiency, moving data is (bus 500Mh, cpu 1-2Ghz). However, without new ideas on this (last not understood) basic operation (polynomial, integer multiplication is mastered), brute computer force (cleverly used in the present case) is welcome.
@Techmagus76 11 months ago

To explain the Alphatensor result it would be better to show and explain the Karazuba algorithm as Alphatensor found something similiar for 4x4 to Karazuba for 2x2. Sure Strassen is beating Karazuba on 2x2 so it should be mentioned, but for the didactic of the solution Karazuba is giving more insight to the result.
@QuantaScienceChannel 11 months ago

You can read more about this work on Quanta's website: www.quantamagazine.org/ai-reveals-new-possibilitie… Correction: At 2:53 in the video, the text previously read "67% less" but has been changed to "67%" for accuracy.
@niks660097 5 months ago

You should make video on constraint based solvers, its just amazing you can bypass all physics and still do real physical simulation just by few linear equations, which in turn just uses matrix multiplication..
@naysay02 10 months ago

thanks for the accessible yet detailed explanation. really cool
@vivekdabholkar5965 10 months ago

Truly amazing solution to centuries-old problem
@user-francescob 11 months ago

Has any of these researchers looked the impact of this new algorithm on specialize matrix libraries such as Tensorflow, and GPUs. It's possible that the 50-year old algorithm will execute faster than the new DM algorithm, IF for example, there's a significant advantage using specialized functions tuned for GPU instruction sets. It would be important to #1 - Know this answer? #2 - Where the cross-over occurs? #3 - Which and When existing libraries will adopt this new algorithm?
@vardhmansingh3463 11 months ago

Quanta 👌🏻👌🏻 Always make these awesome and informative videos. Some enlightenment everytime!
@Omena0MC 2 months ago

Bruh ai literally made itself faster 💀💀💀
@asicdathens 5 months ago

The first Multimedia Extensions for CPU's that Intel made, the MMX instruction set, was basically matrix operations and mostly multiplications