The Truth about the Fast Inverse Square Root on the N64

232,352
0
Published 2023-11-18

All Comments (21)
  • @ENCHANTMEN_
    It's crazy to think just what a modern computer might be capable of if we had the time and expertise to optimize code to this degree...
  • @michi.78
    everyones gangsta till kaze finds an one cycle improvement in the most cracked function ever existing
  • @TypingHazard
    Is "now try it with the fast inverse sqrt" the programmer version of how every musician content creator is asked/forced to attempt Rush E and other meme songs
  • @ethanpayne4116
    Silas' idea with the error cancelling is very cool, there are probably many other examples where we can reduce the error of one problem by dividing it into two sub-problems with opposite error
  • @Gideon_Judges6
    I'm kind of surprised it was made popular by Quake III. I could've sworn Mike Abrash put it in the original Quake, but it's been years since I saw the source.
  • Nice video! Just one small point: If you want to use invsqrt(x) to calculate sqrt(x), you can use x*invsqrt(x) instead of 1/invsqrt(x). That might save a few cycles? But I still agree that the quake3 fast inverse square root algorithm is probably not that useful on N64.
  • @603840Jrg
    I like to think that whatever piece of his soul John Carmack lost when Oculus got acquired by FB/Meta ended up possessing Kaze
  • @simjans7633
    Both editions of the book Hacker's Delight mention the fast inverse square root (or as they call it, an Approximate Reciprocal Square Root Routine) and give various improvements of the algorithm. In the books they already mentioned FISR without Newton iterations: > deleting the Newton step results in a substantially faster function with a relative error within ±0.035, using a constant of 0x5F37642F.
  • @arciks11
    2:00 So they gave it a race car engine and a can of beer for a gas tank?
  • @mrmimeisfunny
    I didn't actually expect it to work at all. Because I remember you said that square roots on the N64 are relatively fast. Of course Kaze will find a way to eek out that tiny extra bit of performance and then some.
  • @RatcheT2497
    have you thought about writing some small research papers for these findings and experiments? like, even if they're extremely specific for your use case, they're still cool as hell and might even help someone some day
  • @torvusbolt201
    All of your videos are so incredible. I love how you mix maths and humour in the way you do. Even if I can't comprehend everything, I love each and every second
  • It's amazing to me how you can make deeply complex topics so easy to understand by explaining them based on a use case. Programming is like black magic to me, yet I can follow your videos along without any issue. God bless.
  • @caliburnleaf9323
    For the graph at 9:19, it probably would have been more clear if you'd labeled it as "Error (%) vs Cycles," since that's what the numbers actually represent. In both cases, a lower number is better, which is the inverse of what is implied by "Accuracy vs Performance" (which suggests that a higher value is more accurate or has higher performance).
  • RAM Bus is my favorite recurring character on this show, glad to see Them back!
  • @DelayRGC
    The inverse square root sure went through a journey, didn't it? From being cumbersome to calculate, to an ingenious bit hack, to becoming its own CPU instruction.
  • @scoreunder
    I'm honestly impressed at the grind. In your shoes I would probably have ignored the FISR comments for being more pop comp sci than actual well-informed optimisation tech but good on you for actually investigating it and finding a place it excels even on a chip with hardware floats
  • @prgnify
    I was certain it was absolutely "useless" in the Nintendo 64 hardware, I'm amazed you actually found a place to use it! Also, you know your audience very well, @06:58 I chuckled and @07:23 I almost laughed. Great content!
  • @Flamefreeze1
    The truth about why my dad never came back from the grocery store 😢😢😢 Edit: Yooo my mind was blown w/ Silas’s idea. Mathematically it seems obvious but getting that much accuracy improvement with the 2 fourth-root calcs multiplied together is insane! Thanks for the good content as always!