# Quake's Fast Inverse Square Algorithm

Quake III has a super-optimized, good-enough inverse square algorithm. As a graphical game, they need millions and millions of these operations so it better be fast.

\[f(x) = \frac{1}{\sqrt{x}}\]The first pass in code is easy. Take the square root and divide one by it:

But, that `sqrt(x)`

is not that fast, or, more properly, not fast
enough for Quake’s purposes (because “fast” only matters when you have
an expectation). Division isn’t a speedy operation either. The
developers had to come up with something faster.

Here’s the code from Quake:

Here’s a discussion of it on YouTube, and mostly what I’m about to write. Wikipedia explains it, for I found it’s discussion very dense and convoluted. I think this YouTube video does a much better job, but there were times he asks the viewer to work it out for themselves. So, that’s what I’m here to do. Plenty of people have written about this before, and you won’t find anything new here.

The algorithm plays various clever, deep-magic tricks with bits to make everything work out.

First, notice that `i`

is a long and that `y`

is float. Each is 32
bits, but the computer uses those bits differently. Notice that the
code puts the value in `number`

in `y`

, a `float`

, but then translate
it to a `long`

.

The rest of the magic comes from knowing things about the bit patterns and messing with those. Reading What Every Computer Scientist Should Know About Floating-Point Arithmetic may be helpful here, but it’s far more than you need. Here’s what a IEEE 754 float looks like:

```
sign exponent (E) mantissa (M)
0 00000000 00000000000000000000000
```

The exponent is signed to give us values from -127 to 128, and the mantissa is a value between 1 and 2 (because, in binary exponential notation, the leading one is assumed: it’s the only possible non-zero value).

For some value, the bit pattern is $2^{23}E$ (to shift the exponent into the right place) added to whatever is in $M$. That’s $2^{23}E+M$. That’s just to make the right sequence of bits appear, manually, without something else constructing that for us.

The actual number works out to be:

\[(1+\frac{M}{2^{23}}) 2^{E-127}\]The $\frac{M}{2^{23}}$ gives the mantissa the right magnitude and the
$1 +$ shifts it to be between 1 and 2. The $E-127$ gives us the right
exponent from the signed bit pattern (*e.g.* an exponent of 4 would
actually be the bit pattern for 131):

Henceforth, all logarithms are $log_2$.

But, math people know a trick. For $x$ close to 1, the logarithm is just $x$ and a small fudge factor (0.0430 minimizes the average deviation between 0 and 1):

\[log( 1 + x ) \approx x + \mu\]With some replacement and rearrangement of terms, this yields the bit representation, $M + 2^{23} E $, scaled ($\frac{1}{2^{23}}$) and shifted ($\mu - 127$):

\[\begin{align*} log( 1 + \frac{M}{2^{23}} ) + E - 127 \\ \frac{M}{2^{23}} + \mu + E - 127 \\ \frac{M}{2^{23}} + E + \mu - 127 \\ \frac{1}{2^{23}} ( M + 2^{23} E ) + \mu - 127 \end{align*}\]That is, we have the bit representation we started with ($M + 2^{23} E$) even though we took the logarithm.

Now the bit magic happens. First, you can’t do bit operations on
floats, but you can on longs. So treat the bit pattern in `y`

as a
long (`i = * ( long * ) &y`

) by telling C to treat the 32 bits
starting at address `&y`

as a `long`

. So, `i`

had the value of
$log(y)$ (still without scaling and shifting).

Consider what we are trying to get, noticing that working in logarithms has some big wins. The value we want is minus one half of the logarithm of the input number. And, we know what the log of that input number is already:

\[log( \frac{1}{\sqrt{y}} ) = log( y^{-\frac{1}{2}} ) = -\frac{1}{2} log(y) = -\frac{1}{2} i\]The next part takes the bit pattern in `i`

and shifts it to the right
1 place. That’s the same thing as dividing by 2. In code,
$-\frac{1}{2} log(y)$ is `- ( i >> 1 )`

. That’s one of the terms we
see in the WTF line.

The other term is more mysterious, but represents that scale and shift we have so far ignored. Suppose that $\Gamma$ is the answer that we want, and send that through everything we know so far:

\[\begin{align*} \Gamma &= \frac{1}{\sqrt{y}} \\ log( \Gamma ) &= log( \frac{1}{\sqrt{y}} ) \\ log( \Gamma ) &= -\frac{1}{2} log(y) \\ \frac{1}{2^{23}} (M_{\Gamma} + 2^{23} E_{\Gamma}) + \mu - 127 &= - \frac{1}{2} ( \frac{1}{2^{23}} (M_{y} + 2^{23} E_{y}) + \mu - 127 ) \\ (M_{\Gamma} + 2^{23} E_{\Gamma}) &= \frac{3}{2} (2^{23}(127-\mu)) - \frac{1}{2} (M_{y} + 2^{23} E_{y}) \\ (M_{\Gamma} + 2^{23} E_{\Gamma}) &= 0x5f3759df - \frac{1}{2} (M_{y} + 2^{23} E_{y}) \\ (M_{\Gamma} + 2^{23} E_{\Gamma}) &= 0x5f3759df - ( i >> 1 ) \end{align*}\]That magic `0x5f3759df`

comes from trial and error and decades of
fooling around with this algorithm. It’s basically an approximation of
$\sqrt{2^127}$.

To get back to the floating point value, reverse the bit magic by
reading 23 bits from the address `i`

and treating it as a `float`

:
`y = * ( float * ) &i`

. That’s almost the answer. It’s not exact, but we
can send it through a single iteration of Newton’s Method:

Rearrange everything to get a function in terms of $y$:

\[\begin{align*} y &= \frac{1}{\sqrt{n}} \\ y^2 &= \frac{1}{n} \\ \frac{1}{y^2} &= n \\ \frac{1}{y^2} - n &= 0 \\ \end{align*}\]So, $f(y) = \frac{1}{y^2} - n = 0$. The derivative of that is $f’(y) = -\frac{2}{y^3}$.

\[\begin{align*} y_1 &= y_0 - \frac{f(y_0)}{f'(y_0)} \\ y_1 &= y_0 - \frac{(\frac{1}{y_0^2} - n)}{(-\frac{2}{y_0^3})} \\ y_1 &= y_0 + \frac{y_0^3 (\frac{1}{y_0^2} - n)}{2} \\ y_1 &= y_0 + \frac{( y_0 - y_0^2 n )}{2} \\ y_1 &= \frac{3}{2} y_0 + \frac{( y_0^2 n )}{2} \\ y_1 &= y_0 \frac{( 3 - y_0^2 n )}{2} \\ y_1 &= y_0 ( \frac{3}{2} - \frac{n}{2} y_0 y_0 ) \end{align*}\]That last line looks like the expression `y = y * ( threehalfs - ( x2 * y * y ) )`

since `x2 = number * 0.5F`

.

That line is repeated (but commented) because one iteration of Newton’s Method is close enough. And, notice that even though it looks like there are divisions, there are no divisions. Those are constants that we’ve previously set up.