79534883

Date: 2025-03-25 21:14:33
Score: 1.5
Natty:
Report link

The fastest minimalist log2(x) code that I know of uses a union to reinterpret an IEEE754 FP number representation as a scaled integer proxy for the log function. It is very specific to the IEEE754 binary representation. I think that it may be originally due to Kahan. It can be very fast if you skip the error checking. I only bother to check for x>0 here (various error cases are covered in another answer).

This is a quick implementation for the simplest linear interpolation and up to cubic forms.
The first return gives the linearised log2 version equivalent to @chux implementation.
Essentially it works by subtracting off the exponent offset from the biased exponent and then making the rather gross approximation that log2(1+x) ~= x. Exact for x=0 and x=1 but poor near x=0.5.

This is the sample log2 code with a minimal test framework:


#include <iostream>

// *CAUTION* Requires IEEE754 FP 32 bit representation to work!
// Pade rational float implementation on x = 0 to 1 is good to 3 sig fig

union myfloat { float f32; int i32; };

float mylog2(float x)
{
    myfloat y, z;
    z.f32 = y.f32 = x;
    if (x > 0.0)
    {
        y.i32 -= 0x3f800000;            // subtract off exponent bias
//        return (float) y.i32/(1<<23); // crude linear approx max err ~ 0.086

        y.f32 = (y.i32)>>23; // y now contains the integer exponent

        z.i32 &= 0x3fffffff;
        z.i32 |= 0x3f800000; // mask argument x to be of form 0 <= 1+x < 1
        z.f32 -= 1.0;
//       z.f32 = 1.2 * z.f32 * (1 - z.f32 * (0.5 - z.f32 / 3.0));            // naive cubic Taylor series around x=0 with  max err ~ 0.083
//       z.f32 = 1.5 * z.f32 - z.f32 * z.f32 * (0.81906 - z.f32 * 0.3236);   // optimised cubic polynomial (not monotonic) max err ~ 0.00093
//       z.f32 = z.f32 * 10 * (6 + z.f32) / (42 + 28 * z.f32);                      // naive Pade rational approximation    max err ~ 0.0047
        z.f32 = z.f32 * (36 + 4.037774 * z.f32) / (24.9620775 + 15.070677 * z.f32); // optimised Pade for this range 0 to 1 max err ~ 0.00021
                                                                                    // naive int coeffs were 36, 4, 25, 15 respectively
    }
    return y.f32+z.f32;
}

void check_log(float x)
{
    printf("%-9g : %-9g %-9g\n", x, log2(x), mylog2(x));
}

void test_log()
{
    float x = 0.0625;
    while (x < 33)
    {
        check_log(x * 0.9999); // to check monotonicity at boundary
        check_log(x);
        check_log(x * 1.0001);
        check_log(x * 1.1);
        check_log(x * 1.5);
        check_log(x * 1.75);
        x += x;
    }
}

int main()
{
    test_log();
}

This TI designers notes on [fast 'log(x)'](https://www.ti.com/lit/an/spra218/spra218.pdf?ts=1742833036256) tricks is also worth a look at for inspiration. You really need to decide how accurately do you want the result. Speed vs accuracy is always a trade off. How much is enough?

Reasons:
  • Long answer (-1):
  • Has code block (-0.5):
  • Ends in question mark (2):
  • User mentioned (1): @chux
  • Looks like a comment (1):
  • High reputation (-1):
Posted by: Martin Brown