79686333

Date: 2025-07-01 16:13:53
Score: 0.5
Natty:
Report link

I pity the OP lumbered with such a silly and inflexible accuracy target for their implementation as shown above. They don't deserve the negative votes that this question got but their professor certainly does!

By way of introduction I should point out that we would not normally approximate sin(x) over such a wide range as 0-pi because to do so violates one of the important heuristics generally used in series expansions used to approximate functions in computing namely:

  1. Each successive term in the series satisfies |anxn| > |an+1xn+1|
  2. Terms are added together smallest to largest to control rounding error.

IOW each successive term is a smaller correction to the sum than its predecessor and they are typically summed from highest order term first usually by Horner's method which lends itself to modern computers FMA instructions which can process a combined multiply and add with only one rounding each machine cycle.

To illustrate how range reduction helps I the test code below does the original range test with different numbers of terms N for argument limits of pi, pi/2 and pi/4. The first case evaluation for pi thrashes about wildly at first before eventually settling down. The latter pi/4requires just 4 terms to converge.

In fact we can get away with the wide range in this instance because sin, cos and exp are all highly convergent polynomial series with a factorial in the denominator - although the large alternating terms added to partial sums when x ~ pi do cause some loss of precision at the high end of the range.

We would normally approximate over a reduced range of pi/2, pi/4 or pi/6. However taking on this challenge head on there are several ways to do it. The different simple methods of summing the Taylor series can give a few possible answers depending on how you add them up and whether or not you accumulate the partial sum into a double precision variable. There is no compelling reason to prefer any one of them over another. The fastest method is as good as any.

There is really nothing good about the professor's recommended method. It is by far the most computationally expensive way to do it and for good measure it will violate the original specification of computing the Taylor series when N>=14 because the factorial result for 14! cannot be accurately represented in a float32 - the value is truncated.

The OP's original method was perfectly valid and neatly sidesteps the risk of overflow of xN and N! by refining the next term to be added for each iteration inside the summation loop. The only refinement would be to step the loop in increments of 2 and so avoid computing n = 2*i.

@user85421's comment reminded me of a very old school way to obtain a nearly correct polynomial approximation for cos/sin by nailing the result obtained at a specific point to be exact. Called "shooting" and usually done for boundary value problems it is the simplest and easiest to understand of the more advanced methods to improve polynomial accuracy.

In this instance we adjust the very last term in xN so that it hits the target of cos(pi) = -1 exactly. It can be manually tweaked from there to get a crude nearly equal ripple solution that is about 25x more accurate than the classical Taylor series.

The fundamental problem with the Taylor series is that it is ridiculously over precise near zero and increasingly struggles as it approaches pi. This hints that we might be able to find a compromise set of coefficients that is just good enough everywhere in the chosen range.

The real magic comes from constructing a Chebyshev equal ripple approximation using the same number of terms. This is harder to do for 32 bit floating point and since a lot of modern CPUs now have double precision arithmetic that is as fast as single precision you often find double precision implementations lurking inside nominally float32 wrappers.

It is possible to rewrite a Taylor series into a Chebyshev expansion by hand. My results were obtained using a Julia numerical code ARMremez.jl for rational approximations.

To get the best possible coefficient set for fixed precision working in practice requires a huge optimisation effort and isn't always successful but to get something that is good enough is relatively easy. The code below shows the various options I have discussed and sample coefficients. The framework used tests enough of the range of x values to put tight bounds on worst case absolute error |cos(x)-poly_cos(x)|.

In real applications of approximation we would usually go for minimum relative error | 1 - poly_cos(x)/cos(x)| (so that ideally all the bits in the mantissa are right). However the zero at pi/2 would make life a bit too interesting for a simple quick demo so I have used absolute error here instead.

The 6 term Chebyshev approximation is 80x more accurate but the error is in the sense that takes cos(x) outside the valid range |cos(x)| <= 1 (highly undesirable). That could easily be fixed by rescaling. They have been written in a hardcoded Horner fixed length polynomial implementation avoiding any loops (and 20-30% faster as a result).

The worst case error in the 7 term Chebyshev approximation computed in double precision is 1000x better at <9e-8 without any fine tuning. The theoretical limit with high precision arithmetic is 1.1e-8 which is below the 3e-8 Unit in the Last Place (ULP) threshold on 0.5-1.0. There is a good chance that it could be made correctly rounded for float32 with sufficient effort. If not then 8 terms will nail it.

One advantage of asking students to optimise their polynomial function on a range like 0-pi is that you can exhaustively test it for every possible valid input value x fairly quickly. Something that is usually impossible for double precision functions. A proper framework for doing this much more thoroughly than my hack below was included in a post by @njuffa about approximating erfc.

The test reveals that the OP's solution and the book solution are not that different, but the official recommended method is 30x slower or just 10x slower if you cache N!. This is all down to using pow(x,N) including the slight rounding differences in the sum and repeatedly recomputing factorial N (which leads to inaccuracies for N>14).

Curiously for a basic Taylor series expansion the worst case error is not always right at the end of the range - something particularly noticeable on the methods using pow()

Here is the results table:

Description cos(pi) error min_error x_min max_error x_max time (s)
prof Taylor -0.99989957 0.000100434 -1.436e-07 0.94130510 0.000100672 3.14159226 10.752
pow Taylor -0.99989957 0.000100434 -1.436e-07 0.94130510 0.000100672 3.14159226 2.748
your Cosinus -0.99989957 0.000100434 -1.570e-07 0.80652559 0.000100791 3.14157438 0.301
my Taylor -0.99989951 0.000100493 -5.476e-08 1.00042307 0.000100493 3.14159274 0.237
shoot Taylor -0.99999595 4.0531e-06 -4.155e-06 2.84360051 4.172e-06 3.14159012 0.26
Horner Chebyshev 6 -1.00000095 -9.537e-07 -1.330e-06 3.14106655 9.502e-07 2.21509051 0.177
double Horner Cheby 7 -1.00000000 0 -7.393e-08 2.34867692 8.574e-08 2.10044718 0.188

Here is the code that can be sued to experiment with the various options. The code is C rather than Java but written in such a way that it should be easily ported to Java.

#include <stdio.h>
#include <math.h>  
#include <time.h>

#define SLOW   // to enable the official book answer 
//#define DIVIDE  // use explicit division vs multiply by precomputed reciprocal 

double TaylorCn[10], dFac[20], drFac[20];
float shootC6;
float Fac[20];
float C6[7] = { 0.99999922f, -0.499994268f, 0.0416598222f, -0.001385891596f, 2.42044015e-05f, -2.19788836e-07f }; // original 240 bit rounded down to float32 
// ref float C7[8] = { 0.99999999f, -0.499999892f, 0.0416664902f, -0.001388780783f, 2.47699662e-05f, -2.70797754e-07f, 1.724760709e-9f }; // original 240 bit rounded down to float32
float C7[8] = { 0.99999999f, -0.499999892f, 0.0416664902f, -0.001388780783f, 2.47699662e-05f, -2.707977e-07f, 1.72478e-9f };  // after simple fine tuning
double dC7[8] = { 0.9999999891722795, -0.4999998918375135482, 0.04166649019522770258731, -0.0013887807826936648, 2.47699662157542654e-05, -2.707977544202106e-07, 1.7247607089243954e-09 };
// Chebeshev equal ripple approximations obtained from modified ARMremez rational approximation code
// C7 +/- 1.08e-8 (computed using 240bit FP arithmetic - coefficients are not fully optimised for float arithmetic) actually obtain 9e-8 (could do better?)
// C6 +/- 7.78e-7 actually obtain 1.33e-6 (with fine tuning could do better)

const float pi = 3.1415926535f;

float TaylorCos(float x, int ordnung)    
{
    double sum, term,  mx2;
    sum = term = 1.0;
    mx2 = -x * x;
    for (int i = 2; i <= ordnung; i+=2) {
        term *= mx2 ;
#ifdef DIVIDE
        sum += term / Fac[i]; // slower when using divide
#else
        sum += term * drFac[i]; // faster to multiply by reciprocal
#endif
    }
    return (float) sum;
}

float fTaylorCos(float x)
{
    return TaylorCos(x, 12);
}

void InitTaylor()
{
    float x2, x4, x8, x12;
    TaylorCn[0] = 1.0;
    for (int i = 1; i < 10; i++) TaylorCn[i] = TaylorCn[i - 1] / (2 * i * (2 * i - 1)); // precomute the coefficients
    Fac[0] = 1;
    drFac[0] = dFac[0] = 1;
    for (int i = 1; i < 20; i++)
    {
      Fac[i] = i * Fac[i - 1];
      dFac[i] = i * dFac[i - 1];
      drFac[i] = 1.0 / dFac[i];
      if ((double)Fac[i] != dFac[i]) printf("float factorial fails for %i! %18.0f should be %18.0f error %10.0f ( %6.5f ppm)\n", i, Fac[i], dFac[i], dFac[i]-Fac[i], 1e6*(1.0-Fac[i]/dFac[i]));
    }
   x2 = pi * pi;
   x4 = x2 * x2;
   x8 = x4 * x4;
   x12 = x4 * x8;
   shootC6 = (float)(cos((double)pi) - TaylorCos(pi, 10)) / x12 * 1.00221f;   // fiddle factor for shootC6 with 7 terms  *1.00128;
}

float shootTaylorCos(float x)
{
    float x2, x4, x8, x12;
    x2 = x * x;
    x4 = x2 * x2;
    x8 = x4 * x4;
    x12 = x4 * x8;
    return TaylorCos(x, 10) + shootC6 * x12;
}

float berechneCosinus(float x, int ordnung) {
    float sum, term, mx2;
    sum = term = 1.0f;
    mx2 = -x * x;
    for (int i = 1; i <= (ordnung + 1) / 2; i++) {
        int n = 2 * i;
        term *= mx2 / ((n-1) * n);
        sum += term;
    }
    return sum;
}

float Cosinus(float x)
{
    return berechneCosinus(x, 12);
}

float factorial(int n)
{
    float result = 1.0f;    
    for (int i = 2; i <= n; i++)
        result *= i;
    return result;
}

float profTaylorCos_core(float x, int n)
{
    float sum, term, mx2;
    sum = term = 1.0f;
    for (int i = 2; i <= n; i += 2) {
        term *= -1;
        sum += term*pow(x,i)/factorial(i);
    }
    return (float)sum;
}

float profTaylorCos(float x)
{
    return profTaylorCos_core(x, 12);
}

float powTaylorCos_core(float x, int n)
{
    float sum, term;
    sum = term = 1.0f;
    for (int i = 2; i <= n; i += 2) {
        term *= -1;
        sum += term * pow(x, i) / Fac[i];
    }
    return (float)sum;
}

float powTaylorCos(float x)
{
    return powTaylorCos_core(x, 12);
}

float Cheby6Cos(float x)
{
    float sum, term, x2;
    sum = term = 1.0f;
    x2 = x * x;
    for (int i = 1; i < 6; i++) {
        term *= x2;
        sum += term * C6[i]; 
    }
    return sum;
}

float dHCheby7Cos(float x)
{
    double x2 = x*x;
    return (float)(dC7[0] + x2 * (dC7[1] + x2 * (dC7[2] + x2 * (dC7[3] + x2 * (dC7[4] + x2 * (dC7[5] + x2 * dC7[6])))))); // cos 7 terms
}

float HCheby6Cos(float x)
{
    float x2 = x * x;
    return C6[0] + x2 * (C6[1] + x2 * (C6[2] + x2 * (C6[3] + x2 * (C6[4] + x2 * C6[5])))); // cos 6 terms
}


void test(const char *name, float(*myfun)(float), double (*ref_fun)(double), double xstart, double xend)
{
    float cospi, cpi_err, x, ox, dx, xmax, xmin;
    double err, res, ref, maxerr, minerr;
    time_t start, end;
    x = xstart;
    ox = -1.0;
//    dx = 1.2e-7f;
    dx = 2.9802322387695312e-8f; // chosen to test key ranges of the function exhaustively
    maxerr = minerr = 0;
    xmin = xmax = 0.0;
    start = clock();
    while (x <= xend) {
        res = (*myfun)(x);
        ref = (*ref_fun)(x);
        err = res - ref;
        if (err > maxerr) {
            maxerr = err;
            xmax = x;
        }
        if (err < minerr) {
            minerr = err;
            xmin = x;
        }
        x += dx;
        if (x == ox) dx += dx;
        ox = x;
    }
    end = clock();
    cospi = (*myfun)(pi);
    cpi_err = cospi - cos(pi);
    printf("%-22s  %10.8f  %12g %12g  @  %10.8f  %12g  @  %10.8f  %g\n", name, cospi, cpi_err, minerr, xmin, maxerr, xmax, (float)(end - start) / CLOCKS_PER_SEC);
}

void OriginalTest(const char* name, float(*myfun)(float, int), float target, float x)
{
    printf("%s cos(%10.7f) using terms upto x^N\n N \t result  error\n",name, x);
    for (int i = 0; i < 19; i += 2) {
        float cx, err;
        cx = (*myfun)(x, i);
        err = cx - target;
        printf("%2i  %-12.9f  %12.5g\n", i, cx, err);
        if (err == 0.0) break;
    }
}

int main() {
    InitTaylor();    // note that factorial 14 cannot be represented accurately as a 32 bit float and is truncated.
                     // easy sanity check on factorial numbers is to count the number of trailing zeroes.

    float x = pi; // approx. PI
    OriginalTest("Taylor Cosinus", berechneCosinus, cos(x), x);
    OriginalTest("Taylor Cosinus", berechneCosinus, cos(x/2), x/2);
    OriginalTest("Taylor Cosinus", berechneCosinus, cos(x/4), x/4);

    printf("\nHow it would actually be done using equal ripple polynomial on 0-pi\n\n");
    printf("Chebyshev equal ripple   cos(pi) 6 terms %12.8f (sum order x^0 to x^N)\n", Cheby6Cos(x));
    printf("Horner optimum Chebyshev cos(pi) 6 terms %12.8f (sum order x^N to x^0)\n", HCheby6Cos(x));
    printf("Horner optimum Chebyshev cos(pi) 7 terms %12.8f (sum order x^N to x^0)\n\n", dHCheby7Cos(x));

    printf("Performance and functionality tests of versions - professor's solution is 10x slower ~2s on an i5-12600 (please wait)...\n");
    printf(" Description \t\t cos(pi)      error \t    min_error \t    x_min\tmax_error \t x_max \t  time\n");
#ifdef SLOW
    test("prof Taylor", profTaylorCos, cos, 0.0, pi);
    test("pow  Taylor", powTaylorCos,  cos, 0.0, pi);
#endif
    test("your Cosinus",  Cosinus, cos, 0.0, pi);
    test("my Taylor",  fTaylorCos, cos, 0.0, pi);
    test("shoot Taylor", shootTaylorCos, cos, 0.0, pi);
    test("Horner Chebyshev 6",  HCheby6Cos, cos, 0.0, pi);
    test("double Horner Cheby 7", dHCheby7Cos, cos, 0.0, pi);
    return 0;
}

It is interesting to make the sum and x2 variables double precision and observe the effect that has on the answers. If someone fancies running simulated annealing or another global optimiser to find the best possible optimised Chebyshev 6 & 7 float32 approximations please post the results.

I agree whole heartedly with Steve Summits final comments. You should think very carefully about risk of overflow of intermediate results and order of summation doing numerical calculations. Numerical analysis using floating point numbers follows different rules to pure mathematics and some rearrangements of an equation are very much better than others when you want to compute an accurate numerical value.

Reasons:
  • Whitelisted phrase (-1): solution is
  • RegEx Blacklisted phrase (2.5): please post
  • Long answer (-1):
  • Has code block (-0.5):
  • User mentioned (1): @user85421's
  • User mentioned (0): @njuffa
  • Filler text (0.5): 00000000
  • High reputation (-1):
Posted by: Martin Brown