First of all I am comparing the code provided in @Joachim with or without the if(0) clause at second task. When using gcc 11.4.0 I see differences, but when using clang 14.0.0 the two codes run in about the same time (of course adding if(0) gives some slight edge but not big differences). Then I tried some stupid experiments like the one below:
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <omp.h>
struct timeval t1, t2;
void recursive_task(int level)
{
//printf("%d\n", level);
if (level == 0){
sleep(1);
return;
}
else
{
#pragma omp task
{
recursive_task(level-1);
}
#pragma omp task
{
recursive_task(level-1);
}
#pragma omp taskwait
}
}
int main()
{
double time;
gettimeofday(&t1, 0);
#pragma omp parallel num_threads(4)
{
#pragma omp single
{
recursive_task(2);
}
}
gettimeofday(&t2, 0);
time = (double)((t2.tv_sec - t1.tv_sec) * 1000000 + t2.tv_usec - t1.tv_usec) / 1000000;
printf("%.4f\n", time);
return 0;
}
Now this should run in about 1 second but in gcc this sometimes runs in about 2 seconds and very rarely (at least in my laptop) runs in 1 second so it does not always manage to schedule correctly and take advantage of the enough threads. This is a very strange inconsistency. I think there are some issues with gcc with openmp scheduler, like a bug or something. When using if(0) at second task it always seems to run in 1 second, as expected, but normally it should not be that huge difference just adding if(0).
On the other hand, with clang both with and without if(0) run in 1 second. I think there is some bug in gcc, can someone confirm that the bug presented above can also be reproduced and I am not crazy?