the gcc compiler optimizes the struct initialization to 3 instructions, while clang takes 16.
And this causes a performance bottleneck in your application?