79481020

Date: 2025-03-03 12:56:28
Score: 2.5
Natty:
Report link

Answering my own question (with helps from comments)

with /O2 flag msvc was able to generate sse instruction for addition. Furthermore, the mscv compiler generated instructions for loop unroll. Combining the two compiler optimisation, it was able out perform my code by a bit (I was using avx).

Here I want to give credits to the people who helped me in the comments section, @PeterCordes and @Homer512 - Thank you both.

I will be reading this book for further study: "Modern X86 Assembly Language Programming: Covers x86 64-bit, AVX, AVX2, and AVX-512"

Reasons:
  • Blacklisted phrase (0.5): Thank you
  • RegEx Blacklisted phrase (1): I want
  • Long answer (-0.5):
  • Has code block (-0.5):
  • User mentioned (1): @PeterCordes
  • User mentioned (0): @Homer512
  • Self-answer (0.5):
  • Low reputation (0.5):
Posted by: Xiyang Liu