I tried optimizing the time zone conversion by saving the offset between UTC and local time at startup of my program (which is good enough for my use). This seems to be very fast (as expected).
Unfortunately the MS compiler/runtime lib does not seem to have a good implementation of std::format since it is consistently slower than put_time (at least twice the cost).
I did a little experiment in QuickBench (here if anyone is interested) Here the fixed offset + std::format version is a bit faster. Unfortunately (for me) this cannot be replicated in Visual Studio where std::format is too slow to compete.
I think I will have to stick with the current implementation using put_time :(
But thanks for all you input!