After reading comments from @dewaffled I've realized that I was missing something. And after reading the implementation again, I found that the wait itself was inside the loop.
libcxx:atomic_sync.h
libcxx:poll_with_backoff.h
template <class _AtomicWaitable, class _Poll>
_LIBCPP_HIDE_FROM_ABI void __atomic_wait_unless(const _AtomicWaitable& __a, memory_order __order, _Poll&& __poll) {
std::__libcpp_thread_poll_with_backoff(
/* poll */
[&]() {
auto __current_val = __atomic_waitable_traits<__decay_t<_AtomicWaitable> >::__atomic_load(__a, __order);
return __poll(__current_val);
},
/* backoff */ __spinning_backoff_policy());
}
template <class _Poll, class _Backoff>
_LIBCPP_HIDE_FROM_ABI bool __libcpp_thread_poll_with_backoff(_Poll&& __poll, _Backoff&& __backoff, chrono::nanoseconds __max_elapsed) {
auto const __start = chrono::high_resolution_clock::now();
for (int __count = 0;;) {
if (__poll())
return true; // __poll completion means success
// code that checks if the time has excceded the max_elapsed ...
}
__atomic_wait calls the __libcpp_thread_poll_with_backoff with polling and backoff policy, which does the spinning work.
And as mentioned by @dewaffled, same thing goes for libstdc++.
template<typename _Tp, typename _Pred, typename _ValFn>
void
__atomic_wait_address(const _Tp* __addr, _Pred&& __pred, _ValFn&& __vfn,
bool __bare_wait = false) noexcept
{
__detail::__wait_args __args{ __addr, __bare_wait };
_Tp __val = __args._M_setup_wait(__addr, __vfn);
while (!__pred(__val))
{
auto __res = __detail::__wait_impl(__addr, __args);
__val = __args._M_setup_wait(__addr, __vfn, __res);
}
// C++26 will return __val
}
So just looking at the implementation, atomic<T>::wait can spuriously wakeup by the implementation (as @Jarod42 mentioned), but does not return from a function until the value has actually changed.
To answer my question,