Reports

After reading comments from @dewaffled I've realized that I was missing something. And after reading the implementation again, I found that the wait itself was inside the loop.

libcxx:atomic_sync.h
libcxx:poll_with_backoff.h

template <class _AtomicWaitable, class _Poll>
_LIBCPP_HIDE_FROM_ABI void __atomic_wait_unless(const _AtomicWaitable& __a, memory_order __order, _Poll&& __poll) {
  std::__libcpp_thread_poll_with_backoff(
      /* poll */
      [&]() {
        auto __current_val = __atomic_waitable_traits<__decay_t<_AtomicWaitable> >::__atomic_load(__a, __order);
        return __poll(__current_val);
      },
      /* backoff */ __spinning_backoff_policy());
}

template <class _Poll, class _Backoff>
_LIBCPP_HIDE_FROM_ABI bool __libcpp_thread_poll_with_backoff(_Poll&& __poll, _Backoff&& __backoff, chrono::nanoseconds __max_elapsed) {
  auto const __start = chrono::high_resolution_clock::now();
  for (int __count = 0;;) {
    if (__poll())
      return true; // __poll completion means success

  // code that checks if the time has excceded the max_elapsed ...
}

__atomic_wait calls the __libcpp_thread_poll_with_backoff with polling and backoff policy, which does the spinning work.

And as mentioned by @dewaffled, same thing goes for libstdc++.

libstdc++:atomic_wait.h

template<typename _Tp, typename _Pred, typename _ValFn>
    void
    __atomic_wait_address(const _Tp* __addr, _Pred&& __pred, _ValFn&& __vfn,
              bool __bare_wait = false) noexcept
    {
      __detail::__wait_args __args{ __addr, __bare_wait };
      _Tp __val = __args._M_setup_wait(__addr, __vfn);
      while (!__pred(__val))
    {
      auto __res = __detail::__wait_impl(__addr, __args);
      __val = __args._M_setup_wait(__addr, __vfn, __res);
    }
      // C++26 will return __val
    }

So just looking at the implementation, atomic<T>::wait can spuriously wakeup by the implementation (as @Jarod42 mentioned), but does not return from a function until the value has actually changed.

To answer my question,

It's a educated guess (as @Homer512 mentioned), which can minimize the spurious wakeup with a reasonable space.
Since value itself is compared on each wakeup, sharing state across multiple addresses is fine.

79795992