Maybe hardware prefetching patterns could explain this behavior. It looks at the access patterns and could predict the next memory location to prefetch for better efficiency. Maybe that's why the second case has better performance since the access pattern is the same (each access on different page).