This is a simplistic example, but imagine if you needed 100 nodes of some application to serve traffic. If you wanted to have 100 still running after one AZ out of two went down, you would need to run 200 nodes, 100 per AZ. If you spread your application over 3 AZs, you would only need to run 50 in each AZ for a total of 150 to still have 100 when one AZ went down.
As you add more AZs to this architecture, you still benefit, but with diminishing returns.
This article explains the idea in more detail, and references Chapter 2 of Architecting for Scale, 2nd Edition by Lee Atchison: