The issue wasn't with the query. The issue was with how I interpreted the number of rows in the output pane. The pane showed 6,092 records because of the limitation on notebook cell output - see Known limitations Databricks notebooks. If I download the results of the output frame showing 6,092 rows I see the complete result set of 971,198 records. Mystery solved. Hoped this helps someone.