79437858

Date: 2025-02-13 22:19:24
Score: 1
Natty:
Report link

My analogy to this is, it is like a library (BigQuery) and clustering is like books on shelves by genre. If there are a lot of books (rows) that don't have a genre (NULL), they are all like one big shelf of unclassified books. It reads more files because searching in books with no genre, BigQuery has to check all that big unclassified shelf reading a lot of unnecessary books. And with clustering, books with no genre (NULL), it is like one big shelf of unclassified books in the library. BigQuery checks more data than needed, which makes everything slower. Perhaps if you can pre-filter the NULL then cluster it to remove the NULL cluster or try to put ā€˜E’ on the later in the clustering order otherwise if not frequently needed, remove it if possible.

Reasons:
  • Long answer (-0.5):
  • No code block (0.5):
  • Single line (0.5):
  • Low reputation (0.5):
Posted by: marky