Reports

Check out a userscript which highlights deleted posts. GitHub

79670595

Date: 2025-06-18 12:07:46

Score: 4

Natty:

Although, the plan shows hash partitioning of A twice for creating both the joined dataframes AB and AC, it does not mean under the hood the tasks are not reusing already hashed partitions of A . Spark skips the stages if it finds the steps redundant even if its part of the plan. Can you check your DAG to see if the stages are skipped like shown below?

Reasons:

No code block (0.5):
Ends in question mark (2):
Single line (0.5):
Looks like a comment (1):

Posted by: Vindhya G