To avoid this behaviour (just like you said data lineage works at partiton level) you can use checkpointing or persist to disk (MEMORY_AND_DISK).