79293307

Date: 2024-12-19 06:41:31
Score: 1
Natty:
Report link

For those new to Delta table, it has a parquet file as base since inception. The base file will not change, but there will be delta logs written on top of the parquet. In other words, if you have a constant update on the delta log, it will build up and eventually takes considerable time to process the table.

delta_table.alias('t1').merge(source_df.alias('t2'),"t1.pk1 = t2.pk1 AND t1.pk2 = t2.pk2") .whenMatchedDelete().execute()

Reasons:
  • Has code block (-0.5):
  • Self-answer (0.5):
  • Low reputation (1):
Posted by: Dan Wang