For those new to Delta table, it has a parquet file as base since inception. The base file will not change, but there will be delta logs written on top of the parquet. In other words, if you have a constant update on the delta log, it will build up and eventually takes considerable time to process the table.
delta_table.alias('t1').merge(source_df.alias('t2'),"t1.pk1 = t2.pk1 AND t1.pk2 = t2.pk2") .whenMatchedDelete().execute()