79554805

Date: 2025-04-04 08:17:40
Score: 1.5
Natty:
Report link

Are you using saveAsTable to write your data? If yes, to avoid your issue, you can set

spark.conf.set("spark.sql.sources.partitionOverwriteMode", "dynamic")

and then using saveAsTable should not delete any data in the source.

Not sure if applicable to your use case, but I would encourage you to avoid using saveAsTable if possible, and instead write the data to a file format and then use something else to handle the operations on the tables. E.g., in AWS, you can write out the data as parquet to s3 and then use a lambda function/crawler for registering the changes to your glue tables. Sometimes saveAsTable can be clearly slower and might have these side effects.

Reasons:
  • Long answer (-0.5):
  • No code block (0.5):
  • Contains question mark (0.5):
  • Low reputation (1):
Posted by: Vilhomaa