Reports

Are you using saveAsTable to write your data? If yes, to avoid your issue, you can set

spark.conf.set("spark.sql.sources.partitionOverwriteMode", "dynamic")

and then using saveAsTable should not delete any data in the source.

Not sure if applicable to your use case, but I would encourage you to avoid using saveAsTable if possible, and instead write the data to a file format and then use something else to handle the operations on the tables. E.g., in AWS, you can write out the data as parquet to s3 and then use a lambda function/crawler for registering the changes to your glue tables. Sometimes saveAsTable can be clearly slower and might have these side effects.

79554805