In my case, when I added chain of drop(column) methods on the Dataset() then I noticed huge impact from GC caused failure of Spark application due to timeout.