The issue is resolved. When running in spark, adding the following in the spark-submit command
--conf spark.hadoop.io.compression.codecs=org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec
--packages org.apache.hadoop:hadoop-aws:3.2.0