79403639

Date: 2025-01-31 18:04:19
Score: 1.5
Natty:
Report link

Bigquery supports batch loading Avro files directly as long as it is compressed using a supported codec (snappy, deflate, zstd). Since you are using gzip, creating a function that will fetch the files and decompress the contents is indeed the nearest solution, but the issue you’ve encountered when using a function might be due to network bandwidth and maximum execution time since the process involves decompressing a lot of files. As mentioned by @somethingsomething, it would be helpful to post your code so that we can take a closer look at what went wrong.

You can take a look at this thread about loading a jsonl.gz file from GCS into Bigquery using Cloud Function.

However, given your scale (75GB of files daily), Dataflow might be a better solution since there is a template that decompresses a batch of files on GCS.

Reasons:
  • Long answer (-0.5):
  • No code block (0.5):
  • User mentioned (1): @somethingsomething
  • Low reputation (0.5):
Posted by: yannco