Reports

I think the issue isn’t about where the processing happens, but how the data is handled after processing. When you run client.query(query).to_dataframe(), BigQuery executes the query and then transfers the entire result set to your Colab instance. I suppose this is where the bottleneck occurs.

What you can try is to perform as much processing as possible within BigQuery itself. Instead of pulling the entire result set into a DataFrame, export the results to a destination like Cloud Storage. Then, you can process the data in smaller chunks or use tools designed for large datasets within your Colab environment.

79362983