79491270

Date: 2025-03-07 04:38:02
Score: 1
Natty:
Report link

I faced the same issue and I was able to install third-party packages in the following way:

BATCH_CONFIG = {
    "pyspark_batch": {
        "main_python_file_uri": f"{BUCKET}/python/latest/{JOB}",
        "python_file_uris": [f"{BUCKET}/python/latest/local_lib/requests-2.32.3-py3-none-any.whl]
        "args": ["gs://pub/shakespeare/rose.txt", f"{BUCKET}/sample-output-data"]
    },
    "environment_config": {
        "execution_config": {
            "network_uri": f"projects/{PROJECT_ID}/global/networks/main-vpc-prd",
            "subnetwork_uri": f"https://www.googleapis.com/compute/v1/projects/{PROJECT_ID}/regions/{REGION}/subnetworks/data-prd",
            "service_account": IMPERSONATION_CHAIN,
        }
    }
}

What I did is to download the python whl file for the library that I want to use. Then I included that as item in the python_file_uris array.

Note: Following this approach you can include as many packages as you want.

Sources:
-> Requests whl file: https://pypi.org/project/requests/#files

Reasons:
  • RegEx Blacklisted phrase (1): I want
  • Long answer (-0.5):
  • Has code block (-0.5):
  • Low reputation (1):
Posted by: Juan Madrigal