79424335

Date: 2025-02-09 03:16:17
Score: 1
Natty:
Report link

can you try by explicitly setting an engine while reading to DataFrame from S3 Path.

Maybe underlying engine could be the issue, again not sure...

df = pd.read_parquet(
     f"s3a://{bucket_and_prefix}",  data
     engine="fastparquet", 
     storage_options=
    {
        "key"          : os.getenv("AWS_ACCESS_KEY_ID"),
        "secret"       : os.getenv("AWS_SECRET_ACCESS_KEY"),
        "client_kwargs": {
          'verify'      : os.getenv('AWS_CA_BUNDLE'),
           'endpoint_url': 'https://prd-data.company.com/'
     }      }
        }
)

or switching between fastparquet or pyarrow might help. Please let me know if you get any fix for this..

Reasons:
  • Whitelisted phrase (-2): can you try
  • RegEx Blacklisted phrase (2.5): Please let me know
  • Long answer (-0.5):
  • Has code block (-0.5):
  • Starts with a question (0.5): can you
  • Low reputation (1):
Posted by: Rahul Yadav