Extending Mark's answer. AWS has also added support for a bucket.s3.aws-region.amazonaws.com/key1/key2
URL pattern (.
instead of -
).
This is also the Object URL
given on the S3 bucket file's page.
So, the 2nd regex pattern can be updated with a [-.]
, to capture a single -
or .
character, allowing it to match against both bucket.s3-aws-region.amazonaws.com/key1/key2
& bucket.s3.aws-region.amazonaws.com/key1/key2
.
match = re.search('^https?://(.+).s3[-.]([^.]+).amazonaws.com/', url)
if match:
return match.group(1), match.group(2)
Reference: https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html#VirtualHostingBackwardsCompatibility mentions bucket.s3-aws-region.amazonaws.com
(-
instead of .
seperating s3
& aws-region
) is the legacy endpoint and recommends using this new pattern.
( P.S. I don't have enough reputation to add a comment to Mark's answer, hence have posted this as a new answer )