79319950

Date: 2024-12-31 12:24:12
Score: 1
Natty:
Report link

First of all, you need to localize the root cause:

  1. Does the download speed(or throughput) change over time and is lower than usual for the problematic sites/pages? Try to download test data samples in parallel from the other domain. And add Mbit/s metric to monitor the target sites. It will help you to find out if the target sites limiting you or not. If so - try to spread the data flow among unique IPs. The basic principles can be found here datascrape.tech/blog/pyramid-of-efficient-scraping

2.Double-check the pages/URLs queue size. It might be that you have local maximums of URLs number and this causes the problem, not the network throughput.

  1. Double-check that you haven't reached your VPCs throughput maximum. Haven't checked that recently, but usually throughput(network speed) depends on the disk size and is limited by default for the VPCs.
Reasons:
  • Long answer (-0.5):
  • No code block (0.5):
  • Contains question mark (0.5):
  • Low reputation (0.5):
Posted by: greggyNapalm