What is 429 status code?

  • Author
    by Josselin Liebe
    2 years ago
  • Response status code 429 generally indicates that the client is making too many requests. In the world of web scraping, this predicament often arises when scraping at a rapid pace.

    One method to sidestep status code 429 is to decelerate our connections using rate limiting. This tactic is prevalent especially when utilizing high-scale asynchronous scrapers such as Python's 🐍 asyncio or scrapy.

    Another strategy to evade the 429 status code is to disseminate connections across multiple agents. In this scenario, proxies and proxy rotation are invaluable.

    Alternatively, the Piloterr web scraping API can be employed to automatically distribute connections, acting as a shield against the stringent rate limits enforced by certain websites.

    Here's a basic Python script that utilizes the requests library to retry requests when a 429 status code is encountered. The script will:

    1. Slow down the request pace using the sleep function from the time module.

    2. If a 429 status code is encountered, it will wait for a specified period and then retry the request.

    import requests
    import requests
    from time import sleep
    
    

    MAX_RETRIES = 5 WAIT_PERIOD = 5 # seconds

    def fetch_url(url): retries = 0 while retries < MAX_RETRIES: response = requests.get(url) if response.status_code == 429: print("Rate limit encountered. Retrying after waiting...") sleep(WAIT_PERIOD * (retries + 1)) # increasing wait time with more retries retries += 1 else: return response raise Exception("Max retries reached.")

    url = "YOUR_URL_HERE" response = fetch_url(url) print(response.status_code)

    Related Articles