Przejdź do głównej zawartości

Timeouts

Scraping Fish API allows you to set two types of timeouts:

  1. For the entire request, including JS scenario execution time, in the total_timeout_ms parameter.
  2. For one trial of loading the website in the trial_timeout_ms parameter.

Timeouts should be specified in milliseconds, and it is possible to set both trial_timeout_ms and trial_timeout_ms for the same request.

Total timeout

The total request timeout is by default set to 90,000 ms (90 seconds). This is an approximate value, and the actual timeout can happen within a 1,000 ms margin. If you have a complex JS scenario use case and need more time, you have to adjust the total_timeout_ms parameter as in the example below.

To simulate a long JS scenario, we go to example.com and simply wait for 100 seconds. To make sure that we have enough time for the website to load and then to execute our dummy JS scenario, we set total_timeout_ms to 110,000 ms.

import requests
import json

payload = {
"api_key": "[your API key]",
"url": "https://www.example.com",
"js_scenario": json.dumps(
{"steps": [{"wait": 100_000}]}
),
"total_timeout_ms": 110_000
}

response = requests.get("https://scraping.narf.ai/api/v1/", params=payload)
print(response.content)

After about 100 seconds, you should get a response with HTML for example.com.

Trial timeout

In addition to the total request timeout, it is possible to set a timeout for one trial of loading the website using the trial_timeout_ms query parameter. This does not include the JS scenario, which is executed after the website is loaded, and it is set to 30,000 ms (30 seconds) by default. In case loading the website fails for any reason within one trial timeout, Scraping Fish attempts to load it again until it succeeds (or until interrupted by the total request timeout).

tip

It can be useful to adjust the trial timeout for a website that is expected to take a very long time to load.

In the example below, we expect example.com to load very quickly and want to force Scraping Fish API to retry the request if it fails within 10 seconds.

import requests

payload = {
"api_key": "[your API key]",
"url": "https://www.example.com",
"trial_timeout_ms": 10_000
}

response = requests.get("https://scraping.narf.ai/api/v1/", params=payload)
print(response.content)