Web Scraper

The web scraper tool uses crawl4ai to extract content from web pages with advanced anti-bot evasion and JavaScript rendering. It's optimised for high-volume scraping of sites that block standard scrapers.

Tool: `scrape_url`

Scrapes a URL and returns the cleaned text content, including content from dynamically rendered JavaScript.

Arguments

Argument	Type	Description
`url`	string	The URL to scrape
`wait_for`	string	Optional CSS selector to wait for before extracting
`scroll`	boolean	Scroll to bottom to trigger infinite scroll loading

Use cases

Scraping news articles, blog posts, documentation
Extracting content from sites with bot protection (LinkedIn, financial news sites)
Crawling dynamic React/Vue/Angular apps
Monitoring competitor pricing or product listings

Anti-bot evasion

The scraper uses crawl4ai's stealth features:

Headless browser with realistic browser fingerprints
Randomised request timing
Cookie and session handling
User-agent rotation

This makes it significantly more effective against sites that block requests-based scrapers or require JavaScript.

Example

Scrape the latest articles from https://news.ycombinator.com and summarise the top 5 stories.

Get the pricing information from https://competitor.com/pricing.

Comparison with Browser tool

See Browser → Comparison for a side-by-side comparison.

note

Web scraping should comply with a site's Terms of Service and robots.txt. Use this tool responsibly.

Tool: scrape_url​

Arguments​

Use cases​

Anti-bot evasion​

Example​

Comparison with Browser tool​