What is Tavily Crawl API?

Last updated: May 5, 2025

Tavily Crawl is a powerful tool that lets you automatically explore and extract content from a website β€” just by providing a single starting URL. It’s designed to help you traverse a site like a graph, collecting raw content from multiple pages, making it ideal for data extraction, documentation indexing, and building up-to-date knowledge bases.

πŸ”§ What it does

Tavily Crawl starts from a base URL and follows internal links, gathering content from each page it visits. You can control how deep it goes, how many links it follows, and what types of content it extracts. It's especially useful when you want to ingest large portions of a site (like a documentation portal or blog) into your AI applications.

βš™ Key Parameters

Parameter

Description

url

The starting point of the crawl (e.g., https://docs.tavily.com)

max_depth

How many levels deep the crawler should go (e.g., from homepage β†’ subpage β†’ sub-subpage)

max_breadth

How many links to follow per page

limit

The maximum number of total pages to crawl

query

(Optional) A natural language instruction to guide the crawler on what content to prioritize

select_paths

(Optional) Regex to focus the crawl on specific URL paths (e.g., /docs/.*)

select_domains

(Optional) Regex to limit crawl to specific domains or subdomains

allow_external

Whether to follow links to external domains (default: false)

include_images

Whether to include image URLs in the result

categories

Filter crawl by types of pages (e.g., Documentation, Blog)

extract_depth

Set to basic (default) or advanced for deeper content extraction including tables and embedded content

πŸ“¦ Example Use Case

You want to extract all documentation pages from docs.example.com and load them into a vector database for a RAG (retrieval-augmented generation) application. With Tavily Crawl, you simply provide the root URL, set filters for /docs/ paths, and choose extract_depth: advanced β€” the system returns clean raw content for each page it discovers.