Sitemap Connector
The Sitemap connector allows you to extract and process pages from website sitemaps when creating a source.
Features
- Processes pages listed in XML sitemaps
- Supports regex patterns for URL filtering
- Respects robots.txt directives
Configuration
Example Payload
{
"type": "sitemap",
"sitemap_url": "https://example.com/sitemap.xml",
"include_pattern": "https://example.com/docs/*",
"exclude_pattern": "https://example.com/docs/blog/*"
}