Files
archon/python/src
leex279 c1677a9220 fix: Skip discovery when user provides direct discovery file URLs
When a user directly provides a URL to a discovery file (sitemap.xml, llms.txt, robots.txt, etc.),
the system now skips the discovery phase and uses the provided file directly.

This prevents unnecessary discovery attempts and respects the user's explicit choice.

Changes:
- Check if the URL is already a discovery target before running discovery
- Skip discovery for: sitemap files, llms variants, robots.txt, well-known files, and any .txt files
- Add logging to indicate when discovery is skipped

Example: When crawling 'xyz.com/sitemap.xml' directly, the system will now use that sitemap
instead of trying to discover a different file like llms.txt
2025-09-20 13:34:07 +02:00
..