Automatically generate LLMS.txt files for any website to help Large Language Models better understand and reference your content
Clone the repository
Install dependencies
Set up API keys (optional)
Run your first generation
config.yaml
:
Website Settings
Generation Settings
Content Analysis
Output Options
Website Discovery
Dynamic Section Detection
Content Extraction
Intelligent Categorization
LLMS.txt Generation
Multi-Format Export
Dashboard Integration
shop
, sale
, help
, company
automatically detected from URL patterns like /shop/
, /sale/
, /help/
, /company/
Success Rate: 47% with 7 sections discovered from 100 pagesSmall Sites (<50 pages)
Medium Sites (50-200 pages)
min_pages_per_section
to 3-5Large Sites (200+ pages)
max_pages: 100-200
to limit scopeLow Success Rate
min_pages_per_section
in configmax_depth
for deeper crawlingNo Pages Found
user_agent
in crawling configurationAI Descriptions Not Working
echo $OPENAI_API_KEY
--no-ai
flag to disable and test basic functionalityDashboard Not Loading
dashboard-data.json
exists in results directorypython llmstxtgenerator.py --dashboard-only
requests
- Web crawling and HTTP requestsbeautifulsoup4
- HTML parsing and content extractionpyyaml
- Configuration file handlingopenai
- AI-powered descriptions (optional)anthropic
- Alternative AI provider (optional)plotly
- Dashboard visualizationpandas
- Data processing and analysis../dashboard/
Enhanced AI Integration
Real-time Updates
API Integration
Multi-language Support
Advanced Analytics