Professional AI tool that crawls websites and analyzes user intents with modern dashboard
A professional Python tool that crawls websites and analyzes the user intents conveyed to LLMs and AI agents. Part of the Airbais AI Tools Suite, IntentCrawler extracts content, discovers intents dynamically using multiple ML techniques, and provides modern interactive dashboards with light/dark mode support.
Respectful crawling with robots.txt compliance, automatic sitemap discovery, and configurable rate limiting
User-focused analysis (default), plus LDA topic modeling, sentence embeddings, and clustering
Professional web interface with Airbais design system, light/dark mode, and responsive layout
Outputs in llmstxt format and JSON for seamless LLM tool integration
Clone the repository
Install dependencies
Run your first analysis
Customize the tool’s behavior through config.yaml
:
Crawler Settings
Control how the tool crawls websites:
Intent Extraction
Configure the ML-powered intent discovery:
Output Organization
Configure how results are stored:
Content Extraction
Crawls website pages and extracts clean, structured content
Text Preprocessing
Removes noise, normalizes text, and prepares for analysis
Feature Extraction
Intent Clustering
Intent Merging
Combines similar intents based on configurable similarity threshold
Naming & Scoring
Automatically generates descriptive intent names and confidence scores
Discovers latent topics across all content with configurable topic counts
Uses sentence transformers and DBSCAN for semantic understanding
Configurable keywords ensure baseline intent detection
Results are organized by date for easy historical tracking:
Two dashboard options: Local tool-specific dashboard and Master multi-tool dashboard
Professional orange/gray color scheme with Inter font family
Toggle themes with persistent user preferences
Works perfectly on desktop and mobile devices
Optimized loading and smooth interactions
The website URL to analyze
Path to custom configuration file
Override default output directory
Set logging level: DEBUG, INFO, WARNING, ERROR
Launch dashboard after analysis completes
View existing results without running analysis
View results from specific date (YYYY-MM-DD)
List all available result dates
Processing time increases with site size and enabled ML features
Small Sites (<100 pages)
All features work well with default settings
Medium Sites (100-500 pages)
Consider reducing LDA topics for faster processing
Large Sites (500-1000 pages)
May need to disable embeddings or increase rate limiting
No intents discovered
Slow processing
Dashboard not loading
requirements.txt
for full dependency listIntentCrawler is part of the larger Airbais AI Tools Suite with centralized dashboard
Centralized view of all AI tool results at ../dashboard/
New tools are automatically detected and integrated
JSON output compatible with other suite tools
Shared Airbais design system across all tools
Multi-language Support
Expand beyond English content analysis
Real-time Tracking
Monitor intent changes over time with the master dashboard
A/B Testing
Compare intents across different site versions
Suite Expansion
Add sentiment analysis, performance monitoring, and SEO tools
API Access
Programmatic access to all suite tools through unified API
We welcome contributions in these key areas:
Additional clustering algorithms and ML techniques
Enhanced dashboard features and data visualization
Optimization for large-scale websites
CMS plugins and third-party tool connections