Skip to main content

Overview

GRASP Evaluator is a comprehensive assessment tool that evaluates website content quality across five weighted dimensions to optimize for LLM understanding and response generation.
GRASP Evaluator Dashboard

Installation

cd tools/graspevaluator
pip install -r requirements.txt

Configuration

Set up your OpenAI API key in the .env file:
.env
OPENAI_API_KEY=your_openai_api_key_here

Quick Start

python graspevaluator.py --url https://example.com

Configuration File

Customize evaluation parameters in config/grasp_config.yaml:
grasp_config.yaml
targets:
  - url: "https://yoursite.com"

grounded:
  intents:
    - "How do I contact support?"
    - "What are your pricing options?"
    - "How do I get started?"

readable:
  target_audience: "general_public"

accurate:
  freshness_thresholds:
    high: 180    # 6 months
    medium: 365  # 1 year

polished:
  use_llm: true
  llm_model: "gpt-3.5-turbo"

Understanding Your Results

GRASP Score Calculation

The overall GRASP score is calculated using weighted metrics:
Range: 0-10 points
  • 9-10: Content provides comprehensive answers to customer intents
  • 7-8: Good content support with minor gaps
  • 5-6: Partial information with noticeable gaps
  • 3-4: Limited relevant information
  • 1-2: Little to no relevant information
Result: Pass/Fail
  • Pass: Content reading level matches target audience (±1 grade level)
  • Fail: Content is too complex or too simple for target audience
Target levels:
  • Elementary: Grades 3-6
  • High School: Grades 7-12
  • College: Grades 13-16
  • General Public: Grades 6-8
Ratings: High/Medium/Low
  • High: Content updated within 6 months
  • Medium: Content updated within 1 year
  • Low: Content older than 1 year or no date found
Date sources checked:
  • Meta tags
  • Schema.org markup
  • Time elements
  • Content patterns
  • HTTP headers
Ratings: Excellent/Good/Fair/Poor/Very PoorEvaluates:
  • Heading hierarchy (h1-h6)
  • Semantic HTML elements
  • Lists and tables
  • Schema.org markup
  • Open Graph tags
Ratings: Excellent/Good/Fair/Poor/Very PoorChecks:
  • Grammar and spelling errors
  • Punctuation issues
  • Style and readability
  • Error rate calculation

Grade Scale

A Grade

90-100 points (Excellent)

B Grade

80-89 points (Good)

C Grade

70-79 points (Fair)

D Grade

60-69 points (Poor)

Detailed Evaluations

Grounded Evaluation

The grounded metric uses AI to evaluate content quality by:
  1. Intent Processing: Takes customer intents from configuration
  2. Answer Generation: Uses content to answer each intent
  3. Quality Assessment: Evaluates answer completeness and accuracy
  4. Scoring: Provides 0-10 score based on content support
Configure relevant customer intents in your grasp_config.yaml for the most accurate grounded evaluation.

Readable Evaluation

Reading level assessment using multiple formulas:
  • Flesch-Kincaid Grade Level
  • Gunning Fog Index
  • Coleman-Liau Index
The evaluator averages these scores and compares against your target audience.

Accurate Evaluation

Since only the content creator can really know if the content is accurate, we use freshness as a proxy for accuracy. The idea is that if the content is updated regularly, it’s far more likely to be accurate than stale content. We look at content freshness from multiple sources:
1

Meta Tags

<meta name="last-modified">, <meta property="article:modified_time">
2

Schema.org

dateModified, datePublished properties
3

Time Elements

<time datetime=""> elements
4

Content Patterns

Date patterns in text content
5

HTTP Headers

Last-Modified header

Structured Evaluation

HTML structure analysis covering:
  • Heading Hierarchy: Proper h1-h6 usage and nesting
  • Semantic Elements: main, article, section, header, footer, nav
  • Data Structures: Lists, tables with proper markup
  • Schema Markup: JSON-LD, microdata, RDFa
  • Meta Properties: Open Graph, Twitter Cards

Polished Evaluation

  • AI-Powered (Default)
  • Rule-Based (Fallback)
Uses OpenAI API for comprehensive analysis:
  • Grammar checking
  • Spelling verification
  • Style assessment
  • Error rate calculation

Dashboard Integration

GRASP evaluation results automatically integrate with the master dashboard:

Score Visualization

Interactive charts showing metric breakdown and trends

Detailed Analysis

Comprehensive metric explanations and recommendations

Historical Tracking

Track improvements over time across evaluations

Recommendation Engine

Actionable suggestions for content improvement

Output Files

Results are saved in timestamped directories:
results/
└── 2024-01-15/
    ├── grasp_evaluation_results.json    # Detailed results
    ├── dashboard-data.json               # Dashboard format
    └── grasp_evaluator.log              # Execution log

API Requirements

OpenAI API Key Required for Grounded and Polished metrics. The tool includes fallback methods but AI analysis provides the most accurate results.
Rate Limits: The evaluator respects API limits with:
  • Built-in batching for intents
  • Configurable request delays
  • Automatic retry logic

Troubleshooting

Error: OPENAI_API_KEY not found in environment variables
Solution: Add your API key to the .env file in the tools directory.
Error: Rate limit exceeded
Solutions:
  • Reduce batch_size in configuration
  • Increase retry_delay in API settings
  • Use fewer customer intents
Warning: Content truncated for analysis
Solution: Increase max_content_length in grounded configuration.
Accurate metric: Low (No date found)
Solutions:
  • Add <meta name="last-modified"> tags
  • Include <time> elements with datetime attributes
  • Add schema.org dateModified properties

Advanced Usage

Custom Intents

Define specific customer intents for your business:
grounded:
  intents:
    - "How do I integrate your API?"
    - "What are the pricing tiers?"
    - "How do I troubleshoot connection issues?"
    - "What security measures do you have?"

Multiple Target Audiences

Configure different reading levels:
readable:
  target_audience: "college"  # Options: elementary, high_school, college, graduate, general_public

Custom Freshness Thresholds

Adjust based on your content update frequency:
accurate:
  freshness_thresholds:
    high: 90     # 3 months for fast-moving content
    medium: 180  # 6 months

Debug Mode

Run with verbose logging:
python graspevaluator.py --url https://example.com --verbose

Best Practices

Content Strategy

  • Align content with specific customer intents
  • Maintain appropriate reading level
  • Update content regularly
  • Use semantic HTML structure

Technical Implementation

  • Add proper meta tags
  • Include schema.org markup
  • Use semantic HTML5 elements
  • Implement proper heading hierarchy

Contributing

This tool is part of the Airbais suite. For contributions:
  1. Follow existing code patterns
  2. Add tests for new features
  3. Update documentation
  4. Ensure dashboard compatibility

Support

For issues and questions, check the troubleshooting section or consult the technical documentation.
I