GRASP Evaluator - Airbais AI Tools

Overview

GRASP Evaluator is a comprehensive assessment tool that evaluates website content quality across five weighted dimensions to optimize for LLM understanding and response generation.

Grounded

Evaluates content alignment with customer intents using AI

Readability

Analyzes reading level for target audience matching

Accuracy

Assesses content freshness as a way to determine accuracy

Structure

Evaluates semantic HTML for optimal LLM consumption

Polish

Evaluates content for grammar, spelling, and style

Installation

cd tools/graspevaluator
pip install -r requirements.txt

Configuration

Set up your OpenAI API key in the .env file:

.env

OPENAI_API_KEY=your_openai_api_key_here

Quick Start

python graspevaluator.py --url https://example.com

Configuration File

Customize evaluation parameters in config/grasp_config.yaml:

grasp_config.yaml

targets:
  - url: "https://yoursite.com"

grounded:
  intents:
    - "How do I contact support?"
    - "What are your pricing options?"
    - "How do I get started?"

readable:
  target_audience: "general_public"

accurate:
  freshness_thresholds:
    high: 180    # 6 months
    medium: 365  # 1 year

polished:
  use_llm: true
  llm_model: "gpt-3.5-turbo"

Understanding Your Results

GRASP Score Calculation

The overall GRASP score is calculated using weighted metrics:

Grounded (40% weight)

Range: 0-10 points

9-10: Content provides comprehensive answers to customer intents
7-8: Good content support with minor gaps
5-6: Partial information with noticeable gaps
3-4: Limited relevant information
1-2: Little to no relevant information

Readable (10% weight)

Result: Pass/Fail

Pass: Content reading level matches target audience (±1 grade level)
Fail: Content is too complex or too simple for target audience

Target levels:

Elementary: Grades 3-6
High School: Grades 7-12
College: Grades 13-16
General Public: Grades 6-8

Accurate (30% weight)

Ratings: High/Medium/Low

High: Content updated within 6 months
Medium: Content updated within 1 year
Low: Content older than 1 year or no date found

Date sources checked:

Meta tags
Schema.org markup
Time elements
Content patterns
HTTP headers

Structured (10% weight)

Ratings: Excellent/Good/Fair/Poor/Very PoorEvaluates:

Heading hierarchy (h1-h6)
Semantic HTML elements
Lists and tables
Schema.org markup
Open Graph tags

Polished (10% weight)

Ratings: Excellent/Good/Fair/Poor/Very PoorChecks:

Grammar and spelling errors
Punctuation issues
Style and readability
Error rate calculation

Grade Scale

A Grade

90-100 points (Excellent)

B Grade

80-89 points (Good)

C Grade

70-79 points (Fair)

D Grade

60-69 points (Poor)

Detailed Evaluations

Grounded Evaluation

The grounded metric uses AI to evaluate content quality by:

Intent Processing: Takes customer intents from configuration
Answer Generation: Uses content to answer each intent
Quality Assessment: Evaluates answer completeness and accuracy
Scoring: Provides 0-10 score based on content support

Configure relevant customer intents in your grasp_config.yaml for the most accurate grounded evaluation.

Readable Evaluation

Reading level assessment using multiple formulas:

Flesch-Kincaid Grade Level
Gunning Fog Index
Coleman-Liau Index

The evaluator averages these scores and compares against your target audience.

Accurate Evaluation

Since only the content creator can really know if the content is accurate, we use freshness as a proxy for accuracy. The idea is that if the content is updated regularly, it’s far more likely to be accurate than stale content. We look at content freshness from multiple sources:

Meta Tags

<meta name="last-modified">, <meta property="article:modified_time">

Schema.org

dateModified, datePublished properties

Time Elements

<time datetime=""> elements

Content Patterns

Date patterns in text content

HTTP Headers

Last-Modified header

Structured Evaluation

HTML structure analysis covering:

Heading Hierarchy: Proper h1-h6 usage and nesting
Semantic Elements: main, article, section, header, footer, nav
Data Structures: Lists, tables with proper markup
Schema Markup: JSON-LD, microdata, RDFa
Meta Properties: Open Graph, Twitter Cards

Polished Evaluation

AI-Powered (Default)
Rule-Based (Fallback)

Uses OpenAI API for comprehensive analysis:

Grammar checking
Spelling verification
Style assessment
Error rate calculation

Dashboard Integration

GRASP evaluation results automatically integrate with the master dashboard:

Score Visualization

Interactive charts showing metric breakdown and trends

Detailed Analysis

Comprehensive metric explanations and recommendations

Historical Tracking

Track improvements over time across evaluations

Recommendation Engine

Actionable suggestions for content improvement

Output Files

Results are saved in timestamped directories:

results/
└── 2024-01-15/
    ├── grasp_evaluation_results.json    # Detailed results
    ├── dashboard-data.json               # Dashboard format
    └── grasp_evaluator.log              # Execution log

API Requirements

OpenAI API Key Required for Grounded and Polished metrics. The tool includes fallback methods but AI analysis provides the most accurate results.

Rate Limits: The evaluator respects API limits with:

Built-in batching for intents
Configurable request delays
Automatic retry logic

Troubleshooting

Missing API Key

Error: OPENAI_API_KEY not found in environment variables

Solution: Add your API key to the .env file in the tools directory.

Rate Limiting

Error: Rate limit exceeded

Solutions:

Reduce batch_size in configuration
Increase retry_delay in API settings
Use fewer customer intents

Content Too Long

Warning: Content truncated for analysis

Solution: Increase max_content_length in grounded configuration.

No Date Found

Accurate metric: Low (No date found)

Solutions:

Add <meta name="last-modified"> tags
Include <time> elements with datetime attributes
Add schema.org dateModified properties

Advanced Usage

Custom Intents

Define specific customer intents for your business:

grounded:
  intents:
    - "How do I integrate your API?"
    - "What are the pricing tiers?"
    - "How do I troubleshoot connection issues?"
    - "What security measures do you have?"

Multiple Target Audiences

Configure different reading levels:

readable:
  target_audience: "college"  # Options: elementary, high_school, college, graduate, general_public

Custom Freshness Thresholds

Adjust based on your content update frequency:

accurate:
  freshness_thresholds:
    high: 90     # 3 months for fast-moving content
    medium: 180  # 6 months

Debug Mode

Run with verbose logging:

python graspevaluator.py --url https://example.com --verbose

Best Practices

Content Strategy

Align content with specific customer intents
Maintain appropriate reading level
Update content regularly
Use semantic HTML structure

Technical Implementation

Add proper meta tags
Include schema.org markup
Use semantic HTML5 elements
Implement proper heading hierarchy

Contributing

This tool is part of the Airbais suite. For contributions:

Follow existing code patterns
Add tests for new features
Update documentation
Ensure dashboard compatibility

Support

For issues and questions, check the troubleshooting section or consult the technical documentation.

Get Started

What's New

Tools

Automation

Learn

​Overview

Grounded

Readability

Accuracy

Structure

Polish

​Installation

​Configuration

​Quick Start

​Configuration File

​Understanding Your Results

​GRASP Score Calculation

​Grade Scale

A Grade

B Grade

C Grade

D Grade

​Detailed Evaluations

​Grounded Evaluation

​Readable Evaluation

​Accurate Evaluation

​Structured Evaluation

​Polished Evaluation

​Dashboard Integration

Score Visualization

Detailed Analysis

Historical Tracking

Recommendation Engine

​Output Files

​API Requirements

​Troubleshooting

​Advanced Usage

​Custom Intents

​Multiple Target Audiences

​Custom Freshness Thresholds

​Debug Mode

​Best Practices

Content Strategy

Technical Implementation

​Contributing

Support

Overview

Installation

Configuration

Quick Start

Configuration File

Understanding Your Results

GRASP Score Calculation

Grade Scale

Detailed Evaluations

Grounded Evaluation

Readable Evaluation

Accurate Evaluation

Structured Evaluation

Polished Evaluation

Dashboard Integration

Output Files

API Requirements

Troubleshooting

Advanced Usage

Custom Intents

Multiple Target Audiences

Custom Freshness Thresholds

Debug Mode

Best Practices

Contributing