Skip to main content
Judge LLM Logo

Judge LLM

A lightweight, extensible framework for evaluating and comparing LLM providers

Easy to Use

Easy to Use

Docusaurus was designed from the ground up to be easily installed and used to get your website up and running quickly.

Focus on What Matters

Focus on What Matters

Docusaurus lets you focus on your docs, and we'll do the chores. Go ahead and move your docs into the docs directory.

Powered by React

Powered by React

Extend or customize your website layout by reusing React. Docusaurus can be extended while reusing the same header and footer.

See It In Action

Judge LLM Demo

CLI Usage

# Run evaluation
judge-llm run --config config.yaml

# List providers
judge-llm list providers

# Generate dashboard
judge-llm dashboard --db results.db

Python API

from judge_llm import evaluate

report = evaluate(
    config="config.yaml"
)

print(f"Success: {report.success_rate:.1%}")