Default Configurations

Learn how to use default configuration files to define reusable settings and custom component registrations.

Overview

Default configuration files allow you to:

Define common settings once, use everywhere
Register custom components globally
Reduce duplication across test configs
Maintain consistent configuration across projects
Share team-wide defaults

Configuration Hierarchy

Judge LLM merges configuration from three sources (in order of precedence):

Global defaults (~/.judge_llm/defaults.yaml) - User-wide settings
Project defaults (.judge_llm.defaults.yaml) - Project-specific settings
Test config (test.yaml) - Test-specific settings

Values in later files override earlier ones.

Quick Start

1. Create Default Config

Create .judge_llm.defaults.yaml in your project root:

# .judge_llm.defaults.yaml
providers:
  - type: gemini
    model: gemini-2.0-flash-exp
    temperature: 0.0

evaluators:
  - type: response_evaluator
    llm_provider: gemini
  - type: cost_evaluator
    max_cost: 0.05

reporters:
  - type: console

2. Create Simple Test Config

Your test configs become much simpler:

# test.yaml
dataset:
  loader: local_file
  paths:
    - ./tests.json

providers:
  - agent_id: my_test  # Other settings from defaults

3. Run Evaluation

judge-llm run --config test.yaml

The configuration is merged automatically!

Project Defaults

Place .judge_llm.defaults.yaml in your project root.

Basic Defaults

# .judge_llm.defaults.yaml

# Default provider settings
providers:
  - type: gemini
    model: gemini-2.0-flash-exp
    temperature: 0.0
    max_tokens: 1024

# Default evaluators
evaluators:
  - type: response_evaluator
    llm_provider: gemini
  - type: cost_evaluator
    max_cost: 0.05
  - type: latency_evaluator
    max_latency: 5.0

# Default reporters
reporters:
  - type: console
  - type: json
    output_path: ./results/latest.json

Using Defaults in Tests

# test.yaml
dataset:
  loader: local_file
  paths: [./tests.json]

providers:
  - agent_id: test_agent  # Inherits type, model, temperature from defaults

Merged result:

dataset:
  loader: local_file
  paths: [./tests.json]

providers:
  - type: gemini              # From defaults
    model: gemini-2.0-flash-exp  # From defaults
    temperature: 0.0          # From defaults
    max_tokens: 1024          # From defaults
    agent_id: test_agent      # From test config

evaluators:                   # From defaults
  - type: response_evaluator
    llm_provider: gemini
  - type: cost_evaluator
    max_cost: 0.05
  - type: latency_evaluator
    max_latency: 5.0

reporters:                    # From defaults
  - type: console
  - type: json
    output_path: ./results/latest.json

Global Defaults

Place defaults.yaml in ~/.judge_llm/ for user-wide settings.

Setup

mkdir -p ~/.judge_llm
vim ~/.judge_llm/defaults.yaml

Example Global Defaults

# ~/.judge_llm/defaults.yaml

# API keys (if not using .env)
providers:
  - type: gemini
    api_key: ${GEMINI_API_KEY}
    temperature: 0.0
  - type: openai
    api_key: ${OPENAI_API_KEY}
    temperature: 0.0
  - type: anthropic
    api_key: ${ANTHROPIC_API_KEY}
    temperature: 0.0

# Always use console output
reporters:
  - type: console

Overriding Defaults

Test configs can override any default value.

Override Provider Model

# .judge_llm.defaults.yaml
providers:
  - type: gemini
    model: gemini-2.0-flash-exp
    temperature: 0.0

# test.yaml
providers:
  - agent_id: test
    model: gemini-pro  # Override default model

Override Evaluators

# .judge_llm.defaults.yaml
evaluators:
  - type: cost_evaluator
    max_cost: 0.05

# test.yaml
evaluators:
  - type: cost_evaluator
    max_cost: 0.01  # Stricter cost limit for this test

Add to Defaults

# .judge_llm.defaults.yaml
reporters:
  - type: console

# test.yaml
reporters:
  - type: console
  - type: html  # Add HTML reporter to defaults
    output_path: ./report.html

Custom Component Registration

Registering Custom Providers

# .judge_llm.defaults.yaml
providers:
  - type: custom
    module_path: ./providers/my_provider.py
    class_name: MyCustomProvider
    register_as: my_provider  # ← Register globally

Use by name:

# test.yaml
providers:
  - type: my_provider  # ← Use by name
    agent_id: test

Registering Custom Evaluators

# .judge_llm.defaults.yaml
evaluators:
  - type: custom
    module_path: ./evaluators/safety.py
    class_name: SafetyEvaluator
    register_as: safety

Use by name:

# test.yaml
evaluators:
  - type: safety
  - type: response_evaluator

Registering Custom Reporters

# .judge_llm.defaults.yaml
reporters:
  - type: custom
    module_path: ./reporters/slack.py
    class_name: SlackReporter
    register_as: slack

Use by name:

# test.yaml
reporters:
  - type: slack
    webhook_url: ${SLACK_WEBHOOK_URL}

Complete Registration Example

# .judge_llm.defaults.yaml

# Register custom provider
providers:
  - type: custom
    module_path: ./providers/custom_provider.py
    class_name: CustomProvider
    register_as: custom_provider

# Register custom evaluator
evaluators:
  - type: custom
    module_path: ./evaluators/safety.py
    class_name: SafetyEvaluator
    register_as: safety
  
  - type: custom
    module_path: ./evaluators/tone.py
    class_name: ToneEvaluator
    register_as: tone

# Register custom reporter
reporters:
  - type: custom
    module_path: ./reporters/csv_reporter.py
    class_name: CSVReporter
    register_as: csv
  
  - type: custom
    module_path: ./reporters/slack_reporter.py
    class_name: SlackReporter
    register_as: slack

Use everywhere:

# test.yaml
dataset:
  loader: local_file
  paths: [./tests.json]

providers:
  - type: custom_provider  # Registered name
    agent_id: test

evaluators:
  - type: safety  # Registered name
  - type: tone    # Registered name

reporters:
  - type: csv     # Registered name
    output_path: ./results.csv
  - type: slack   # Registered name
    webhook_url: ${SLACK_WEBHOOK_URL}

Environment-Specific Defaults

Development Defaults

# .judge_llm.defaults.yaml (development)
providers:
  - type: gemini
    model: gemini-2.0-flash-exp
    temperature: 0.0

evaluators:
  - type: response_evaluator
  - type: cost_evaluator
    max_cost: 0.1  # More lenient for dev

reporters:
  - type: console

Production Defaults

# .judge_llm.defaults.yaml (production)
providers:
  - type: gemini
    model: gemini-2.0-flash-exp
    temperature: 0.0

evaluators:
  - type: response_evaluator
  - type: trajectory_evaluator
  - type: cost_evaluator
    max_cost: 0.01  # Stricter for prod
  - type: latency_evaluator
    max_latency: 3.0

reporters:
  - type: console
  - type: database
    db_path: ./prod_results.db
  - type: json
    output_path: ./results/prod-${date}.json

Managing Multiple Environments

# Use different default files
cp .judge_llm.defaults.dev.yaml .judge_llm.defaults.yaml
judge-llm run --config test.yaml

cp .judge_llm.defaults.prod.yaml .judge_llm.defaults.yaml
judge-llm run --config test.yaml

Or use environment variable:

# Set environment
export JUDGE_LLM_ENV=production

# Load environment-specific defaults in Python
import os
from pathlib import Path

env = os.getenv('JUDGE_LLM_ENV', 'development')
defaults_file = f'.judge_llm.defaults.{env}.yaml'

# Your evaluation code...

Best Practices

1. Keep Defaults Generic

# Good - Generic defaults
providers:
  - type: gemini
    model: gemini-2.0-flash-exp
    temperature: 0.0

# Bad - Test-specific in defaults
providers:
  - type: gemini
    agent_id: specific_test_agent  # Too specific

2. Use Test Configs for Specifics

# .judge_llm.defaults.yaml - Generic
providers:
  - type: gemini
    model: gemini-2.0-flash-exp

# test.yaml - Specific
providers:
  - agent_id: math_test_agent
  - agent_id: language_test_agent

3. Document Your Defaults

# .judge_llm.defaults.yaml

# Default provider configuration
# Uses Gemini Flash for cost efficiency
providers:
  - type: gemini
    model: gemini-2.0-flash-exp
    temperature: 0.0  # Deterministic for testing

# Standard evaluation criteria
evaluators:
  - type: response_evaluator
  - type: cost_evaluator
    max_cost: 0.05  # Maximum $0.05 per test case

# Always output to console for immediate feedback
reporters:
  - type: console

4. Version Control Defaults

git add .judge_llm.defaults.yaml
git commit -m "Add project defaults"

5. Separate Custom Components

project/
├── .judge_llm.defaults.yaml  # References custom components
├── providers/
│   └── custom_provider.py
├── evaluators/
│   ├── safety.py
│   └── tone.py
└── reporters/
    ├── csv_reporter.py
    └── slack_reporter.py

Common Patterns

Shared Team Defaults

# .judge_llm.defaults.yaml (checked into git)

# Team-wide provider settings
providers:
  - type: gemini
    model: gemini-2.0-flash-exp
    temperature: 0.0
    api_key: ${GEMINI_API_KEY}  # Each dev sets their own

# Consistent evaluation criteria
evaluators:
  - type: response_evaluator
  - type: cost_evaluator
    max_cost: 0.05

# Standard reporting
reporters:
  - type: console
  - type: database
    db_path: ${DB_PATH:-./results.db}

Personal Global Defaults

# ~/.judge_llm/defaults.yaml (personal machine)

# Personal API keys
providers:
  - type: gemini
    api_key: ${GEMINI_API_KEY}
  - type: openai
    api_key: ${OPENAI_API_KEY}

# Always include console output
reporters:
  - type: console

CI/CD Defaults

# .judge_llm.defaults.yaml (for CI/CD)

providers:
  - type: gemini
    model: gemini-2.0-flash-exp
    temperature: 0.0
    api_key: ${GEMINI_API_KEY}  # From CI secrets

evaluators:
  - type: response_evaluator
  - type: cost_evaluator
    max_cost: 0.01
  - type: latency_evaluator
    max_latency: 5.0

reporters:
  - type: console
  - type: json
    output_path: ./ci_results.json
  - type: html
    output_path: ./ci_report.html

Troubleshooting

Defaults Not Loading

Issue: Defaults file exists but not being applied

Solutions:

Check filename: .judge_llm.defaults.yaml (note the leading dot)
Verify file location (project root or ~/.judge_llm/)
Check YAML syntax: yamllint .judge_llm.defaults.yaml

Unexpected Values

Issue: Getting unexpected configuration values

Solution: Remember precedence order:

Global defaults (lowest precedence)
Project defaults
Test config (highest precedence)

Check each file to see where value is defined.

Custom Component Not Found

Issue: Module not found: ./providers/custom.py

Solutions:

Verify module_path is correct relative to project root
Ensure file exists: ls ./providers/custom.py
Check Python path if using absolute imports

Registration Not Working

Issue: Custom component not available by registered name

Solutions:

Ensure register_as field is present
Check registration happens in defaults file
Verify component registration before use

Overview​

Configuration Hierarchy​

Quick Start​

1. Create Default Config​

2. Create Simple Test Config​

3. Run Evaluation​

Project Defaults​

Basic Defaults​

Using Defaults in Tests​

Global Defaults​

Setup​

Example Global Defaults​

Overriding Defaults​

Override Provider Model​

Override Evaluators​

Add to Defaults​

Custom Component Registration​

Registering Custom Providers​

Registering Custom Evaluators​

Registering Custom Reporters​

Complete Registration Example​

Environment-Specific Defaults​

Development Defaults​

Production Defaults​

Managing Multiple Environments​

Best Practices​

1. Keep Defaults Generic​

2. Use Test Configs for Specifics​

3. Document Your Defaults​

4. Version Control Defaults​

5. Separate Custom Components​

Common Patterns​

Shared Team Defaults​

Personal Global Defaults​

CI/CD Defaults​

Troubleshooting​

Defaults Not Loading​

Unexpected Values​

Custom Component Not Found​

Registration Not Working​

Related Documentation​

Overview

Configuration Hierarchy

Quick Start

1. Create Default Config

2. Create Simple Test Config

3. Run Evaluation

Project Defaults

Basic Defaults

Using Defaults in Tests

Global Defaults

Setup

Example Global Defaults

Overriding Defaults

Override Provider Model

Override Evaluators

Add to Defaults

Custom Component Registration

Registering Custom Providers

Registering Custom Evaluators

Registering Custom Reporters

Complete Registration Example

Environment-Specific Defaults

Development Defaults

Production Defaults

Managing Multiple Environments

Best Practices

1. Keep Defaults Generic

2. Use Test Configs for Specifics

3. Document Your Defaults

4. Version Control Defaults

5. Separate Custom Components

Common Patterns

Shared Team Defaults

Personal Global Defaults

CI/CD Defaults

Troubleshooting

Defaults Not Loading

Unexpected Values

Custom Component Not Found

Registration Not Working

Related Documentation