Documentation

Bench AF Documentation

The Alignment Faking Benchmark - a comprehensive evaluation framework for testing model organisms.

Key Features

Model Evaluation

Test alignment-faking and aligned models across various environments.

Configurable Testing

Flexible parameters for sample size, concurrency, and model types.

Comprehensive Analysis

Detailed logging, metrics, and visualization tools for results.

Documentation Structure

Getting Started

  • โ€ข System overview and architecture
  • โ€ข Installation and setup
  • โ€ข Basic usage examples

Advanced Usage

  • โ€ข Model validation details
  • โ€ข Environment configuration
  • โ€ข Results analysis and interpretation