EvalResult

Results for a single query evaluated against a single bot variant.

Usage

Source

EvalResult()

Parameters

variant: str

Name of the bot variant.

query: str

The query that was sent.

response: str

The bot’s response.

scores: list[EvalScore] = list()

List of dimension scores from the judge.

duration: float = 0.0
Time in seconds to generate the response.

Attributes

Name Description
avg_score Average score across all dimensions.

avg_score

Average score across all dimensions.

avg_score: float

Methods

Name Description
score_for() Get score for a specific dimension, or None if not scored.

score_for()

Get score for a specific dimension, or None if not scored.

Usage

Source

score_for(dimension)