EvalResult
Results for a single query evaluated against a single bot variant.
Usage
EvalResult()Parameters
variant: str-
Name of the bot variant.
query: str-
The query that was sent.
response: str-
The bot’s response.
scores: list[EvalScore] = list()-
List of dimension scores from the judge.
duration: float = 0.0- Time in seconds to generate the response.
Attributes
| Name | Description |
|---|---|
| avg_score | Average score across all dimensions. |
avg_score
Average score across all dimensions.
avg_score: float
Methods
| Name | Description |
|---|---|
| score_for() | Get score for a specific dimension, or None if not scored. |
score_for()
Get score for a specific dimension, or None if not scored.
Usage
score_for(dimension)