consensus

Usage

Consensus mode: query multiple models, compare outputs, flag disagreements.

Classes

Name Description
ConsensusResult Result of running consensus across multiple model responses.
ConsensusStrategy Strategy for resolving consensus across multiple model responses.
Disagreement A detected disagreement between model responses.
ModelResponse A single model’s response to a prompt.

ConsensusResult

Result of running consensus across multiple model responses.

Usage

Source

ConsensusResult(
    winner,
    winner_model,
    agreement_score,
    strategy,
    responses=list(),
    disagreements=list(),
    consensus_reached=True
)
Parameters
winner: str

The winning/selected response text.

winner_model: str

The model that produced the winning response.

agreement_score: float

Overall agreement score from 0.0 (complete disagreement) to 1.0 (unanimous).

strategy: ConsensusStrategy

The consensus strategy that was used.

responses: list[ModelResponse] = list()

All individual model responses that were compared.

disagreements: list[Disagreement] = list()

Detected disagreements between responses.

consensus_reached: bool = True
Whether consensus was successfully reached (relevant for UNANIMOUS strategy).

ConsensusStrategy

Strategy for resolving consensus across multiple model responses.

Usage

Source

ConsensusStrategy()
Attributes
MAJORITY

The response most similar to the majority wins.

UNANIMOUS

All responses must substantially agree; otherwise consensus fails.

MOST_COMMON

Select the most frequently occurring response (exact or near-match).

WEIGHTED
Responses are weighted by model quality/cost tier (higher tier = more weight).

Disagreement

A detected disagreement between model responses.

Usage

Source

Disagreement(description, models, severity="moderate")
Parameters
description: str

Human-readable description of the disagreement.

models: tuple[str, …]

The models involved in the disagreement.

severity: str = "moderate"
Severity level: "minor" (stylistic), "moderate" (factual nuance), "major" (contradictory claims).

ModelResponse

A single model’s response to a prompt.

Usage

Source

ModelResponse(model, text, latency_ms=None, token_count=None, weight=1.0)
Parameters
model: str

Model identifier (e.g., "anthropic:claude-sonnet-4-6").

text: str

The response text from the model.

latency_ms: float | None = None

Response latency in milliseconds, if measured.

token_count: int | None = None

Number of tokens in the response, if known.

weight: float = 1.0
Optional weight for weighted consensus (default 1.0).

Functions

Name Description
consensus() Determine consensus across multiple model responses.
find_disagreements() Detect disagreements between model responses.

consensus()

Determine consensus across multiple model responses.

Usage

Source

consensus(
    responses, *, strategy=ConsensusStrategy.MAJORITY, unanimous_threshold=0.7
)

Compares the provided responses and selects a winner based on the chosen strategy. Also detects and reports disagreements.

Parameters
responses: list[ModelResponse]

List of model responses to compare. Must contain at least one response.

strategy: ConsensusStrategy = ConsensusStrategy.MAJORITY

The consensus strategy to use.

unanimous_threshold: float = 0.7
For the UNANIMOUS strategy, the minimum pairwise similarity required for consensus to be reached (default 0.7).
Returns
ConsensusResult
The consensus outcome including winner, agreement score, and disagreements.
Raises
ValueError
If responses is empty.
Examples
import talk_box as tb

responses = [
    tb.ModelResponse(model="anthropic:claude-sonnet-4-6", text="Python is a programming language."),
    tb.ModelResponse(model="openai:gpt-4o", text="Python is a high-level programming language."),
    tb.ModelResponse(model="google:gemini-2.5-flash", text="Python is an interpreted programming language."),
]

result = tb.consensus(responses, strategy=tb.ConsensusStrategy.MAJORITY)
result.winner           # "Python is a high-level programming language."
result.agreement_score  # ~0.75
result.consensus_reached  # True

find_disagreements()

Detect disagreements between model responses.

Usage

Source

find_disagreements(responses)

Compares each pair of responses and flags significant differences. Uses word-level similarity to classify severity.

Parameters
responses: list[ModelResponse]
List of model responses to compare.
Returns
list[Disagreement]
Detected disagreements, sorted by severity (major first).
Examples
import talk_box as tb

responses = [
    tb.ModelResponse(model="model_a", text="Python was created in 1991."),
    tb.ModelResponse(model="model_b", text="Python was created in 1989."),
]

disagreements = tb.find_disagreements(responses)
disagreements[0].severity  # "major"