ContextWindow

Manages fitting content into a model’s token budget.

Usage

ContextWindow()

Combines token estimation with truncation strategies to ensure prompts and conversation messages stay within a model’s context window.

Parameters

max_tokens: int | None = None: Explicit token budget. If provided, overrides any model profile lookup.
model: str | ModelProfile | None = None: A model key (e.g., "ollama:llama3.2:latest") or a ~talk_box.models.ModelProfile instance. The profile’s context_window is used as the budget.
reserve_output: int = 1024: Tokens to reserve for the model’s response. Subtracted from the budget before fitting input content. Defaults to 1024.
strategy: str | FitStrategy = FitStrategy.TRUNCATE_OLDEST: The default strategy for fitting content. Can be overridden per call.
token_counter: Callable[[str], int] | None = None: Optional custom token counting function. Defaults to estimate_tokens.

Examples

import talk_box as tb

# From a model profile
ctx = tb.ContextWindow(model="ollama:llama3.2:latest")

# With explicit budget
ctx = tb.ContextWindow(max_tokens=8192)

# With custom settings
ctx = tb.ContextWindow(
    max_tokens=32_768,
    reserve_output=4096,
    strategy="truncate_middle",
)

Attributes

Name	Description
input_budget	Tokens available for input (max_tokens - reserve_output).
max_tokens	Total context window budget in tokens.
reserve_output	Tokens reserved for the model’s response.

input_budget

Tokens available for input (max_tokens - reserve_output).

input_budget: int

max_tokens

Total context window budget in tokens.

max_tokens: int

reserve_output

Tokens reserved for the model’s response.

reserve_output: int

Methods

Name	Description
count_tokens()	Count tokens in text using the configured counter.
fit_messages()	Fit a list of conversation messages into the token budget.
fit_prompt()	Fit a PromptBuilder’s output into the token budget.
fits()	Check whether text fits within the input budget.
overflow()	Calculate how many tokens over budget the text is.

count_tokens()

Count tokens in text using the configured counter.

Usage

Source

count_tokens(text)

Parameters

text: str: The text to count tokens for.

Returns

int: Token count.

fit_messages()

Fit a list of conversation messages into the token budget.

Usage

Source

fit_messages(messages, *, system_prompt="", strategy=None)

The system prompt is always preserved. Messages are dropped according to the selected strategy until they fit within the remaining budget.

Parameters

messages: list[dict[str, str] | Message]: List of message dicts (with "role" and "content" keys) or ~talk_box.conversation.Message objects.
system_prompt: str = "": The system prompt text (always preserved in full).
strategy: str | FitStrategy | None = None: Override the default strategy for this call.

Returns

FitResult: The fitted messages plus metadata about what was dropped.

Examples

import talk_box as tb

ctx = tb.ContextWindow(max_tokens=4096, reserve_output=512)
messages = [
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hi there! How can I help?"},
    {"role": "user", "content": "Tell me about Python."},
]
result = ctx.fit_messages(messages, system_prompt="You are helpful.")
print(f"Using {result.tokens_used}/{result.token_budget} tokens")
print(f"Dropped {result.messages_dropped} messages")

fit_prompt()

Fit a PromptBuilder’s output into the token budget.

Usage

Source

fit_prompt(builder)

If the full prompt fits, returns it unchanged. Otherwise, drops lowest-priority sections until it fits.

Parameters

builder: PromptBuilder: A ~talk_box.prompt_builder.PromptBuilder instance.

Returns

PromptFitResult: The fitted prompt text and metadata.

Examples

import talk_box as tb

builder = (
    tb.PromptBuilder()
    .persona("analyst", "data science")
    .task_context("Analyze sales data")
    .constraint("Be concise")
    .example("Q: Revenue?", "A: $1.2M")
)
ctx = tb.ContextWindow(max_tokens=2048)
result = ctx.fit_prompt(builder)
print(f"Prompt uses {result.tokens_used} tokens")

fits()

Check whether text fits within the input budget.

Usage

Source

fits(text)

Parameters

text: str: Text to check.

Returns

bool: True if token count is within budget.

overflow()

Calculate how many tokens over budget the text is.

Usage

Source

overflow(text)

Parameters

text: str: Text to check.

Returns

int: Number of tokens over budget (0 if within budget).