autotest_avoid_topics
Comprehensive avoid topics testing with automated violation detection.
USAGE
autotest_avoid_topics(
target_bot,='medium',
test_intensity=None,
max_conversations=None,
judge_model=False,
verbose )
This function runs adversarial testing using QuestionProducerBot prompts and automatically evaluates responses using the enhanced JudgeBot to detect violations. It combines prompt generation, conversation testing, and violation analysis into a single, easy-to-use interface for comprehensive compliance validation.
Testing Framework: The function orchestrates a sophisticated testing pipeline that generates adversarial questions targeting configured avoid topics, conducts conversations with the target bot, and automatically evaluates responses for violations using structured evaluation criteria. This provides automated compliance testing with detailed analysis.
Automated Evaluation: Uses JudgeBot with PromptBuilder to systematically analyze bot responses for avoid topics violations, providing severity ratings, specific quotes, and detailed explanations. The evaluation is consistent and objective, removing human bias from compliance assessment.
Rich Reporting: Returns TestResults with comprehensive violation analysis, conversation transcripts, statistical summaries, and HTML representation for Jupyter notebooks. Results include export capabilities for further analysis and integration with quality assurance workflows.
Parameters
target_bot :
-
The ChatBot instance to test for avoid topics compliance. Must have avoid topics configured via the
.avoid()
method orPromptBuilder.avoid_topics()
in system prompt. test_intensity : str = 'medium'
-
Testing intensity level controlling number of conversations and strategies. Available levels:
"light"
(3 conversations),"medium"
(6 conversations),"thorough"
(10 conversations),"exhaustive"
(15 conversations). The default is"medium"
. max_conversations : int = None
-
Override for maximum number of conversations to run, superseding the intensity level setting. Use when you need precise control over test scope.
judge_model : = None
-
Model to use for automated judgment. If provided, will be set via
.model()
on theJudgeBot
. Defaults to inheriting model configuration fromtarget_bot=
for consistency. verbose : bool = False
-
Whether to show detailed output during testing including conversation progress and intermediate results. Default is
False
for clean output.
Returns
TestResults
-
Enhanced results object with rich reporting capabilities including individual conversation results with violation analysis, automated violation detection with severity ratings, statistical summaries and compliance metrics, HTML representation for Jupyter notebooks, and export capabilities for further analysis.
Examples
Basic avoid topics testing
Test a bot with a simple avoid topics configuration:
import talk_box as tb
# Configure bot with avoid topics
= (
bot
tb.ChatBot()"openai:gpt-4-turbo")
.provider_model("medical_advice", "financial_planning"]
.avoid([
)
# Run basic compliance testing
= tb.autotest_avoid_topics(bot, test_intensity="light")
results
# Check compliance results
print(f"Compliance rate: {results.summary['compliance_rate']:.1%}")
print(f"Violations found: {results.summary['total_violations']}")
View HTML-based summary of results:
results
Testing with PromptBuilder configuration
Test a bot configured with PromptBuilder
avoid topics:
import talk_box as tb
# Configure bot with `PromptBuilder`
= (
prompt
tb.PromptBuilder()"helpful assistant", "general support")
.persona("politics", "religion"])
.avoid_topics(["Always be respectful and professional")
.constraint(
.preview()
)
= tb.ChatBot().provider_model("openai:gpt-4-turbo").system_prompt(prompt)
bot
# Run thorough testing
= tb.autotest_avoid_topics(bot, test_intensity="thorough")
results
# Display detailed violation analysis
results.show_violations()
Advanced testing with custom configuration
Comprehensive testing with custom judge model and verbose output:
import talk_box as tb
# Configure specialized bot
= (
bot
tb.ChatBot()"gpt-4")
.model("legal_advice", "investment_recommendations"])
.avoid([0.3)
.temperature("customer service representative")
.persona(
)
# Run comprehensive testing with custom judge
= tb.autotest_avoid_topics(
results
bot,="exhaustive",
test_intensity="gpt-4",
judge_model=True
verbose
)
# Analyze results comprehensively
print(f"Total conversations: {len(results.conversation_results)}")
print(f"Compliance rate: {results.summary['compliance_rate']:.1%}")
# Export results for further analysis
if results.summary['total_violations'] > 0:
= results.get_violation_summary()
violation_details print("Violations requiring attention:")
for violation in violation_details:
print(f"- {violation['topic']}: {violation['severity']}")
# HTML display in notebooks
results
Integration Notes
- Avoid Topics Detection: automatically extracts avoid topics from bot configuration or system prompt
- Intensity Scaling: different intensity levels provide appropriate testing coverage for various use cases
- Automated Evaluation: judgeBot provides consistent, objective violation detection with detailed analysis
- Rich Reporting:
TestResults
includes comprehensive analysis, visualizations, and export capabilities - Quality Assurance: enables systematic compliance testing as part of development and deployment workflows
- Professional Integration: results format supports integration with quality assurance and compliance systems
The autotest_avoid_topics()
function provides comprehensive automated testing for avoid topics compliance, enabling systematic validation of chatbot behavior with detailed analysis and reporting capabilities suitable for professional development and deployment workflows.