EnrichmentPipeline
Configurable pipeline for enriching documents in a knowledge graph.
Usage
EnrichmentPipeline()The pipeline applies an enrichment function to document nodes and creates entity nodes, topic nodes, and relationship edges in the graph. Enrichment is incremental: documents are only re-enriched when their content changes.
Parameters
enrich_fn: EnrichmentFn-
A callable
(title, content) -> EnrichmentResultthat performs the actual extraction. This is where LLM calls happen. entity_prefix: str = "entity"-
Prefix for generated entity node IDs.
topic_prefix: str = "topic"- Prefix for generated topic node IDs.
Examples
import talk_box as tb
def my_enricher(title: str, content: str) -> tb.EnrichmentResult:
# Call your LLM here
return tb.EnrichmentResult(
entities=[tb.ExtractedEntity(name="Python", entity_type="technology")],
topics=["programming"],
summary="A document about Python.",
)
pipeline = tb.EnrichmentPipeline(enrich_fn=my_enricher)
kg = tb.KnowledgeGraph(":memory:")
# ... add document nodes via sync() ...
result = pipeline.run(kg)
result.enriched # number of documents enrichedMethods
| Name | Description |
|---|---|
| run() | Run enrichment on document nodes in the knowledge graph. |
run()
Run enrichment on document nodes in the knowledge graph.
Usage
run(kg, *, limit=100, force=False)Parameters
kg: Any-
A
~talk_box.knowledge_graph.KnowledgeGraphinstance. limit: int = 100-
Maximum number of documents to enrich per run.
force: bool = False-
If
True, re-enrich all documents regardless of whether they’ve been enriched before.
Returns
PipelineResult- Summary of enrichment activity.