EntityAggregatorFilter#

class EntityAggregatorFilter(config)#

Bases: Filter

The entity aggregator Filter creates a structured prediction field based on detected entities.

This filter examines entities that have been detected in previous pipeline steps (typically from NER or entity extraction steps) and aggregates them into a structured prediction field with entity types as keys and unique entity names as values.

Input - List[InputEntity]: Entities conforming to the InputEntity Pydantic model

Output - PredictionOutput: Structured prediction field conforming to PredictionOutput model

Data Models: - InputEntity: Defines the expected structure for input entities with type, name, and extracts - PredictionOutput: Defines the output structure as {entity_type: [unique_entity_names]}

Parameters:
  • entities_field (str, 'entities') – Field containing detected entities

  • output_field (str, 'prediction') – Field to write resulting prediction structure

  • entity_types (list, None) – Optional list to filter which entity types to include. If None, all entity types will be used.

Example

{
    "step": "filter",
    "type": "entity_aggregator",
    "entities_field": "entities",
    "output_field": "prediction",
    "entity_types": ["PERSON", "ORG", "GPE"]
}

Methods Summary

process_doc(doc)

Process a document and generate prediction structure from entities

Methods Documentation

process_doc(doc)#

Process a document and generate prediction structure from entities

Parameters:

doc – Document with entities field conforming to List[InputEntity] structure

Returns:

Document with prediction field conforming to PredictionOutput structure