AggregateFilter#

class AggregateFilter(config)#

Bases: Filter

The aggregate Filter concatenates / sums up the field values of the aggregated_fields grouped by the value of the aggregating_field

Input - the aggregated_fields need to be of type int, float or list

Output - the output field is filled with data of type int, float or list according to the type of the aggregated_fields

Parameters:
  • type (str) – aggregate

  • aggregated_fields (list) – Fields to aggregate

  • aggregating_field (str) – Field to aggregate into

  • output_field (str, None) – Field to store the aggregation (defaults to aggregating_field)

  • use_separate_docs (bool, True) – Whether or not to use a separate doc for each aggregating value

Example

{
        "step": "filter",
        "type": "aggregate",
        "output_field": "out",
        "aggregated_fields": ["b", "c"],
        "aggregating_field": "a",
    }

Methods Summary

process(docs)

Process a set of documents

train(docs)

Train on a step of a set of documents

Methods Documentation

process(docs)#

Process a set of documents

Parameters:

docs (generator(Document)) – Generator of documents

Returns:

Generator of processed documents

Return type:

generator(Document)

train(docs)#

Train on a step of a set of documents

Parameters:

docs (generator(Document)) – Generator of documents

Returns:

Generator of processed documents

Return type:

generator(Document)