NlpServiceSpacy
NlpServiceSpacy#
- class squirro.lib.nlp.steps.external.NlpServiceSpacy(config)#
Bases:
squirro.lib.nlp.steps.batched_step.BatchedStep
Step that uses an external API endpoint.
It sends a batch of
Document
in the shape of {“docs”:LIST_OF_DOCS, “fields”:LIST_OF_FIELDS} to spaCy and returns the annotated batch ofDocument
- Parameters
type (str, "external") – remote_spacy
step (str, "nlp_service_spacy") – remote_spacy
field_mapping (dict) – mapping of input field to output field
endpoint (str, None) – Custom NLP Service endpoint to be invoked. If not defined the default endpoint loaded from the ini file is used.
max_concurrent (int, 10) – maximum concurrent requests
language_processor_mapping (dict,{}) – Define what spacy processor should be used for the detected language (fields.language). Note: This option bypasses pipeline__field setting.
pipeline__field (str, None) – Programmatic selection of invoked spacy processor (read processor name from field). Uses pipeline__default if empty.
pipeline__default (str, None) – default pipeline, if no pipeline__field specified or value is null
disable_pipes__field (str, None) – Programmatic selection of disabled pipelines (read disabled pipelines from field). Uses disable_pipes__default if empty.
disable_pipes__default (list, []) – specified pipes are disabled by default, if no disable_pipes__field specified or value is null
Example
{ "name": "nlp_service_spacy", "step": "nlp_service_spacy", "type": "nlp_service_spacy", "field_mapping": {"body": "nlp__body", "title": "nlp__title"}, language_processor_mapping": { "de": "de:fast", "en": "en:fast" }, "pipeline__default": "en:fast", "disable_pipes__default": ["ner"] }
Methods Summary
process_batch
(batch)Process a batch of documents.
Methods Documentation