ParagraphHighlight#

class ParagraphHighlight#

paragraph_highlight | Highlight a specific paragraph that has been been processed by the Paragraph Embedding step given its ID. Additionally its possible to highlight only the sentence that contains the given span.

This profile can be used to match a specific paragraph within a document and highlight it accordingly.

Example usage in a SqueryQuerySyntax:#

Match & Highlight full paragraph:#

profile:{paragraph_highlight id:"qJUHOQ8a0goWHjWy7TEswg_45" highlight_fragments:"["all about types","types are  literally"]"}

Match a paragraph - but highlight only selected text fragments:#

Define the argument highlight_fragments

# provide list of fragments, joined with a `|` character within the
profile:{paragraph_highlight id:"qJUHOQ8a0goWHjWy7TEswg_45" highlight_fragments:"all about types|types are  literally"}

# or encoded as a json string
profile:{paragraph_highlight id:"qJUHOQ8a0goWHjWy7TEswg_45" highlight_fragments:"["all about types","types are  literally"]"}
pydantic model PluginConfig#
Fields:
  • end (int)

  • highlight_fragments (list[str] | None)

  • id (str)

  • relevant_tokens (list[str] | None)

  • sentence_splitting_max_merge_length (int)

  • sentence_splitting_min_length (int)

  • start (int)

PluginConfig.plugin_name: ClassVar[str] = 'paragraph_highlight'#

Used to register and reference the plugin within a query.

field PluginConfig.id: str [Required]#

ID of the paragraph to highlight

field PluginConfig.start: int = 0#

Best Sentence Extraction: Answer Span start within paragraph

field PluginConfig.end: int = 0#

Best Sentence Extraction: Answer Span end within paragraph

field PluginConfig.relevant_tokens: Optional[list[str]] = []#

Best Sentence Extraction: Comma separated list of relevant tokens that help to find the most relevant sentences within the paragraph

field PluginConfig.highlight_fragments: Optional[list[str]] = []#

Only highlight these text fragments explicitly, instead of highlighting the full paragraph. Via Squery Syntax, input is supported as encoded json list, or | separated fragments.

field PluginConfig.sentence_splitting_max_merge_length: int = 150#

Stop merging shorter text chunks after this threshold is exceeded.

field PluginConfig.sentence_splitting_min_length: int = 50#

Minimum length for a sentence (consecutive sentences shorter than this threshold are merged together).