3.9.4 Release Notes#

Squirro 3.9.4 was released on January 19, 2024.

Reference: Learn more about the Squirro Release Process.

What’s New#

  • SquirroGPT now supports connecting to your own deployments of OpenAI or Azure-supported models, including Mixtral, via the project or server-level configuration settings under genai.sqgpt.settings.

  • The default value of useReactWidgets is now true, meaning that React widgets are used by default on new projects.

  • The Squirro platform now features a redesigned header and navigation bar with improved looks and functionality.

  • Squirro now includes beta support for Oracle Database for new installations.

  • Values in the configuration service can now be fetched from INI files. For example, specifying ${nlp_service.api_key} in the configuration service option fetches the value from the api_key key in the [nlp_service] located in the .ini file. This is especially useful for sensitive data that would preferably not be exposed to the configuration service.

  • Item title can now be modified using the modify_item and modify_items Squirro Client methods.

  • Added a new server setting to hide internal information such as version and thumblerAPIRoot.

  • Exposed the ability to retry failed batches from the UI.

  • Introduced a new item field clean_body which can be used to pre-process the body for classification tasks. KEE, NLP keyphrase tagging, and machine learning workflows now default to using this field.

Search Additions and Improvements#

  • Tuned the precision/recall of keyword search by applying scoring plugins on individual term sequences as found in the search query. Depending on the plugin that is set up, it’s possible to enable fuzzy term matching on the body, or typeahead-like (prefix) term matching on the title, or any other custom matching logic that a plugin may provide. This can be enabled in the new project configuration topic.search.query-strategy-term-scoring-profiles. Default configuration comes with these settings: "scoring_profiles": ["prefix_match fields:title","fuzzy_match fuzziness:auto fields:title"]

  • Created a Text Chunking step that can be configured for different chunking strategies.

  • Added Scoring Plugins to perform fuzzy_match and prefix_match matching. (Phrase)-Prefix term matching can be used to achieve typeahead functionality on searchable text content fields, considering the order of tokens and proximity. This can be used to increase recall for lexical search (with higher query-computation cost). To learn more, see the Retrieve Scoring Plugin Documentation.

  • Added a new layer visibility condition option, Concept Search, with is empty and is not empty conditions.

  • Added the ability to highlight specific paragraphs in searches via a new paragraph_highlight profile. Provided a paragraph ID, it can be used in the following manner: profile:{ paragraph_highlight id:FUelODYIFNsmsUoDhEa2cA_0 }. This profile is primarily intended for SquirroClient users and is only available for projects with paragraph embeddings.

  • Semantic search highlighting in the Item Detail view was improved. The best paragraph is highlighted, together with query-terms (hybrid search highlighting). This introduces basic semantic highlighting functionality without requiring extracive-question-answering to be active.

  • Removed the semantic similarity step, instead sentence tokenize the snippet to produce most_relevant_sentences.

  • An error is no longer raised if an embedding chunk is too big. Instead, the text is truncated and a warning is logged.

  • Improved highlighting of extractive-question-answer span (sentence boundary aware).

  • Squirro now returns the amount of approximately matching documents when doing semantic search (instead of returning the count of matching paragraphs). This is applied when doing client.query(options={"search_scope":"paragraph", "response_format":"document", or client.query(query=" ..user question.. profile:{semantic}").

  • There is now a more robust baseline keyword search (without query processing). Squirro now better interprets ? in term-sequences as a question indicator and not as wildcard term matching.

  • For the aggregation API, exposed single value aggregation value_count to count the matching documents of a sub-aggregation.

Other Improvements#

  • Squirro now supports highlighting over multiple lines in the PDF viewer.

  • Added the project-level configuration option frontend.userapp.excel-export-filename which sets the filename of the generated Excel file when selecting to export project items to Excel.

  • In the webshot service, changed the log level from WARNING to DEBUG.

  • Activated the stripping of title prefix by default.

  • Migrated the HorizontalTabs design.

  • Made the highlighting of matching keywords more meaningful within the Item Detail view by rewriting highlight-query based on the output of query-processing, using only relevant terms for item-detail view highlighting.

  • Internal service users can now be created more easily when the tenant is unknown.

  • Default the store value in useStoreKeyChange to the current store value.

  • Introduced the new libNLP filter coalesce. This returns the first field from a provided list for which a value is present. This has been introduced to support the clean_body handling in machine learning workflows.

  • The email parsing pipelet now writes content into the newly standardized clean_body field.

  • Improved the quality of the email parsing pipelet and machine learning step. Only the latest reply of an email chain is now returned, leading to improved precision in email classification tasks.

  • Added options to specify certificates and custom headers (that may include the API token) to connect to the NLP services.

  • Improved Search widget/Global Search spacing.

  • Squirro now counts the number of items for retried failed batches.

  • Added the option to rerun failed items on a source.

  • In the Labels widget, label value percentages are no longer shown when they are not known. For example, read and starred values.

  • Added the option to customize the Labels widget and add a start icon or text to the dropdown or accordion.

  • Added helper text to the Labels widget configuration options.

  • Added a loading indicator and empty message to the Labels widget.

  • Upgraded redis-server to version 7.2.3.

  • Updated hiredis to the latest stable version (2.3.2).

  • Updated Fasttext to 0.9.2.

Bug Fixes#

  • Fixed an issue with selections applied by Item card keywords not being able to be removed.

  • Fixed an issue where a default pipeline workflow (Standard) of SquirroGPT projects (Web, Data) could not be saved in the Pipeline Editor.

  • Fixed snackbars and tooltips not visualizing within modals.

  • Fixed an issue with the Dashboard Store clearing after global search, now it restores the default store values.

  • Fixed an issue with dashboard selections not being added immediately to the Search widget/Global Search.

  • Fixed a bug with retrieving boolean Redis SSL configuration values. Previously it set to True even when the value was declared as False in the configuration file. This has been corrected.

  • Fixed an issue with automatic mapping suggestions of columns for Excel and CSV file uploads.

  • Reworked React widget loading in Carousel layers, fixing loading issues.

  • Fixed an issue with custom React widgets not working in accordion layer mode.

  • Included studio db option in encrypted config options.

  • Internal errors are no longer exposed to the API.

  • Properly append passage and query prefixes to the embeddings. This leads to improved embeddings quality.

  • Fixed an issue with community subscriptions not updating after switching back from the community type tab.

Breaking Changes#

  • If you have enabled the email parsing step in a workflow, the result of the steps that now use the clean_body field can change. Carefully review the output of the classification steps or remove the email parsing step if it is not required.

  • Chunking documents inside the Paragraph Embedding pipeline step is now deprecated. All semantic search projects should have the Text Chunking step before Paragraph Embedding.

  • Updated Pydantic, FastAPI, and Spacy dependencies. If you are using any of these in a pipelet, data loader plugin, or studio plugin, they may need to be updated.

Installation and Upgrade#

For new installations, find step-by-step instructions in Install and Manage Squirro with Ansible (recommended) or Installing Squirro on Linux.

To upgrade an existing installation, see Upgrading Squirro.

January 30, 2024 Correction: Updated language on connecting to OpenAI or Azure-supported models via the project or server-level configuration settings under genai.sqgpt.settings.