High Performance Configuration#

Within a production deployment, a Squirro installation can be further optimized for improved performance.

This page describes various ways system administrators can improve performance.


There are no hard and fast rules for performance tuning. The best way to tune a system is to measure it and then make changes based on the results.

Elasticsearch Tuning#

One of the most important aspects of performance tuning is to ensure that Elasticsearch is optimized correctly.

Elasticsearch Best Practices#

Elasticsearch itself publishes documentation on best practices for performance tuning.


Some of these recommendations may not apply to your setup or may need to be executed on the backend by Squirro Solutions Engineers. Contact Squirro Support if there’s a specific item you’d like help with or to discuss.

Tune for Search Speed#

Elasticsearch’s How to Tune for Search Speed guide includes the following recommendations:

  • Give memory to the filesystem cache.

  • Avoid page cache thrashing by using modest readahead values on Linux.

  • Use faster hardware.

  • Search as few fields as possible.

  • Consider mapping identifiers as keyword (Non-analyzed labels in Squirro).

  • Avoid scripts.

  • Force-merge read-only indices.

  • Warm up global ordinals and and the filesystem cache.

Reference: To learn more about these recommendations, see Elasticsearch’s official How to Tune for Search Speed Guide.

Tune for Indexing Speed#

Elasticsearch’s How to Tune for Indexing Speed guide includes the following recommendations:

  • Unset or increase the refresh interval.

  • Disable replicas for initial loads.

  • Disable swapping.

  • Give memory to the filesystem cache.

  • Use auto-generated ids.

  • Use faster hardware.

  • Ensure correct indexing buffer size.

  • Use cross-cluster replication to prevent searching from stealing resources from indexing.

  • Avoid hot spotting.

Reference: To learn more about these recommendations, see Elasticsearch’s official How to Tune for Indexing Speed Guide.

Squirro Search Settings#

Squirro offers a set of configurable options via Setup → Setting → Project Configuration that allow you to fine-tune various aspects of search and aggregation behavior to optimize the search performance.

Search Settings#

The search settings control the behavior of Elasticsearch queries executed by Squirro.

You can, for example, define the number of concurrent search requests.

Or if you don’t need information about the exact number of matched documents, you can increase the search speed by specifying the maximum number of documents to collect or the number of hits matching the query to count accurately.

Reference: topic.search.search-settings
pydantic model SearchConfig#
field max_concurrent_shard_requests: int = 6#

Defines the number of concurrent shard requests per node this search executes concurrently.

field profile: bool = False#

Provides detailed timing information about the execution of individual components in a search request.

field request_cache: bool = True#

If true, the caching of search results is enabled for requests where size is 0.

field request_timeout: int = 35#

Specifies the period of time to wait for a response from each shard.

field terminate_after: int = 0#

Maximum number of documents to collect for each shard. To disable terminate query execution early, set the value to 0.

field track_total_hits: str = '500000'#

Number of hits matching the query to count accurately. If true, the exact number of hits is returned at the cost of some performance. If false, the response does not include the total number of hits matching the query.

Aggregation Settings#

The aggregation settings govern how Squirro performs Elasticsearch aggregations.

For displaying statistics about your data, you typically don’t need to aggregate all documents, only a sufficient representation of the whole index.

In this case, you can apply the sampler to the aggregations so that Elasticsearch limits the number of aggregated documents.

Reference: topic.search.agg-settings
pydantic model AggregationsConfig#
field random_sampler_probability: float = 1#

The probability that a document will be included in the aggregated data. Must be less than or equal to 0.5 or exactly 1.0. To disable random sampling set the value to 0.0.

  • le = 1.0

field sample_shard_size: int = 0#

Limits how many top-scoring documents are collected in the sample processed on each shard. The options is ignored if random sampling is used. To disable aggregation sampling set the value to 0.


Various caches can be tuned to improve performance, including the following:

  • Authentication Cache

  • Sources Cache

  • Facets Mapping Cache

  • Query Cache

Authentication Cache#

The topic and provider APIs will contact the user API on every request to validate the provided authentication or refresh token and its access to the requested project.

To avoid this, the auth_cache can be enabled with the following lines in /etc/squirro/common.ini:

auth_cache = {"type": "MemoryLRU", "max_items": 1000, "timeout": "5m"}


  • Only successful authentications are cached.

  • Tokens that expire will be valid for up to this many seconds beyond their actual expiration.

  • User permission changes in projects will take up to this many seconds to refresh.

Sources Cache#

The topic API needs to check the existing sources, facets, and other information about the project when responding to queries.

While not a big overhead, that can still add up.

To enable a cache for these lookups, the project metadata cache can be set up in /etc/squirro/topic.ini as follows:

metadata_cache = {"type": "MemoryLRU", "timeout": "5m"}


  • New sources will be delayed: items for these sources will not be visible for up to this many seconds.

  • New facets will not work correctly for querying and when displaying items for up to this many seconds.

  • Will impact the adding of new communities: items will not be tagged with newly added communities for up to this many seconds.

Facets Mapping Cache#

Facets that are present in the Elasticsearch index need to be looked up as part of querying.

This overhead can be removed with the Redis-cache-based facet mapping cache in /etc/squirro/common.ini as follows:

facet_mapping_cache_timeout_secs = 300

Caveat: New facets will not work correctly for querying and when displaying items for up to this many seconds.

Query Cache#

Query responses are by default cached in Redis.

When applying access control, the cache is relatively worthless and still takes considerable resources.

As a result, it may be faster to disable this cache in /etc/squirro/common.ini as follows:

query_results_cache_enabled = false

Leveraging Multiple CPU Cores#

Out of the box, all Squirro services can only leverage one CPU core.

This makes sense on smaller machines and in development environments, as Squirro consists of many small services.

However, under high load on a production server with lots of CPU cores (e.g. 8+) this can become a limiting factor.

If response times get slow and you observe Squirro python processes at 100%, it might be time to allow the service to fork multiple processes.

How to Fork Multiple Processes#

To fork multiple processes, add the following section to the services config file, e.g. /etc/squirro/topic.ini:

fork = true
min_spare = 2
max_spare = 2

Then, restart the service.

After this change, you will see three topic service processes: one main process and two workers.

Reference: Learn more about [server] options by reading Apache MPM Prefork documentation.


Be judicious with increases. Try two to start, measure, then increase further if needed.

Supported Services#

Not all services support forking.

The following services have been tested and support forking:

  • provider

The machinelearning service has stopped supporting forking via the Apache MPM Prefork framework since version 3.9.5, after it was migrated to the FastAPI web framework.

FastAPI-based Services#


The settings described below have only been tested for the machinelearning service. While other FastAPI-based services might support these settings, they remain untested. These services include, but are not limited to, indexmanager, notes, pdfconversion, plumber, and search.

Since version 3.9.5, the machinelearning service has migrated to utilize the FastAPI web framework. Squirro services that employ FastAPI are served by default by Uvicorn, an ASGI web server. Consequently, they do not support any options listed in the [server] section above (such as fork, min_spare, and max_spare). Although these options can be configured, they will have no effect.

To leverage multi-core CPUs, Uvicorn can be configured to spawn multiple workers (for further details, refer to the Uvicorn Documentation). This setting is exposed in Squirro services through the following configuration:

uvicorn_workers = 2

Another option is to utilize the Gunicorn web server instead of Uvicorn. Despite Gunicorn being a WSGI server, Uvicorn offers a Gunicorn worker class, enabling Gunicorn to serve FastAPI-based services. To use Gunicorn, the web_server setting must be explicitly set to the gunicorn value:

web_server = gunicorn

Setting web_server to gunicorn allows for the configuration of the number of workers and other Gunicorn-related options. These options can be found in the Gunicorn Documentation.

Currently, the following options are supported:

web_server = gunicorn
uvicorn_workers = 2
max_requests = 1000
max_requests_jitter = 50
timeout = 60