convert.ini#

The convert.ini config file, located at /etc/squirro/convert.ini configures the content conversion.

Memory Limit#

New in version 3.6.9: By default, in Squirro 3.6.9 and later, Squirro uses an external dedicated Tika service. As such, auto-spawner settings do not apply.

If there is no Tika already running under the http://localhost:9998 address, the Squirro Ingester service will auto-spawn the Tika server. This allows for the configuration of a memory limit for the spawned Tika server.

The memory is limited to avoid running into out-of-memory exceptions when converting. For large documents, the default setting may be too low. When this happens, there will be aborted items in the index with SQ-05105 processing errors.

In the apache-tike section, use the vmargs option.

 Key Usage Default Example vmargs Java VM parameters. See Oracle’s Java HotSpot VM Options reference for the options. -Xmx512M,-Xms64M [apache-tika] # comma-separated list of additional Java Virtual Machine command-line options # to use vmargs = -Xmx512M,-Xms64M 

Using External Tika Service#

The Tika web service can be configured to run separately for more fine-grained control.

New in version 3.6.9: Releases Squirro 3.6.9 and later use the dedicated Tika service by default.

 Key Usage Default Example tika-url Point to a Tika server, http://localhost:9998 is the default. [apache-tika] tika-url = http://localhost:9998 

tika systemd service is provided by squirro-tika-server package, and tika-server using /etc/tika-config.xml configuration file. For all options available see Tika Server Configuration.

Please note that changing the default configuration is not recommended, as it must be compatible with the Squirro Ingester and required configurations may change in the future.