Configure PostgreSQL as the Database Backend#
Squirro supports PostgreSQL as an alternative SQL database backend to MariaDB. This page describes how to configure PostgreSQL for Ansible-based and on-premises Linux deployments.
Important
PostgreSQL is supported for new Squirro installations only. Migrating data from an existing MariaDB deployment to PostgreSQL is not supported through this procedure. To migrate an existing instance, visit the Squirro Support website and submit a technical support request.
PostgreSQL is supported on currently supported Squirro releases. Each Squirro service creates its database schema automatically the first time it starts against an empty database.
Note
For containerized deployments, currently available as a technical preview only, PostgreSQL is the default database. No additional configuration is required for a fresh installation. For guidance specific to containerized setups, visit the Squirro Support website and submit a technical support request.
Ansible-Based Deployment#
By default, the Squirro Ansible role uses MariaDB as the database. To use PostgreSQL instead, set the database_type variable to postgres in your playbook.
Important
The Ansible role supports PostgreSQL only as an external (remote) database. The role does not install or manage a local PostgreSQL server, and it does not create the required databases and roles. Provision the PostgreSQL server and create the databases and roles in advance, then set remote_postgres_server to true so that the role configures the Squirro services to connect to it. To create the databases and roles, follow the SQL steps in the On-Premises Linux Deployment section below.
When database_type is set to postgres, the role installs the psycopg2 PostgreSQL driver and writes the PostgreSQL connection strings into the Squirro service configuration files.
Prerequisites#
Before running the playbook, provision an external PostgreSQL server and create one role and one database for each of the following Squirro services:
configurationdatasourceemailsenderfilteringmachinelearningnotesplumberschedulerstudiotopicuser
The GenAI service also uses a database, which is configured separately. See the GenAI configuration step in the on-premises section for details.
Each database uses UTF8 encoding with en_US.UTF-8 collation and character type. The Create Database Users and Databases steps in the on-premises section provide the exact SQL commands.
Configuration#
To connect to an external PostgreSQL server such as AWS RDS, set the following variables in your playbook:
database_type: postgres
remote_postgres_server: true
postgres_host: postgres.example.com
postgres_port: 5432
postgres_shared_service_password: <your_password>
The role uses a single shared password for all services by default. To assign a dedicated password to a service, set the corresponding per-service variable.
To use an SSL/TLS connection to the PostgreSQL server, set the following variables:
postgres_ssl: true
postgres_ssl_ca_certs: /etc/ssl/certs/ca-bundle.trust.crt
Per-Service Passwords#
Each Squirro service can use a dedicated database password. Each variable defaults to the value of postgres_shared_service_password. Set any of the following variables to override the shared password for a specific service:
postgres_configuration_passwordpostgres_datasource_passwordpostgres_emailsender_passwordpostgres_filtering_passwordpostgres_machinelearning_passwordpostgres_notes_passwordpostgres_plumber_passwordpostgres_scheduler_passwordpostgres_studio_passwordpostgres_topic_passwordpostgres_user_password
For more granular control, each service also accepts a full connection string through a postgres_<service>_uri variable, which overrides the host, port, and password variables for that service.
For a description of each variable, see the Ansible Role Variables Reference page.
On-Premises Linux Deployment#
This section describes how to install and configure PostgreSQL as the metadata storage on a single-node, on-premises Linux deployment. The procedure targets PostgreSQL 16 on Rocky Linux 8 and Red Hat Enterprise Linux 8.
A standard Squirro installation provisions MariaDB by default. This procedure configures PostgreSQL as the backend instead, before the instance holds any data, and turns off the MariaDB service at the end.
Important
This procedure configures PostgreSQL manually and is intended for nodes that are not managed by the Squirro Ansible role. On an Ansible-managed node, the role reasserts the Squirro service configuration files on every run and can overwrite these manual changes. For Ansible-managed deployments, use the Ansible-Based Deployment section above instead.
Note
The default password squirro/4u is used throughout this procedure for consistency with the Squirro service defaults. Replace it with a secure password in production deployments.
For supported PostgreSQL version compatibility, visit the Squirro Support website and submit a technical support request.
Install PostgreSQL#
Install the PostgreSQL server on the Squirro node.
On Rocky Linux 8 and Red Hat Enterprise Linux 8, install PostgreSQL with the dnf package manager:
# Install the repository RPM
sudo dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm
# Turn off the built-in PostgreSQL module
sudo dnf -qy module disable postgresql
# Install PostgreSQL
sudo dnf install -y postgresql16-server
After installation, initialize the database, then start and turn on the service:
# Initialize the database
sudo /usr/pgsql-16/bin/postgresql-16-setup initdb
# Turn on and start the systemd service
sudo systemctl enable postgresql-16
sudo systemctl start postgresql-16
To verify that the PostgreSQL server is running:
# Check the service status
systemctl status postgresql-16
# Become the OS user that runs PostgreSQL
sudo su - postgres
# Open psql as the postgres user
psql
# List the existing databases
\l
For more information, see the PostgreSQL installation documentation website.
Create Database Users and Databases#
Each Squirro service uses a dedicated database and role. Before creating the roles, confirm that PostgreSQL uses scram-sha-256 as the password encryption method, so that the passwords created in the next step are hashed correctly. PostgreSQL 16 uses scram-sha-256 by default. To set it explicitly, edit /var/lib/pgsql/16/data/postgresql.conf:
password_encryption = scram-sha-256
Reload the configuration, then create the roles and databases:
# Become the OS postgres user
sudo su - postgres
# Reload the settings
/usr/pgsql-16/bin/pg_ctl reload
# Create the Squirro roles and the per-service databases
sq_users_dbs=('configuration' 'datasource' 'emailsender' 'filtering' 'genai' 'machinelearning' 'notes' 'plumber' 'scheduler' 'studio' 'topic' 'user')
for sq_user_db in "${sq_users_dbs[@]}"; do
# Escaped quotes are required because 'user' is a reserved word.
# The encoding and locale are set explicitly so the result does not depend on the template database defaults.
psql -c "create role \"${sq_user_db}\" with login password 'squirro/4u';" -c "create database \"${sq_user_db}\" with owner \"${sq_user_db}\" encoding 'UTF8' lc_collate 'en_US.UTF-8' lc_ctype 'en_US.UTF-8' template template0;"
done
Note
The create role and create database commands fail if the role or database already exists. Run this step only once against a fresh PostgreSQL server. To rerun it, drop the existing roles and databases first.
Verify that every Squirro database uses UTF8 encoding with en_US.UTF-8 collation and character type:
# List the databases and check their owners and encoding
psql -c "\l"
# Confirm that passwords were hashed using SCRAM-SHA-256
psql -c "select * from pg_authid;"
Configure Authentication#
Client authentication is controlled by the pg_hba.conf file. In a single-node deployment, the Squirro services and the PostgreSQL server run on the same node, so the relevant connection type is host on the loopback address. Squirro recommends the scram-sha-256 authentication method.
Edit /var/lib/pgsql/16/data/pg_hba.conf to use scram-sha-256 for IPv4 and IPv6 loopback connections. PostgreSQL 16 uses this method by default:
# TYPE DATABASE USER ADDRESS METHOD
# "local" is for Unix domain socket connections only
local all all peer
# IPv4 local connections:
host all all 127.0.0.1/32 scram-sha-256
# IPv6 local connections:
host all all ::1/128 scram-sha-256
After changing the file, reload the settings:
sudo su - postgres
/usr/pgsql-16/bin/pg_ctl reload
Configure Squirro Services#
Point each Squirro service at PostgreSQL by editing the db option in the configuration files in /etc/squirro/. The PostgreSQL connection string uses the format postgresql://<role>:<password>@localhost/<database>, for example postgresql://configuration:squirro/4u@localhost/configuration.
The following snippet updates the db option in every affected configuration file:
for config_file in /etc/squirro/*.ini; do
sed -i "s/^db = mysql+mariadbconnector\(.*\)?charset=utf8$/db = postgresql\1/" "$config_file"
done
Note
This command only rewrites db lines that use the mysql+mariadbconnector driver and end in ?charset=utf8. Lines in any other format are left unchanged, so confirm the result with the verification step below. The command also preserves the existing user name, password, and host from the MariaDB connection string. Adjust them manually if your PostgreSQL roles use different credentials.
To confirm that every service now points at PostgreSQL:
grep -E '^db =' /etc/squirro/*.ini
The GenAI service stores the database configuration in the /etc/squirro/genai.d/run.sh script rather than in an .ini file. Change the line that sets the SQ_DATABASE_URL environment variable from:
echo "SQ_DATABASE_URL=mysql+pymysql://genai:squirro/4u@sql/genai?charset=utf8mb4"
to:
echo "SQ_DATABASE_URL=postgresql+psycopg2://genai:squirro/4u@localhost/genai"
Important
The GenAI database configuration reverts to the MariaDB default on every instance upgrade. To prevent the upgrade from overwriting the configuration, turn off the instance upgrade. Go to Server → Upgrade Instance and set the enable_upgrade_instance option to false in /etc/squirro/topic.ini:
[studio]
# Set to true to turn on the upgrade_instance plugin
enable_upgrade_instance = false
Restart the topic and topicproxy services after the change:
systemctl restart sqtopicd
systemctl restart sqtopicproxyd
Finally, switch the deployment over to PostgreSQL:
# Stop the Squirro services
squirro_stop
# Stop and turn off MariaDB
systemctl stop mariadb
systemctl disable mariadb
# Restart the Squirro services
squirro_restart
If the mariadb service keeps restarting after it is turned off, mask it:
systemctl mask mariadb
For setup guidance specific to your environment, visit the Squirro Support website and submit a technical support request.