How to Set Up Oracle Database as Squirro’s Metadata Storage#
This page provides instructions on how to configure Squirro to work with the Oracle Database (instead of MariaDB or PostgreSQL) on a fresh single-node deployment.
Note
This integration has been developed and tested on Oracle Database 21c Express Edition on a Rocky Linux 8 machine with Squirro.
Overview of the Steps#
There are four primary steps to this process, laid out in detail below:
Set up prerequisites, including a running Oracle Database instance.
Configure Oracle for Squirro.
Configure Squirro to use Oracle by installing the
cx_Oracle
package and editing the configuration files.Restart Squirro.
Note: Utility snippets are also provided at the end of this page.
I: Set Up Prerequisites#
Firstly, you must have an Oracle Database instance up and running.
Reference: If you require assistance setting one up, see Oracle’s Official Installation Guide for help.
After you have an instance up and running as per the guide above, perform the following three steps:
Change to the
oracle
system user:
sudo su - oracle
Add the following to the end of
~/.bashrc
:
# Setting the Oracle Database XE Environment Variables
export ORACLE_SID=XE
export ORAENV_ASK=NO
export ORACLE_HOME=/opt/oracle/product/21c/dbhomeXE
. /opt/oracle/product/21c/dbhomeXE/bin/oraenv
Install
rlwrap
to make it a bit easier to work withsqlplus
:
dnf install rlwrap
II: Configure Oracle for Squirro#
Firstly, you must configure Oracle by creating a separate pluggable database (PDB) for every Squirro service.
In every PDB, we create two users:
an admin user
a user that will be used by the service itself to connect to the PDB.
The admin user is optional (but considered a good practice) and not needed by Squirro to operate.
You can skip the creation of the admin user if you think you won’t need it.
sq_user_pwd='<sq_user_pwd>'
pdb_admin_pwd='<pdb_admin_pwd>'
sq_users_dbs=('configuration' 'datasource' 'emailsender' 'filtering' 'fingerprint' 'machinelearning' 'notes' 'plumber' 'scheduler' 'studio' 'topic' 'trends' 'squser')
for sq_user_db in "${sq_users_dbs[@]}"; do
echo "Setting up the PDB for the $sq_user_db service..."
sqlplus / as sysdba << EOF
CREATE PLUGGABLE DATABASE $sq_user_db ADMIN USER admin IDENTIFIED BY $pdb_admin_pwd FILE_NAME_CONVERT = ('/opt/oracle/oradata/XE/pdbseed/', '/opt/oracle/oradata/XE/$sq_user_db/');
ALTER PLUGGABLE DATABASE $sq_user_db OPEN;
ALTER SESSION SET CONTAINER=$sq_user_db;
CREATE TABLESPACE $sq_user_db DATAFILE '/opt/oracle/oradata/XE/$sq_user_db/${sq_user_db}01.dbf' SIZE 100M AUTOEXTEND ON NEXT 10M MAXSIZE UNLIMITED;
CREATE USER $sq_user_db IDENTIFIED BY $sq_user_pwd DEFAULT TABLESPACE $sq_user_db TEMPORARY TABLESPACE temp;
GRANT CONNECT, RESOURCE TO $sq_user_db;
ALTER USER $sq_user_db QUOTA UNLIMITED ON $sq_user_db;
EOF
echo "Setup of the PDB for the $sq_user_db service has been completed."
done
Note
It’s best not to include any special characters, such as /
, in the password. Special characters don’t tend to work well with Oracle. Including special characters may result in an inability to connect to the database.
For the user service, we create the database (PDB) and the database user as squser
instead of user
because it is a reserved word and Oracle does not allow that, as shown below:
SQL> CREATE PLUGGABLE DATABASE user ADMIN USER admin IDENTIFIED BY <pdb_admin_pwd> FILE_NAME_CONVERT = ('/opt/oracle/oradata/XE/pdbseed/', '/opt/oracle/oradata/XE/user/')
*
ERROR at line 1:
ORA-65000: missing or invalid pluggable database name
In PostgreSQL, this can be circumvented by quoting the identifier, but Oracle does not support that. Therefore, a prefix is added to differentiate it.
III: Configure Squirro to Use Oracle#
Now, we need to configure Squirro to use Oracle as its database. This involves two steps:
Install the
cx_Oracle
package.Edit the Squirro configuration files.
Install cx_Oracle#
Firstly, we need to install the package to use Oracle from Python:
squirro_activate
pip install cx_Oracle
mkdir -p /opt/oracle
cd /opt/oracle
wget https://download.oracle.com/otn_software/linux/instantclient/2112000/instantclient-basic-linux.x64-21.12.0.0.0dbru.zip
unzip instantclient-basic-linux.x64-21.12.0.0.0dbru.zip
rm -f instantclient-basic-linux.x64-21.12.0.0.0dbru.zip
yum install libaio
sh -c "echo /opt/oracle/instantclient_21_12 > /etc/ld.so.conf.d/oracle-instantclient.conf"
ldconfig
You can execute the following in a Python interpreter to evaluate if connection can be established successfully to one of the created PDBs:
dialect = "oracle+cx_oracle"
username = "datasource"
password = "<sq_user_pwd>"
hostname = "test-oracle-database.squirro.cloud"
port = 1539
service = "datasource.squirro.cloud"
import cx_Oracle
connection = cx_Oracle.connect(user=username, password=password, dsn=f"{hostname}:{port}/{service}")
cursor = connection.cursor()
cursor.execute("create table test (id number primary key)")
connection.commit()
for table_name in cursor.execute("select table_name from user_tables"):
print(f"Values: {table_name}")
cursor.close()
connection.close()
Tip
As the oracle system user, you can use the lsnrctl status
command to discover the correct hostname
and service
values.
Having verified that we can talk to Oracle from Python, we can also evaluate whether we can talk to it from SQLAlchemy by creating the tables of one of the Squirro services:
dialect = "oracle+cx_oracle"
username = "datasource"
password = "<sq_user_pwd>"
hostname = "test-oracle-database.squirro.cloud"
port = 1539
service = "datasource.squirro.cloud"
import cx_Oracle
from sqlalchemy import create_engine
from squirro.service.datasource.model import Base
dsn = cx_Oracle.makedsn(hostname, port, service_name=service)
database_url = f"{dialect}://{username}:{password}@{dsn}"
engine = create_engine(database_url)
Base.metadata.create_all(engine)
And we can verify from sqlplus that the tables have been created in the appropriate PDB:
# first ensure that you are the oracle system user
sudo su - oracle
rlwrap sqlplus [email protected]:1539/datasource.squirro.cloud
# input password
# verify that we are in datasource PDB
show con_name;
# display the tables in the PDB
select table_name from user_tables;
# you can drop the TEST table created previously (optional)
drop table test;
Edit Squirro Configuration Files#
You will then need to edit all the configuration files that define communication with the database and update them appropriately to connect to Oracle instead of MariaDB:
You can then verify that the database connection strings have been updated correctly for all the services by examining the output of the following command:
grep -E '^db =' /etc/squirro/*.ini
Secondly, you’ll need to disable the execution of the database migration scripts included in every service. This is because many migration scripts are not database agnostic.
However, in a fresh installation you don’t need to execute them, because the schema is created directly from the defined SQLAlchemy models.
To disable this, open the /etc/squirro/common.ini
file and add the following:
[migration]
execute_at_startup = false
Thirdly, you’ll need to remove the ssl
and ssl_ca
options of the [db]
section because they are not supported by cx_Oracle
.
To do so, open the /etc/squirro/common.ini
file and remove or comment out them:
[db]
#ssl = False
#ssl_ca = /etc/ssl/certs/ca-bundle.trust.crt
IV: Restart Squirro#
Finally, you must stop all the Squirro services, stop and disable the MariaDB service, and restart all the Squirro services as follows:
squirro_stop
systemctl stop mariadb; systemctl disable mariadb
squirro_restart
Utility Snippets#
Use the following snippet to open all the PDBs of Squirro after a database restart, as they are reported as closed after restart:
sq_users_dbs=('configuration' 'datasource' 'emailsender' 'filtering' 'fingerprint' 'machinelearning' 'notes' 'plumber' 'scheduler' 'studio' 'topic' 'trends' 'squser')
for sq_user_db in "${sq_users_dbs[@]}"; do
echo "Opening the PDB of the $sq_user_db service..."
sqlplus / as sysdba << EOF
ALTER PLUGGABLE DATABASE $sq_user_db OPEN;
EOF
echo "Done."
done
Use the following snippet to drop all the PDBs of Squirro:
sq_users_dbs=('configuration' 'datasource' 'emailsender' 'filtering' 'fingerprint' 'machinelearning' 'notes' 'plumber' 'scheduler' 'studio' 'topic' 'trends' 'squser')
for sq_user_db in "${sq_users_dbs[@]}"; do
echo "Dropping the PDB of the $sq_user_db service..."
sqlplus / as sysdba << EOF
ALTER PLUGGABLE DATABASE $sq_user_db CLOSE IMMEDIATE;
DROP PLUGGABLE DATABASE $sq_user_db INCLUDING DATAFILES;
EOF
echo "Dropped successfully the PDB of the $sq_user_db service."
done