Installing Squirro Using Ansible
Contents
Installing Squirro Using Ansible#
This page provides instructions on how to install Squirro using Red Hat’s Ansible Automation Platform.
Note
If you prefer to deploy Squirro as a service, see start.squirro.com.
Overview#
The Ansible plugin is the recommended way of managing Squirro. It supports standalone installations as well as highly customized, multi-node deployments using Ansible AWX/Tower.
If you still want to install Squirro manually, see Installing Squirro on Linux. With that said, Squirro strongly recommends the Ansible plugin.
This is the way ;-)
Download#
The current Beta version is available on Squirro mirror here:
Download the Squirro Ansible Module
Note: You must have a valid Squirro mirror user account to access this download. If you require assistance, contact Squirro Support.
Quickstart Installation#
If you’re looking for a quick single server setup of the latest Squirro release, follow the steps below:
Secure a Rocky Linux 8, RHEL8, CentOS 7, or RHEL7 Linux server. Squirro recommends AWS, GCP, Azure, Hetzner Cloud, or any physical server.
For development, Vagrant is a great choice, but any VM solution such as Virtualbox, VMware Workstation/Fusion/Player etc will work.
Ensure your setup meets the requirements detailed in System Requirements.
Log in to the server and become the root.
Install Ansible using the following commands:
yum install -y epel-release
yum install -y ansible
Unzip the downloaded Squirro Ansible Module using the following command:
unzip squirro-ansible-x.x.x-beta-release.zip
Edit the
playbook-quickstart.yml
to make the yum username and password match your Squirro mirror username and password.
Example code shown below:
- name: Quickstart Install Squirro
hosts: all
become: true
vars:
squirro_clusternode: True
squirro_storagenode: True
yum_user: ...
yum_password: ...
squirro_channel: stable
squirro_version: latest
elasticsearch_discovery_type: single-node
roles:
- role: squirro-ansible
Install Squirro using the following command:
ansible-playbook --connection=local --inventory 127.0.0.1, playbook-quickstart.yml
Note
Depending on your system, the installation can take 15-20 minutes.
Open the
HTTP
andHTTPS
port on the firewall as shown below:
# firewall-cmd --zone=public --permanent --add-service=http
# firewall-cmd --zone=public --permanent --add-service=https
Validate if all services are running using the following command:
squirro_status
Access the service at
https://your-sytems-ip
orhttp://your-systems-ip
.
Note
To learn how to use Squirro, visit the Squirro Academy at learn.squirro.com.
Offline Installation#
If your machine cannot reach mirror.squirro.net
, you can install Squirro offline as shown below:
- name: Quickstart Install Squirro
hosts: all
become: true
vars:
squirro_clusternode: True
squirro_storagenode: True
squirro_install_mode: offline
squirro_packages_tar: /path/to/os-n.n-stable-x86_64-n.n.n.tar.gz
squirro_channel: stable
squirro_version: latest
elasticsearch_discovery_type: single-node
roles:
- role: squirro-ansible
The corresponding tar.gz
file can be downloaded from the mirror.
Example: For Rocky Linux 8, you can get the correct package from: https://mirror.squirro.net/rocky/8/stable/x86_64/
.
If you expect to run multiple installations you can speed things up by extracting the tar.gz
file and placing it on a shared filesystem, as shown below:
- name: Quickstart Install Squirro
hosts: all
become: true
vars:
squirro_clusternode: True
squirro_storagenode: True
squirro_install_mode: filesystem
yum_repo_folder: /path/to/offline/yum/repo
squirro_channel: stable
squirro_version: latest
elasticsearch_discovery_type: single-node
roles:
- role: squirro-ansible
This can be faster as the uncompression will not take place each time.
Note: The yum_repo_folder
needs to point at the location that contains the repodata
folder.
AWS Standalone Playbook#
Expanding on the quickstart example, on AWS, when deploying Squirro, it is best practice to leverage managed services to increase availability and enable horizontal scaling.
In the example below, the following are leveraged:
RDS to externalize MariaDB/MySQL
ElastiCache to externalize the Key/Value Store
ElastiCache again to externalize the In Memory Cache
EFS as a shared filesystem
These resources can be spun up manually in the AWS web console.
Important: Squirro highly recommends you leverage infrastructure automation solutions such as CloudFormation or Terraform.
Once complete, edit the playbook-quickstart-aws.yml
file to fill in all the blanks and adjust the RDS and ElastiCache endpoints, credentials, and IPs.
- name: Quickstart Install Squirro
hosts: all
become: true
vars:
squirro_clusternode: True
squirro_storagenode: True
yum_user: ...
yum_password: ...
squirro_channel: stable
squirro_version: latest
elasticsearch_squirro_cluster_name: squirro-vagrant-testing
elasticsearch_cluster_nodes: ['10.1.0.2', '10.1.0.3', '...']
elasticsearch_network_interface: eth0
remote_mysql_server: True
mysql_host: yourproject-db.abcdefghij.eu-central-1.rds.amazonaws.com
mysql_root_user: root
mysql_root_password: ...
mysql_shared_service_password: ...
remote_redis_server: True
redis_tls: True
redis_storage_host: master.storage-abcdefgh.euc1.cache.amazonaws.com
redis_storage_port: 6379
redis_storage_password: ...
redis_cache_host: master.storage-abcdefgh.euc1.cache.amazonaws.com
redis_cache_port: 6380
redis_cache_password: ...
remote_filesystem: True
remote_filesystem_path: /mnt/efs/squirro
roles:
- role: squirro-ansible
You then run the same steps as under the Quickstart section to install Ansible, but instead run this command to execute the install:
ansible-playbook --connection=local --inventory 127.0.0.1, playbook-quickstart-aws.yml
You can now repeat this on each EC2 instance.
Tip: For production deployments, Squirro typically recommends at least three instances.
This procedure can be fully automated using Cloudinit/Userdata methods to bootstrap new instances and/or to build AMI images using frameworks such as packer.
Caution
This only serves as an example, for full production readiness you will need to delegate the secrets to a secrets manager (e.g. Hashicorp Vault) and also leverage the AWS EC2 discovery plugin for Elasticsearch.
Role Variables#
The following variables can be set to control the squirro-ansible role:
ENTRY POINT: main - The main entry point for the squirro-ansible role.
OPTIONS (= is mandatory):
- config_decrypt_command
Can be set to a command line utility that will be called when
the Squirro .ini files encounter prefix.
[Default: None]
type: str
- custom_pip_index_url
If custom pip index-url for the squirro virtualenv. This is
only of relevance if additional pip packages/wheel need to be
installed and the installation cannot reach the internet /
pypi.
[Default: ]
type: str
- database_type
Set the sql database squirro is using. mysql is default and
recommended. postgres support is currently experimental, see
https://nektoon.atlassian.net/wiki/x/AQAng
(Choices: mysql, postgres)[Default: mysql]
type: str
- elasticsearch_cluster_nodes
List of hostname or ip addresses of all elasticsearch nodes.
This is only relevant if `elasticsearch_discovery_type` is set
to `zen`.
[Default: ['127.0.0.1']]
type: list
- elasticsearch_data_dir
Location of the elasticsearch data folder. Use this to place
the elasticsearch indices onto dedicated storage volumes. e.g.
on AWS you can use this to locate the data onto the epherial
but very fast NVMe drives if you use a m5d, r5d or c5d
instance type. Note that you cannot change this after the
initial deployment. If you have multiple volumes, you can gain
extra performance by letting elasticsearch span all of the
volumes, in this case use `elasticsearch_data_dirs`3.4-lts
[Default: /var/lib/elasticsearch]
type: str
- elasticsearch_data_dirs
Location of the elasticsearch data folders. See
`elasticsearch_data_dir` for details. If both
`elasticsearch_data_dir`` and `elasticsearch_data_dirs` is
set, the latter is used.
[Default: ['{{ elasticsearch_data_dir }}']]
type: list
- elasticsearch_discovery_type
Set the elasticsearch discovery mode.
(Choices: single-node, zen)[Default: zen]
type: str
- elasticsearch_heap_size
Set the RAM that elasticsearch allocates in gigabytes. If not
set ansible will detect the total RAM available and will try
to find a good setting. On a dedicated machine you should use
no more than 50% of the available RAM, but not more than 32
gigabytes. On a machine with clusternode and storagenode,
giving 25% of RAM to elasticsearch is a good starting point.
In general less is more. A machine starved out of RAM will not
perform well.
[Default: None]
type: int
- elasticsearch_install_aws_plugins
Install the AWS elasticsearch plugins. Note that this is an
experimental feature and will need addtional work. For example
this can currently fail upgrades and will not work in offline
mode.
[Default: False]
type: bool
- elasticsearch_install_azure_plugins
Install the azure elasticsearch plugins. Note that this is an
experimental feature and will need addtional work. For example
this can currently fail upgrades and will not work in offline
mode.
[Default: False]
type: bool
- elasticsearch_install_google_plugins
Install the Google elasticsearch plugins. Note that this is an
experimental feature and will need addtional work. For example
this can currently fail upgrades and will not work in offline
mode.
[Default: False]
type: bool
- elasticsearch_memory_lock
Should Elasticsearch lock all memory to prevent memory
fragmentation. This should always be set to True unless in
test / dev enviroment with extremly low RAM. Setting this to
FALSE will have a severe performance impact.
[Default: True]
type: bool
- elasticsearch_network_interface
Only use a given network interface, e.g. eth0 or eth1.
[Default: None]
type: str
- elasticsearch_network_ip_protocol
Should elasticsearch listen to IPV4 or IPV6.
(Choices: ipv4, ipv6)[Default: ipv4]
type: str
- elasticsearch_replica_count
Number of elastisearch index shards replicas. In multi-node
deployments set at least to 1 to ensure fault tolerance.
[Default: 9]
type: int
- elasticsearch_shards_number
Number of elastisearch index shards for new indices.
[Default: 6]
type: int
- elasticsearch_squirro_cluster_name
Name of the elasticsearch cluster. Set a unique string if you
intend to run a multi-node setup.
[Default: squirro-cluster-{{ ansible_hostname | to_uuid }}]
type: str
- frontent_flask_secret_key
If set the flask secret key for squirro session is set to this
value. Once set, if this value is unset the previous value
remains in the frontend.ini file
[Default: None]
type: str
- mode
Is an installation or upgrade performed. This option is marked
for deprecation.
(Choices: install, upgrade)[Default: install]
type: str
- mysql_configuration_password
Password used for the configuration service sql connection.
[Default: {{ mysql_shared_service_password }}]
type: str
- mysql_data_dir
Path where the local mysql data folder needs to be placed.
This only works if set before the MariaDB server is installed.
[Default: /var/lib/mysql]
type: str
- mysql_datasource_password
Password used for the datasource service sql connection.
[Default: {{ mysql_shared_service_password }}]
type: str
- mysql_emailsender_password
Password used for the emailsender service sql connection.
[Default: {{ mysql_shared_service_password }}]
type: str
- mysql_filtering_password
Password used for the filtering service sql connection.
[Default: {{ mysql_shared_service_password }}]
type: str
- mysql_fingerprint_password
Password used for the fingerprint service sql connection.
[Default: {{ mysql_shared_service_password }}]
type: str
- mysql_host
Hostname of the remote MySQL/MariaDB server. Example:
`yourproject-db.abcdefghij.eu-central-1.rds.amazonaws.com`.
Only relevant if `remote_mysql_server` is set to True
[Default: None]
type: str
- mysql_machinelearning_password
Password used for the machinelearning service sql connection.
[Default: {{ mysql_shared_service_password }}]
type: str
- mysql_plumber_password
Password used for the plumber service sql connection.
[Default: {{ mysql_shared_service_password }}]
type: str
- mysql_port
TCP port of the remote MySQL/MariaDB server. Only relevant if
`remote_mysql_server` is set to True
[Default: 3306]
type: int
- mysql_root_password
Password of the remote MySQL/MariaDB server root user. Only
relevant if `remote_mysql_server` is set to True.
[Default: None]
type: str
- mysql_root_user
Username of the remote MySQL/MariaDB server root user. Only
relevant if `remote_mysql_server` is set to True.
[Default: None]
type: str
- mysql_scheduler_password
Password used for the scheduler service sql connection.
[Default: {{ mysql_shared_service_password }}]
type: str
- mysql_shared_service_password
Default password to use for the various squirro services. Note
that options exist to have a unique password for each
services.
[Default: squirro/4u]
type: str
- mysql_shared_service_user_hostname
Username hostname string. This is required for some
MySQL/MariaDB deployment, e.g. on Microsoft Azure. See the
Azure example playbook as well.
[Default: None]
type: str
- mysql_ssl
Use SSL for the MySQL/MariaDB connection.
[Default: False]
type: bool
- mysql_ssl_ca_certs
Path to the ca certificate file used for the MySQL/MariaDB
connections.
[Default: <python site-packages path>/certifi/cacert.pem]
type: str
- mysql_topic_password
Password used for the topic service sql connection.
[Default: {{ mysql_shared_service_password }}]
type: str
- mysql_trends_password
Password used for the trends service sql connection.
[Default: {{ mysql_shared_service_password }}]
type: str
- mysql_user_password
Password used for the user service sql connection.
[Default: {{ mysql_shared_service_password }}]
type: str
- nginx_primary_http_port
If set, the primary nginx http port tcp 80 will be changed to
this port
[Default: None]
type: int
- postgres_configuration_password
Password used for the configuration service sql connection.
[Default: {{ postgres_shared_service_password }}]
type: str
- postgres_datasource_password
Password used for the datasource service sql connection.
[Default: {{ postgres_shared_service_password }}]
type: str
- postgres_emailsender_password
Password used for the emailsender service sql connection.
[Default: {{ postgres_shared_service_password }}]
type: str
- postgres_filtering_password
Password used for the filtering service sql connection.
[Default: {{ postgres_shared_service_password }}]
type: str
- postgres_fingerprint_password
Password used for the fingerprint service sql connection.
[Default: {{ postgres_shared_service_password }}]
type: str
- postgres_host
Hostname of the remote PostgreSQL server. Example:
`yourproject-db.abcdefghij.eu-central-1.rds.amazonaws.com`.
Only relevant if `remote_postgres_server` is set to True
[Default: None]
type: str
- postgres_machinelearning_password
Password used for the machinelearning service sql connection.
[Default: {{ postgres_shared_service_password }}]
type: str
- postgres_plumber_password
Password used for the plumber service sql connection.
[Default: {{ postgres_shared_service_password }}]
type: str
- postgres_port
TCP port of the remote PostgreSQL server. Only relevant if
`remote_postgres_server` is set to True
[Default: 5432]
type: int
- postgres_scheduler_password
Password used for the scheduler service sql connection.
[Default: {{ postgres_shared_service_password }}]
type: str
- postgres_shared_service_password
Default password to use for the various squirro services. Note
that options exist to have a unique password for each
services.
[Default: squirro/4u]
type: str
- postgres_topic_password
Password used for the topic service sql connection.
[Default: {{ postgres_shared_service_password }}]
type: str
- postgres_trends_password
Password used for the trends service sql connection.
[Default: {{ postgres_shared_service_password }}]
type: str
- postgres_user_password
Password used for the user service sql connection.
[Default: {{ postgres_shared_service_password }}]
type: str
- redis_cache_host
Hostname of the remote Redis server for caching. Example:
`master.storage-abcdefgh.euc1.cache.amazonaws.com`. Only
relevant if `remote_redis_server` is set to True.
[Default: None]
type: str
- redis_cache_password
Password of the remote Redis server for caching. Only relevant
if `remote_redis_server` is set to True.
[Default: None]
type: str
- redis_cache_port
TCP port of the remote Redis server for caching. Only relevant
if `remote_redis_server` is set to True.
[Default: None]
type: int
- redis_data_dir
Path where the local redis data folder needs to be placed.
This only works if set before the Redis servers are installed.
[Default: /var/lib/redis]
type: str
- redis_ssl
Use SSL/TLS for the Redis connections
[Default: False]
type: bool
- redis_ssl_ca_certs
Path to the ca certificate file used for the Redis
connections.
[Default: <python site-packages path>/certifi/cacert.pem]
type: str
- redis_ssl_verify
Set to False if the Redis server SSL/TLS certificate is self
signed and/or otherwise untrusted or invalid.
[Default: True]
type: bool
- redis_storage_host
Hostname of the remote Redis server for persistent storage.
Example: `master.storage-abcdefgh.euc1.cache.amazonaws.com`.
Only relevant if `remote_redis_server` is set to True.
[Default: None]
type: str
- redis_storage_password
Password of the remote Redis server for persistant storage.
Only relevant if `remote_redis_server` is set to True.
[Default: None]
type: str
- redis_storage_port
TCP port of the remote Redis server for persistant storage.
Only relevant if `remote_redis_server` is set to True.
[Default: None]
type: int
- remote_filesystem
Set to true if the various Squirro assets must be placed on a
remote/shared fileystems. Examples are NFS, EFS, SMB. This
must be set before the first run, and cannot be changed later.
[Default: False]
type: bool
- remote_filesystem_path
Path to the already mounted remote fileystem. Only relevant if
`remote_fileystem` is set to True. This must be set before the
first run, and cannot be changed later. Example:
`/mnt/efs/squirro`
[Default: None]
type: str
- remote_mysql_server
Set to True if you want to leverage a remote MySQL/MariaDB
server. e.g. via AWS RDS.
[Default: False]
type: bool
- remote_postgres_server
Set to True if you want to leverage a remote PostgreSQL
server. e.g. via AWS RDS.
[Default: False]
type: bool
- remote_redis_server
Set to True if you want to leverate remote Redis servers. e.g.
via AWS ElastiCache or Redis lib_squirro_storage.stat.isdir
[Default: False]
type: bool
- service_endpoint_baseurl
Can be set to a different service endpoint protocol, hostname
and port. This is only of relevance if a custom nginx server
needs to be used.
[Default: http://localhost:81]
type: str
- squirro_channel
Which Squirro release channel from the mirror to use for the
installation. Only applies when squirro_install_mode is set to
online.
(Choices: stable, testing, unstable)[Default: stable]
type: str
- squirro_clusternode
If set to `True` the squirro-clusternode package is installed
[Default: False]
type: bool
- squirro_install_mode
Where is ansible taking the packages from. online: Packages
are retrieved from mirror.squirro.net. offline: Package is
installed from a tar.gz file that can be downloaded from
https://mirror.squirro.net. filesytem: Same as offline, but
the tar.gz file is allready extracted. custom_url: Host the
yum repo on a custom http or https url
(Choices: online, offline, filesystem, custom_url)[Default:
online]
type: str
- squirro_packages_tar
Location of a installation tar.gz file downloaded from
https://mirror.squirro.net. Only relevant if
`squirro_install_mode` is set to `offline`
[Default: None]
type: str
- squirro_service_group
This parameter is related to 'squirro_service_user'. See there
for further details, all squirro related files will get this
gid, when squirro_service_user is set to a custom value.
[Default: None]
type: str
- squirro_service_user
Set the linux uid that is used to run all the squirro python
services. If not set (default), then each service is run with
it own dedicated user (sqfront, sqtopic, etc) as provided by
the squirro packages, if set any other value, all services run
with this specific user. The user and its group (see
squirro_service_group) needs to pre- exist. This action cannot
be reverted, as information about detailed file ownership is
lost in the process. This is only recommended in scenerios
where updates are not run in place, but the instances or
containers are discarded frequently (Cloud Instances, Docker,
K8S, etc.)
[Default: None]
type: str
- squirro_storagenode
If set to `True` the squirro-storagenode package is installed
[Default: False]
type: bool
- squirro_version
Which Squirro version to install. Next to specific version
numbers you can also use the strings `latest` (for the latest
bi-weekly release) or `x.y-lts` (e.g., `3.4-lts` for the
latest LTS release in the 3.4 series).
[Default: latest]
type: str
- yum_password
Password for https://mirror.squirro.net. Only relevant if
`squirro_install_mode` is set to `offline`. Reach out to
[email protected] if you don't have this information.
[Default: None]
type: str
- yum_repo_folder
Path to the extracted contents of the offline installer tar.tz
file downloaded from mirror.squirro.net Only relevant if
`squirro_install_mode` is set to `fileystem`.
[Default: None]
type: str
- yum_repo_url
Full URL to a webserver hosting a full Squirro release. Only
relevant if `squirro_install_mode` is set to `fileystem`.
[Default: None]
type: str
- yum_user
Username for https://mirror.squirro.net. Only relevant if
`squirro_install_mode` is set to `offline`. Reach out to
[email protected] if you don't have this information.
[Default: None]
type: str