Elasticsearch Migration#
Starting from Release 3.13.5, Squirro storage nodes use Elasticsearch 9.x.
Before upgrading Elasticsearch to the new major version, it’s crucial to check if existing indices will work in the new Elasticsearch version.
Elasticsearch 9.x can only read indices created in version 8.0 or later. This means all indices created in Elasticsearch 7.x and earlier versions are not supported.
You must reindex them with Elasticsearch 8.x before proceeding with the upgrade.
This page explains how to migrate existing indices to work with Elasticsearch 9.x.
Overview#
To migrate your indices to Elasticsearch 8.x, you must first perform these common initial steps:
Download provided helper scripts.
Detect any incompatible indices.
After detecting incompatible indices, you can choose between two migration approaches:
Manual Migration (Recommended for small number of indices): 3. Manually migrate each incompatible index found. 4. Verify the total number of documents in the reindexed indices match the original indices. 5. Begin using the new indices. 6. Validate changes. 7. Delete old indices. 8. Perform a final compatibility verification.
Automated Migration Manager (Recommended for large number of indices):
3. Use the migration_manager.py
script to automatically handle the entire migration process end-to-end.
Download Helper Scripts#
Squirro provides helper scripts to make the migration as smooth as possible.
Follow the steps below to download the scripts:
Download the package containing the scripts:
$ yum install squirro-elasticsearch-maintenance
Go to the scripts directory:
$ cd /opt/squirro/elasticsearch/maintenance
Note: The scripts must be executed inside a Squirro virtual environment.
$ squirro_activate
Tip
All scripts allow configuring Elasticsearch server URL, user, password, and certificates. See available options by adding the –help when calling the script.
Detect Incompatible Indices#
To detect if you have any indices that are incompatible with the Elasticsearch 8.x, use the detect_incompatible_indices.py script as shown below:
$ python detect_incompatible_indices.py --elastic-version 9
Running the script will show you whether all indices are compatible with the Elasticsearch 8.x and if you can safely proceed with the Elasticsearch upgrade.
If all indices are compatible, the output will look like this:
All indices are compatible, you can proceed with the upgrade.
If there are indices that must be reindexed first, the output will look like this:
Found incompatible indices: ['squirro_v9_pudxlusdtiyxytp3xuoi9a', 'squirro_v9_fp']. You must reindex them with Elasticsearch 7.x or higher before proceeding with the upgrade.
Choose Your Migration Method#
After detecting incompatible indices, you need to choose the appropriate migration method:
Use Manual Migration if: - You have a small number of indices to migrate (1-5 indices) - You want full control over each step of the migration process - You prefer to migrate indices one at a time
Use Automated Migration Manager if: - You have many indices to migrate (5+ indices) - You want an automated end-to-end migration process - You want to minimize manual intervention and potential errors
The sections below provide detailed instructions for both approaches.
Before Starting Migration#
Regardless of which migration method you choose, you must stop Squirro services before beginning the migration process:
$ systemctl stop sqingesterd && systemctl stop sqmachinelearningd && systemctl stop sqfilteringd && systemctl stop sqfingerprintd
Prerequisites#
Sufficient Disk Space#
At a minimum, you must have spare disk space equal to the primary store size on the storage node.
Note: Documents marked for deletion are skipped during reindexing. As a result, source indices with many such documents require less space on the new index.
Permission to Modify the Index Locator#
Verify that the topic.custom-locator
option is set to true
in the Configuration Service.
Automated Migration Manager (Recommended for Many Indices)#
If you have many indices to migrate, use the migration_manager.py
script to automate the entire migration process.
Basic Usage#
To automatically detect and migrate all incompatible indices:
$ python migration_manager.py --detect-incompatible --elastic-version 9 --token <SQUIRRO_TOKEN>
To migrate specific indices:
$ python migration_manager.py --index-list squirro_v9_pudxlusdtiyxytp3xuoi9a squirro_v9_fp --elastic-version 9 --token <SQUIRRO_TOKEN>
Advanced Options#
The migration manager supports several additional options:
$ python migration_manager.py --detect-incompatible --elastic-version 9 --token <SQUIRRO_TOKEN> --dry-run --min-doc-percentage 95.0
Key options:
- --dry-run
: Preview what would be migrated without making changes
- --min-doc-percentage
: Set minimum document percentage for sanity checks (default: 90%)
- --cluster
: Specify Squirro API cluster URL (default: http://localhost:80)
Elasticsearch Connection Options#
The migration manager also supports all Elasticsearch connection options:
$ python migration_manager.py --detect-incompatible --elastic-version 9 --token <SQUIRRO_TOKEN> --elastic-server <ELASTIC_URL> --elastic-username <USERNAME> --elastic-password <PASSWORD>
The migration manager will:
1. Detect incompatible indices (if using --detect-incompatible
)
2. Reindex each incompatible index with a timestamped name
3. Perform sanity checks to verify document counts
4. Update project locators for project-specific indices
5. Clean up original indices automatically
6. Provide a detailed summary of all migration results
Note
The migration manager requires a Squirro API token for updating project locators. Project-specific indices will be automatically detected and their locators updated appropriately.
Manual Migration (For Few Indices)#
If you have only a few indices to migrate or prefer manual control, follow the manual migration process below. The following process will be followed per index.
Reindex#
Reference: Learn more about reindexing at Reindexing Elasticsearch.
To perform reindexing, follow the steps below:
Run the
reindex.py
script and provide the name of the index you want to reindex together with the target index name, as shown below:
$ python reindex.py --original-index <ORIGINAL_INDEX> --target-index <TARGET_INDEX>
For example, if you want to reindex the index called squirro_v9_pudxlusdtiyxytp3xuoi9a
, run the following command:
$ python reindex.py --original-index squirro_v9_pudxlusdtiyxytp3xuoi9a --target-index squirro_v9_pudxlusdtiyxytp3xuoi9a-reindexed
Tip
You can reindex multiple indices in parallel by running the script in the background.
To accomplish this, add the &
symbol and the end of the line.
The script saves logs and index statistics to the files with the reindex-
suffix so you can review them to detect any issues.
Verify the Total Number of Documents#
To be sure that reindexing went well, you can verify the total number of documents in the reindexed index and compare it to the number of documents in the original one.
To do so, use the compare_docs.py
script as shown below:
$ python compare_docs.py --original-index <ORIGINAL_INDEX> --target-index <TARGET_INDEX>
Note
During reindexing documents with invalid mapping are filtered out, so the number of documents may slightly vary without it meaning that something went wrong. Treat it more like a sanity check to capture obvious issues, for example empty index or half of documents missing.
Utilize the New Index#
After reindexing is done, the output is a new index with a different name than the original one.
To tell Squirro to use this new index instead of the original one, you must perform different actions depending on the index type.
Project-Specific Index#
Tip
Project-specific indices can be recognizable by the naming convention, which is squirro_v9_<RANDOM 22 CHARACTERS>
.
Every Squirro project creates its own index and stores information about that index in the database.
This means that you must update the project locator in the database to point now to the new index.
To do so, use the update_index_locator.py
script as shown below:
$ python update_index_locator.py --original-index <ORIGINAL_INDEX> --target-index <TARGET_INDEX> --token <SQUIRRO_TOKEN>
In addition to changing the pointer in the database, the script also closes the original index.
The script saves logs to the files with the rewire-
suffix, so you can review them to detect any issues.
Other Indices#
Apart from project-specific indices, where you can replace the original index name with a reindexed one, Squirro also creates other indices that have strictly defined names and cannot be renamed.
Some such indices include:
squirro_v9_fp
squirro_v9_filter
In these situations, you cannot use a different index name, you must instead reindex back an index with the same name as the original one.
For example, if you reindexed squirro_v9_fp
and called the new index squirro_v9_fp-reindexed
, you can reindex it back by executing the following command:
$ python reindex.py --original-index squirro_v9_fp-reindexed --target-index squirro_v9_fp
After reindexing back, you can then delete the redundant index using the following command:
$ curl -XDELETE http://localhost:81/ext/elastic/squirro_v9_fp-reindexed
After Migration Completion#
Once you’ve completed your migration (using either method), restart Squirro services and validate that everything is working correctly:
Restart Squirro services using the following command:
$ squirro_restart
Check the Squirro status using the following command:
$ squirro_status
If all services are healthy, you can now do manual validation by performing searches, clicking on item details, displaying labels, and manually clicking through your dashboards.
Delete Original Index (Manual Migration Only)#
Note
If you used the Automated Migration Manager, this step is not needed as the script automatically cleans up original indices for you.
If you performed Manual Migration and have validated that the changes are working properly, you can delete the original index.
To do so, use the following command:
$ curl -XDELETE http://localhost:81/ext/elastic/<ORIGINAL_INDEX>
Note
If you reindexed a non-project-specific index (as described above), you’ve likely already deleted the original index in the previous steps.
Check Indices#
After you’ve migrated all incompatible indices, check once again to verify that all indices work with the Elasticsearch 8.x.
To do so, use the detect_incompatible_indices.py
script shown below:
$ python detect_incompatible_indices.py --elastic-version 9