Thumbnail Extraction#

The Thumbnail Extraction pipeline step finds a thumbnail image to represent the Squirro item.

Enrichment name

Thumbnail Extraction, internally referred to as “webshot”




There are two ways that Squirro can find the right thumbnail for the item:

  • If the webshot_picture_hint field points to a valid image URL, that image is used as the thumbnail.

  • Alternatively the web site is downloaded and analyzed to find the most prominent image.



Thumbnail extraction relies on an Amazon Web Services S3 configuration to store images for thumbnails and to retrieve thumbnails for display. Configure the following files:

Configuration File




thumbler = //

thumb = <salt_1>



access_key = <key_1>
secret_key = <key_2>
s3_bucket =
s3_base_url =

use_thumbler = True
thumbler_config = thumb
thumbler_bucket = webshot
thumbler_salt = <salt_1>

Then restart the sqwebshotd service.



is_s3 = True
access_key = <key_1>
secret_key = <key_2>
s3_bucket =

operation = scale
salt = <salt_1>

Then restart the sqthumblerd service.

URL and webserver configuration to forward

Example based on nginx: /etc/nginx/conf.d/thumber.conf

upstream thumbler-testing {
    server ip-squirro-cluster-node:443;

server {
    listen 443 ssl;

    ssl_certificate <ssl_certificate_1>;
    ssl_certificate_key <ssl_key_1;

    location / {
        proxy_pass https://thumbler-testing/service/thumbler/;
        proxy_set_header Host $host;
        proxy_set_header Connection Close;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_redirect    off;
        proxy_read_timeout 60;

    # redirect server error pages to the static page /50x.html
    error_page   500 502 503 504  /50x.html;
    location = /50x.html {
        root   /usr/share/nginx/html;

Then reload the nginx service or other web server you may be using.