logo

Overview

  • Squirro Profiles
    • System Administrator
    • Project Creator
    • Data Scientist
    • Model Creator
    • Python Engineer
    • Frontend Developer
    • Cognitive Search User
  • Squirro Products
    • Squirro Insight Engine
    • Squirro Cognitive Search
    • Squirro Sales Insights
    • Squirro for Salesforce
    • Squirro for Microsoft Outlook
    • Installation Guides
      • Squirro for Salesforce Installation
      • Salesforce Federated Search Installation
      • Microsoft Outlook Installation
  • Squirro A to Z
    • Squirro Glossary
    • Glossary of Industry Terms
    • Index
  • Squirro Academy

Get Technical

  • Administration
    • Configuration
      • Configuration Files
        • cluster.ini
        • common.ini
        • configuration.ini
        • convert.ini
        • datasource.ini
        • digestmailer.ini
        • emailsender.ini
        • filtering.ini
        • fingerprint.ini
        • frontend.ini
        • ingester.ini
        • topic.ini
      • Configuration Service
      • Default Language
      • Email Templates
      • Interpolation
      • Securing Configuration Files
      • Secure Configuration Guide
      • Securing Elasticsearch and MariaDB
      • Securing Redis Instance over SSL
      • SSL Certificates
    • External Authentication
      • ADFS Setup
      • Architecture for Authentication
      • Azure Active Directory Setup
      • Google SAML Setup
      • Okta SAML Setup
      • Query Templates
      • SAML SSO for Squirro
      • User Management in the UI
    • Operations
      • Accessing Servers
      • Activity Tracking
      • Business Continuity Planning
      • Cluster Status
      • Database Backup and Restore
      • Elasticsearch Management
      • Fixing MySQL/MariaDB Replication
      • How Squirro Scales
      • Monitoring
      • Services
    • Troubleshooting
      • Internet Explorer Compatibility
      • Investigating Performance Issues
      • MySQL Big Data Management
      • MySQL Too Many Connections
      • Flushing Caches
      • Python Performance Profiling Using PySpy
      • Smart Filters Showing as Empty
  • AI Studio
    • How-To Guides
      • Bulk Labeling
      • Multi-Label Proximity Filter
      • Integrating a Custom ML CLassifier
      • Interact with Squirro Using Jupyter Notebook
    • Step 1: Candidate Sets
    • Step 2: Ground Truth
    • Step 3: Models
    • Step 4: Validation
    • Step 5: Publish
    • Bulk Labeling
    • ML Enrichments for Pipeline Workflows
  • APIs and SDKs
    • SquirroClient (Python SDK)
      • APIs by Topic
        • CommunitiesMixin
        • CommunitySubscriptionsMixin
        • CommunityTypesMixin
        • ConfigurationMixin
        • ContributingRecordsMixin
        • DashboardsMixin
        • EmailTemplatesMixin
        • EnrichmentsMixin
        • EntitiesMixin
        • FacetsMixin
        • FileUploadMixin
        • GlobalTempMixin
        • MLCandidateSetMixin
        • MLGroundTruthMixin
        • MLModelsMixin
        • MLPublishMixin
        • MLSentenceSplitterMixin
        • MLTemplatesMixin
        • MLUserFeedbackMixin
        • MachineLearningMixin
        • ObjectsMixin
        • PipelineSectionsMixin
        • PipelineStatusMixin
        • PipelineWorkflowMixin
        • ProjectGuideFilesMixin
        • ProjectTranslationsMixin
        • ProjectsMixin
        • SavedSearchesMixin
        • SmartAnswersMixin
        • SmartfiltersMixin
        • SourcesMixin
        • SubscriptionsMixin
        • SuggestImageMixin
        • SynonymsMixin
        • TasksMixin
        • ThemesMixin
        • TopicApiBaseMixin
        • TrendDetectionMixin
        • WidgetsAndAssetsMixin
      • DocumentUploader Class
      • Installation
      • ItemUploader Class
      • Setup Class
      • SquirroClient Tutorial
      • User Management
        • UserApiMixin
    • Squirro Toolbox
      • Bulk Exporter
      • Install In a Python Environment
      • squirro_asset CLI Reference
      • Toolbox Differences Between macOS / Linux / Windows
    • Authentication
    • Catalyst Data Model
    • Common Headers
    • Common Status Codes
    • Data Modeling
    • Item Format
    • JavaScript SDK
    • Plugin Repository
    • Custom Sections API
  • Architecture
    • Architecture Diagrams
    • General Concepts
  • Communities
    • Augmentation
    • Auto-Subscription
    • Communities Tutorial
    • CSV and Excel Formatting for Upload
    • Managing Communities
    • Setting Up Community Types
    • How To Set Up Using KEE
    • Visualizing Communities
  • Dashboards
    • Community 360 Dashboard
    • Dashboard Editor
    • Dashboard Store
    • Global Search Dashboard
    • Layers
    • Layout Master Dashboard
    • Mobile Dashboard Editor
    • Permissions
    • Query Parameterization
    • Visibility Conditions
  • Data Loading
    • How-To Guides
      • CLI Tool Tutorial
      • Creating a Custom Connector
      • Creating a Custom Plugin
      • Labels Tutorial
    • Command Line Tool
    • Connectors
      • Built-In Data Connectors
      • 1-Click Data Connectors
        • Atlassian Confluence Connector
        • Atlassian Jira Connector
        • Dropbox Connector
        • Google Drive Connector
        • Google Gmail Connector
        • Microsoft Exchange Connector
        • Microsoft One Drive Connector
        • Microsoft Sharepoint Connector
        • Salesforce Sales Cloud Connector
        • Salesforce Service Cloud Connector
        • Squirro Connector
        • Webz.io Connector
    • Data Types
    • Date Format
    • Data Loader Reference
    • Format Strings
    • Labels
    • Plugins
      • Data Loader Templates
      • Data Loader Plugin Boilerplate
      • API for Caching and Custom State Management
      • Data Loader Plugin Configuration
      • Data Loader Plugin Reference
      • Data Loader Plugin Dependencies
      • Example Data Loader Plugin
      • Incremental Loading
      • Data Loader Plugin Preview
    • UI Data Loader
  • Data Processing Pipeline
    • Built-In Steps
      • Enrich Steps
        • Unshorten Link
        • Duplicate Detection
        • Content Augmentation
        • Content Extraction
        • PDF OCR
        • Noise Removal
        • Near Duplicate Detection
        • Thumbnail Extraction
        • Language Detection
        • PDF Conversion
      • Relate (KEE)
      • Discover (NLP Tagger)
      • Classify (from AI Studio)
      • Predict (Trend Detection)
      • Recommend
      • Automate
      • Index Steps
        • Content Standardization
        • Cache Cleaning
        • Indexing
        • Search Tagging and Alerting
      • Flow (Change Pipeline)
    • Custom Steps
      • Pipelets (Custom Step)
      • Squirro Scripts
    • Indexing Common Formats
    • Overview
    • Prioritization
    • Processing Errors
    • Pipeline Editor
    • Reset Project
    • Reruns
    • Trend Detection
    • Trend Detection Tutorial
  • Installation and Upgrade
    • Cluster Expansion
    • Installing Squirro Using Ansible
    • Installing Squirro on Linux
    • System Requirements
    • Upgrading
    • Upgrading Configuration Conflicts
    • Upgrading Earlier Versions
  • Integrations
    • Atlassian (Confluence & Jira)
    • Embedding Dashboards
    • Microsoft Dynamics, Sharepoint, and Office
    • Newsletters
    • OpenSearch API
    • PegaSystems - Pega
    • Qlik Sense
    • ServiceNow Insights
    • Tableau
  • Known Entity Extraction
    • Command Line Tool
    • Command Line Tool Tutorial
    • Configuration
    • Studio Plugin
    • Studio Plugin Tutorial
    • Testing
    • Tokenizers and Filters
  • libNLP
    • Base Types
    • How to Create Custom Query-Processing Steps
    • Query Processing
      • CustomSpacyNormalizer
      • LanguageDetection
      • LemmaExpander
      • POSBooster
      • QueryClassifier
      • QueryModifier
      • QuerySyntaxParser
      • QueryZeroShotClassifier
      • SemanticTermExpander
    • Step Types
      • Standard Types
      • Analyzers Package
        • make_analyzer
        • Analyzer
        • ProximityValidation
      • Classifiers Package
        • make_classifier
        • BERTSentiment
        • Classifier
        • CosineSimilarityClassifier
        • FastTextClassifier
        • KFoldValidation
        • SklearnClassifier
        • SmartfilterClassifier
        • VaderSentiment
      • Debuggers Package
        • make_debugger
        • Debugger
        • LogFieldsDebugger
      • Embedders Package
        • make_embedder
        • BowEmbedder
        • Embedder
        • SklearnTFIDFEmbedder
        • TermsExtractionEmbedder
        • TransformerEmbedder
      • External Package
        • make_step
        • EndpointStep
        • MlflowMaasEndpointStep
        • NlpServiceQuestionAnswering
        • NlpServiceSpacy
      • Filters Package
        • make_filter
        • AggregateFilter
        • BlacklistFilter
        • CopyFilter
        • DocJoinFilter
        • DocSplitFilter
        • EmptyFilter
        • Filter
        • JoinFilter
        • MergeFilter
        • ProximityFilter
        • RegexFilter
        • SplitFilter
        • SquirroEntityFilter
        • ThresholdFilter
        • VoteFilter
        • WhitelistFilter
      • Flow Package
        • make_flow_step
        • ConditionStep
      • Loaders Package
        • make_loader
        • CSVLoader
        • FileLoader
        • JSONLoader
        • Loader
        • SquirroGroundtruthLoader
        • SquirroItemLoader
        • SquirroQueryLoader
      • Normalizers Package
        • make_normalizer
        • CharacterNormalizer
        • EmailParseNormalizer
        • HTMLNormalizer
        • LowercaseNormalizer
        • Normalizer
        • PunctuationNormalizer
        • SentimentTermNormalizer
        • SpacyNormalizer
        • StopwordsNormalizer
      • Projectors Package
        • make_projector
        • Projector
        • SklearnProjector
      • Runtimes Package
        • make_runtime_step
        • ONNXRuntime
        • Runtime
      • Savers Package
        • make_saver
        • CSVSaver
        • JSONSaver
        • Saver
        • SquirroEntityCleaner
        • SquirroItemSaver
      • Tokenizers Package
        • make_tokenizer
        • HtmlTokenizer
        • PdfSentencesTokenizer
        • SentencesNLTKTokenizer
        • SpacesTokenizer
        • Tokenizer
    • Troubleshooting & FAQ
    • Utils
      • Standard Types
      • Cache Package
        • get_document_cache_client
        • get_model_cache_client
        • Cache
        • CacheDocument
        • CacheWithExpiration
        • ExpiringCache
        • MemoryCache
        • ModelCacheClient
      • Model Management Package
        • InMemModel
        • ModelNotInMemoryException
      • Transformers Package
        • get_transformer
        • Huggingface_Transformers
        • SentenceTransformer
        • Tensorflow_Transformer
        • Transformer
  • Machine Learning
    • Example MLFlow Model
    • How To Publish ML Models Using the Squirro Client
    • Model-as-a-Service
    • Significant Terms Extraction
  • Pipelets
    • How-To Guides
      • Pipelets Tutorial
      • How to Access File Contents in Pipelets
      • How to Use Pipelets With the Squirro Data Loader
    • Additional Labeling Pipelet
    • Development Workflow
    • Pipelets Reference
    • Rerunning a Pipelet
    • Scaling Pipelet Execution
    • Time Formats Pipelet
    • Troubleshooting and FAQ
    • Writing Pipelets
  • Project Templates
    • Configuration
    • Export and Import
    • Overview
  • Search
    • How-To Guides
      • How To Set Up a Cognitive Search Application
      • Business Environment Monitoring Quick-Start Guide
      • How To Use Best-Bets Labels to Map Query Terms
      • How to Use The Other People Ask Feature
      • How To Use The Question Answering Feature
      • How To Handle User Query Terms Correctly
      • How to Use Scoring Profiles to Customize Document Relevancy Scoring
      • How To Install a SpaCy Language Model
    • Concept Search
    • Document Relevancy
    • Map Search and Visualizations
    • Other People Ask
    • Popular Query Suggestions
    • Query Processing
    • Query Syntax
    • Question Answering
    • Recommendations
    • Smart Filters
    • Smart Rank
    • Spellchecking
    • Synonyms
    • Typeahead Suggestions
  • Squirro Self-Service
    • How-To Guides
      • Install an Application
      • Open a Demo
      • Register for a Squirro ID
    • Cluster Roles and Permissions
    • Email Domain-Based Permissions
    • Instances
    • Upgrading an Instance
  • User Interface
    • App and Nav Bar Styling
    • Connecting to Squirro
    • Explore Dashboard
    • Overview
    • Project Browser
    • Reference Screen
    • Roles & Permissions
    • Spaces
    • Squirro Monitoring
    • Studio
      • ML Workflows
      • ML Jobs
    • Themes
  • Widgets
    • Built-In Widgets
      • Action Widget
      • Bar Chart Widget
      • Breadcrumbs Widget
      • Cards Widget
      • Communities Widget
      • Communities Banner Widget
      • Community Headlines Widget
      • Data Labeling Lane Widget
      • Divider Widget
      • Engagement Map Widget
      • Entities Widget
      • Favorites Widget
      • Heat Map Widget
      • HTML Editor Widget
      • iFrame Widget
      • Item Detail Widget
      • Items Table Widget
      • Items Widget
      • Kanban Lane Widget
      • Labels Widget
      • Line Chart Widget
      • Metrics Widget
      • Navigation Chips Widget
      • Pie Chart Widget
      • QA Widget
      • REC Explanations Widget
      • REC Input Widget
      • REC Results Widget
      • Region Map Widget
      • Reset Filters Widget
      • Result List Widget
      • Rich Text Editor Widget
      • Sample Text Labeller Widget
      • Search Bar Widget
      • Significant Terms Widget
      • Similar Searches Widget
      • Smart Filters Widget
      • Spelling Correction Widget
      • Table Widget
      • Tabs Widget
      • Time Selection Widget
      • Timeline Widget
      • Topic Cluster Widget
      • Trend Widget
      • Trends List Widget
      • Word Cloud Widget
      • World Map Widget
    • Custom Widgets
    • How to Create a Custom Widget
    • Managing Widgets in the UI
    • React Custom Widgets
      • Getting Started with React
      • Creating React Custom Widgets
      • Customizing React Custom Widgets
      • Squirro Storybook
      • Available Libraries
      • Available React Hooks
      • Styling React Components
      • React How-Tos and Walkthroughs
        • How To Opt In To React Widgets
        • How To Style React Widgets
        • Items Widget Styling Walkthrough
        • Adding Tooltips to Tabs Widget Walkthrough
    • Backbone Custom Widgets
      • Custom Widgets Life Cycle
      • Custom Widgets for Dashboards
      • Squirro Widget CLI Reference
      • Custom Widgets and the Dashboard Store
      • Example Card or Result List Custom Widget
      • Squirro Widget SDK
        • Core Widgets
        • Properties
        • Factories
        • Utils
      • Tutorials
        • Getting Started
        • Accessing Data
      • Common Knowledge
        • FAQ
        • Debugging
        • Models, Views, & Collections
        • JavaScript
        • Limitations & Workarounds

Engage

  • Forum
  • Support
  • Redefining AI Podcast

Other

  • Squirro Website
  • Terms & Conditions
  • Security
    • Security Advisories
      • CVE-2021-27945 - Cross-Site Scripting
      • CVE-2021-44228 - log4j Security Vulnerability
Contents
  • Overview
  • Configuration

Unshorten Link

Contents

  • Overview
  • Configuration

Unshorten Link#

The unshorten link pipeline step resolves the link and expands it to the long version. This ensures that short URLs are indexed with the long version. This helps with Duplicate Detection which relies on a combination of title and link by default.

Enrichment name

unshorten-link

Stage

deduplication

Overview#

During the unshorten-link step, the link field of items is expanded to resolve any HTTP redirects. This ensures that tiny URLs e.g. from Twitter posts are expanded to their long version.

Because this step has to request websites, it will add delays to the pipeline processing. If your data source does not contain shortened URLs, then you can disable this step using the processing config. image1

Configuration#

There are no configuration options for this enrichment, except for enabling and disabling the enabled property.

By Squirro AG
© Copyright 2023, Squirro AG.
Last updated on Jan 14, 2023.