File Upload#

Upload documents directly to Squirro Chat for ad hoc analysis without requiring permanent indexing. That feature lets you analyze documents on demand, extract relevant information, and receive real-time answers from files during conversations.

Overview#

The file upload feature allows you to:

  • Upload documents directly to chat conversations for immediate analysis.

  • Ask questions about uploaded content without indexing the files permanently.

  • Analyze multiple files simultaneously in a single conversation.

Files are stored temporarily and associated with the specific conversation, making this ideal for quick document reviews, contract analysis, research tasks, and other use cases where permanent indexing is not needed.

Squirro Chat supports the file formats listed below. While the underlying processing engine may handle additional formats, only the listed formats are officially supported and guaranteed to deliver consistent results.

Supported File Formats#

Starting from version 3.15.2, Squirro Chat supports the following file formats for upload and analysis. Earlier versions support PDF files only.

Extensions

Format

DOC, DOCX, PPT, PPTX

Microsoft Office formats (Word, PowerPoint)

PDF

Portable Document Format

TXT

Plain text files

CSV

Comma-separated values for tabular data

JSON

JavaScript Object Notation for structured data

XML

Extensible Markup Language for hierarchical data

MD

Markdown formatted text files

HTML

Web pages

How to Use#

  1. Open Squirro Chat in your project.

  2. Click the file upload button or drag and drop files into the chat interface.

  3. Wait for upload confirmation.

  4. Start asking questions.

Processing Strategies#

Squirro Chat uses three intelligent strategies to process uploaded files, automatically selecting the most appropriate approach based on your query.

Sequential Reading Strategy#

Best for reading through documents systematically, browsing page by page, or when you need to understand the full context.

How it works:

  • Returns content in document order (page by page).

  • Supports pagination with automatic continuation.

  • Shows progress (for example, “Showing chunks 1-10 of 50”).

  • Agent can continue reading with subsequent queries.

Search Strategy#

Best for finding specific information, locating keywords, or answering targeted questions.

How it works:

  • Searches document content using keyword and semantic matching.

  • Returns the 15 most relevant chunks with relevance scores.

  • Includes surrounding context chunks for better understanding.

  • Highlights page numbers for easy reference.

Full Document Strategy#

Best for small documents, complete summaries, or when you need the entire content.

How it works:

  • Extracts and returns complete document text.

  • No chunking or pagination.

  • Processes entire file at once.

  • Includes page markers for PDF documents.

Configuration#

For Administrators#

The file upload tool can be configured at the agent level with the following parameters.

Parameter

Default

Description

default_search_strategy

sequential

Default strategy: search, sequential, or full.

max_chunks

100

Maximum chunks returned for sequential strategy.

include_metadata

False

Show chunk metadata (page numbers, scores, indices).

chunk_context

4

Number of surrounding chunks for search results.

max_tokens_per_file

50000

Maximum tokens per file (approximate limit).

use_fallback

True

Enable fallback to PyMuPDF when chunks are unavailable.

To configure these settings:

  1. Navigate to Project Settings > Agents.

  2. Select the agent that uses file upload.

  3. Expand the File Upload tool configuration.

  4. Adjust the parameters.

  5. Save changes.

File Size Limit#

The file upload size limit controls the maximum size of files that can be uploaded through the Squirro web interface, including Squirro Chat and data source configurations.

For information about how administrators can configure this limit, see the File Upload Size Limits page.

For Developers#

Custom tools can integrate with the file upload feature using the attachments placeholder.

@deploy_as_agent_tool(
    "custom_file_tool",
    define_placeholders={
        "attachments": "attachments",
        "conversation_id": "conversation_id",
    }
)
class CustomFileToolFactory(ToolBase):
    attachments: list[dict] | None = Field(default_factory=list)
    conversation_id: str = "default"

Files are passed as a list of dictionaries containing file_id, filename, and content_type.

Considerations#

File Storage#

  • Temporary storage only

    The uploaded files are not permanently indexed in Squirro.

  • Conversation-specific

    The uploaded files are associated with the conversation and not shared across chats.

  • Session-based

    The uploaded files remain available during the conversation session.

File Size and Processing#

  • Token limits apply

    The uploaded files are truncated if they exceed configured token limits. The default limit is 50’000 tokens, which corresponds to approximately 200’000 characters.

  • Context window dependency

    The maximum document size that can be effectively processed depends on the context window supported by the attached LLM model.

  • Processing time

    Large files may take longer to process for chunking.

Access and Permissions#

  • Project-level permissions

    Every user with read access or above can use the file upload feature.

  • User-specific access

    The uploaded files are accessible only via the user’s authentication token.

  • No sharing

    The uploaded files cannot be shared with other users or conversations.

Example Uses#

Contract Analysis#

Quickly review contract terms without permanent storage.

User: [Uploads contract.pdf]
User: What are the payment terms in this contract?

Chat: [Searches for "payment terms"]
Chat: According to Section 3.2 on page 5, the payment terms are...

Data Analysis#

Analyze structured data files for insights.

User: [Uploads sales-data.csv]
User: What is the total revenue for Q4?

Chat: [Processes CSV content]
Chat: Based on the data, Q4 revenue totals $2.4M across all regions...

Multi-File Comparison#

Compare information across multiple documents.

User: [Uploads proposal-v1.pdf and proposal-v2.pdf]
User: What are the differences in pricing between these two proposals?

Chat: [Searches both files for "pricing"]
Chat: In proposal-v1.pdf, the total is $50,000 (page 3), while
       proposal-v2.pdf shows $47,500 (page 4)...