File Upload#
Upload documents directly to Squirro Chat for ad hoc analysis without requiring permanent indexing. That feature lets you analyze documents on demand, extract relevant information, and receive real-time answers from files during conversations.
Overview#
The file upload feature allows you to:
Upload documents directly to chat conversations for immediate analysis.
Ask questions about uploaded content without indexing the files permanently.
Analyze multiple files simultaneously in a single conversation.
Files are stored temporarily and associated with the specific conversation, making this ideal for quick document reviews, contract analysis, research tasks, and other use cases where permanent indexing is not needed.
Squirro Chat supports the file formats listed below. While the underlying processing engine may handle additional formats, only the listed formats are officially supported and guaranteed to deliver consistent results.
Supported File Formats#
Starting from version 3.15.2, Squirro Chat supports the following file formats for upload and analysis. Earlier versions support PDF files only.
Extensions |
Format |
|---|---|
DOC, DOCX, PPT, PPTX |
Microsoft Office formats (Word, PowerPoint) |
Portable Document Format |
|
TXT |
Plain text files |
CSV |
Comma-separated values for tabular data |
JSON |
JavaScript Object Notation for structured data |
XML |
Extensible Markup Language for hierarchical data |
MD |
Markdown formatted text files |
HTML |
Web pages |
How to Use#
Open Squirro Chat in your project.
Click the file upload button or drag and drop files into the chat interface.
Wait for upload confirmation.
Start asking questions.
Processing Strategies#
Squirro Chat uses three intelligent strategies to process uploaded files, automatically selecting the most appropriate approach based on your query.
Sequential Reading Strategy#
Best for reading through documents systematically, browsing page by page, or when you need to understand the full context.
How it works:
Returns content in document order (page by page).
Supports pagination with automatic continuation.
Shows progress (for example, “Showing chunks 1-10 of 50”).
Agent can continue reading with subsequent queries.
Search Strategy#
Best for finding specific information, locating keywords, or answering targeted questions.
How it works:
Searches document content using keyword and semantic matching.
Returns the 15 most relevant chunks with relevance scores.
Includes surrounding context chunks for better understanding.
Highlights page numbers for easy reference.
Full Document Strategy#
Best for small documents, complete summaries, or when you need the entire content.
How it works:
Extracts and returns complete document text.
No chunking or pagination.
Processes entire file at once.
Includes page markers for PDF documents.
Configuration#
For Administrators#
The file upload tool can be configured at the agent level with the following parameters.
Parameter |
Default |
Description |
|---|---|---|
|
|
Default strategy: |
|
|
Maximum chunks returned for sequential strategy. |
|
|
Show chunk metadata (page numbers, scores, indices). |
|
|
Number of surrounding chunks for search results. |
|
|
Maximum tokens per file (approximate limit). |
|
|
Enable fallback to PyMuPDF when chunks are unavailable. |
To configure these settings:
Navigate to Project Settings > Agents.
Select the agent that uses file upload.
Expand the File Upload tool configuration.
Adjust the parameters.
Save changes.
File Size Limit#
The file upload size limit controls the maximum size of files that can be uploaded through the Squirro web interface, including Squirro Chat and data source configurations.
For information about how administrators can configure this limit, see the File Upload Size Limits page.
For Developers#
Custom tools can integrate with the file upload feature using the attachments placeholder.
@deploy_as_agent_tool(
"custom_file_tool",
define_placeholders={
"attachments": "attachments",
"conversation_id": "conversation_id",
}
)
class CustomFileToolFactory(ToolBase):
attachments: list[dict] | None = Field(default_factory=list)
conversation_id: str = "default"
Files are passed as a list of dictionaries containing file_id, filename, and content_type.
Considerations#
File Storage#
Temporary storage only
The uploaded files are not permanently indexed in Squirro.
Conversation-specific
The uploaded files are associated with the conversation and not shared across chats.
Session-based
The uploaded files remain available during the conversation session.
File Size and Processing#
Token limits apply
The uploaded files are truncated if they exceed configured token limits. The default limit is 50’000 tokens, which corresponds to approximately 200’000 characters.
Context window dependency
The maximum document size that can be effectively processed depends on the context window supported by the attached LLM model.
Processing time
Large files may take longer to process for chunking.
Access and Permissions#
Project-level permissions
Every user with read access or above can use the file upload feature.
User-specific access
The uploaded files are accessible only via the user’s authentication token.
No sharing
The uploaded files cannot be shared with other users or conversations.
Example Uses#
Contract Analysis#
Quickly review contract terms without permanent storage.
User: [Uploads contract.pdf]
User: What are the payment terms in this contract?
Chat: [Searches for "payment terms"]
Chat: According to Section 3.2 on page 5, the payment terms are...
Data Analysis#
Analyze structured data files for insights.
User: [Uploads sales-data.csv]
User: What is the total revenue for Q4?
Chat: [Processes CSV content]
Chat: Based on the data, Q4 revenue totals $2.4M across all regions...
Multi-File Comparison#
Compare information across multiple documents.
User: [Uploads proposal-v1.pdf and proposal-v2.pdf]
User: What are the differences in pricing between these two proposals?
Chat: [Searches both files for "pricing"]
Chat: In proposal-v1.pdf, the total is $50,000 (page 3), while
proposal-v2.pdf shows $47,500 (page 4)...