AI Features Setup Guide#
Piler's AI features provide intelligent email summarization, conversational search, and thread analysis using a local LLM (Large Language Model). This guide walks you through setting up the AI infrastructure.
Overview#
AI Features Available:
- β¨ Email Summarization Instant AI-generated summaries
- π Conversational Search Natural language queries
- π Thread Summarization Structured thread intelligence
Key Advantage: 100% on-premise. Your data NEVER leaves your infrastructure.
Prerequisites#
Hardware Requirements#
| Deployment Size | Minimum | Recommended | Notes |
|---|---|---|---|
| Small (100-500 users) | 4 CPU cores, 8GB RAM | RTX 3060 (12GB VRAM) | CPU-only works but slow (10-30s per query) |
| Medium (500-5K users) | 8 CPU cores, 16GB RAM | RTX 4090 (24GB VRAM) | GPU highly recommended (2-5s per query) |
| Large (5K+ users) | 16 CPU cores, 32GB RAM | A100 (40GB VRAM) | GPU required for acceptable performance |
GPU strongly recommended for production (20-50x faster than CPU).
Software Requirements#
- Linux: Ubuntu 22.04+, Debian 12+, RHEL 9+, or similar
- macOS: macOS 12+ (Apple Silicon or Intel)
- Docker: (optional) For containerized deployment
Installation Methods#
Method 1: Ollama (Recommended) β#
Why Ollama:
- β Easiest installation (one command)
- β Automatic GPU detection
- β Model management built-in
- β Works on Linux, macOS, Windows
- β Production-ready
- β Free and open source
Install Ollama#
Linux/macOS:
# One-line install
curl -fsSL https://ollama.com/install.sh | sh
# Verify installation
ollama --version
# Should show: ollama version X.X.X
Or with Docker:
docker pull ollama/ollama:latest
# GPU support (NVIDIA)
docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
# CPU only
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
Pull the LLM Model#
Recommended model: Llama 3.1 (8B)
# Pull model (will download ~4.7GB)
ollama pull llama3.1:8b
# Verify model is available
ollama list
# Should show: llama3.1:8b ... 4.7 GB
Alternative models:
| Model | Size | VRAM | Speed | Quality | Use Case |
|---|---|---|---|---|---|
| llama3.1:8b | 4.7GB | 6GB | Fast | Excellent | Recommended |
| llama3.1:70b | 40GB | 48GB | Slow | Best | Large deployments only |
| mistral:7b | 4.1GB | 5GB | Very fast | Good | Budget/CPU setups |
| qwen2.5:7b | 4.7GB | 6GB | Fast | Excellent | Multilingual |
| phi3:mini | 2.3GB | 3GB | Very fast | Good | Small/demo setups |
Test Ollama#
# Test generation
curl http://localhost:11434/api/generate -d '{
"model": "llama3.1:8b",
"prompt": "Summarize: This is a test email about quarterly budget approval.",
"stream": false
}'
# Should return JSON with summary
Start Ollama as Service#
Linux (systemd):
# Ollama installer creates service automatically
sudo systemctl enable ollama
sudo systemctl start ollama
sudo systemctl status ollama
macOS (launchd):
# Ollama runs as service automatically after install
# Check status:
ps aux | grep ollama
Docker:
# Already running if you used docker run -d
docker ps | grep ollama
Method 2: LM Studio (Alternative)#
Why LM Studio:
- GUI for model management
- Good for macOS/Windows users
- Easy testing and experimentation
Install:
- Download from https://lmstudio.ai/
- Install and open LM Studio
- Download a model (llama3.1:8b recommended)
- Start local server on port 11434
Configure Piler:
LLM_BASE_URL=http://localhost:1234 # LM Studio default port
LLM_MODEL=llama3.1:8b
Method 3: Custom LLM Server (Advanced)#
For organizations with specific requirements:
vLLM (High performance):
pip install vllm
vllm serve meta-llama/Llama-3.1-8B-Instruct --port 11434
Ollama on remote server:
# On GPU server
OLLAMA_HOST=0.0.0.0:11434 ollama serve
# On Piler server
LLM_BASE_URL=http://gpu-server.internal:11434
Piler Configuration#
1. Enable AI Features#
Edit your Piler .env file:
# Enable AI features
LLM_ENABLED=true
# Ollama connection
LLM_BASE_URL=http://localhost:11434
LLM_MODEL=llama3.1:8b
# Cache settings (optional tuning)
LLM_CACHE_EXPIRY_MINUTES=15
# Advanced (optional)
NL2QUERY_PROMPT_FILE= # Empty = use embedded default
2. Restart Piler#
# Systemd
sudo systemctl restart piler
# Docker
docker-compose restart piler-ui
# Manual
pkill piler-ui
./piler-ui
3. Verify AI Features#
Check LLM connectivity:
curl http://localhost:3000/api/v1/llm/ping \
-H "Authorization: Bearer YOUR_TOKEN"
# Should return:
{"status":"ok","model":"llama3.1:8b"}
In the UI:
- Log in to Piler
- View any email
- Click "AI Tools" dropdown
- Click "AI Summary"
- Should see summary within 2-5 seconds β
Network Setup#
Same Server (Simplest)#
Piler and Ollama on same machine:
βββββββββββββββββββββββββββββββββββ
β Server β
β ββββββββββββ βββββββββββββ β
β β Piler UI ββββββ Ollama β β
β β :3000 β β :11434 β β
β ββββββββββββ βββββββββββββ β
βββββββββββββββββββββββββββββββββββ
Config:
LLM_BASE_URL=http://localhost:11434
Firewall: No changes needed (local connection)
Separate LLM Server (Recommended for Production)#
Piler on one server, Ollama on GPU server:
βββββββββββββββ βββββββββββββββββββ-β
β Piler Serverβ β GPU Server β
β β β β
β Piler UI βββββββββββ Ollama :11434 β
β :3000 β Network β (GPU-accelerated)β
βββββββββββββββ ββββββββββββββββββ ββ
Config:
LLM_BASE_URL=http://gpu-server.internal:11434
Firewall rules:
# On GPU server
sudo ufw allow from PILER_SERVER_IP to any port 11434
# Or open to network (if trusted)
sudo ufw allow 11434/tcp
Ollama config (GPU server):
# Allow network access
export OLLAMA_HOST=0.0.0.0:11434
# Start Ollama
ollama serve
Docker Compose Setup#
Full stack with Ollama:
version: '3.8'
services:
piler-ui:
image: sutoj/piler-ui:latest
environment:
LLM_ENABLED: "true"
LLM_BASE_URL: "http://ollama:11434"
LLM_MODEL: "llama3.1:8b"
depends_on:
- ollama
networks:
- piler-network
ollama:
image: ollama/ollama:latest
volumes:
- ollama-data:/root/.ollama
ports:
- "11434:11434"
# GPU support (uncomment if you have NVIDIA GPU)
# deploy:
# resources:
# reservations:
# devices:
# - driver: nvidia
# count: 1
# capabilities: [gpu]
networks:
- piler-network
networks:
piler-network:
driver: bridge
volumes:
ollama-data:
Pull model:
docker-compose up -d
docker exec -it <ollama-container> ollama pull llama3.1:8b
Performance Tuning#
GPU Acceleration (NVIDIA)#
Verify GPU is detected:
# Check NVIDIA driver
nvidia-smi
# Ollama should detect GPU automatically
# Check logs:
journalctl -u ollama -f | grep -i gpu
# Should show: "Using GPU: NVIDIA GeForce RTX 4090"
If GPU not detected:
# Install CUDA toolkit
# Ubuntu/Debian:
sudo apt install nvidia-cuda-toolkit
# Restart Ollama
sudo systemctl restart ollama
CPU-Only Optimization#
If running without GPU:
# Use smaller model for faster responses
ollama pull phi3:mini # 2.3GB, faster on CPU
# Update Piler config
LLM_MODEL=phi3:mini
Tune thread count:
# Set CPU threads (default: auto)
export OLLAMA_NUM_THREADS=8 # Match your CPU cores
ollama serve
Memory Management#
Limit model memory:
# Unload models after 5 minutes of inactivity (default)
export OLLAMA_KEEP_ALIVE=5m
# Or keep loaded always (faster but uses RAM)
export OLLAMA_KEEP_ALIVE=-1
ollama serve
Troubleshooting#
"LLM service not configured" Error#
Cause: Piler can't reach Ollama
Solutions:
- Check Ollama is running:
curl http://localhost:11434/api/tags
# Should return list of models
- Check Piler config:
grep LLM_BASE_URL .env
# Should match Ollama address
- Check firewall (if separate servers):
# From Piler server
curl http://gpu-server:11434/api/tags
# Should connect
Slow AI Responses (>10 seconds)#
Cause: Running on CPU without GPU
Solutions:
-
Add GPU: Install NVIDIA GPU, verify with
nvidia-smi -
Use smaller model:
ollama pull phi3:mini
# Update LLM_MODEL=phi3:mini in .env
-
Upgrade hardware:
-
Minimum: 8 CPU cores, 16GB RAM
- Recommended: RTX 3060 or better
"Model not found" Error#
Cause: Model not pulled
Solution:
# Pull the model
ollama pull llama3.1:8b
# Verify
ollama list
Out of Memory (OOM) Errors#
Cause: Model too large for available VRAM/RAM
Solutions:
- Use smaller model:
ollama pull llama3.1:8b # Instead of 70b
- Increase swap (Linux):
sudo fallocate -l 16G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
- Reduce concurrent requests:
# Piler handles this automatically with rate limiting
# No config needed
Connection Timeout#
Cause: LLM taking too long to respond
Solution: Piler has built-in timeouts:
- Email summary: 60 seconds
- Thread summary: 120 seconds
- Search translation: 10 seconds
These are reasonable. If hitting timeouts, your LLM is too slow (needs GPU).
Security Considerations#
Network Security#
Ollama has NO authentication!
If running on separate server:
# Option 1: Firewall (recommended)
sudo ufw allow from PILER_IP to any port 11434
sudo ufw deny 11434 # Block others
# Option 2: VPN/Private network
# Run Ollama on private network only
# Option 3: Reverse proxy with auth (advanced)
# Use nginx with basic auth in front of Ollama
Never expose Ollama directly to the internet!
Data Privacy#
What gets sent to Ollama:
- Email subject
- Email body (truncated to 50KB for summaries)
- Thread messages (up to 50 messages)
What NEVER leaves your server:
- Nothing! Ollama runs locally
- No external API calls
- No telemetry (unless you enable it)
Compliance:
- β GDPR compliant (data stays on-premise)
- β HIPAA compatible (no third-party processing)
- β SOC 2 friendly (complete audit trail)
Production Deployment#
Recommended Architecture#
For small deployments (<500 users):
ββββββββββββββββββββββββββββββ
β Single Server β
β β’ Piler UI β
β β’ Ollama (CPU/small GPU) β
β β’ MySQL β
β β’ Manticore β
ββββββββββββββββββββββββββββββ
Hardware: 8 cores, 16GB RAM, optional RTX 3060
For medium deployments (500-5K users):
βββββββββββββββ ββββββββββββββββ
β Piler Serverβ β GPU Server β
β β’ UI ββββββββ β’ Ollama β
β β’ MySQL β β β’ RTX 4090 β
β β’ Manticore β ββββββββββββββββ
βββββββββββββββ
For large deployments (5K+ users):
βββββββββββββββ ββββββββββββββββββββββ
β Piler Masterβ β GPU Cluster β
β β’ UI ββββββββ β’ Ollama (node 1) β
β β β β’ Ollama (node 2) β
βββββββββββββββ ββββββββββββββββββββββ
β
ββββ Worker 1 (emails)
ββββ Worker 2 (emails)
Use load balancer for Ollama nodes
High Availability#
Option 1: Multiple Ollama instances
# Run Ollama on 2+ servers
# Use nginx/HAProxy load balancer
# Piler config:
LLM_BASE_URL=http://llm-loadbalancer:11434
Option 2: Failover
# Primary Ollama on GPU server
# Backup Ollama on CPU (slower but works)
# Application handles failover automatically via retries
Cost Analysis#
On-Premise (Ollama)#
One-time costs:
- RTX 4090 GPU: $1,600-2,000
- Server: $2,000-4,000 (if needed)
- Setup: $500-1,000 (engineering time)
Ongoing:
- Power: ~$30-50/month (300W GPU)
- Maintenance: Minimal
Total first year: ~$4,000-7,000
Unlimited queries after initial investment.
Cloud API (Alternative - Not Recommended)#
OpenAI/Anthropic pricing:
- ~$0.01-0.05 per summary
- 1,000 summaries/day = $10-50/day = $3,650-18,250/year
- 10,000 summaries/day = $36,500-182,500/year
Drawbacks:
- β Data leaves your infrastructure (compliance risk)
- β Ongoing per-query costs
- β Vendor lock-in
- β Potential downtime (external dependency)
Recommendation: On-premise Ollama for enterprise. Cloud APIs only for trials/demos.
Configuration Reference#
All Available Options#
# Core AI Settings
LLM_ENABLED=true|false
# Enable/disable all AI features
# Default: false
LLM_BASE_URL=http://localhost:11434
# Ollama/LLM server URL
# Default: http://localhost:11434
LLM_MODEL=llama3.1:8b
# Model name (must be pulled in Ollama)
# Default: llama3.1:8b
# Alternatives: mistral:7b, qwen2.5:7b, phi3:mini
LLM_CACHE_EXPIRY_MINUTES=15
# Redis cache TTL for summaries
# Default: 15 minutes
# Threads cached for 60 minutes (hardcoded)
NL2QUERY_PROMPT_FILE=
# Custom prompt template for conversational search
# Empty = use embedded default
# Example: /etc/piler/prompts/custom_nl2query.txt
Multi-Tenant Configuration#
Same LLM for all tenants:
# Global config in .env
LLM_ENABLED=true
LLM_BASE_URL=http://localhost:11434
Per-tenant enable/disable:
-- In tenant_settings table
UPDATE piler.tenant_settings
SET settings_json = JSON_SET(settings_json, '$.llm_enabled', true)
WHERE tenant_id = 'tenant1';
Customizing AI Behavior#
Default Conversational Search Prompt#
The conversational search uses this prompt template (embedded in binary):
Location: internal/llm/prompts/nl2query.txt
Click to view full default prompt template
You are a search query translator for an email archiving system. Convert natural language questions into Piler search syntax.
SEARCH SYNTAX:
- from:email@domain.com - Filter by sender
- to:email@domain.com - Filter by recipient
- subject:keyword - Search in subject (can include multiple words: subject:word1 word2)
- subject:"exact phrase" - Search for exact phrase in subject
- body:keyword - Search in body text (can include multiple words: body:word1 word2)
- body:"exact phrase" - Search for exact phrase in body
- date1:YYYY-MM-DD - Start date (use with date2 for range)
- date2:YYYY-MM-DD - End date (use with date1 for range)
- a:any - Has any attachment
- a:pdf - Has PDF attachments ONLY
- a:word - Has Word document attachments ONLY
- a:excel - Has Excel spreadsheet attachments ONLY
- a:image - Has image attachments ONLY (jpg, png, gif, etc. - use a:image, NOT a:jpg)
- a:zip - Has zip/archive attachments ONLY
- attachment:filename.pdf - Has specific attachment filename
- size:>5M - Size greater than 5MB (use M for megabytes, K for kilobytes)
- direction:inbound - Received emails (also: direction:outbound, direction:internal)
- category:name - Filter by category (short form: cat:name)
- tag:tagname - Search by tag
- note:text - Search in notes
- id:123 - Specific message ID
OPERATORS:
- Multiple terms separated by spaces are implicitly AND (no parentheses needed)
- OR - Either term (use uppercase: urgent OR important)
- NOT - Exclude term (use uppercase: NOT spam)
- "exact phrase" - Phrase search in quotes
- * - Wildcard (john* matches john, johnny, johns)
- ( ) - ONLY use parentheses for grouping OR/NOT operations, NOT for simple AND queries
IMPORTANT NOTES:
- For date ranges, use date1: and date2: together (NOT date:X..Y)
- For "has any attachment" use a:any (NOT has:attachment)
- Valid attachment types: a:pdf, a:word, a:excel, a:image, a:zip, a:any
- For images, ALWAYS use a:image (NOT a:jpg, a:png, etc.)
- For size, use M for megabytes, K for kilobytes (e.g., size:>5M NOT size:>5MB)
- Keep queries SIMPLE - only include meaningful search terms, skip filler words
- Skip generic words like "with", "about", "problems", "issues", "emails", "messages"
- Focus on specific entities: names, subjects, dates, attachment types
- Example: "virtualfax problems with pdf attachments" β "subject:virtualfax a:pdf" (skip "problems with")
- There is NO is:unread filter (this is an archive, not a mailbox)
- Direction must be: inbound, outbound, or internal
RELATIVE DATES (calculate from today: {{CURRENT_DATE}}):
- "today" β date1:{{TODAY}} date2:{{TODAY}}
- "yesterday" β date1:{{YESTERDAY}} date2:{{YESTERDAY}}
- "last week" β date1:{{LAST_WEEK}} (7 days ago)
- "last month" β date1:{{LAST_MONTH}} (30 days ago)
- "last quarter" β date1:{{LAST_QUARTER}} (90 days ago)
- "this month" β date1:{{MONTH_START}} date2:{{TODAY}}
- "this year" β date1:{{YEAR_START}}
EXAMPLES:
Input: "emails from sarah last week"
Output: {"query":"from:sarah date1:{{LAST_WEEK}}","explanation":"Emails from Sarah sent since {{LAST_WEEK}}","confidence":0.95}
Input: "urgent messages about project"
Output: {"query":"subject:urgent subject:project","explanation":"Messages about urgent project","confidence":0.90}
Input: "employment invitation"
Output: {"query":"subject:employment invitation","explanation":"Emails about employment invitation","confidence":0.92}
Input: "virtualfax problems with pdf attachments"
Output: {"query":"subject:virtualfax a:pdf","explanation":"Virtualfax emails with PDF attachments","confidence":0.93}
Input: "large PDF attachments from vendors"
Output: {"query":"from:*vendor* size:>5M a:pdf","explanation":"PDF attachments larger than 5MB from vendor domains","confidence":0.88}
Input: "images from operator"
Output: {"query":"a:image from:operator","explanation":"Image attachments from operator","confidence":0.95}
Input: "invoices in December"
Output: {"query":"subject:invoice date1:2024-12-01 date2:2024-12-31","explanation":"Emails with 'invoice' in subject from December 2024","confidence":0.92}
Input: "emails with any attachments from john"
Output: {"query":"from:john a:any","explanation":"Emails from John that have attachments","confidence":0.95}
Input: "emails about budget OR finance from sarah"
Output: {"query":"from:sarah (subject:budget OR subject:finance)","explanation":"Emails from Sarah about budget or finance","confidence":0.93}
Input: "attachments sent in the last 5 years"
Output: {"query":"date1:2020-01-20 a:any","explanation":"Emails with attachments since January 2020","confidence":0.94}
CONTEXT-AWARE EXAMPLES (showing how to handle follow-ups):
Context: Previous query was "virtualfax problems with pdf attachments" β "subject:virtualfax a:pdf"
Input: "after 2015-10-21"
Output: {"query":"subject:virtualfax a:pdf date1:2015-10-21","explanation":"Virtualfax emails with PDF attachments after October 21, 2015","confidence":0.96}
Context: Previous query was "emails from john" β "from:john"
Input: "just PDFs"
Output: {"query":"from:john a:pdf","explanation":"PDF emails from John","confidence":0.95}
Context: Previous query was "invoices last month" β "subject:invoice date1:2024-12-20"
Input: "over $10,000"
Output: {"query":"subject:invoice date1:2024-12-20 body:$10,000 OR body:10000","explanation":"Invoices from last month over $10,000","confidence":0.88}
{{CONVERSATION_CONTEXT}}
IMPORTANT RULES:
1. MUST return valid JSON with exactly these fields: query, explanation, confidence
2. If there is CONTEXT from previous queries, you MUST include those search terms in the new query
3. For follow-up/refinement queries (like "just PDFs" or "after 2020"), ALWAYS combine with previous context
4. Use confidence < 0.7 if the question is ambiguous
5. For ambiguous queries, explain what clarification is needed in the explanation field
6. Calculate today's date from context (assume current date: {{CURRENT_DATE}})
7. NEVER change dates provided by the user - if user says "2015" use 2015, NOT 2025 (this is an archive with old emails)
8. Use EXACT dates from user input - do not assume typos or correct dates
9. Always escape special characters in email addresses
10. Use wildcards (*) for partial matches when appropriate
11. Prefer (subject:X OR body:X) over just subject:X unless explicitly about subject only
RESPONSE FORMAT (copy this structure exactly):
{
"query": "your translated search query here",
"explanation": "brief explanation of what you're searching for",
"confidence": 0.95
}
Respond with ONLY the JSON object above. No markdown, no code blocks, no extra text.
Customizing the Prompt#
Why customize:
- Add industry-specific examples (legal, healthcare, finance)
- Improve accuracy for your organization's terminology
- Adjust for different LLM models
How to customize:
- Create custom prompt file:
# Copy embedded template (extract from binary or docs)
cat > /etc/piler/custom_nl2query.txt << 'EOF'
[Paste default template above]
# Add your custom examples:
Input: "discovery documents from opposing counsel"
Output: {"query":"from:*@opposing-firm.com category:discovery","explanation":"...","confidence":0.94}
Input: "patient records with consent forms"
Output: {"query":"subject:patient subject:consent a:pdf","explanation":"...","confidence":0.93}
EOF
- Point Piler to custom prompt:
# In .env
NL2QUERY_PROMPT_FILE=/etc/piler/custom_nl2query.txt
- Restart Piler:
sudo systemctl restart piler-ui
Template variables available:
{{CURRENT_DATE}}Today's date (auto-calculated){{LAST_WEEK}},{{LAST_MONTH}},{{LAST_QUARTER}}Relative dates{{CONVERSATION_CONTEXT}}Previous queries (auto-injected)
Testing your prompt:
Try queries in the UI and check if translations match expectations. Iterate based on real usage patterns.
Monitoring#
Check AI Feature Health#
1. Test LLM connectivity:
curl http://localhost:3000/api/v1/llm/ping \
-H "Cookie: session=YOUR_SESSION"
# Should return:
{"status":"ok","model":"llama3.1:8b"}
2. Monitor Ollama:
# Check running models
curl http://localhost:11434/api/tags
# Monitor logs
journalctl -u ollama -f
# Check resource usage
htop # Watch CPU/RAM
nvidia-smi -l 1 # Watch GPU (if NVIDIA)
3. Check Piler logs:
tail -f /var/log/piler/app.log | grep -i llm
# Look for:
# "LLM summarization failed" - problems
# "Cache hit for email summary" - working well
Performance Metrics#
Target metrics:
- Email summary: <5 seconds (P95)
- Thread summary: <30 seconds for 20-message thread (P95)
- Search translation: <3 seconds (P95)
- LLM uptime: >99%
If slower:
- Add GPU
- Use smaller model
- Check network latency (if separate servers)
Upgrading#
Update Ollama#
# Linux/macOS
curl -fsSL https://ollama.com/install.sh | sh
# Docker
docker pull ollama/ollama:latest
docker-compose up -d
Update Model#
# Pull newer version
ollama pull llama3.1:8b
# Old version removed automatically if space needed
Piler Updates#
AI features are part of Piler UI - update Piler as normal:
# Binary update
sudo systemctl stop piler-ui
sudo cp new-piler-ui /var/piler/ui/app
sudo systemctl start piler
# Docker update
docker-compose pull sutoj/piler-ui:2.1.0
docker-compose up -d
FAQ#
Q: Do I need a GPU?#
A: Highly recommended but not required.
- Without GPU: 10-30 seconds per summary (acceptable for light use)
- With GPU: 2-5 seconds per summary (production quality)
Q: Can I use GPT-4/Claude instead of Ollama?#
A: Technically yes, but not recommended:
- β Data leaves your infrastructure
- β Ongoing costs
- β Requires code changes (different API format)
- β Compliance risks
Ollama is designed for on-premise use.
Q: How much disk space for models?#
A: ~5-10GB per model
- llama3.1:8b: 4.7GB
- Keep 2-3 models for testing: ~15GB
- Models stored in
~/.ollama/models/
Q: Can multiple Piler instances share one Ollama?#
A: Yes! Ollama handles concurrent requests.
- Single RTX 4090: ~10-20 concurrent requests
- Add more GPUs for higher concurrency
Q: What if Ollama crashes?#
A: AI features gracefully degrade:
- User gets "LLM service unavailable" error
- Can still use traditional search
- Cached summaries still served (if available)
- No impact on email viewing/searching
Ollama auto-restarts via systemd.
Q: Can I customize AI prompts?#
A: Yes! See AI Prompt Customization Guide
Q: Does this work offline/air-gapped?#
A: Yes!
- Download Ollama installer on internet-connected machine
- Pull model:
ollama pull llama3.1:8b - Copy model files to air-gapped server
- Works completely offline
Support#
Getting Help#
- Check logs:
journalctl -u piler-uiandjournalctl -u ollama - Test Ollama directly:
curl http://localhost:11434/api/tags - Verify config:
grep LLM .env - Contact support: support@mailpiler.com
Reporting Issues#
Include:
- Piler version:
./piler-ui --version - Ollama version:
ollama --version - Model:
ollama list - Error logs (last 50 lines)
- Hardware: CPU, RAM, GPU (if any)
Next Steps#
- β Install Ollama
- β Pull llama3.1:8b model
- β Configure Piler (LLM_ENABLED=true)
- β Restart Piler
- β Test AI Summary feature
- β Train users on AI features
See also:
- AI Features User Guide - How to use AI features
- Configuration Options - All Piler settings
Last Update: November 22, 2025
Piler Version: 2.1.0+
Status: Production Ready