eDiscovery Configuration Reference#
This document covers all configuration options for the eDiscovery Export feature.
Environment Variables#
Export Directory#
eDiscovery exports use the standard Piler export directory:
# Base directory for all exports (default: /var/piler/export)
DIR_EXPORT=/var/piler/export
eDiscovery exports are stored in a subdirectory structure:
{DIR_EXPORT}/
└── {tenant_id}/
└── ediscovery/
└── {job_id}/
├── loadfile.csv
├── loadfile.dat
├── NATIVES/
├── TEXT/
└── ATTACHMENTS/
Concurrent Export Limits#
# Maximum concurrent exports across all tenants (default: 3)
EDISCOVERY_MAX_GLOBAL_EXPORTS=3
# Maximum concurrent exports per tenant (default: 1)
EDISCOVERY_MAX_TENANT_EXPORTS=1
These limits prevent resource exhaustion during large exports. The global limit ensures the server isn't overwhelmed when multiple tenants run exports simultaneously.
Worker Threads#
# Number of parallel workers for processing emails (default: 4, max: 32)
EDISCOVERY_WORKERS=4
Increase for faster exports on high-CPU systems. Decrease if exports cause performance issues for other users.
Database Tables#
Per-Tenant Tables#
These tables are created in each tenant's database ({tenant_id}.tablename):
ediscovery_exports#
Stores export job metadata.
| Column | Type | Description |
|---|---|---|
id |
BIGINT | Auto-increment primary key |
job_id |
VARCHAR(64) | Unique job identifier (UUID) |
status |
VARCHAR(32) | pending, running, completed, failed, cancelled |
bates_prefix |
VARCHAR(32) | Bates number prefix |
bates_start |
INT | Starting Bates number |
bates_end |
INT | Ending Bates number (after completion) |
total_emails |
INT | Total emails to process |
processed |
INT | Emails processed so far |
failed |
INT | Emails that failed processing |
config_json |
TEXT | JSON export configuration |
output_path |
VARCHAR(512) | Path to output directory |
output_size |
BIGINT | Total size of export files |
error_message |
TEXT | Error details if failed |
created_by |
VARCHAR(255) | Email of user who created export |
created_at |
TIMESTAMP | Creation timestamp |
started_at |
TIMESTAMP | Processing start time |
completed_at |
TIMESTAMP | Completion timestamp |
ediscovery_bates_sequences#
Tracks Bates number sequences for continuity.
| Column | Type | Description |
|---|---|---|
id |
BIGINT | Auto-increment primary key |
prefix |
VARCHAR(32) | Bates prefix (unique) |
last_number |
INT | Last assigned number |
updated_at |
TIMESTAMP | Last update timestamp |
ediscovery_export_items#
Maps individual emails to their Bates numbers.
| Column | Type | Description |
|---|---|---|
id |
BIGINT | Auto-increment primary key |
job_id |
VARCHAR(64) | Reference to export job |
email_id |
BIGINT | Reference to archived email |
bates_number |
VARCHAR(64) | Assigned Bates number |
status |
VARCHAR(32) | success, failed, skipped |
error_message |
TEXT | Error if failed |
processed_at |
TIMESTAMP | Processing timestamp |
ediscovery_download_log#
Audit trail for export downloads.
| Column | Type | Description |
|---|---|---|
id |
BIGINT | Auto-increment primary key |
job_id |
VARCHAR(64) | Reference to export job |
downloaded_by |
VARCHAR(255) | User who downloaded |
downloaded_at |
TIMESTAMP | Download timestamp |
ip_address |
VARCHAR(45) | Client IP address |
user_agent |
TEXT | Browser user agent |
Global Table#
Created in the main piler database:
ediscovery_active_exports#
Tracks currently running exports across all tenants for enforcing global limits.
| Column | Type | Description |
|---|---|---|
id |
BIGINT | Auto-increment primary key |
tenant_id |
VARCHAR(32) | Tenant identifier |
job_id |
VARCHAR(64) | Export job ID |
started_at |
TIMESTAMP | When export started |
Export Configuration Object#
When creating an export, the following configuration is stored as JSON:
{
"bates_prefix": "ACME",
"bates_start_number": 1,
"bates_digits": 6,
"include_natives": true,
"include_text": true,
"include_attachments": true,
"organize_by_custodian": false,
"load_file_format": "csv",
"date_format": "iso8601",
"email_ids": [1001, 1002, 1003],
"workers": 4
}
Configuration Fields#
| Field | Type | Default | Description |
|---|---|---|---|
bates_prefix |
string | required | Prefix for Bates numbers |
bates_start_number |
int | 0 (auto) | Starting number, 0 = continue from last |
bates_digits |
int | 6 | Number of digits in Bates number |
include_natives |
bool | true | Include original EML files |
include_text |
bool | false | Extract plain text files |
include_attachments |
bool | false | Extract attachments separately |
organize_by_custodian |
bool | false | Create custodian folders |
load_file_format |
string | "csv" | "csv", "dat", or "both" |
date_format |
string | "iso8601" | "iso8601", "us", or "eu" |
email_ids |
[]int64 | required | List of email IDs to export |
workers |
int | 4 | Parallel processing workers (1-32) |
Load File Formats#
CSV Format#
- Delimiter: Comma (
,) - Quote character: Double quote (
") - Encoding: UTF-8 with BOM
- Line endings: CRLF (
\r\n) - Header row: Yes
DAT Format (Concordance)#
- Field delimiter:
þ(U+00FE, Concordance default) - Text qualifier:
¶(U+00B6, Concordance default) - Newline replacement:
®(U+00AE) - Encoding: UTF-8
- Line endings: CRLF (
\r\n) - Header row: Yes
File Naming Conventions#
Native Files#
{BATES_PREFIX}_{NUMBER}.eml
Example: ACME_000001.eml
Text Files#
{BATES_PREFIX}_{NUMBER}.txt
Example: ACME_000001.txt
Attachments#
{BATES_PREFIX}_{NUMBER}_{ATTACHMENT_SEQ}.{ext}
Example: ACME_000001_001.pdf
Custodian Organization#
When enabled, files are organized:
{CUSTODIAN_EMAIL}/
├── NATIVES/
│ └── ACME_000001.eml
├── TEXT/
│ └── ACME_000001.txt
└── ATTACHMENTS/
└── ACME_000001_001.pdf
API Endpoints#
Auditor Endpoints (require auditor role)#
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/v1/ediscovery/export |
Create new export |
| GET | /api/v1/ediscovery/exports |
List all exports |
| GET | /api/v1/ediscovery/exports/:job_id |
Get export details |
| POST | /api/v1/ediscovery/exports/:job_id/cancel |
Cancel running export |
| DELETE | /api/v1/ediscovery/exports/:job_id |
Delete export |
| GET | /api/v1/ediscovery/exports/:job_id/download |
Download export ZIP |
| GET | /api/v1/ediscovery/bates-prefixes |
Get Bates prefix sequences |
Admin Endpoints (require admin role)#
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/v1/admin/ediscovery/exports |
List all tenant exports |
| GET | /api/v1/admin/ediscovery/bates-prefixes |
Get all Bates prefixes |
Security Considerations#
Access Control#
- Only users with auditor role (role_id = 2) can create exports
- Users can only see/download their own tenant's exports
- Download actions are logged with IP and user agent
File System Security#
- Export directories should be on encrypted volumes
- Consider filesystem permissions on the export directory
Data Retention#
- Exports remain until manually deleted
- Consider implementing automatic cleanup policies
- Download logs are retained indefinitely for audit purposes
Migration#
Apply the migration to create required tables:
# Run migration 014_ediscovery_export.sql on each tenant database
mysql -u piler -p tenant_db < util/migrations/sql/014_ediscovery_export.sql
# Run global migration on piler database
mysql -u piler -p piler < util/migrations/sql/014_ediscovery_global.sql