Skip to content

eDiscovery Configuration Reference#

This document covers all configuration options for the eDiscovery Export feature.

Environment Variables#

Export Directory#

eDiscovery exports use the standard Piler export directory:

# Base directory for all exports (default: /var/piler/export)
DIR_EXPORT=/var/piler/export

eDiscovery exports are stored in a subdirectory structure:

{DIR_EXPORT}/
└── {tenant_id}/
    └── ediscovery/
        └── {job_id}/
            ├── loadfile.csv
            ├── loadfile.dat
            ├── NATIVES/
            ├── TEXT/
            └── ATTACHMENTS/

Concurrent Export Limits#

# Maximum concurrent exports across all tenants (default: 3)
EDISCOVERY_MAX_GLOBAL_EXPORTS=3

# Maximum concurrent exports per tenant (default: 1)
EDISCOVERY_MAX_TENANT_EXPORTS=1

These limits prevent resource exhaustion during large exports. The global limit ensures the server isn't overwhelmed when multiple tenants run exports simultaneously.

Worker Threads#

# Number of parallel workers for processing emails (default: 4, max: 32)
EDISCOVERY_WORKERS=4

Increase for faster exports on high-CPU systems. Decrease if exports cause performance issues for other users.

Database Tables#

Per-Tenant Tables#

These tables are created in each tenant's database ({tenant_id}.tablename):

ediscovery_exports#

Stores export job metadata.

Column Type Description
id BIGINT Auto-increment primary key
job_id VARCHAR(64) Unique job identifier (UUID)
status VARCHAR(32) pending, running, completed, failed, cancelled
bates_prefix VARCHAR(32) Bates number prefix
bates_start INT Starting Bates number
bates_end INT Ending Bates number (after completion)
total_emails INT Total emails to process
processed INT Emails processed so far
failed INT Emails that failed processing
config_json TEXT JSON export configuration
output_path VARCHAR(512) Path to output directory
output_size BIGINT Total size of export files
error_message TEXT Error details if failed
created_by VARCHAR(255) Email of user who created export
created_at TIMESTAMP Creation timestamp
started_at TIMESTAMP Processing start time
completed_at TIMESTAMP Completion timestamp

ediscovery_bates_sequences#

Tracks Bates number sequences for continuity.

Column Type Description
id BIGINT Auto-increment primary key
prefix VARCHAR(32) Bates prefix (unique)
last_number INT Last assigned number
updated_at TIMESTAMP Last update timestamp

ediscovery_export_items#

Maps individual emails to their Bates numbers.

Column Type Description
id BIGINT Auto-increment primary key
job_id VARCHAR(64) Reference to export job
email_id BIGINT Reference to archived email
bates_number VARCHAR(64) Assigned Bates number
status VARCHAR(32) success, failed, skipped
error_message TEXT Error if failed
processed_at TIMESTAMP Processing timestamp

ediscovery_download_log#

Audit trail for export downloads.

Column Type Description
id BIGINT Auto-increment primary key
job_id VARCHAR(64) Reference to export job
downloaded_by VARCHAR(255) User who downloaded
downloaded_at TIMESTAMP Download timestamp
ip_address VARCHAR(45) Client IP address
user_agent TEXT Browser user agent

Global Table#

Created in the main piler database:

ediscovery_active_exports#

Tracks currently running exports across all tenants for enforcing global limits.

Column Type Description
id BIGINT Auto-increment primary key
tenant_id VARCHAR(32) Tenant identifier
job_id VARCHAR(64) Export job ID
started_at TIMESTAMP When export started

Export Configuration Object#

When creating an export, the following configuration is stored as JSON:

{
  "bates_prefix": "ACME",
  "bates_start_number": 1,
  "bates_digits": 6,
  "include_natives": true,
  "include_text": true,
  "include_attachments": true,
  "organize_by_custodian": false,
  "load_file_format": "csv",
  "date_format": "iso8601",
  "email_ids": [1001, 1002, 1003],
  "workers": 4
}

Configuration Fields#

Field Type Default Description
bates_prefix string required Prefix for Bates numbers
bates_start_number int 0 (auto) Starting number, 0 = continue from last
bates_digits int 6 Number of digits in Bates number
include_natives bool true Include original EML files
include_text bool false Extract plain text files
include_attachments bool false Extract attachments separately
organize_by_custodian bool false Create custodian folders
load_file_format string "csv" "csv", "dat", or "both"
date_format string "iso8601" "iso8601", "us", or "eu"
email_ids []int64 required List of email IDs to export
workers int 4 Parallel processing workers (1-32)

Load File Formats#

CSV Format#

  • Delimiter: Comma (,)
  • Quote character: Double quote (")
  • Encoding: UTF-8 with BOM
  • Line endings: CRLF (\r\n)
  • Header row: Yes

DAT Format (Concordance)#

  • Field delimiter: þ (U+00FE, Concordance default)
  • Text qualifier: (U+00B6, Concordance default)
  • Newline replacement: ® (U+00AE)
  • Encoding: UTF-8
  • Line endings: CRLF (\r\n)
  • Header row: Yes

File Naming Conventions#

Native Files#

{BATES_PREFIX}_{NUMBER}.eml
Example: ACME_000001.eml

Text Files#

{BATES_PREFIX}_{NUMBER}.txt
Example: ACME_000001.txt

Attachments#

{BATES_PREFIX}_{NUMBER}_{ATTACHMENT_SEQ}.{ext}
Example: ACME_000001_001.pdf

Custodian Organization#

When enabled, files are organized:

{CUSTODIAN_EMAIL}/
├── NATIVES/
│   └── ACME_000001.eml
├── TEXT/
│   └── ACME_000001.txt
└── ATTACHMENTS/
    └── ACME_000001_001.pdf

API Endpoints#

Auditor Endpoints (require auditor role)#

Method Endpoint Description
POST /api/v1/ediscovery/export Create new export
GET /api/v1/ediscovery/exports List all exports
GET /api/v1/ediscovery/exports/:job_id Get export details
POST /api/v1/ediscovery/exports/:job_id/cancel Cancel running export
DELETE /api/v1/ediscovery/exports/:job_id Delete export
GET /api/v1/ediscovery/exports/:job_id/download Download export ZIP
GET /api/v1/ediscovery/bates-prefixes Get Bates prefix sequences

Admin Endpoints (require admin role)#

Method Endpoint Description
GET /api/v1/admin/ediscovery/exports List all tenant exports
GET /api/v1/admin/ediscovery/bates-prefixes Get all Bates prefixes

Security Considerations#

Access Control#

  • Only users with auditor role (role_id = 2) can create exports
  • Users can only see/download their own tenant's exports
  • Download actions are logged with IP and user agent

File System Security#

  • Export directories should be on encrypted volumes
  • Consider filesystem permissions on the export directory

Data Retention#

  • Exports remain until manually deleted
  • Consider implementing automatic cleanup policies
  • Download logs are retained indefinitely for audit purposes

Migration#

Apply the migration to create required tables:

# Run migration 014_ediscovery_export.sql on each tenant database
mysql -u piler -p tenant_db < util/migrations/sql/014_ediscovery_export.sql

# Run global migration on piler database
mysql -u piler -p piler < util/migrations/sql/014_ediscovery_global.sql