7.3 KiB
Paperless-ngx Deployment — CT 111
Overview
Self-hosted document management system with multi-language OCR. Deployed on CT 111 via Docker Compose, accessible at paperless.spendlik.sk. All documents, media, and data stored on NAS.
| Property | Value |
|---|---|
| Container | CT 111 |
| Hostname | paperless |
| IP | 192.168.1.111 |
| OS | Debian 13 (privileged LXC, nesting=1) |
| URL | https://paperless.spendlik.sk |
| Internal port | 8000 |
| Compose file | /opt/paperless/docker-compose.yml |
| NAS mount (host) | /mnt/pve/spendlik-nas/data/paperless |
| NAS mount (CT) | /mnt/paperless |
LXC Configuration
# /etc/pve/lxc/111.conf
arch: amd64
cores: 2
features: nesting=1
hostname: paperless
memory: 8192
mp1: /mnt/pve/spendlik-nas/data/paperless,mp=/mnt/paperless,shared=1
net0: name=eth0,bridge=vmbr0,gw=192.168.1.1,hwaddr=BC:24:11:A8:11:71,ip=192.168.1.111/24,type=veth
ostype: debian
rootfs: local-lvm:vm-111-disk-0,size=50G
startup: order=5,up=30
swap: 1024
⚠️ RAM is set to 8192MB — this was raised from 4GB to handle bulk OCR. Should be reduced to 2048MB once bulk imports are complete.
NAS Directory Structure
The entire /mnt/pve/spendlik-nas/data/paperless is bind-mounted into CT 111 at /mnt/paperless. Subdirectories:
| Path (inside CT) | Purpose |
|---|---|
/mnt/paperless/consume |
Drop files here for automatic ingestion |
/mnt/paperless/export |
Export destination |
/mnt/paperless/media |
Processed documents (originals, archive, thumbnails) |
/mnt/paperless/data |
Paperless application data (search index, classifier, etc.) |
Docker Compose
Located at /opt/paperless/docker-compose.yml:
services:
broker:
image: redis:7
restart: unless-stopped
volumes:
- redis_data:/data
db:
image: postgres:16
restart: unless-stopped
volumes:
- pg_data:/var/lib/postgresql/data
environment:
POSTGRES_DB: paperless
POSTGRES_USER: paperless
POSTGRES_PASSWORD: <see Vaultwarden>
webserver:
image: ghcr.io/paperless-ngx/paperless-ngx:latest
restart: unless-stopped
user: root
depends_on:
- db
- broker
ports:
- "8000:8000"
volumes:
- /mnt/paperless/data:/usr/src/paperless/data
- /mnt/paperless/media:/usr/src/paperless/media
- /mnt/paperless/export:/usr/src/paperless/export
- /mnt/paperless/consume:/usr/src/paperless/consume
environment:
PAPERLESS_REDIS: redis://broker:6379
PAPERLESS_DBHOST: db
PAPERLESS_DBNAME: paperless
PAPERLESS_DBUSER: paperless
PAPERLESS_DBPASS: <see Vaultwarden>
PAPERLESS_URL: https://paperless.spendlik.sk
PAPERLESS_SECRET_KEY: <see Vaultwarden>
PAPERLESS_TIME_ZONE: Europe/Bratislava
PAPERLESS_OCR_LANGUAGE: slk+ces+rus+hun+deu+eng
PAPERLESS_OCR_LANGUAGES: slk ces rus hun deu eng
volumes:
redis_data:
pg_data:
ℹ️
mediaanddatawere originally Docker named volumes. They were migrated to NAS bind mounts after the container disk filled up during bulk OCR. See migration notes below.
Docker Container Names
| Name | Image | Purpose |
|---|---|---|
paperless-webserver-1 |
ghcr.io/paperless-ngx/paperless-ngx:latest |
Main app + Celery worker + consumer |
paperless-db-1 |
postgres:16 |
Database |
paperless-broker-1 |
redis:7 |
Task queue |
nginx Reverse Proxy (CT 101)
Config at /etc/nginx/sites-available/paperless.spendlik.sk:
server {
server_name paperless.spendlik.sk;
location / {
proxy_pass http://192.168.1.111:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
listen 443 ssl; # managed by Certbot
ssl_certificate /etc/letsencrypt/live/paperless.spendlik.sk/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/paperless.spendlik.sk/privkey.pem;
include /etc/letsencrypt/options-ssl-nginx.conf;
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem;
}
server {
if ($host = paperless.spendlik.sk) {
return 301 https://$host$request_uri;
}
listen 80;
server_name paperless.spendlik.sk;
return 404;
}
Consuming Documents
Automatic (inotify watcher)
Drop files into /mnt/paperless/consume — the consumer detects new files automatically via inotify and queues them. The consumer runs inside the paperless-webserver-1 container.
Manual trigger (for pre-existing files)
The inotify watcher only detects new file additions, not files already present when the container starts. To process existing files:
cd /opt/paperless
docker compose exec webserver python3 manage.py document_consumer --oneshot
⚠️ Flag is
--oneshot(one word), not--one-shot.
If consumer process is not running
Check with:
docker compose exec webserver ps aux | grep consumer
If missing, restart the webserver container:
docker compose restart webserver
Then watch logs to confirm consumer starts:
docker compose logs webserver --tail=50 -f
Look for: Using inotify to watch directory for changes: /usr/src/paperless/consume
Supported File Types
Paperless-ngx supports PDF and common image formats (JPG, PNG, etc.). .djvu files are not supported and will be skipped with a warning.
OCR Notes
- 6 languages configured: Slovak, Czech, Russian, Hungarian, German, English
- Tesseract warnings about "lots of diacritics" and "too few characters" are normal for old scanned magazines — not errors
- OCR is CPU-intensive; bulk imports require adequate RAM (8GB during bulk, can reduce to 2GB after)
Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
| Files in consume folder not processing | Consumer process died (OOM kill) | docker compose restart webserver |
| HTTP 500 on web UI | Container disk full | Check disk: df -h; migrate volumes to NAS or resize disk |
chown: Invalid argument in logs |
NAS mount doesn't allow ownership changes | Harmless — files still process correctly |
| OOM kill of worker | Insufficient RAM during bulk OCR | pct set 111 --memory 8192 --swap 1024 on Proxmox host |
| Tasks show as failed in UI | OOM kill mid-processing | Re-trigger with --oneshot; failed tasks can be cleared from UI |
Deployment History & Key Events
- Initial deploy — CT 111 created, Docker + Paperless stack deployed, NAS consume/export mounted via Proxmox bind mount
- Disk fill — Container root disk (20GB) filled during bulk OCR of 500+ magazines; resized to 50GB (
pct resize 111 rootfs +30G) - OOM kills — 2GB RAM insufficient for 6-language bulk OCR; raised to 4GB then 8GB
- NAS migration —
mediaanddataDocker named volumes migrated to NAS bind mounts (/mnt/paperless/mediaand/mnt/paperless/data) to avoid future disk issues. Migration done viadocker cpwithout reprocessing. - Bulk import — 500+ scanned Czech/Slovak modelling magazines (Letecký Modelář, 1950s–1960s) imported
Planned: Gemini Post-Processing
Future project to run nightly Gemini API post-processing on documents to improve OCR text, suggest tags, and improve titles. See obsidian-vault/02 Projects/Gemini Post-Processing for Paperless.md.