From d8b2d88c73615d6697c9fea1a4f0a341dd719423 Mon Sep 17 00:00:00 2001 From: Spendlik Date: Mon, 22 Jun 2026 05:31:21 +0000 Subject: [PATCH] Add Stirling PDF deployment guide (CT 113) --- 113_stirling_pdf_deployment.md | 339 +++++++++++++++++++++++++++++++++ 1 file changed, 339 insertions(+) create mode 100644 113_stirling_pdf_deployment.md diff --git a/113_stirling_pdf_deployment.md b/113_stirling_pdf_deployment.md new file mode 100644 index 0000000..002e03b --- /dev/null +++ b/113_stirling_pdf_deployment.md @@ -0,0 +1,339 @@ +# 113 — Stirling PDF Deployment Guide + +> Status: **PLANNED** — not yet deployed +> CT ID: 113 · IP: 192.168.1.113 +> Domain: `pdf.spendlik.sk` +> Last updated: 2026-06-22 + +--- + +## Overview + +Stirling PDF is a self-hosted, open-source PDF toolkit with 50+ operations (merge, split, OCR, compress, convert, redact, sign, rotate, watermark, etc.). All processing happens locally — no documents leave the network. + +Primary use case in this homelab: PDF preprocessing for the Paperless-ngx pipeline (CT 111). Potential n8n integration for automated document processing. + +Stack: single Docker container (Java/Spring backend + Next.js frontend), no database required. + +--- + +## Resource Allocation + +| Resource | Allocation | +|---|---| +| **CT ID** | 113 | +| **IP** | 192.168.1.113 | +| **CPUs** | 2 | +| **RAM** | 1 GB (idle ~512 MB; OCR/conversion peaks higher) | +| **Disk** | 8 GB | +| **Template** | Debian 13 (trixie) | +| **Privileged** | Yes (Docker requires it) | +| **Nesting** | Enabled (`features: nesting=1`) | + +--- + +## Phase 1 — Create LXC Container + +In the Proxmox web UI terminal on the host: + +```bash +pct create 113 local:vztmpl/debian-13-standard_13.0-1_amd64.tar.zst \ + --hostname stirling-pdf \ + --cores 2 \ + --memory 1024 \ + --swap 512 \ + --rootfs local-lvm:8 \ + --net0 name=eth0,bridge=vmbr0,ip=192.168.1.113/24,gw=192.168.1.1 \ + --unprivileged 0 \ + --features nesting=1 \ + --ostype debian \ + --start 1 +``` + +Enter the container: + +```bash +pct enter 113 +``` + +--- + +## Phase 2 — Base Setup + +```bash +apt update && apt upgrade -y +apt install -y nano curl ca-certificates gnupg lsb-release +``` + +--- + +## Phase 3 — Install Docker + +```bash +install -m 0755 -d /etc/apt/keyrings +curl -fsSL https://download.docker.com/linux/debian/gpg | gpg --dearmor -o /etc/apt/keyrings/docker.gpg +chmod a+r /etc/apt/keyrings/docker.gpg + +echo \ + "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \ + https://download.docker.com/linux/debian \ + $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null + +apt update +apt install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin +``` + +Verify: + +```bash +docker run --rm hello-world +``` + +--- + +## Phase 4 — Deploy Stirling PDF + +```bash +mkdir -p /opt/stirling-pdf +cd /opt/stirling-pdf +nano docker-compose.yml +``` + +Paste: + +```yaml +services: + stirling-pdf: + image: stirlingtools/stirling-pdf:latest + container_name: stirling-pdf + restart: unless-stopped + ports: + - "8080:8080" + volumes: + - ./trainingData:/usr/share/tessdata + - ./extraConfigs:/configs + - ./logs:/logs + environment: + - DOCKER_ENABLE_SECURITY=false + - INSTALL_BOOK_AND_ADVANCED_HTML_OPS=false + - LANGS=en_GB +``` + +> ⚠️ `DOCKER_ENABLE_SECURITY=false` is correct for setups where authentication is handled externally by Authelia. Do not enable internal login as well — it conflicts. + +Start: + +```bash +docker compose up -d +docker compose logs -f +``` + +Wait for the message `Started StirlingPDFApplication`. Then verify locally: + +```bash +curl -s http://localhost:8080 | grep -i stirling +``` + +--- + +## Phase 5 — nginx Reverse Proxy (CT 101) + +Enter CT 101: + +```bash +pct enter 101 +nano /etc/nginx/sites-available/stirling-pdf +``` + +Paste: + +```nginx +server { + listen 80; + server_name pdf.spendlik.sk; + + location / { + proxy_pass http://192.168.1.113:8080; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; + client_max_body_size 100M; + } +} +``` + +> ⚠️ `client_max_body_size 100M` is important — large PDFs will be rejected without it. + +Enable and reload: + +```bash +ln -s /etc/nginx/sites-available/stirling-pdf /etc/nginx/sites-enabled/ +nginx -t && systemctl reload nginx +``` + +--- + +## Phase 6 — SSL Certificate + +Still in CT 101: + +```bash +certbot --nginx -d pdf.spendlik.sk +``` + +> ⚠️ After certbot runs, always inspect the config: + +```bash +cat /etc/nginx/sites-available/stirling-pdf +``` + +Check for: +- Duplicate `server_name` directives +- Missing closing `}` brace +- `listen 443 ssl` block correctly added + +If anything looks wrong, fix manually — do not re-run certbot without correcting first. + +--- + +## Phase 7 — DNS Record + +In WebSupport admin panel: + +1. Add A record: `pdf` → current public IP +2. **Check both DNS management pages** — missing the second page has caused outages before +3. Note the numeric record ID assigned by WebSupport +4. Add the record ID to `00_index.md` DNS table + +--- + +## Phase 8 — DDNS Updater (CT 108) + +Enter CT 108: + +```bash +pct enter 108 +nano /usr/local/bin/ddns-update.sh +``` + +Add an entry for `pdf.spendlik.sk` using the record ID obtained in Phase 7, following the existing pattern in the script. + +--- + +## Phase 9 — Authelia Protection (CT 102) + +Enter CT 102: + +```bash +pct enter 102 +nano /etc/authelia/configuration.yml +``` + +In the `access_control.rules` section, add a bypass rule for the Stirling PDF API (needed if wiring to n8n) **before** the catch-all 2FA rule: + +```yaml +- domain: pdf.spendlik.sk + resources: + - "^/api/.*" + policy: bypass + +- domain: pdf.spendlik.sk + policy: two_factor +``` + +> ⚠️ Rule order is first-match-wins. The API bypass must precede the catch-all `two_factor` rule or n8n calls will be blocked by 2FA. + +Restart Authelia: + +```bash +docker compose restart +``` + +Add Authelia middleware to the nginx vhost in CT 101 (refer to how other Authelia-protected services are configured — e.g., `automation.spendlik.sk`). + +--- + +## Phase 10 — Verify + +Test from mobile data (not LAN — hairpin NAT): + +- `https://pdf.spendlik.sk` loads and Authelia prompts for 2FA +- After login, Stirling PDF home page is accessible with all tool categories visible +- Upload a test PDF and run merge or compress — file should process and download + +--- + +## API Integration with n8n + +Stirling PDF exposes a full REST API. Swagger UI is available at: + +``` +https://pdf.spendlik.sk/swagger-ui/index.html +``` + +Example n8n HTTP Request node — compress a PDF: + +``` +POST https://pdf.spendlik.sk/api/v1/general/compress-pdf +Content-Type: multipart/form-data + +fileInput: +optimizeLevel: 3 +``` + +> For API calls from n8n, the Authelia bypass rule on `/api/*` (Phase 9) allows Bearer-free requests from within the LAN. + +Common operations available via API: + +| Operation | Endpoint | +|---|---| +| Merge PDFs | `POST /api/v1/general/merge-pdfs` | +| Split PDF | `POST /api/v1/general/split-pdf` | +| Compress PDF | `POST /api/v1/general/compress-pdf` | +| PDF to image | `POST /api/v1/convert/pdf/img` | +| Add OCR layer | `POST /api/v1/misc/add-ocr-pdf` | +| Rotate pages | `POST /api/v1/general/rotate-pdf` | +| Remove metadata | `POST /api/v1/misc/remove-blanks` | + +Full API reference at the Swagger UI on your instance after deployment. + +--- + +## Paperless-ngx Integration Ideas + +Stirling PDF sits naturally upstream of Paperless-ngx for preprocessing: + +- **Compress oversized scans** before consume — reduces NAS storage for Modelář magazines +- **Rotate misaligned pages** from batch scans +- **Strip metadata** from sensitive documents before ingestion +- **OCR layer addition** for scanned PDFs that Paperless struggles with (though Paperless has its own OCR — use Stirling for pre-processing only when needed) + +A simple n8n workflow pattern: + +``` +New file in NAS consume folder (webhook or poll) + → Stirling PDF: compress + rotate + → Save back to consume folder + → Paperless picks it up automatically +``` + +--- + +## Resource Notes + +- Idle RAM: ~512 MB +- OCR operations: can spike to ~800 MB temporarily +- LibreOffice conversions (PDF → DOCX etc.): heaviest operation, may need RAM bump to 2 GB if used frequently +- 2 CPU cores sufficient for single-user use + +--- + +## Gotchas + +| Issue | Fix | +|---|---| +| Large PDF uploads rejected | `client_max_body_size 100M` in nginx config | +| certbot corrupts nginx config | Always inspect after issuance | +| n8n API calls blocked by Authelia | Add `/api/*` bypass rule before catch-all | +| Container won't start after Proxmox reboot | Check `pct config 113` for boot order; add `--onboot 1` if missing |