how-toself-hostingproductivity

Deploying LibreOffice Online (Collabora) on Kubernetes: Self‑Hosted Collaboration for Teams

UUnknown

2026-02-21

10 min read

Hands‑on guide to run Collabora (LibreOffice Online) with Nextcloud on Kubernetes—TLS, storage, scaling and backups for production (2026).

Stop losing work to cloud lock‑in: run Collabora (LibreOffice Online) behind Nextcloud on Kubernetes

Teams that care about privacy, cost control and offline‑first governance increasingly choose self‑hosted collaboration. But running Collabora (the engine behind LibreOffice Online) in production—securely, scalable, and resilient—raises practical questions: how to terminate TLS, where to put persistent state, how to scale editor pods under load, and how to back everything up so you can recover from failure. This guide (2026‑aware) gives a hands‑on path to running Collabora + Nextcloud on Kubernetes with production grade TLS, storage, scaling and backup strategies.

At a glance: what you will get

Reference architecture for Collabora + Nextcloud on Kubernetes (Ingress + cert‑manager TLS).
Production manifests for Deployment, Service, Ingress, HPA and PVC examples.
Storage choices explained (ephemeral vs persistent, Longhorn, Rook/Ceph, hostPath, cloud volumes).
Scaling & performance strategies, including session affinity and autoscaling tips.
Backups & recovery using Velero/CSI snapshots and application‑level backups for Nextcloud.
Security & operations: network policies, RBAC, TLS best practices and upgrade notes for 2026.

Why self‑host Collabora + Nextcloud in 2026?

Through 2025 and into 2026, two trends accelerated adoption of self‑hosted stacks: stronger data sovereignty requirements and mature cloud native tooling (stable CSI snapshot APIs, cert‑manager maturity, and richer autoscaling). Collabora provides a LibreOffice‑based editor that integrates with Nextcloud via WOPI. Running these together gives teams a full, privacy‑first docs platform without vendor lock‑in.

Note: Collabora provides Collabora Online Development Edition (CODE) images suitable for self‑hosting; Collabora Ltd. also offers enterprise support for production SLAs and commercial releases.

High‑level architecture

Keep the architecture simple and secure:

Nextcloud cluster (StatefulSet/Deployment) + backing object storage (S3/MinIO) and DB (MariaDB/Postgres).
Collabora CODE Deployment(s) behind an Ingress with TLS terminated by cert‑manager.
Internal NetworkPolicy so Collabora only accepts connections from Nextcloud and trusted users.
Persistent volumes where necessary for caching; treat Collabora as largely stateless—Nextcloud stores user files.
Backups: Velero for cluster resources + database dumps + object storage snapshots.

Prerequisites

Kubernetes cluster (1.26+ recommended; ensure CSI snapshot and metrics server installed).
kubectl configured and admin rights.
ingress controller (NGINX ingress or Traefik) installed.
cert‑manager installed (ACME issuer for public DNS or internal CA for private clusters).
Nextcloud running (or plan to deploy it alongside Collabora).
Persistent storage (Longhorn, Rook/Ceph, cloud PVs or hostPath/testing).

Step 1 — Deploy Collabora CODE (reference manifest)

Collabora provides the collabora/code image. The container expects a TLS endpoint by default. Below is a lean Deployment that sets sane resource limits and runs the CODE server.

# collabora-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: collabora
  labels:
    app: collabora
spec:
  replicas: 2
  selector:
    matchLabels:
      app: collabora
  template:
    metadata:
      labels:
        app: collabora
    spec:
      securityContext:
        runAsNonRoot: true
      containers:
      - name: collabora
        image: collabora/code:latest
        imagePullPolicy: IfNotPresent
        env:
        - name: domain
          value: "nextcloud.example.com"
        - name: username
          value: ""
        - name: password
          value: ""
        ports:
        - containerPort: 9980
          name: http
        resources:
          requests:
            cpu: "250m"
            memory: "512Mi"
          limits:
            cpu: "1"
            memory: "1Gi"
        readinessProbe:
          httpGet:
            path: /hosting/discovery
            port: http
          initialDelaySeconds: 10
          periodSeconds: 20
        livenessProbe:
          httpGet:
            path: /loleaflet/dist/admin/admin.html
            port: http
          initialDelaySeconds: 30
          periodSeconds: 60

Notes:

Set domain to Nextcloud's FQDN (it limits which origins can attach).
Use a Deployment with multiple replicas for redundancy; scale to match concurrent edit sessions (see scaling section).
Probes help the load balancer avoid routing to cold or unhealthy pods.

Step 2 — Service + Ingress + TLS (cert‑manager)

Create a Service and an Ingress that fronts Collabora. For public access, terminate TLS at the Ingress using cert‑manager and an ACME issuer (Let's Encrypt or a private CA).

# collabora-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: collabora
spec:
  type: ClusterIP
  selector:
    app: collabora
  ports:
  - name: http
    port: 9980
    targetPort: 9980

# collabora-ingress.yaml (nginx example)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: collabora-ingress
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/proxy-body-size: "0"
    nginx.ingress.kubernetes.io/backend-protocol: "HTTP"
    nginx.ingress.kubernetes.io/affinity: "cookie"
    nginx.ingress.kubernetes.io/session-cookie-name: "COLLAFSID"
spec:
  tls:
  - hosts:
    - collabora.example.com
    secretName: collabora-tls
  rules:
  - host: collabora.example.com
    http:
      paths:
      - pathType: Prefix
        path: /
        backend:
          service:
            name: collabora
            port:
              number: 9980

Why session affinity? Collabora maintains long‑running websocket/SSE sessions for document edits. Using cookie‑based affinity at the ingress reduces reconnect churn and improves perceived performance. If using a service mesh (e.g., Istio), use consistent hashing or proxy‑side session affinity.

Step 3 — Integrate with Nextcloud

Install the Nextcloud Collabora Online app (available in the Nextcloud App Store).
In Nextcloud Admin > Collabora Online, set the Collabora server URL to https://collabora.example.com.
Ensure Nextcloud's trusted_domains includes both Nextcloud's host and your Collabora host if proxying.
On private clusters, you can use an internal ClusterIP host: https://collabora.kube.svc.cluster.local:9980—but you must still serve a valid certificate or configure Nextcloud to trust that internal URL (not recommended for public access).

Storage choices for Collabora

Collabora itself is mostly stateless; document storage lives in Nextcloud. However, Collabora uses caches, temp files and may benefit from persistent mounts for better warm‑starts. Choose a storage model that matches your SLA:

Ephemeral (emptyDir) — fastest for transient sessions; simpler but no restart durability. Good for test and small teams.
Persistent Volume (PVC) — mount a PVC for /var/cache/loolwsd or similar to persist caches across pod restarts. Use fast SSD-backed PVs for low latency.
Longhorn — excellent for on‑prem clusters; snapshot and replication built in and easy to manage for small clusters.
Rook/Ceph — suitable for larger clusters that need highly available, distributed storage.
Cloud provider PVs (EBS, PD) — integrate with CSI snapshot for point‑in‑time backups; good for cloud deployments.

Recommendation (2026): use a small PVC for caches on a replicated CSI volume class (Longhorn or cloud SSD) and rely on Nextcloud's object storage for user data. That gives fast restarts and avoids re‑converting files on every attach.

Scaling Collabora: how many pods do you need?

Collabora is CPU and memory intensive for concurrent editors. Each active document consumes CPU for rendering and conversions. Rule of thumb:

Start with small footprints: allow 1 CPU and 1 GB per pod (tune up based on load).
Estimate concurrent editors per pod: typical 4–10 concurrent lightweight editors per CPU in practice—but test with your workload.
Use a Horizontal Pod Autoscaler (HPA) keyed to CPU utilization and consider request‑based autoscaling with KEDA (scale on queue length or custom metrics) if you aggregate metrics externally.

# Simple HPA example
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: collabora-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: collabora
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 60

Advanced scaling strategies (2026): use a mix of HPA + custom metrics (via Prometheus Adapter) or KEDA for event‑based bursts. Consider keeping a minimum warm pool to avoid cold starts for editors during predictable working hours.

Observability & performance tuning

Expose resource metrics to Prometheus (collect container CPU/memory, request latency via ingress).
Monitor active websocket counts and connection durations—these reflect editing load.
Tune ingress timeouts (websocket timeouts) and proxy buffer sizes for large files.
Use PodDisruptionBudgets to keep at least one replica during upgrades.

Security best practices

TLS everywhere: terminate TLS at the ingress with cert‑manager and keep internal traffic encrypted if crossing untrusted networks.
NetworkPolicy: restrict Collabora to accept traffic only from the Nextcloud pod/service and trusted networks.
Pod Security: run non‑root, set read‑only root filesystem where possible, and use seccomp and runtimeClass to harden.
Secrets: store any keys/secrets in Kubernetes Secrets or an external secret store (HashiCorp Vault, Sealed Secrets).
RBAC: use least privilege for service accounts; avoid cluster admin for app pods.

Backup and Disaster Recovery

Because Nextcloud owns user files, your backup strategy must prioritize Nextcloud's DB and object store. Collabora's state is usually reconstructible (reconfigure and redeploy). Still, backup cluster resources and certificates so you can restore service quickly.

What to back up

Nextcloud database (dump daily or more frequently depending on RPO).
Object storage (S3/MinIO) snapshots—ensure versioning or lifecycle controls.
PersistentVolumes (if using PVCs for caches) via CSI snapshots.
Kubernetes resources: Deployments, Services, Ingress, ConfigMaps, Secrets (use encrypted backups or Sealed Secrets).
cert‑manager Issuers/Certificates (or ensure external CA backups).

Backup tooling (recommended)

Velero for cluster resource backups and CSI snapshots—works with many cloud providers and on‑prem CSI drivers.
Database dumps (cronjob or operator) stored in object storage (S3/MinIO) with immutable lifecycle policies.
Object storage replication or cross‑region replication for disaster recovery.
Versioned secrets via Sealed Secrets or a secure external secrets manager.

# Example: velero backup command for nextcloud and collabora
velero backup create nextcloud-backup \
  --include-namespaces nextcloud,collabora \
  --snapshot-volumes \
  --ttl 168h

Test restores regularly. Recovery drills are essential: practice restoring the DB and mounting file snapshots into a recovery Nextcloud instance.

Upgrades & maintenance

Follow Collabora CODE release notes and test new images in staging before production (document rendering regressions happen).
Use rolling updates with readiness probes and PodDisruptionBudgets to avoid service gaps.
Keep cert‑manager and ingress controllers up to date to benefit from ACME improvements and security patches (ACME v2 and rate limit behavior matured after 2024).

Troubleshooting common issues

1. Nextcloud shows connection error to Collabora

Check Collabora ingress address in Nextcloud settings (must match TLS host).
Confirm cert validity and trusted CA; browsers will block mixed content.
Ensure ingress allows websocket upgrades; check proxy_set_header Upgrade and Connection settings for nginx/Traefik.

2. Edits disconnect frequently

Enable session affinity (cookie) at the ingress and Service to stabilize websocket routing.
Check pod logs for memory OOM kills—increase memory limits or add replicas.

3. Slow document load or conversions

Profile CPU usage; conversions are CPU bound—consider vertical scaling or more replicas.
Enable cache persistence via PVCs to avoid repeated conversions.

Advanced strategies & 2026 predictions

As cloud native ecosystems evolved, three practical patterns emerged by 2026 that you should consider:

Warm pools & predictive autoscaling: Use cron or scheduler hooks to maintain warm replicas during office hours. Combine with predictive metrics (historical load) to avoid cold starts.
KEDA + custom events: Scale on application metrics (e.g., number of open documents tracked in Redis) for more accurate autoscaling than raw CPU.
Multi‑cluster deployment for global teams: place Collabora instances closer to users, with a central Nextcloud or federated Nextcloud instances, reducing latency during editing.

Checklist before going live

Ingress TLS validated and trusted by clients.
Nextcloud Collabora URL configured and tested with sample documents.
Autoscaling tested with load‑testing (simulate concurrent editors).
Backups configured (Velero + DB dumps + object store snapshots) and restore tested.
NetworkPolicy and RBAC enforced; secrets secured.
Monitoring in place (Prometheus/Grafana) and alerting for high CPU, memory, and websocket errors.

Quick production reference (commands)

# Apply manifests
kubectl apply -f collabora-deployment.yaml
kubectl apply -f collabora-service.yaml
kubectl apply -f collabora-ingress.yaml
kubectl apply -f collabora-hpa.yaml

# Check pods and logs
kubectl get pods -l app=collabora
kubectl logs deploy/collabora -c collabora

# Test TLS
curl -vk https://collabora.example.com/hosting/discovery

Final notes & operational tips

Collabora + Nextcloud together give you a powerful, privacy‑respecting alternative to hosted office suites. Keep the Collabora image updated, monitor resource usage, and don’t skimp on backups—Nextcloud file and DB backups are your true RTO governor. Embrace the 2026 cloud‑native toolset: CSI snapshots, cert‑manager, KEDA and Prometheus make production self‑hosting reliable and maintainable.

Call to action

If you’re ready to try this in your environment, fork the manifests in the examples above, deploy to a staging cluster, and run a 48‑hour resiliency exercise (traffic spikes, node drains, restore). Join our community at opensources.live for curated manifests, upgrade alerts and real‑world case studies from teams running Collabora + Nextcloud at scale.

Unknown

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.

Up Next

Kubernetes for RISC‑V + GPU Clusters: Device Plugins, Scheduling and Resource Topology

Open Source•13 min read

Building Open Drivers for NVLink on RISC‑V: Where to Start

RISC-V•11 min read

How NVLink Fusion Changes the Game: Architecting Heterogeneous RISC‑V + Nvidia GPU Nodes

ai•9 min read

Evaluating AI in Office Suites: Privacy, Offline Alternatives, and Open Approaches

mobile•10 min read

Maintaining Security in Android Skins and Forks: Patch Management Best Practices

From Our Network

Trending stories across our publication group

Legal & Compliance Risks When Third-Party Cybersecurity Providers Fail

opensoftware.cloud

compliance•11 min read

Legal & Compliance Risks When Third-Party Cybersecurity Providers Fail

From Cloudflare Outage to Chaos Engineering: Designing DR Tests for Edge Dependencies

opensoftware.cloud

chaos-engineering•12 min read

From Cloudflare Outage to Chaos Engineering: Designing DR Tests for Edge Dependencies

Multi-CDN Failover Patterns for Self-Hosted Platforms: Avoiding Single-Provider Blackouts

opensoftware.cloud

high-availability•9 min read

Multi-CDN Failover Patterns for Self-Hosted Platforms: Avoiding Single-Provider Blackouts

Postmortem Playbook: How to Harden Web Platforms After a CDN-Induced Outage

opensoftware.cloud

incident-response•10 min read

Postmortem Playbook: How to Harden Web Platforms After a CDN-Induced Outage

WCET and Safety Pipelines: Best Practices for Continuous Timing Regression Monitoring

opensoftware.cloud

verification•11 min read

WCET and Safety Pipelines: Best Practices for Continuous Timing Regression Monitoring

The Future of On-prem AI: Energy, Sovereignty and RISC-V Accelerated Inference Clusters

opensoftware.cloud

strategy•9 min read

The Future of On-prem AI: Energy, Sovereignty and RISC-V Accelerated Inference Clusters

2026-02-25T03:32:58.823Z

Stop losing work to cloud lock‑in: run Collabora (LibreOffice Online) behind Nextcloud on Kubernetes

At a glance: what you will get

Why self‑host Collabora + Nextcloud in 2026?

High‑level architecture

Prerequisites

Step 1 — Deploy Collabora CODE (reference manifest)

Step 2 — Service + Ingress + TLS (cert‑manager)

Step 3 — Integrate with Nextcloud

Storage choices for Collabora

Scaling Collabora: how many pods do you need?

Observability & performance tuning

Security best practices

Backup and Disaster Recovery

What to back up

Backup tooling (recommended)

Upgrades & maintenance

Troubleshooting common issues

1. Nextcloud shows connection error to Collabora

2. Edits disconnect frequently

3. Slow document load or conversions

Advanced strategies & 2026 predictions

Checklist before going live

Quick production reference (commands)

Final notes & operational tips

Call to action

Related Reading

Related Topics

Unknown

Up Next

Kubernetes for RISC‑V + GPU Clusters: Device Plugins, Scheduling and Resource Topology

Building Open Drivers for NVLink on RISC‑V: Where to Start

How NVLink Fusion Changes the Game: Architecting Heterogeneous RISC‑V + Nvidia GPU Nodes

Evaluating AI in Office Suites: Privacy, Offline Alternatives, and Open Approaches

Maintaining Security in Android Skins and Forks: Patch Management Best Practices

From Our Network

Legal & Compliance Risks When Third-Party Cybersecurity Providers Fail

From Cloudflare Outage to Chaos Engineering: Designing DR Tests for Edge Dependencies

Multi-CDN Failover Patterns for Self-Hosted Platforms: Avoiding Single-Provider Blackouts

Postmortem Playbook: How to Harden Web Platforms After a CDN-Induced Outage

WCET and Safety Pipelines: Best Practices for Continuous Timing Regression Monitoring

The Future of On-prem AI: Energy, Sovereignty and RISC-V Accelerated Inference Clusters