How to enable Traefik auto-discovery for docker-compose containers on Swarm worker nodes

The Problem

I am running docker (duh) at home for various containers (stirlingpdf, vscodium, jellyfin, gitea, authentik - just to name a few). And I wanted to spread the workload across multiple worker-nodes, but didn’t want to hassle with the hurdles of that - managing which host was serving what container.

So I started looking at docker swarm - which sadly was a no-go for me, as komodo isn’t supporting swarm yet. I’m currently also running portainer (and moving away from it) - which I could have used for that, but I wanted to get rid of portainer.

However I wanted to use some of the concept(s) of docker swarm or even kubernetes (the ingress - https://kubernetes.io/docs/concepts/services-networking/ingress/) - while mainting the simplicity of docker compose.

When running Docker Swarm with Traefik as the reverse proxy, you typically want:

  1. Swarm Services - Deployed via docker stack deploy, automatically discovered by Traefik
  2. Standalone Containers - Deployed via docker compose on worker nodes for simpler management

Traefik v3 with providers.swarm only sees Swarm services, not standalone containers. And unlike Traefik v2, Traefik v3 does not support multiple Docker endpoints - you cannot configure it to watch multiple Docker sockets.

Failed Approaches

  • Socket Proxy: Exposing Docker sockets from workers via TCP/socket-proxy doesn’t work - Traefik v3 rejects multiple docker providers
  • Docker API over TCP: Same limitation applies
  • Service Mesh (Consul Connect, Linkerd): Overkill for this use case

After a lot of trial and error (you know that rabbit hole you go down, once you start looking) I came to the conclusion that there was no such thing.

The Solution: Docker Label Sync

A lightweight Python service that:

  1. Watches Docker events on worker nodes
  2. Extracts Traefik labels from containers
  3. Generates Traefik file provider rules (YAML)
  4. Syncs rules to the Swarm manager via SSH/rsync

Traefik’s file provider with watch=true picks up changes automatically.

Architecture

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
                            SWARM CLUSTER
┌───────────────────────────────────────────────────────────────────┐
│                                                                   │
│  ┌─────────────────────────────────────┐                          │
│  │        MANAGER NODE                 │                          │
│  │        (swarm-mgr-01)               │                          │
│  │                                     │                          │
│  │  ┌─────────────────────────────┐    │                          │
│  │  │  Traefik (Swarm Service)    │    │                          │
│  │  │                             │    │                          │
│  │  │  - providers.swarm ─────────┼────┼──► Swarm Services        │
│  │  │  - providers.file ──────┐   │    │                          │
│  │  │    (watch=true)         │   │    │                          │
│  │  └─────────────────────────┼───┘    │                          │
│  │                            │        │                          │
│  │                            ▼        │                          │
│  │  ┌─────────────────────────────┐    │                          │
│  │  │  /opt/stacks/traefik/       │    │                          │
│  │  │    config/rules/            │    │                          │
│  │  │      worker-node-01-app.yml │◄───┼─────┐                    │
│  │  │      worker-node-02-api.yml │◄───┼─────┼────┐               │
│  │  └─────────────────────────────┘    │     │    │               │
│  │                                     │     │    │               │
│  └─────────────────────────────────────┘     │    │               │
│                                              │    │  SSH/rsync    │
│         ┌────────────────────────────────────┘    │               │
│         │                                         │               │
│  ┌──────┴──────────────────────┐  ┌──────────────┴─────────────┐  │
│  │      WORKER NODE 01         │  │      WORKER NODE 02        │  │
│  │      (swarm-wrk-01)         │  │      (swarm-wrk-02)        │  │
│  │                             │  │                            │  │
│  │  ┌───────────────────────┐  │  │  ┌───────────────────────┐ │  │
│  │  │  docker-label-sync    │  │  │  │  docker-label-sync    │ │  │
│  │  │  (systemd service)    │  │  │  │  (systemd service)    │ │  │
│  │  │                       │  │  │  │                       │ │  │
│  │  │  - Watch Docker API   │  │  │  │  - Watch Docker API   │ │  │
│  │  │  - Parse labels       │  │  │  │  - Parse labels       │ │  │
│  │  │  - Generate YAML      │  │  │  │  - Generate YAML      │ │  │
│  │  │  - rsync to manager   │  │  │  │  - rsync to manager   │ │  │
│  │  └───────────┬───────────┘  │  │  └───────────┬───────────┘ │  │
│  │              │              │  │              │             │  │
│  │              ▼              │  │              ▼             │  │
│  │  ┌───────────────────────┐  │  │  ┌───────────────────────┐ │  │
│  │  │  Docker Compose       │  │  │  │  Docker Compose       │ │  │
│  │  │  Containers           │  │  │  │  Containers           │ │  │
│  │  │                       │  │  │  │                       │ │  │
│  │  │  my-app ──────────────┼──┼──┼──┼───────────────────────┼─┼──┤
│  │  │   traefik.enable=true │  │  │  │  my-api               │ │  │
│  │  │   traefik.http...     │  │  │  │   traefik.enable=true │ │  │
│  │  └───────────────────────┘  │  │  └───────────────────────┘ │  │
│  │                             │  │                            │  │
│  └─────────────────────────────┘  └────────────────────────────┘  │
│                                                                   │
│                    ┌─────────────────────┐                        │
│                    │  Overlay Network    │                        │
│                    │  traefik-overlay    │                        │
│                    │  (attachable)       │                        │
│                    └─────────────────────┘                        │
│                              │                                    │
│              Containers connect to overlay                        │
│              for direct routing from Traefik                      │
│                                                                   │
└───────────────────────────────────────────────────────────────────┘

Data Flow

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Container Start on Worker
┌─────────────────┐
│ Docker Event    │
│ type: container │
│ action: start   │
└────────┬────────┘
┌─────────────────────────────────────────────┐
│ docker-label-sync                           │
│                                             │
│ 1. Receive event via Docker API             │
│ 2. Inspect container                        │
│ 3. Check: connected to overlay network?     │
│    └─ No: skip                              │
│    └─ Yes: continue                         │
│ 4. Check: traefik.enable=true?              │
│    └─ No: skip                              │
│    └─ Yes: continue                         │
│ 5. Parse traefik.* labels                   │
│ 6. Get container IP in overlay network      │
│ 7. Generate Traefik YAML rule               │
│ 8. rsync to manager:/opt/.../rules/         │
└─────────────────────────────────────────────┘
┌─────────────────┐
│ Traefik         │
│ file provider   │
│ watch=true      │
│                 │
│ Detects new     │
│ rule file       │
│ Routes traffic  │
└─────────────────┘

Implementation

The Python Script

The core script (docker-label-sync) runs as a systemd service and performs these tasks:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Simplified flow
def main():
    client = docker.from_env()

    # Sync existing containers on startup
    for container in client.containers.list():
        process_container(container)

    # Watch for new events
    for event in client.events(decode=True):
        if event['Type'] == 'container':
            if event['Action'] == 'start':
                container = client.containers.get(event['Actor']['ID'])
                process_container(container)
            elif event['Action'] in ('stop', 'die'):
                schedule_cleanup(event['Actor']['Attributes']['name'])

def process_container(container):
    # Check if container is in our overlay network
    ip = get_container_ip(container, OVERLAY_NETWORK)
    if not ip:
        return

    # Parse traefik labels
    labels = parse_traefik_labels(container.labels)
    if not labels:
        return

    # Generate and sync rule
    yaml_content = generate_traefik_rule(container, labels, ip)
    sync_to_manager(yaml_content, f"worker-{hostname}-{container.name}.yml")

Label to Rule Transformation

Input: Docker Compose Labels

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
services:
  webapp:
    image: myapp:latest
    networks:
      - traefik-overlay
    labels:
      traefik.enable: "true"
      traefik.http.routers.webapp.rule: "Host(`app.example.com`)"
      traefik.http.routers.webapp.entrypoints: "websecure"
      traefik.http.services.webapp.loadbalancer.server.port: "8080"

networks:
  traefik-overlay:
    external: true

Output: Traefik File Provider Rule

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# Auto-generated by docker-label-sync
# Container: webapp (abc123def)
# Host: swarm-wrk-01.example.com
http:
  routers:
    webapp:
      rule: "Host(`app.example.com`)"
      entryPoints:
        - websecure
      service: webapp
      tls:
        certResolver: letsencrypt
  services:
    webapp:
      loadBalancer:
        servers:
          - url: "http://10.0.1.15:8080"  # Container IP in overlay

Systemd Service

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
[Unit]
Description=Docker Label Sync Service
After=docker.service
Requires=docker.service

[Service]
Type=simple
User=root
EnvironmentFile=/etc/docker-label-sync.conf
ExecStart=/usr/local/bin/docker-label-sync
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target

Configuration

1
2
3
4
5
6
7
8
# /etc/docker-label-sync.conf
TRAEFIK_SYNC_MANAGER=<ssh-user>@swarm-mgr-01.example.com
TRAEFIK_SYNC_PATH=/opt/stacks/traefik/config/rules
TRAEFIK_SYNC_NETWORK=traefik-overlay
TRAEFIK_SYNC_PREFIX=worker-
TRAEFIK_SYNC_CLEANUP_DELAY=15
TRAEFIK_SYNC_SSH_KEY=/home/<ssh-user>/.ssh/id_ed25519
TRAEFIK_SYNC_CERT_RESOLVER=letsencrypt

Traefik Configuration

The Swarm manager runs Traefik with both providers:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# traefik command arguments
command:
  # Swarm provider for stack-deployed services
  - "--providers.swarm"
  - "--providers.swarm.exposedbydefault=false"
  - "--providers.swarm.network=traefik-overlay"

  # File provider for worker container rules
  - "--providers.file.directory=/etc/traefik/rules"
  - "--providers.file.watch=true"

Cleanup Strategy

When a container stops, the sync service:

  1. Waits for configurable delay (default: 15 seconds)
  2. Removes the rule file from the manager
  3. Traefik automatically removes the route

The delay prevents flapping during container restarts or quick redeploys.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
Container Stop Event
┌───────────────────┐
│ Start 15s timer   │
└─────────┬─────────┘
    Container restarts
    within 15 seconds?
    ┌─────┴─────┐
    │           │
   Yes          No
    │           │
    ▼           ▼
┌─────────┐ ┌─────────────┐
│ Cancel  │ │ Remove rule │
│ timer   │ │ from manager│
└─────────┘ └─────────────┘

Network Requirements

Overlay Network

The overlay network must be:

  • Created on the Swarm manager
  • Marked as attachable: true
  • Used by both Traefik and worker containers
1
2
# On manager
docker network create --driver overlay --attachable traefik-overlay

SSH Access

Workers need SSH access to the manager for rsync:

  • Uses dedicated service user (ssh-user)
  • Only needs write access to rules directory

Advantages

AspectBenefit
SimplicityNo complex service mesh required
CompatibilityWorks with standard docker-compose
Real-timeDocker events API provides instant updates
Reliabilityrsync over SSH is robust and well-tested
SecuritySSH keys, no exposed Docker sockets
FlexibilityWorks with any deployment tool (Komodo, Portainer, CLI)

Limitations

  • Requires SSH connectivity between workers and manager
  • Additional service running on each worker
  • Slight delay compared to native Traefik discovery (~1-2 seconds)
  • Rule files accumulate if cleanup fails (manual cleanup may be needed)

Monitoring

Check service status:

1
2
systemctl status docker-label-sync
journalctl -u docker-label-sync -f

Verify rules on manager:

1
2
ls -la /opt/stacks/traefik/config/rules/
cat /opt/stacks/traefik/config/rules/worker-*.yml

Conclusion

Docker Label Sync bridges the gap between Traefik v3’s Swarm-only discovery and the need for docker-compose deployments on worker nodes. It’s a pragmatic solution that:

  • Maintains the familiar label-based configuration
  • Works transparently with existing workflows
  • Requires minimal infrastructure changes
  • Provides real-time route updates

The approach trades a small amount of complexity (one additional service per worker) for significant operational flexibility.