How to enable Traefik auto-discovery for docker-compose containers on Swarm worker nodes
The Problem#
I am running docker (duh) at home for various containers (stirlingpdf, vscodium, jellyfin, gitea, authentik - just to name a few). And I wanted to spread the workload across multiple worker-nodes, but didn’t want to hassle with the hurdles of that - managing which host was serving what container.
So I started looking at docker swarm - which sadly was a no-go for me, as komodo isn’t supporting swarm yet.
I’m currently also running portainer (and moving away from it) - which I could have used for that, but I wanted to get rid of portainer.
However I wanted to use some of the concept(s) of docker swarm or even kubernetes (the ingress - https://kubernetes.io/docs/concepts/services-networking/ingress/) - while mainting the simplicity of docker compose.
When running Docker Swarm with Traefik as the reverse proxy, you typically want:
- Swarm Services - Deployed via
docker stack deploy, automatically discovered by Traefik - Standalone Containers - Deployed via
docker compose on worker nodes for simpler management
Traefik v3 with providers.swarm only sees Swarm services, not standalone containers. And unlike Traefik v2, Traefik v3 does not support multiple Docker endpoints - you cannot configure it to watch multiple Docker sockets.
Failed Approaches#
- Socket Proxy: Exposing Docker sockets from workers via TCP/socket-proxy doesn’t work - Traefik v3 rejects multiple docker providers
- Docker API over TCP: Same limitation applies
- Service Mesh (Consul Connect, Linkerd): Overkill for this use case
After a lot of trial and error (you know that rabbit hole you go down, once you start looking) I came to the conclusion that there was no such thing.
The Solution: Docker Label Sync#
A lightweight Python service that:
- Watches Docker events on worker nodes
- Extracts Traefik labels from containers
- Generates Traefik file provider rules (YAML)
- Syncs rules to the Swarm manager via SSH/rsync
Traefik’s file provider with watch=true picks up changes automatically.
Architecture#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
| SWARM CLUSTER
┌───────────────────────────────────────────────────────────────────┐
│ │
│ ┌─────────────────────────────────────┐ │
│ │ MANAGER NODE │ │
│ │ (swarm-mgr-01) │ │
│ │ │ │
│ │ ┌─────────────────────────────┐ │ │
│ │ │ Traefik (Swarm Service) │ │ │
│ │ │ │ │ │
│ │ │ - providers.swarm ─────────┼────┼──► Swarm Services │
│ │ │ - providers.file ──────┐ │ │ │
│ │ │ (watch=true) │ │ │ │
│ │ └─────────────────────────┼───┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌─────────────────────────────┐ │ │
│ │ │ /opt/stacks/traefik/ │ │ │
│ │ │ config/rules/ │ │ │
│ │ │ worker-node-01-app.yml │◄───┼─────┐ │
│ │ │ worker-node-02-api.yml │◄───┼─────┼────┐ │
│ │ └─────────────────────────────┘ │ │ │ │
│ │ │ │ │ │
│ └─────────────────────────────────────┘ │ │ │
│ │ │ SSH/rsync │
│ ┌────────────────────────────────────┘ │ │
│ │ │ │
│ ┌──────┴──────────────────────┐ ┌──────────────┴─────────────┐ │
│ │ WORKER NODE 01 │ │ WORKER NODE 02 │ │
│ │ (swarm-wrk-01) │ │ (swarm-wrk-02) │ │
│ │ │ │ │ │
│ │ ┌───────────────────────┐ │ │ ┌───────────────────────┐ │ │
│ │ │ docker-label-sync │ │ │ │ docker-label-sync │ │ │
│ │ │ (systemd service) │ │ │ │ (systemd service) │ │ │
│ │ │ │ │ │ │ │ │ │
│ │ │ - Watch Docker API │ │ │ │ - Watch Docker API │ │ │
│ │ │ - Parse labels │ │ │ │ - Parse labels │ │ │
│ │ │ - Generate YAML │ │ │ │ - Generate YAML │ │ │
│ │ │ - rsync to manager │ │ │ │ - rsync to manager │ │ │
│ │ └───────────┬───────────┘ │ │ └───────────┬───────────┘ │ │
│ │ │ │ │ │ │ │
│ │ ▼ │ │ ▼ │ │
│ │ ┌───────────────────────┐ │ │ ┌───────────────────────┐ │ │
│ │ │ Docker Compose │ │ │ │ Docker Compose │ │ │
│ │ │ Containers │ │ │ │ Containers │ │ │
│ │ │ │ │ │ │ │ │ │
│ │ │ my-app ──────────────┼──┼──┼──┼───────────────────────┼─┼──┤
│ │ │ traefik.enable=true │ │ │ │ my-api │ │ │
│ │ │ traefik.http... │ │ │ │ traefik.enable=true │ │ │
│ │ └───────────────────────┘ │ │ └───────────────────────┘ │ │
│ │ │ │ │ │
│ └─────────────────────────────┘ └────────────────────────────┘ │
│ │
│ ┌─────────────────────┐ │
│ │ Overlay Network │ │
│ │ traefik-overlay │ │
│ │ (attachable) │ │
│ └─────────────────────┘ │
│ │ │
│ Containers connect to overlay │
│ for direct routing from Traefik │
│ │
└───────────────────────────────────────────────────────────────────┘
|
Data Flow#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
| Container Start on Worker
│
▼
┌─────────────────┐
│ Docker Event │
│ type: container │
│ action: start │
└────────┬────────┘
│
▼
┌─────────────────────────────────────────────┐
│ docker-label-sync │
│ │
│ 1. Receive event via Docker API │
│ 2. Inspect container │
│ 3. Check: connected to overlay network? │
│ └─ No: skip │
│ └─ Yes: continue │
│ 4. Check: traefik.enable=true? │
│ └─ No: skip │
│ └─ Yes: continue │
│ 5. Parse traefik.* labels │
│ 6. Get container IP in overlay network │
│ 7. Generate Traefik YAML rule │
│ 8. rsync to manager:/opt/.../rules/ │
└─────────────────────────────────────────────┘
│
▼
┌─────────────────┐
│ Traefik │
│ file provider │
│ watch=true │
│ │
│ Detects new │
│ rule file │
│ Routes traffic │
└─────────────────┘
|
Implementation#
The Python Script#
The core script (docker-label-sync) runs as a systemd service and performs these tasks:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
| # Simplified flow
def main():
client = docker.from_env()
# Sync existing containers on startup
for container in client.containers.list():
process_container(container)
# Watch for new events
for event in client.events(decode=True):
if event['Type'] == 'container':
if event['Action'] == 'start':
container = client.containers.get(event['Actor']['ID'])
process_container(container)
elif event['Action'] in ('stop', 'die'):
schedule_cleanup(event['Actor']['Attributes']['name'])
def process_container(container):
# Check if container is in our overlay network
ip = get_container_ip(container, OVERLAY_NETWORK)
if not ip:
return
# Parse traefik labels
labels = parse_traefik_labels(container.labels)
if not labels:
return
# Generate and sync rule
yaml_content = generate_traefik_rule(container, labels, ip)
sync_to_manager(yaml_content, f"worker-{hostname}-{container.name}.yml")
|
Input: Docker Compose Labels
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| services:
webapp:
image: myapp:latest
networks:
- traefik-overlay
labels:
traefik.enable: "true"
traefik.http.routers.webapp.rule: "Host(`app.example.com`)"
traefik.http.routers.webapp.entrypoints: "websecure"
traefik.http.services.webapp.loadbalancer.server.port: "8080"
networks:
traefik-overlay:
external: true
|
Output: Traefik File Provider Rule
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| # Auto-generated by docker-label-sync
# Container: webapp (abc123def)
# Host: swarm-wrk-01.example.com
http:
routers:
webapp:
rule: "Host(`app.example.com`)"
entryPoints:
- websecure
service: webapp
tls:
certResolver: letsencrypt
services:
webapp:
loadBalancer:
servers:
- url: "http://10.0.1.15:8080" # Container IP in overlay
|
Systemd Service#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| [Unit]
Description=Docker Label Sync Service
After=docker.service
Requires=docker.service
[Service]
Type=simple
User=root
EnvironmentFile=/etc/docker-label-sync.conf
ExecStart=/usr/local/bin/docker-label-sync
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
|
Configuration#
1
2
3
4
5
6
7
8
| # /etc/docker-label-sync.conf
TRAEFIK_SYNC_MANAGER=<ssh-user>@swarm-mgr-01.example.com
TRAEFIK_SYNC_PATH=/opt/stacks/traefik/config/rules
TRAEFIK_SYNC_NETWORK=traefik-overlay
TRAEFIK_SYNC_PREFIX=worker-
TRAEFIK_SYNC_CLEANUP_DELAY=15
TRAEFIK_SYNC_SSH_KEY=/home/<ssh-user>/.ssh/id_ed25519
TRAEFIK_SYNC_CERT_RESOLVER=letsencrypt
|
Traefik Configuration#
The Swarm manager runs Traefik with both providers:
1
2
3
4
5
6
7
8
9
10
| # traefik command arguments
command:
# Swarm provider for stack-deployed services
- "--providers.swarm"
- "--providers.swarm.exposedbydefault=false"
- "--providers.swarm.network=traefik-overlay"
# File provider for worker container rules
- "--providers.file.directory=/etc/traefik/rules"
- "--providers.file.watch=true"
|
Cleanup Strategy#
When a container stops, the sync service:
- Waits for configurable delay (default: 15 seconds)
- Removes the rule file from the manager
- Traefik automatically removes the route
The delay prevents flapping during container restarts or quick redeploys.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
| Container Stop Event
│
▼
┌───────────────────┐
│ Start 15s timer │
└─────────┬─────────┘
│
▼
Container restarts
within 15 seconds?
│
┌─────┴─────┐
│ │
Yes No
│ │
▼ ▼
┌─────────┐ ┌─────────────┐
│ Cancel │ │ Remove rule │
│ timer │ │ from manager│
└─────────┘ └─────────────┘
|
Network Requirements#
Overlay Network#
The overlay network must be:
- Created on the Swarm manager
- Marked as
attachable: true - Used by both Traefik and worker containers
1
2
| # On manager
docker network create --driver overlay --attachable traefik-overlay
|
SSH Access#
Workers need SSH access to the manager for rsync:
- Uses dedicated service user (
ssh-user) - Only needs write access to rules directory
Advantages#
| Aspect | Benefit |
|---|
| Simplicity | No complex service mesh required |
| Compatibility | Works with standard docker-compose |
| Real-time | Docker events API provides instant updates |
| Reliability | rsync over SSH is robust and well-tested |
| Security | SSH keys, no exposed Docker sockets |
| Flexibility | Works with any deployment tool (Komodo, Portainer, CLI) |
Limitations#
- Requires SSH connectivity between workers and manager
- Additional service running on each worker
- Slight delay compared to native Traefik discovery (~1-2 seconds)
- Rule files accumulate if cleanup fails (manual cleanup may be needed)
Monitoring#
Check service status:
1
2
| systemctl status docker-label-sync
journalctl -u docker-label-sync -f
|
Verify rules on manager:
1
2
| ls -la /opt/stacks/traefik/config/rules/
cat /opt/stacks/traefik/config/rules/worker-*.yml
|
Conclusion#
Docker Label Sync bridges the gap between Traefik v3’s Swarm-only discovery and the need for docker-compose deployments on worker nodes. It’s a pragmatic solution that:
- Maintains the familiar label-based configuration
- Works transparently with existing workflows
- Requires minimal infrastructure changes
- Provides real-time route updates
The approach trades a small amount of complexity (one additional service per worker) for significant operational flexibility.