Expands the Containers feature with two new ways to see and manage Docker containers without exposing the Docker Engine TCP socket, plus the docs and roadmap entries that frame them. Docker over SSH (management): - Runs the `docker` CLI on a remote SSH host instead of talking to the Engine TCP API, reusing the existing SSH transport (jump-host chaining, host-key verification, key/password auth) via connectTarget + execCommand. No dockerd socket has to be exposed — the mesh + SSH auth are the gate. - backend/src/ssh/docker.ts: list/logs/start/stop/restart/pause/unpause/remove and an interactive `docker exec` shell builder. Container refs are validated against a strict allowlist and single-quoted to prevent command injection; action verbs are whitelisted. - backend/src/routes/dockerSsh.ts: REST routes mirroring the TCP Docker API shape (mutating actions gated by adminOnly) + a /api/docker-ssh/exec WebSocket modeled on the terminal PTY plumbing. - Note: the SSH path uses the ssh2 key/password auth; it does not implement the OpenSSH-certificate (OPKSSH) fallback that the terminal route has. Docker push-agent monitoring (self-hosted, read-only): - A small bash agent (agent/archnest-docker-agent.sh) runs on each Docker VM, collects a rich snapshot (docker ps + inspect + a stats snapshot), masks secret-looking env values locally, and POSTs it to ArchNest. VMs need outbound-only mesh access — no exposed port, no SSH for monitoring. - backend/src/routes/agents.ts: token-gated ingest (POST /api/agents/docker/report, ARCHNEST_AGENT_TOKEN, constant-time compare; 503 when unset, so it is disabled by default) plus user-auth read endpoints (hosts list with staleness flag, per-host containers, single-container detail). New docker_agent_reports table (latest report per host). - Ingest stores data only; it never executes anything from the agent. Containers page: - Host selector now spans Docker API, SSH, and Agent sources. - Intra-page tabs: a Containers list plus dynamic, closeable per-container detail tabs opened by clicking a container name. Agent detail shows overview/state/stats/ports/networks/mounts/env(masked)/labels; docker/ssh degrade gracefully. Agent rows are read-only; docker/ssh keep management. Docs/roadmap: - docs/docker-agent-monitoring.md (design doc, written before implementation). - ROADMAP.md: LXC management (paid), Docker monitoring agent tiering (push self-hosted now / pull-agent paid), terminal grid tiering. Deferred (documented, not built here): the mesh-prerequisite setup gate, the paid pull-agent (Option 2), per-host tokens, time-series metrics. Requires ARCHNEST_AGENT_TOKEN in the backend env to enable agent ingest. Verified: backend `tsc --noEmit` and frontend `tsc -b && vite build` both pass; agent jq filters, byte conversion, and `bash -n` checked locally. Co-authored-by: Samuel James <ssamjame@amazon.com> Co-authored-by: Kiro <noreply@kiro.dev>
125 lines
3.4 KiB
Markdown
125 lines
3.4 KiB
Markdown
# ArchNest Docker monitoring agent
|
|
|
|
A small push agent that reports this host's Docker containers to ArchNest. See
|
|
the design in [`docs/docker-agent-monitoring.md`](../docs/docker-agent-monitoring.md).
|
|
|
|
It is **monitoring only** — it pushes data outbound to ArchNest and never
|
|
receives or runs commands. Container management stays on ArchNest's
|
|
Docker-over-SSH / Docker API paths.
|
|
|
|
## Requirements
|
|
|
|
`bash`, `docker`, `curl`, `jq`. Install `jq` if missing:
|
|
|
|
```bash
|
|
# Debian/Ubuntu
|
|
sudo apt-get install -y jq
|
|
# RHEL/Alma/Rocky
|
|
sudo dnf install -y jq
|
|
# Alpine
|
|
sudo apk add jq
|
|
```
|
|
|
|
The user running the agent must be able to run `docker` (in the `docker` group
|
|
or via root).
|
|
|
|
## Install
|
|
|
|
1. Copy the script onto the VM and make it executable:
|
|
|
|
```bash
|
|
sudo install -m 0755 archnest-docker-agent.sh /usr/local/bin/archnest-docker-agent
|
|
```
|
|
|
|
2. Create the config file (keep it root-only — it holds the token):
|
|
|
|
```bash
|
|
sudo mkdir -p /etc/archnest
|
|
sudo tee /etc/archnest/agent.env >/dev/null <<'EOF'
|
|
ARCHNEST_URL=http://<archnest-mesh-ip>:4000
|
|
ARCHNEST_AGENT_TOKEN=<the shared token, same as the backend>
|
|
ARCHNEST_HOST_ID=proxmox-vm-1
|
|
# ARCHNEST_HOSTNAME=docker01 # optional; defaults to `hostname`
|
|
EOF
|
|
sudo chmod 600 /etc/archnest/agent.env
|
|
```
|
|
|
|
`ARCHNEST_URL` must point at the ArchNest backend over your **mesh / private
|
|
network**, never a public address — the ingest endpoint is protected only by
|
|
the shared token at the application layer.
|
|
|
|
3. Run it once to verify:
|
|
|
|
```bash
|
|
sudo archnest-docker-agent
|
|
# -> "reported N container(s) as 'proxmox-vm-1' (HTTP 200)"
|
|
```
|
|
|
|
## Schedule it (pick one)
|
|
|
|
Report interval should be **shorter than the backend's stale window**
|
|
(`ARCHNEST_AGENT_STALE_MS`, default 90s). 30s is a good default.
|
|
|
|
### Option A — cron (every minute; simplest)
|
|
|
|
```cron
|
|
* * * * * root /usr/local/bin/archnest-docker-agent >/dev/null 2>&1
|
|
```
|
|
|
|
(cron's finest granularity is 1 minute; raise `ARCHNEST_AGENT_STALE_MS` to e.g.
|
|
150000 on the backend if you use a 1-minute cron.)
|
|
|
|
### Option B — systemd service + timer (recommended; supports 30s)
|
|
|
|
`/etc/systemd/system/archnest-docker-agent.service`:
|
|
|
|
```ini
|
|
[Unit]
|
|
Description=ArchNest Docker monitoring agent
|
|
After=docker.service
|
|
Wants=docker.service
|
|
|
|
[Service]
|
|
Type=oneshot
|
|
EnvironmentFile=/etc/archnest/agent.env
|
|
ExecStart=/usr/local/bin/archnest-docker-agent
|
|
```
|
|
|
|
`/etc/systemd/system/archnest-docker-agent.timer`:
|
|
|
|
```ini
|
|
[Unit]
|
|
Description=Run ArchNest Docker monitoring agent every 30s
|
|
|
|
[Timer]
|
|
OnBootSec=30
|
|
OnUnitActiveSec=30
|
|
AccuracySec=5s
|
|
|
|
[Install]
|
|
WantedBy=timers.target
|
|
```
|
|
|
|
Enable:
|
|
|
|
```bash
|
|
sudo systemctl daemon-reload
|
|
sudo systemctl enable --now archnest-docker-agent.timer
|
|
sudo systemctl list-timers archnest-docker-agent.timer # confirm scheduling
|
|
journalctl -u archnest-docker-agent.service -n 20 # see last run output
|
|
```
|
|
|
|
## Backend configuration
|
|
|
|
The backend must have `ARCHNEST_AGENT_TOKEN` set (the same value as the agent).
|
|
If it is unset, the ingest endpoint is disabled and returns HTTP 503. Optional:
|
|
`ARCHNEST_AGENT_STALE_MS` (default 90000) controls when a host is shown stale.
|
|
|
|
## Security notes
|
|
|
|
- The token is a credential — treat `/etc/archnest/agent.env` as sensitive
|
|
(`chmod 600`, root-owned).
|
|
- The agent masks env var values whose key matches
|
|
`PASS|SECRET|TOKEN|KEY|PRIVATE|CREDENTIAL` before sending; the full values
|
|
never leave the VM.
|
|
- Expose the ArchNest ingest endpoint on the mesh only, not the public internet.
|