Keep SSH terminal sessions connected across page navigation (#30)
The Terminal page held all session state (xterm instances and their
WebSockets) in component-local React state. Because it renders as a
`<Route element={<Terminal />}>`, navigating away unmounted it and ran
the xterm cleanup (`term.dispose()` + `ws.close()`), tearing down every
SSH session. Returning to the page reconnected from scratch, losing
scrollback and any running work.
Lift terminal sessions into a `TerminalSessionProvider` mounted above the
router (in `main.tsx`, inside `AuthProvider`). The provider owns each
pane's xterm instance, fit addon, WebSocket, and a persistent wrapper DOM
node. Wrappers live in a hidden container at the app root; the Terminal
page re-parents them into its grid on mount and moves them back to the
hidden root on unmount instead of disposing — so the xterm + WebSocket
keep running in the background across route changes.
Disconnect semantics: closing a tab/pane (or shrinking the 1/2/4 grid)
destroys those sessions; logout tears down all sessions. A full browser
reload still drops connections (the WebSocket dies with the page) — this
persists across in-app navigation only.
Shared terminal constants/types/prefs are split into a non-component
module (`src/lib/terminalPrefs.ts`) so the context file stays a clean
component module.
Also document the terminal window grid-view tiering in ROADMAP.md
(self-hosted = 4-window cap, current; paid = as many as fit on screen,
planned for the AWS deployment), and realign HANDOFF/README/design-docs
to reflect that auth Phase 3 (multi-user) shipped and Phase 4 (SSO) is
deferred to a paid AWS add-on.
Verified with a clean `tsc -b && vite build` (frontend) and
`tsc --noEmit -p .` (backend).
Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
2026-06-20 15:02:50 -04:00
|
|
|
# ArchNest — Roadmap
|
|
|
|
|
|
|
|
|
|
Forward-looking work that is **planned but not currently being built**. For
|
|
|
|
|
the state of shipped work and the active task, see `HANDOFF.md`. For
|
|
|
|
|
historical feature build-out, see `TERMIX_MIGRATION.md`.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Shipped (for context)
|
|
|
|
|
|
|
|
|
|
The auth roadmap so far — full detail in `HANDOFF.md`:
|
|
|
|
|
|
|
|
|
|
- **Phase 1** — User menu Profile/Appearance/Security wired up; `?tab=`
|
|
|
|
|
deep-linking in Settings.
|
|
|
|
|
- **Phase 2** — Password change, server-tracked sessions, login audit log.
|
|
|
|
|
- **Phase 3** — Multi-user accounts: admin/member roles, `active` flag,
|
|
|
|
|
10-seat cap, admin-only user management, `requireAdmin`/`adminOnly` gating.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Phase 4 — Authentik SSO (OIDC) — PAID ADD-ON (AWS deployment)
|
|
|
|
|
|
|
|
|
|
**Status:** deferred. This is intentionally **not** part of the
|
|
|
|
|
self-hosted core build. It is planned as a **paid add-on, shipped when
|
|
|
|
|
ArchNest is deployed on AWS** — not on the current `racknerd1` deployment.
|
|
|
|
|
|
|
|
|
|
Local username/password auth (Phases 1-3) remains the free, always-available
|
|
|
|
|
path and the admin recovery path; SSO layers on top of it rather than
|
|
|
|
|
replacing it.
|
|
|
|
|
|
|
|
|
|
### Intended scope (when built)
|
|
|
|
|
- Instance-level SSO config (issuer URL, client ID/secret, redirect URI) —
|
|
|
|
|
likely an integration-like settings entry, or a dedicated config
|
|
|
|
|
table / env vars.
|
|
|
|
|
- `GET /api/auth/sso/login` → redirect to Authentik.
|
|
|
|
|
- `GET /api/auth/sso/callback` → exchange code, look up/create local user by
|
|
|
|
|
SSO subject claim (respecting the 10-user cap from Phase 3), issue the same
|
|
|
|
|
JWT format as today.
|
|
|
|
|
- "Sign in with SSO" button on `Login.tsx` alongside username/password
|
|
|
|
|
(local accounts remain — do **not** remove password auth entirely).
|
|
|
|
|
|
|
|
|
|
### Open scope questions (decide before any code)
|
|
|
|
|
1. **Where does SSO config live?** env vars (simplest, redeploy to change) vs.
|
|
|
|
|
a dedicated config table vs. an integration-like settings entry (editable
|
|
|
|
|
in-UI, more work).
|
|
|
|
|
2. **First-login provisioning** — auto-create a local `member` for an
|
|
|
|
|
unknown-but-valid SSO user (subject to the 10-seat cap), or require an
|
|
|
|
|
admin to pre-create the account and only *link* it on SSO login?
|
|
|
|
|
3. **Role mapping** — do Authentik groups/claims map to admin/member, or do
|
|
|
|
|
all SSO users default to `member` with roles managed locally?
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Terminal — window grid view (tiered: self-hosted vs. paid)
|
|
|
|
|
|
|
|
|
|
**Status:** self-hosted behavior is current; the paid tier is planned.
|
|
|
|
|
|
|
|
|
|
The Terminal page (`src/pages/Terminal.tsx`) supports a split-pane grid view
|
|
|
|
|
within a tab.
|
|
|
|
|
|
|
|
|
|
- **Self-hosted (current):** capped at a **4-window grid** (1 / 2 / 4 pane
|
|
|
|
|
layouts via the toolbar buttons). This is the free, always-available tier.
|
|
|
|
|
- **Paid (planned, AWS deployment):** **as many windows as fit on the
|
|
|
|
|
screen** — dynamic grid sizing beyond the 4-pane cap, laid out responsively
|
|
|
|
|
to the viewport rather than a fixed 1/2/4 choice.
|
|
|
|
|
|
|
|
|
|
When the paid tier is built, the 4-pane cap becomes a licensing/feature gate
|
|
|
|
|
rather than a hard UI limit; the grid layout logic generalizes to an
|
|
|
|
|
arbitrary pane count.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
Add Docker-over-SSH management and push-agent monitoring (#31)
Expands the Containers feature with two new ways to see and manage Docker
containers without exposing the Docker Engine TCP socket, plus the docs and
roadmap entries that frame them.
Docker over SSH (management):
- Runs the `docker` CLI on a remote SSH host instead of talking to the Engine
TCP API, reusing the existing SSH transport (jump-host chaining, host-key
verification, key/password auth) via connectTarget + execCommand. No dockerd
socket has to be exposed — the mesh + SSH auth are the gate.
- backend/src/ssh/docker.ts: list/logs/start/stop/restart/pause/unpause/remove
and an interactive `docker exec` shell builder. Container refs are validated
against a strict allowlist and single-quoted to prevent command injection;
action verbs are whitelisted.
- backend/src/routes/dockerSsh.ts: REST routes mirroring the TCP Docker API
shape (mutating actions gated by adminOnly) + a /api/docker-ssh/exec
WebSocket modeled on the terminal PTY plumbing.
- Note: the SSH path uses the ssh2 key/password auth; it does not implement the
OpenSSH-certificate (OPKSSH) fallback that the terminal route has.
Docker push-agent monitoring (self-hosted, read-only):
- A small bash agent (agent/archnest-docker-agent.sh) runs on each Docker VM,
collects a rich snapshot (docker ps + inspect + a stats snapshot), masks
secret-looking env values locally, and POSTs it to ArchNest. VMs need
outbound-only mesh access — no exposed port, no SSH for monitoring.
- backend/src/routes/agents.ts: token-gated ingest
(POST /api/agents/docker/report, ARCHNEST_AGENT_TOKEN, constant-time compare;
503 when unset, so it is disabled by default) plus user-auth read endpoints
(hosts list with staleness flag, per-host containers, single-container
detail). New docker_agent_reports table (latest report per host).
- Ingest stores data only; it never executes anything from the agent.
Containers page:
- Host selector now spans Docker API, SSH, and Agent sources.
- Intra-page tabs: a Containers list plus dynamic, closeable per-container
detail tabs opened by clicking a container name. Agent detail shows
overview/state/stats/ports/networks/mounts/env(masked)/labels; docker/ssh
degrade gracefully. Agent rows are read-only; docker/ssh keep management.
Docs/roadmap:
- docs/docker-agent-monitoring.md (design doc, written before implementation).
- ROADMAP.md: LXC management (paid), Docker monitoring agent tiering
(push self-hosted now / pull-agent paid), terminal grid tiering.
Deferred (documented, not built here): the mesh-prerequisite setup gate, the
paid pull-agent (Option 2), per-host tokens, time-series metrics.
Requires ARCHNEST_AGENT_TOKEN in the backend env to enable agent ingest.
Verified: backend `tsc --noEmit` and frontend `tsc -b && vite build` both pass;
agent jq filters, byte conversion, and `bash -n` checked locally.
Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
2026-06-20 16:24:57 -04:00
|
|
|
## LXC container management (Proxmox) — PAID ADD-ON
|
|
|
|
|
|
|
|
|
|
**Status:** not built; planned as a paid-tier feature.
|
|
|
|
|
|
|
|
|
|
ArchNest currently has full **Docker** container management (the Containers
|
|
|
|
|
page: list/start/stop/restart/pause/remove, logs, interactive exec — backed
|
|
|
|
|
by `backend/src/routes/docker.ts` + `backend/src/docker/`). There is **no LXC
|
|
|
|
|
equivalent**.
|
|
|
|
|
|
|
|
|
|
The only place LXC could surface today is the Proxmox integration's
|
|
|
|
|
`listResources()` (`backend/src/integrations/proxmox.ts`), and it currently
|
|
|
|
|
queries `/api2/json/cluster/resources?type=vm` — i.e. **QEMU VMs only**, so
|
|
|
|
|
Proxmox LXC containers (`type=lxc`) are not even listed.
|
|
|
|
|
|
|
|
|
|
Planned scope (paid tier):
|
|
|
|
|
- **List** LXC guests alongside VMs (drop/relax the `type=vm` filter, or also
|
|
|
|
|
fetch `type=lxc`, and label them in the resource grid).
|
|
|
|
|
- **Lifecycle** management via Proxmox's per-node LXC API
|
|
|
|
|
(`POST /api2/json/nodes/{node}/lxc/{vmid}/status/{start|stop|shutdown}`) —
|
|
|
|
|
a new route group + `api.ts` entries + UI, mirroring the Docker Containers
|
|
|
|
|
page.
|
|
|
|
|
- **Console/shell** into an LXC guest via the Proxmox console/ticket API
|
|
|
|
|
(more involved than Docker exec — separate auth/ticket flow).
|
|
|
|
|
|
|
|
|
|
Note: the read-only "list LXC in the resource grid" piece is small and
|
|
|
|
|
arguably a bug fix (the Proxmox integration silently hides half a cluster's
|
|
|
|
|
guests today); if the user later wants just that part in the free tier, it
|
|
|
|
|
can be split out from this paid add-on.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Docker monitoring agent — tiered (push self-hosted / pull paid)
|
|
|
|
|
|
|
|
|
|
ArchNest can manage Docker containers two ways today: the Docker Engine TCP
|
|
|
|
|
integration (`backend/src/docker/`) and "Docker over SSH" (runs the `docker`
|
|
|
|
|
CLI on a remote SSH host — `backend/src/ssh/docker.ts`,
|
|
|
|
|
`backend/src/routes/dockerSsh.ts`). Both are **pull** models where ArchNest
|
|
|
|
|
reaches into the host.
|
|
|
|
|
|
|
|
|
|
A complementary **agent** model is planned, split across tiers:
|
|
|
|
|
|
|
|
|
|
### Self-hosted — Option 1: push agent (monitoring) — IN PROGRESS
|
|
|
|
|
- A lightweight script dropped on each Docker VM (bash + `docker` CLI + curl)
|
|
|
|
|
collects `docker ps` (+ optional per-container stats) and **POSTs** a JSON
|
|
|
|
|
report to an ArchNest ingest endpoint on a timer (cron/systemd).
|
|
|
|
|
- VMs need **outbound-only** access to ArchNest over the mesh — no exposed
|
|
|
|
|
port, no SSH, no dockerd socket. Cleanest security story for the free tier.
|
|
|
|
|
- ArchNest stores the latest report per host and surfaces it as a read-only
|
|
|
|
|
monitoring view / Infrastructure resource source.
|
|
|
|
|
- **Monitoring only** — a one-way push cannot perform actions. Management on
|
|
|
|
|
self-hosted continues to use the existing **Docker-over-SSH** path on
|
|
|
|
|
demand, so nothing is removed: push = constant monitoring (zero exposure),
|
|
|
|
|
SSH = occasional management action.
|
|
|
|
|
|
|
|
|
|
### Paid — Option 2: pull agent with local API (monitor + manage)
|
|
|
|
|
- A small **authenticated HTTP service** runs on each VM, bound to its mesh
|
|
|
|
|
IP, exposing a thin, locked-down wrapper over the Docker socket
|
|
|
|
|
(`/containers`, `/logs`, lifecycle actions, exec).
|
|
|
|
|
- ArchNest **pulls** on demand — supports both monitoring and management
|
|
|
|
|
through one uniform mechanism, with real per-agent auth (which the raw
|
|
|
|
|
dockerd TCP socket lacks).
|
|
|
|
|
- Tradeoff: exposes a (locked-down, authenticated) port on each VM, and is a
|
|
|
|
|
service to run/secure — hence gated to the paid tier.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
2026-06-22 16:39:50 -04:00
|
|
|
## Remote Desktop — GNOME & KDE support — ADD-ON
|
|
|
|
|
|
|
|
|
|
**Status:** not built; **XFCE is confirmed working today**, the others are not.
|
|
|
|
|
Full investigation + research lives in `docs/rdp-debug-handoff.md`.
|
|
|
|
|
|
|
|
|
|
Remote Desktop (Guacamole/`guacd` → RDP) works end-to-end **only with XFCE** on
|
|
|
|
|
the target VM, via xrdp's X11 backend. GNOME and KDE do **not** work yet:
|
|
|
|
|
|
|
|
|
|
- **GNOME (Wayland-only on modern distros)** is blocked two ways: it ships no
|
|
|
|
|
Xorg session for xrdp to launch, *and* its native `gnome-remote-desktop`
|
|
|
|
|
mandates NLA that guacd's bundled **FreeRDP 2.x cannot complete** ("wrong
|
|
|
|
|
security type"). Verified on Fedora 44 / GNOME 50.
|
|
|
|
|
- **KDE Plasma 6** is expected to work like XFCE via xrdp + `startplasma-x11`
|
|
|
|
|
(X11 session shipped through ~early 2027), but is **not yet installed/tested**.
|
|
|
|
|
|
|
|
|
|
Planned scope (add-on):
|
|
|
|
|
1. **Custom `guacd` image built against FreeRDP 3** (Apache's official 1.5.5 and
|
|
|
|
|
1.6.0 images both still ship FreeRDP 2.11.x). This is the real unlock for
|
|
|
|
|
GNOME's native Wayland RDP and benefits any modern GNOME / Ubuntu-24.04+
|
|
|
|
|
target — not just this VM. ~30-min from-source build to maintain in
|
|
|
|
|
`docker-compose.yml`.
|
|
|
|
|
2. **GNOME headless "system" RDP** (GDM handover, GNOME 46+) — the intended
|
|
|
|
|
modern path; only viable once (1) lands because it still uses NLA.
|
|
|
|
|
3. **KDE Plasma** via xrdp + `startplasma-x11` (quick win, no guacd change;
|
|
|
|
|
likely needs a KWin compositing/software-GL tweak on virtual GPUs).
|
|
|
|
|
4. **Per-host desktop/session selection** instead of a single global
|
|
|
|
|
`/etc/xrdp/startwm.sh`, so one VM can offer XFCE / KDE / GNOME.
|
|
|
|
|
|
|
|
|
|
See `docs/rdp-debug-handoff.md` for the suggested order of work and primary-source
|
|
|
|
|
references (SUSE headless-GNOME series, jamesnorth GRD setup, RHEL 10 docs, KDE
|
|
|
|
|
discuss threads).
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
2026-06-21 09:35:55 -04:00
|
|
|
## Per-integration node tabs — PAID ADD-ON
|
|
|
|
|
|
|
|
|
|
**Status:** not built; planned as a paid-tier feature.
|
|
|
|
|
|
|
|
|
|
Node Status on the Infrastructure page collapses every integration (except
|
|
|
|
|
Proxmox) into a **single tile per integration** — e.g. 30 EC2 instances under
|
|
|
|
|
one "AWS" tile, all of Uptime Kuma's monitors under one "Uptime" tile — with
|
|
|
|
|
the individual members only visible in the Node Detail card after selecting
|
|
|
|
|
that tile (`ungroupedIntegrationTypes` in `src/pages/Infrastructure.tsx`).
|
|
|
|
|
This keeps the grid usable when an integration has dozens/hundreds of
|
|
|
|
|
resources, but it means there's currently no way to see *all* nodes of a
|
|
|
|
|
given integration laid out at once.
|
|
|
|
|
|
|
|
|
|
Planned scope (paid tier): a dedicated **tab per integration** (alongside
|
|
|
|
|
today's Overview/Network etc. sub-tabs) that lists every node belonging to
|
|
|
|
|
that integration — full grid, not just the grouped summary tile — for users
|
|
|
|
|
who want to browse/filter dozens of EC2 instances, Docker containers, or
|
|
|
|
|
Uptime Kuma monitors directly rather than drilling through Node Detail.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
Keep SSH terminal sessions connected across page navigation (#30)
The Terminal page held all session state (xterm instances and their
WebSockets) in component-local React state. Because it renders as a
`<Route element={<Terminal />}>`, navigating away unmounted it and ran
the xterm cleanup (`term.dispose()` + `ws.close()`), tearing down every
SSH session. Returning to the page reconnected from scratch, losing
scrollback and any running work.
Lift terminal sessions into a `TerminalSessionProvider` mounted above the
router (in `main.tsx`, inside `AuthProvider`). The provider owns each
pane's xterm instance, fit addon, WebSocket, and a persistent wrapper DOM
node. Wrappers live in a hidden container at the app root; the Terminal
page re-parents them into its grid on mount and moves them back to the
hidden root on unmount instead of disposing — so the xterm + WebSocket
keep running in the background across route changes.
Disconnect semantics: closing a tab/pane (or shrinking the 1/2/4 grid)
destroys those sessions; logout tears down all sessions. A full browser
reload still drops connections (the WebSocket dies with the page) — this
persists across in-app navigation only.
Shared terminal constants/types/prefs are split into a non-component
module (`src/lib/terminalPrefs.ts`) so the context file stays a clean
component module.
Also document the terminal window grid-view tiering in ROADMAP.md
(self-hosted = 4-window cap, current; paid = as many as fit on screen,
planned for the AWS deployment), and realign HANDOFF/README/design-docs
to reflect that auth Phase 3 (multi-user) shipped and Phase 4 (SSO) is
deferred to a paid AWS add-on.
Verified with a clean `tsc -b && vite build` (frontend) and
`tsc --noEmit -p .` (backend).
Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
2026-06-20 15:02:50 -04:00
|
|
|
## Known non-blocking stubs (cosmetic, not scheduled)
|
|
|
|
|
|
|
|
|
|
Not flagged as work to do unless explicitly asked:
|
|
|
|
|
|
|
|
|
|
- `Infrastructure.tsx`'s "Network" sub-tab is **intentionally** disabled
|
|
|
|
|
(`title="Coming soon"`) — leave alone unless explicitly asked.
|
|
|
|
|
- `Settings.tsx`'s Appearance section (theme/accent/fontSize/radius/
|
|
|
|
|
sidebarExpanded/animations) is local-state-only — doesn't persist or apply
|
|
|
|
|
anywhere. Recommended fix if picked up: mirror the Terminal page's
|
|
|
|
|
`localStorage`-backed prefs pattern and apply via CSS variables on `:root`.
|
|
|
|
|
- `Settings.tsx`'s Notifications section (email/push/sound toggles) has no
|
|
|
|
|
backing delivery mechanism — recommend removing or clearly labeling as
|
|
|
|
|
not-yet-functional rather than persisting settings that do nothing.
|