Make the automated pipeline the documented "setup moving forward" and finish scrubbing the last stale GitHub-Actions/racknerd1 references that never reached main. - HANDOFF.md: refresh the stale 2026-06-21 snapshot. New "CI/CD & deploy" section (push to main -> build + push to registry.snsnetlabs.com -> auto-deploy to racknerd2 over SSH, SHA-pinned, /api/health gate), racknerd2 validation-host + SSH-tunnel access notes, Forgejo workflow rule, and a current Deployment + orientation section. - .kiro/steering/project-guide.md: Forgejo-only Git workflow (no gh), CI/CD row, registry host, racknerd2 + forgejo-runner SSH entries, and a CI/CD pipeline section. - .kiro/hooks/tunnel-racknerd2-8080.kiro.hook: the "View ArchNest on racknerd2" hook (ssh -L 8080:localhost:8080 -N) to view the deployed site at http://localhost:8080 (racknerd2's edge only allows port 22). - src/pages/Settings.tsx: About panel repo URL -> Forgejo. - .dockerignore: .github -> .forgejo. - TERMIX_MIGRATION.md / docs/OPEN-SOURCE-RELEASE.md: drop stale .github/workflows + "GitHub Actions deploy" references. Co-authored-by: Samuel James <ssamjame@amazon.com> Co-authored-by: Kiro <noreply@kiro.dev>
42 KiB
Termix → ArchNest Migration Plan
Status doc for porting Termix's full feature set into ArchNest as a single app, single backend, single auth, single database — reskinned to match ArchNest's design. Written so any session (human or AI) can see exactly what's done, what's next, and why decisions were made.
Migration status: COMPLETE. All 8 phases below are DONE and verified. No further feature work is queued from this migration. CI/CD has since moved to Forgejo Actions (build → registry.snsnetlabs.com → auto-deploy to racknerd2) — see HANDOFF.md and deploy/README.md. Do not start new feature work here without explicit instruction.
Source: https://github.com/SamuelSJames/Termix (user's fork), cloned for reference at the time of writing. Upstream is Termix-SSH/Termix, an Electron + Express + Drizzle ORM self-hosted SSH/RDP/VNC management app — not a small terminal widget. It ships as its own Docker image with a guacd sidecar for RDP/VNC.
Decision: why merge into ArchNest's backend, not Termix's
ArchNest's backend (Fastify + better-sqlite3 + JWT) is small and already has things worth keeping: the bookmarks system, the integration adapter framework (Proxmox/AWS/NetBird/Cloudflare/Weather/SSH health checks — see backend/src/integrations/), the audit log, and a working auth/profile system built this session. Termix's backend is much bigger but its value is the SSH/tunnel/file-manager/Docker/RDP feature logic, not its auth system (OIDC/LDAP/2FA) or its Drizzle schema. So: port Termix's feature modules onto ArchNest's existing Fastify app and auth, don't adopt Termix's backend wholesale.
What is explicitly NOT being ported (user-approved tradeoff)
- Electron desktop app + native installers (Chocolatey/Flatpak/AppImage/MSI/Cask) — ArchNest is a web app.
- OIDC/LDAP/2FA/SSO and Termix's own multi-user auth system — replaced by ArchNest's existing JWT auth. User confirmed they don't currently use 2FA/OIDC/LDAP, so this is an accepted downgrade, not an oversight.
- ~30 language translations (i18n) — not a stated goal, not being ported.
- All Termix branding — logos, icons, About/product copy, links to Termix's Discord/docs/GitHub. Every ported UI component gets reskinned to ArchNest's Tailwind theme (gold
#C8A434, the existing dark palette) as part of porting it, not as a separate pass.
Everything else — SSH terminal, tunnels, file manager, Docker management, RDP/VNC/Telnet, host metrics — is in scope to fully port, feature-equivalent, just rebuilt on ArchNest's stack.
Phases
Each phase is independently committable and testable. Do not start a later phase before the previous one is working end-to-end and committed — this is a large port and needs to land in reviewable chunks.
Phase 1 — SSH Terminal (DONE)
The actual /terminal page: a real interactive SSH terminal in the browser (xterm.js + WebSocket), reusing the SSH credentials already stored in ArchNest's integrations (no second "add a host" flow — Termix's separate host-manager concept is being merged into ArchNest's existing integrations table/SSH adapter, not duplicated).
Termix source files this phase is based on (sizes as of the fork snapshot, for scoping):
src/backend/ssh/terminal.ts(2,570 lines) — WebSocket route handling, message protocol (connect/data/resize/disconnect), output buffering.src/backend/ssh/terminal-session-manager.ts(570 lines) — session lifecycle, reattach-on-reconnect, per-user session caps, idle timeout, optional session logging to disk.src/backend/ssh/ssh-connection-pool.ts(225 lines) — connection reuse.src/backend/ssh/host-resolver.ts,jump-host-chain.ts,terminal-jump-hosts.ts(~900 lines combined) — jump-host / bastion chaining.src/backend/ssh/auth-manager.ts,credential-username.ts,host-key-verifier.ts,terminal-auth-helpers.ts(~950 lines combined) — credential resolution, host key verification/trust-on-first-use.src/backend/ssh/opkssh-auth.ts,opkssh-cert-auth.ts(~1,350 lines) — OPKSSH (OpenPubkey SSH) certificate auth.src/backend/ssh/tmux-monitor.ts,tmux-helper.ts,tmux-monitor-helpers.ts(~1,350 lines) — tmux session detection/monitoring inside the terminal.- Frontend:
src/ui/features/terminal/*— xterm.js wrapper, tab system, up-to-4-panel split screen, theme/font customization.
Scope split for this phase, given the size above:
- Phase 1a (doing this now): core single-session SSH terminal. WebSocket connect/data/resize/disconnect, using ArchNest's existing SSH integration config/secrets (host/port/username/password/privateKey/passphrase — already in
backend/src/integrations/ssh.ts) instead of Termix's separate host table. One terminal per tab, no split panes yet, no jump hosts, no OPKSSH, no tmux monitor, no session recording/logging. Ported onto Fastify's WebSocket support, reusing ArchNest's JWT auth for the WS handshake. - Phase 1b (follow-up, not blocking 1a): jump-host/bastion chaining, host-key verification/trust-on-first-use UI, tab system + up to 4 split panes, terminal theme/font customization settings.
- Phase 1c (follow-up, lower priority): OPKSSH cert auth, tmux session monitor/reattach, session recording/logging to disk.
Rationale for splitting: 1a alone is a real, useful terminal (matches what /terminal needs to stop being a placeholder) and is testable end-to-end on its own. Bundling jump-hosts/OPKSSH/tmux into the first pass risks a large unreviewable change with no working checkpoint in between.
Status:
- ✅ Phase 1a — done.
/terminalis a real interactive SSH terminal:backend/src/routes/terminal.ts(WebSocket, connect/input/resize/disconnect overssh2),backend/src/db/secrets.ts(shared secret loader),src/pages/Terminal.tsx(xterm.js + host picker, reuses ArchNest's existing SSH integrations — no duplicate host table). Verified end-to-end against a real test SSH server. No jump hosts, no tabs/split panes, no OPKSSH, no tmux monitor yet — see 1b/1c below. - ✅ Phase 1b — done.
- Jump-host chaining: an SSH integration's config can carry
jumpHostIntegrationIdreferencing another SSH integration.backend/src/routes/terminal.tsconnects to the jump host first, opens aforwardOut()channel to the real target, and connects the targetClientover that channel (single-hop; mirrors Termix's core mechanism without its multi-hop/credential-sharing complexity). Verified end-to-end with two real test SSH servers (one as jump, one as target). - Host-key verification (TOFU): new
ssh_host_keystable (backend/src/db/index.ts) stores a SHA-256 fingerprint per SSH integration on first successful connect; subsequent connects are rejected if the fingerprint changes, viassh2'shostVerifierconnect option. No interactive accept/reject-changed-key UI yet — first-use accept-and-store, hard-reject on mismatch. Verified both the accept-on-first-use and reject-on-mismatch paths against a real test server. - Settings UI for multiple SSH hosts:
src/pages/Settings.tsxpreviously could only show/edit one integration per type, which silently broke multi-host SSH. Added a dedicatedSshHostsSectionwith its own per-host cards (Save/Test/Delete) and an "Add SSH Host" flow, including aJump Hostdropdown populated from the other configured SSH hosts. - Tabs + up to 4 split panes:
src/pages/Terminal.tsxrewritten around aTerminalPanecomponent (one xterm + WebSocket connection each, reusable). Each tab holds 1/2/4 panes (single / split-2 / 2x2 grid); each pane connects independently to whichever SSH host is clicked while it's focused. - Terminal theme/font customization: a preferences bar (theme preset, font size, font family) persisted to
localStorage(archnest-terminal-prefs), applied per-pane on connect. - Verified via a clean production build (
tsc -b && vite build), and subsequently browser-verified (Playwright/Chromium, once available): logged in, opened/terminal, connected a pane to a real SSH host (confirmed by the live remote promptuitester@vm:~$and aConnected — <host>status), split into 2 and 4 panes (confirmed 1→2→4 livexterminstances rendering as a 2×2 grid), opened a new tab, and changed the theme preference — confirmed it persisted tolocalStorage(archnest-terminal-prefs→{"themeName":"Matrix",...}). The original build-only caveat is now closed.
- Jump-host chaining: an SSH integration's config can carry
- ✅ Phase 1c — done, with one documented verification gap.
- OPKSSH / certificate auth:
ssh2(the npm library) has no support for OpenSSH certificates — confirmed by inspecting its type definitions and README, no certificate-related auth flow exists. ImplementedconnectWithCertificate()inbackend/src/routes/terminal.ts: writes the stored private key + certificate to a temp dir (mode0600) and shells out to the systemsshbinary (which natively understands-o CertificateFile=) under a realnode-ptypty. Used automatically when an SSH integration has acertificatesecret configured (new field added to Settings' SSH host form). Does not support jump-host chaining (documented limitation, not silently dropped — Termix's own OPKSSH path doesn't generally chain through jump hosts either). Verified end-to-end (gap from the original pass now closed): withopenssh-client/openssh-serveravailable, built a real SSH CA, signed a user key into an OpenSSH certificate (principalcertuser), configured a realsshdwithTrustedUserCAKeys+PasswordAuthentication no(so only cert auth could succeed), created a realssh-type integration carrying the private key + certificate as secrets, and drove ArchNest's actual/api/terminalWebSocket route: it reachedconnected, spawned the cert-auth pty, and a real shell echoed back a marker ascertuser— i.e. authentication genuinely happened via the certificate, not a password or plain key. - tmux session monitor/reattach: new WebSocket message
list_tmuxexecstmux list-sessionson the target host and returns session names;connectaccepts an optionaltmuxSession(validated against^[A-Za-z0-9_-]{1,64}$before being interpolated into a shell command, to prevent injection) which attaches to that tmux session or creates it if missing, viaexec('tmux attach -t <name> || tmux new-session -s <name>', { pty: ... })instead of a plainclient.shell().src/pages/Terminal.tsx's pane header gained a tmux session picker (plain shell / new session / attach to an existing one). Verified end-to-end against a real test SSH server running realbash/tmuxprocesses (vianode-pty): listed zero sessions, created atestsesstmux session through the WS protocol, confirmed a follow-uplist_tmuxcall returned['testsess']. - Session recording/logging to disk: new SSH integration config field
sessionLogging(checkbox in Settings' SSH host form). When set, all outbound terminal output (both thessh2path and the cert-auth pty path) is appended to<ARCHNEST_SESSION_LOG_DIR ?? './data/session-logs'>/<integrationId>_<timestamp>.log. No log browsing/download UI yet (not built — out of scope for this pass, not silently dropped). Verified end-to-end: a real shell session's output was confirmed present in its log file on disk. - Everything in this phase was tested against live processes (real
sshd, realtmux, real cert-auth via a real SSH CA), not mocked. The Phase 1b UI (tabs/split panes/theme) remains build/type-verified only — no interactive browser click-through was done — but every backend path, including cert auth, is now exercised end-to-end. All cert-auth test artifacts (CA, signed cert, testsshd, test OS user, test backend/DB) were cleaned up afterward.
- OPKSSH / certificate auth:
Phase 2 — SSH Tunnels (DONE)
Source: src/backend/ssh/tunnel.ts (2,414 lines) + tunnel-c2s-relay.ts, tunnel-socks5-relay.ts, tunnel-ssh-primitives.ts, tunnel-utils.ts, tunnel-c2s-relay-utils.ts (~830 lines combined) + frontend src/ui/features/tunnel/*.
Scope decision: Termix distinguishes "S2S" (server-to-server, backend-managed) and "C2S" (client-to-server, routed through Termix's desktop/Electron app) tunnels. ArchNest has no desktop client (explicitly out of scope per the top of this doc), so only the S2S model was ported — a single persistent backend process manages all tunnels, same as Termix's S2S path. C2S's WebSocket data-multiplexing-to-a-desktop-client layer was not ported; it has no equivalent need in a pure web app.
What was built:
backend/src/ssh/connect.ts— extractedloadSshHost/baseConnectConfig/connectTarget(jump-host chaining + TOFU host-key verification) out ofterminal.tsinto a shared module, since tunnels need the exact same SSH-connection logic terminal sessions do.backend/src/tunnels/manager.ts— in-memory tunnel runtime manager (Map<tunnelId, RuntimeState>), mirroring Termix'sactiveTunnels/connectionStatusmaps but scoped down to this app's needs. Three modes:- Local forward: a
net.Serverlistens onsourcePort; each inbound connection callsclient.forwardOut()toendpointHost:endpointPortand pipes the two sockets together. - Remote forward:
client.forwardIn('0.0.0.0', sourcePort)asks the SSH server to bind that port; incoming'tcp connection'events are piped to a localnet.connect()againstendpointHost:endpointPort. - Dynamic (SOCKS5): a
net.Serverlistens onsourcePortrunning a minimal SOCKS5 handshake (backend/src/tunnels/socks5.ts, CONNECT-only, no-auth — sufficient for this use case, not a general SOCKS5 server), thenforwardOut()s to whatever target the client requested per-connection. - Automatic reconnection: on SSH error/close or listener bind failure, schedules a retry after
retryIntervalMs, up tomaxRetries, then settles into anerrorstatus (mirrors Termix's retry/backoff but simplified to a fixed interval rather than exponential — sufficient for this scale). startAutoStartTunnels()is called once at server boot to bring up any tunnel withautoStartset.
- Local forward: a
backend/src/routes/tunnels.ts— REST CRUD (GET/POST /api/tunnels,DELETE /api/tunnels/:id) plusPOST /api/tunnels/:id/connect//disconnect. Status (stopped/connecting/connected/retrying/error+ retry count + last error) is read directly off the in-memory runtime state on everyGET /api/tunnels(simple polling from the frontend every 3s — no SSE/EventSource, unlike Termix; not needed at this scale and keeps the implementation smaller).backend/src/db/index.ts— newtunnelstable:id, name, integration_id, mode, source_port, endpoint_host, endpoint_port, auto_start, max_retries, retry_interval_ms, created_at. Each tunnel references an existing SSHintegrationsrow (no separate host table, consistent with the rest of this migration) — no separate "preset" concept needed since a tunnel row already is the saved preset.src/pages/Tunnels.tsx— new page (/tunnels, added to the sidebar with aWaypointsicon) with a creation form (name, SSH host picker, mode, source port, endpoint host/port, auto-start) and a card grid showing each tunnel's status, mode, route, and Start/Stop/Delete actions, polling every 3 seconds.
Verified end-to-end against a real test SSH server (extending the same real-ssh2-Server + node-pty pattern used in Phase 1c) that genuinely handles tcpip (forwardOut) and tcpip-forward/cancel-tcpip-forward (forwardIn) requests, plus a real upstream TCP echo server: created one tunnel of each mode (local/remote/dynamic), connected all three, and confirmed real data flowed through each — local forward and remote forward both delivered the upstream server's banner through the tunnel, and the dynamic tunnel completed a real SOCKS5 CONNECT handshake and relayed data. Also verified disconnect correctly tears down the local listener (ECONNREFUSED after stopping). All test artifacts (test SSH server, test backend instance, test DB, tokens) were cleaned up afterward.
Phase 3 — Remote File Manager (DONE, with documented gaps)
Source: src/backend/ssh/file-manager*.ts (six files, ~3,900 lines combined: list/content/action/operation/download routes + session + utils) + frontend src/ui/features/file-manager/*.
Scope decisions:
- Ephemeral SFTP connections instead of Termix's pooled/long-lived sessions: each request opens a fresh SSH+SFTP connection (
backend/src/ssh/sftp.ts'swithSftp()), does one operation, and tears the connection down. Simpler than managing a third long-lived connection lifecycle alongside terminal and tunnel sessions, and acceptable at this app's scale. - No sudo/permission-elevation support. Termix falls back to shell commands piped a stored sudo password when SFTP returns a permission error; not ported in this pass (no privileged remote test target available in this sandbox to verify against safely — same category of gap as the OPKSSH cert-auth gap in Phase 1c). Documented here rather than silently dropped.
- No server-to-server transfer — this matches Termix's actual behavior (its own cross-host "transfer" is just sequential
downloadthenuploadthrough the browser; same-host moves use shellmv/cp, which isn't ported since sudo isn't). Not a regression. - Whole-file-in-memory model for view/edit, same as Termix:
GET/PUT /api/files/:id/contentreads/writes the entire file viasftp.readFile/writeFile. Files over 50MB (MAX_EDITABLE_SIZE) are rejected with a message pointing at download/upload instead. Binary detection (so binary files are shown as a "can't edit" message rather than mangled text) uses the same heuristic as Termix: scan the first 8KB for a null byte or a >1% ratio of other control bytes. - Streaming download (
GET /api/files/:id/download) for files of any size, viasftp.createReadStream()piped straight into the HTTP response rather than buffered in memory.
What was built:
backend/src/ssh/sftp.ts—withSftp(integrationId, fn): opens an ephemeral SSH+SFTP connection (reusingconnect.ts's jump-host-chaining + TOFU logic from Phase 1/2), runsfn, then tears the connection down.backend/src/routes/files.ts—GET /api/files/:id/list,GET/PUT /api/files/:id/content,POST /api/files/:id/mkdir,POST /api/files/:id/rename,POST /api/files/:id/delete,POST /api/files/:id/chmod,GET /api/files/:id/download,POST /api/files/:id/upload(multipart, via newly-added@fastify/multipart, 1GB limit).src/pages/Files.tsx— new page (/files, sidebar entry with aFolderOpenicon): SSH host picker, breadcrumb-navigable directory browser, inline text editor for non-binary files, new-folder/rename/delete/chmod-via-octal-display/upload/download actions.
Verified end-to-end against a real filesystem-backed SFTP server built specifically for this (using ssh2's server-side low-level SFTP protocol API — genuine OPEN/READ/WRITE/READDIR/RENAME/REMOVE/MKDIR/STAT/SETSTAT handlers backed by real fs calls against a real directory on disk, not a mock). Confirmed by inspecting the actual files/permissions on disk after each operation (cat, ls, stat -c '%a'), not just the HTTP response: list, read, write, mkdir, rename, delete, chmod, upload, and download (byte-for-byte diff match against the uploaded source file) all round-tripped correctly. One real bug was caught and fixed during this verification: the download route's wrapping Promise was resolving immediately after reply.send(stream) instead of waiting for the response to actually finish, which raced Fastify into ending the HTTP response (and the route's cleanup() into closing the underlying SSH connection) before the SFTP stream had sent any data — produced a 0-byte download with a "stream closed prematurely" log line. Fixed by letting reply.send(stream)'s return value resolve the promise instead of resolving synchronously, and moving connection cleanup to the response's own finish/close events. All test artifacts (test SFTP server, test backend instance, test DB, tokens, temp files) were cleaned up afterward.
Phase 4 — Docker Container Management (DONE, with documented gaps)
Architecture decision: Termix's source (src/backend/ssh/docker.ts, docker-container-routes.ts, docker-console.ts) drives Docker over SSH+CLI. ArchNest's existing backend/src/integrations/docker.ts adapter already talks to the Docker Engine HTTP API directly via a stored baseUrl (the only config field exposed in Settings for a docker integration — no SSH credentials, no TLS client certs). Rather than bolt on a second SSH-based Docker code path, Phase 4 extends the existing Engine-API approach: all new code talks straight to dockerd's HTTP API.
What was built:
backend/src/docker/client.ts—loadDockerHost(integrationId),dockerFetch/dockerJsonthin wrappers over the Engine API,demuxDockerStream()(best-effort parser for the 8-byte-frame multiplexed stdout/stderr format used by non-TTY containers'logs/statsendpoints, falling back to raw text for TTY containers).backend/src/docker/exec.ts—openExecStream()opens adocker execsession and performs the raw HTTP "hijack": afterPOST /exec/{id}/start, the daemon switches the TCP socket to a raw bidirectional byte stream (no further HTTP framing), so the implementation connects vianet/tlsdirectly, writes the HTTP request by hand, and strips the response headers before treating the rest as raw I/O.backend/src/routes/docker.ts—dockerRoutes(REST: list/stats/logs/start/stop/restart/pause/unpause/remove, behind the standardapp.authenticatehook) anddockerExecRoutes(websocket/api/docker/exec, auth via atokenquery param verified on theconnectmessage, mirroringterminal.ts's pattern since websocket upgrades can't carry anAuthorizationheader).src/pages/Containers.tsx— new page (/containers, sidebar entry with aBoxicon): Docker host picker, container table (state, image, live CPU/memory fromstats, ports) with start/stop/restart/pause/unpause/remove actions, a logs modal, and an exec-terminal modal reusingTerminal.tsx's xterm.js +FitAddonpattern (base64-encoded I/O over the websocket).
Verified end-to-end against a real Docker daemon (dockerd) started inside the sandbox on a TCP port, with a real container built from a docker import of the host's own rootfs (no network access to a registry was available, so a minimal real image was constructed locally rather than pulled). Confirmed via real container state transitions (docker inspect) cross-checked against the API responses: list, stats, logs (including the frame-demuxed multi-line case), start/stop/restart/pause/unpause, and remove all worked correctly through the new REST routes. The exec-terminal websocket path was exercised with a real ws client driving an interactive shell inside the real container (sent echo HELLO_FROM_EXEC, got the echoed output back through the hijacked socket) and a live resize.
One real bug was caught and fixed during this verification: openExecStream() originally called POST /exec/{id}/resize immediately after creating the exec instance but before starting it — confirmed via a raw curl repro that the Docker daemon blocks that request indefinitely until the exec's process actually exists, which hung every exec session before it ever reached ready. Fixed by passing the initial terminal size via ConsoleSize in the exec-create payload instead, and only using the explicit resize endpoint for later live resizes (sent after the exec is already running, so it's safe there, and was verified working in that position).
Documented gap: no browser is available in this sandbox, so Containers.tsx was verified by type-checking and a production vite build, and by manually exercising every backend endpoint it calls against the real daemon above — but it has not been clicked through in an actual browser. All test artifacts (test dockerd instance, test image/container, test backend instance, test DB, tokens, temp files) were cleaned up afterward.
Phase 5 — RDP/VNC/Telnet (DONE)
Architecture decision: Termix's own approach (new GuacamoleLite({ server }, ...)) attaches an unfiltered 'upgrade' listener to the whole HTTP server, which would have collided with @fastify/websocket's existing routes (/api/terminal, /api/docker/exec). Instead, guacamole-lite's lower-level ClientConnection/Crypt classes (imported directly from their CJS lib files, typed via a small ambient .d.ts) are driven from inside our own Fastify {websocket: true} route, on a socket Fastify has already upgraded — no interaction with the HTTP server's 'upgrade' event at all. guacd itself remains a required sidecar process (a real guacd binary, available via apt), but is not wired into a docker-compose.yml yet — see gap below.
What was built:
backend/src/integrations/types.ts/registry.ts/routes/integrations.ts— newremote_desktopintegration type (config:protocol/hostname/port/username/domain, secret:password).backend/src/integrations/remoteDesktop.ts—testConnection()does a raw TCP probe of the configured port (distinct from the real Guacamole-protocol tunnel below).backend/src/routes/guacamole.ts—/api/guacamolewebsocket route: authenticates thetokenquery param viaapp.jwt.verify(same pattern asterminal.ts/docker.ts, since websocket upgrades can't carry anAuthorizationheader), loads theremote_desktopintegration's config + decrypted secrets, server-side constructs and encrypts a Guacamole connection token viaCrypt, then instantiatesClientConnectiondirectly on the open socket and calls.connect({ host, port })againstguacd(configurable viaARCHNEST_GUACD_HOST/ARCHNEST_GUACD_PORT, default127.0.0.1:4822). New env varARCHNEST_GUAC_CRYPT_KEY(32-byte AES-256-CBC key) added to.env.example.src/pages/RemoteDesktop.tsx— new page (/remote-desktop, sidebar entry with aMonitorSmartphoneicon): host picker + aguacamole-common-jsGuacamole.Client/Guacamole.WebSocketTunnelcanvas viewer. Note:Guacamole.WebSocketTunnelappends its own"?" + dataquery string insideconnect(), so the tunnel URL passed to its constructor must be bare, withtoken/integrationIdpassed as the string argument toclient.connect(...)instead — this was caught and fixed during browser verification (see below).src/pages/Settings.tsx— generic integration card extended with aremote_desktopentry (protocol/hostname/port/username/domain/password fields).
Verified end-to-end against real, locally-installed infrastructure (no mocking): a real guacd (v1.3.0, installed via apt) and a real Xtightvnc/vncserver desktop. A raw ws client test first confirmed the tunnel itself — JWT auth, integration lookup, token encryption, and the guacd handshake — by observing real Guacamole-protocol size/img instructions come back over the websocket. Then the actual RemoteDesktop.tsx page was exercised in a real headless Chromium (Playwright) against a real running Vite dev server + backend: logged in, navigated to /remote-desktop, selected the configured VNC host, and confirmed the UI reaches Connected state with a live VNC framebuffer (cursor visible) rendered on canvas — not just a build/typecheck pass.
One real bug was caught and fixed during this browser verification: the page initially called client.connect() with no arguments while the tunnel URL already had token=...&integrationId=... appended, producing a malformed ...&integrationId=1?undefined URL and an ECONNREFUSED-style failure. Root cause (confirmed by reading Guacamole.WebSocketTunnel's source): it always appends its own "?" + data itself. Fixed by passing a bare tunnel URL and moving the query data into the client.connect(data) call.
Documented gaps:
Telnet and RDP were not verified(now done): with theaptmirror cooperating on a later attempt, both paths were verified end-to-end through the exact same/api/guacamoleroute. Telnet: ran a realinetutils-telnetd(bridged to a listening port viasocat), created aremote_desktop/telnetintegration, and drove the websocket — guacd loggedTelnet connection successfuland returned real Guacamole instructions (4.size,...). RDP: ran a realxrdpserver (after installing thelibguac-client-rdp0plugin guacd needs), created aremote_desktop/rdpintegration, and confirmed guacd negotiated the connection and returned a4.size,1.0,4.1024,3.768display surface. All three protocols (VNC from the original pass, plus telnet and RDP now) are confirmed against the identical code path. All test artifacts (guacd, telnetd/socat, xrdp, test user, test backend/DB) were cleaned up afterward.(now done):guacdis not yet added to adocker-compose.ymldocker-compose.ymlgained aguacdservice (guacamole/guacd:1.5.5, no published port — only the backend reaches it on the compose network), the backend service now setsARCHNEST_GUACD_HOST=guacd/ARCHNEST_GUACD_PORT=4822+ARCHNEST_GUAC_CRYPT_KEYanddepends_on: [guacd], andbackend/.env.exampledocuments theARCHNEST_GUACD_*vars for local dev. Verified the compose file parses cleanly viadocker compose config(the Docker daemon isn't running in this sandbox, so an actualupwas not performed).- All test artifacts (test
guacd/vncserverprocesses, test backend instance, test DB, tokens, temp files, Playwright scripts) were cleaned up afterward.
Phase 6 — Host Metrics Widgets (DONE, with documented gaps)
Architecture decision: Termix's host-metrics.ts route (2,584 lines) is tightly coupled to its own Drizzle schema, multi-user auth, SOCKS5/jump-host chaining, TOTP-gated metrics sessions, and a metrics cache/backoff/request-queue layer — none of that scaffolding was ported. The actual reusable value is the 10 widgets/*-collector.ts files: small, near-backend-agnostic functions that take a raw ssh2.Client, run a few shell commands, and return null-tolerant typed metrics. Those collectors were reimplemented against ArchNest's own ssh2 connection objects (reusing loadSshHost/connectTarget from Phase 1/2, not Termix's pool/cache/session substrate). Delivery is simple on-demand REST + 5s client-side polling — the same low-tech approach Phase 2 used for tunnel status — rather than Termix's own caching/backoff system. This was built as a new standalone page (/host-metrics) rather than folded into Infrastructure.tsx: the existing Infrastructure page is a fleet-wide overview (one row per resource), while these widgets are a deep per-host live view, closer in spirit to Terminal.tsx/RemoteDesktop.tsx's "pick a host, see one rich view" pattern. The existing backend/src/integrations/ssh.ts listResources probe (disk/mem/load percentages for the Infrastructure overview) is left as-is and unrelated — it answers "is this host healthy at a glance," not "show me everything about this host."
What was built:
backend/src/ssh/metrics/common.ts— sharedexecCommand()(exec + timeout + cleanup) and small numeric helpers, ported from Termix'swidgets/common-utils.ts.backend/src/ssh/metrics/{cpu,memory,disk,uptime,network,system,processes,ports,firewall,login-stats}.ts— 10 collectors ported from Termix'swidgets/*-collector.ts, each independently null-safe.ports.tsonly implements thess-based path (Termix also had anetstatfallback parser, dropped as redundant on any modern target).backend/src/ssh/metrics/index.ts—collectHostMetrics()aggregator.backend/src/routes/metrics.ts—GET /api/integrations/:id/metrics, authenticated, connects viaconnectTarget(transparent jump-host support inherited for free) and runs the aggregator.src/pages/HostMetrics.tsx— new page (/host-metrics, sidebar entry with aGaugeicon): SSH host picker + CPU/memory/disk gauges, uptime/system card, network interfaces, listening ports, top processes table, firewall summary, login activity summary. Polls every 5s while a host is selected.src/lib/api.ts—getHostMetrics()+HostMetricstype.
Verified end-to-end against a real, locally-installed sshd (not mocked): installed openssh-server, created a real test user, ran a real ArchNest backend + SQLite DB, created a real ssh-type integration, and hit GET /api/integrations/:id/metrics over a real SSH connection. CPU, memory, disk, uptime, system, and processes all returned real, correct data from the live container (verified CPU% against /proc/stat math, memory/disk against free/df, process list against a parallel manual ps aux).
One real bug was caught and fixed: the first version ran all 10 collectors via Promise.all, which opens 15-20 concurrent SSH exec channels — this silently exceeded OpenSSH's default MaxSessions 10 and starved whichever collectors lost the race (network/processes/ports/firewall/loginStats came back empty while cpu/memory/disk/uptime/system succeeded). Fixed by running collectors sequentially in collectHostMetrics() — acceptable since this is on-demand polling, not a latency-critical path.
Follow-up verification (gaps from the first pass now closed): with iproute2 installed and a test sshd configured for root login, the three previously-unverified collectors were re-run against a real host over the real API and all returned correct data:
network→eth0with its real IP (192.0.2.2/24) and stateUP.ports→source: "ss", 6 listening ports, with real process names and PIDs (sshd, etc.).firewall→ after adding twoiptablesrules (--dport 22/--dport 80 -j ACCEPT) and connecting as root,type: "iptables",status: "active", and the INPUT chain parsed back the two rules correctly.
The frontend was also browser-verified (Playwright/Chromium, now available): logged in, opened /host-metrics, selected the host, and confirmed all widgets render with real live data (CPU/memory/disk gauges, uptime, the eth0 interface, listening ports with process names, the top-processes table, the iptables firewall summary with 2 rules, and login activity) — see screenshot evidence captured during the run.
Remaining documented gap:
loginStatsreturned empty because the test host'swtmphad no real login history and/var/log/auth.log/secureweren't populated —last/grepboth ran successfully, just had nothing to report. This is data-availability, not a code defect; unverified against a host with real login history.- All test artifacts (test
sshdprocess, test OS users, test iptables rules, test backend instance, test DB, tokens, temp files) were cleaned up afterward.
Phase 7 — Host-to-Host File Transfer (DONE)
Architecture decision: Termix's host-transfer.ts (3,428 lines, plus transfer-paths.ts/transfer-routing.ts) is a heavily over-engineered system — parallel-segment workers, a tar-vs-per-file-SFTP method selector driven by incompressibility heuristics, hung-stream watchdogs, retry orchestration, worker caches, archive-method previews. Per the same stance taken in every prior phase, only the core value was ported: streaming a file/directory from one SSH host to another through the backend (read from the source's SFTP, write to the destination's SFTP, item by item). This is exactly the item_sftp path Termix itself falls back to in most cases; the parallel/tar/watchdog machinery is left behind as unjustified at this app's scale. Reuses ArchNest's existing connectTarget SSH helper (jump-host support inherited for free on both ends), not Termix's connection pool/session manager. Delivery mirrors Phase 2/6: an in-memory transfer registry + REST polling, no websockets.
What was built:
backend/src/ssh/transfer.ts— the transfer engine.startTransfer()returns atransferIdand runs asynchronously: opens an SFTP connection to both hosts, scans the source tree up front (depth-first walk) to computetotalFiles/totalBytesfor a real progress bar, recreates the directory structure on the destination, then streams each file (sourcecreateReadStream→ destcreateWriteStream). Tracks live progress in an in-memoryactiveTransfersmap; supportsmove(deletes the source tree, files-then-dirs-deepest-first, after a successful copy) and cooperative cancellation (a flag checked between files and on every read chunk).cleanupOldTransfers()drops finished entries after an hour.backend/src/routes/transfer.ts—POST /api/transfers(start),GET /api/transfers(list),GET /api/transfers/:id(status),POST /api/transfers/:id/cancel. All authenticated; start is zod-validated.src/pages/Files.tsx— added a per-entry "Send to another host" action (disabled unless ≥2 SSH hosts exist) opening a modal (destination host dropdown, destination directory, move checkbox), plus a live "Host-to-Host Transfers" panel that polls (1s while any transfer is running, 5s otherwise) and shows per-transfer progress bars, current file, status, and a cancel button.src/lib/api.ts—startTransfer/listTransfers/getTransfer/cancelTransfer+TransferProgresstype.
Verified end-to-end against two real SSH endpoints (a real sshd with two real OS users as source/dest, not mocked): created two real ssh-type integrations and exercised all four behaviours over the real API:
- Recursive directory copy of a tree (text file + a 100 KB random binary + a nested subdir): completed 3/3 files / 100,019 bytes; verified on disk that the directory structure was recreated, text content was intact, and the binary's
md5summatched the source exactly. - Move: a single file transferred with
move:true— confirmed present on the destination and deleted from the source afterward. - Error handling: a transfer of a nonexistent source path ended
status: "failed"with a clear"No such file"error rather than hanging. - Cancellation: an 80 MB transfer cancelled ~0.3 s in stopped at 162 KB with
status: "cancelled"— confirming the mid-stream cancel flag actually interrupts the copy.
The frontend transfer UI was also browser-verified (Playwright/Chromium): logged in, opened the Files page, switched to a source SSH host, navigated into a directory, clicked the per-row "Send to another host" action, picked the destination host + directory in the modal, and confirmed the live "Host-to-Host Transfers" panel rendered the transfer and reached a full completed progress bar — then verified on the destination host's disk that the file actually landed with correct content.
All test artifacts (test sshd, both test OS users + their home dirs, test backend instance, test DB, temp files) were cleaned up afterward.
Phase 8 — Data Export / Import (DONE)
Architecture decision: a single-file JSON backup/restore of the user's configuration — all integrations (with their credentials), bookmark categories + bookmarks, and tunnels. Secrets are exported decrypted on purpose: that makes a backup portable to a different ArchNest instance whose ARCHNEST_SECRET_KEY differs (an encrypted export would be useless after a key change / on a fresh install). The export is only ever served to an authenticated user — the same person who can already read those secrets via the integrations they own — and the UI labels it as containing plaintext credentials. Import is additive (insert-as-new, never destructive), with old→new id remapping so tunnels and bookmarks keep pointing at their correct newly-created parents, all wrapped in a single SQLite transaction.
What was built:
backend/src/routes/data.ts—GET /api/data/export(serializes integrations+decrypted secrets, bookmark categories, bookmarks, tunnels with aversionfield) andPOST /api/data/import(zod-validated, transactional, additive, withintegrationIdMap/categoryIdMapremapping; tunnels referencing an integration absent from the import are skipped rather than orphaned).src/lib/api.ts—exportData()/importData()+DataExporttype.src/pages/Settings.tsx— wired the previously-placeholder "Data & Backup" section to the real endpoints: Export downloadsarchnest-backup-<date>.json; Import reads a chosen file and POSTs it, with success/error feedback. (Replaced the old mock "Export Bookmarks"/"Clear Cache"/"Reset" buttons.)
Verified end-to-end against a real backend (not mocked): seeded an instance with an SSH integration (password + passphrase secrets), a bookmark category + bookmark, and a tunnel; then:
- Export returned
version: 1with the secrets correctly decrypted to plaintext and all four entity types present. - Additive import into the same instance doubled every count, and the new tunnel's
integrationIdpointed at the newly-created integration (id remapping confirmed, not the stale original id). - Cross-instance portability: imported the backup into a second backend started with a completely different
ARCHNEST_SECRET_KEY; re-exporting from that instance showed the credentials decrypt correctly under the new key — proving they were re-encrypted on import, which is the whole point of the decrypted-export design. - Browser-verified (Playwright/Chromium): the Settings → Data & Backup page exports a real downloaded JSON file (correct contents + success message) and imports an uploaded backup file (correct "Imported N integrations…" confirmation).
All test artifacts (two test backend instances, test DBs, downloaded backup files, temp files) were cleaned up afterward.
Also worth checking during/after the phases above
All previously-listed follow-ups are now complete: host-metrics widgets (Phase 6), host-to-host transfer (Phase 7), and data export/import (Phase 8) are done, and the verification gaps noted in Phases 1, 5, and 6 have been closed (cert auth, Telnet, RDP, guacd compose wiring, host-metrics network/ports/firewall + browser UI, and the Phase 1b/7 UI click-throughs).
Tracking
Update the phase status lines above as work lands. Each phase should get its own commit(s) on claude/wonderful-faraday-qxym5t, following the existing commit message style (descriptive title + why, Co-Authored-By/Claude-Session trailer).