Close verification gaps for Phases 1b, 6, 7 via real infra + browser tests

With iproute2 and Playwright/Chromium now available in the sandbox:
- Re-verified host-metrics network/ports/firewall collectors against a real
  root SSH host (real eth0, ss ports with process names, parsed iptables rules).
- Browser-verified the host-metrics page, the terminal tabs/split-panes/theme
  UI (live remote prompt, 1->2->4 xterm panes, prefs persisted), and the
  host-to-host transfer UI (live progress panel to completion + on-disk check).

Updates documentation only; no code changes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BbJV5nm8KPVH1oNJYKpnoF
This commit is contained in:
Claude 2026-06-19 16:02:40 +00:00
parent 29c69224b2
commit e10acfd4a1
No known key found for this signature in database

View file

@ -51,7 +51,7 @@ Rationale for splitting: 1a alone is a real, useful terminal (matches what `/ter
- **Settings UI for multiple SSH hosts**: `src/pages/Settings.tsx` previously could only show/edit one integration per type, which silently broke multi-host SSH. Added a dedicated `SshHostsSection` with its own per-host cards (Save/Test/Delete) and an "Add SSH Host" flow, including a `Jump Host` dropdown populated from the other configured SSH hosts. - **Settings UI for multiple SSH hosts**: `src/pages/Settings.tsx` previously could only show/edit one integration per type, which silently broke multi-host SSH. Added a dedicated `SshHostsSection` with its own per-host cards (Save/Test/Delete) and an "Add SSH Host" flow, including a `Jump Host` dropdown populated from the other configured SSH hosts.
- **Tabs + up to 4 split panes**: `src/pages/Terminal.tsx` rewritten around a `TerminalPane` component (one xterm + WebSocket connection each, reusable). Each tab holds 1/2/4 panes (single / split-2 / 2x2 grid); each pane connects independently to whichever SSH host is clicked while it's focused. - **Tabs + up to 4 split panes**: `src/pages/Terminal.tsx` rewritten around a `TerminalPane` component (one xterm + WebSocket connection each, reusable). Each tab holds 1/2/4 panes (single / split-2 / 2x2 grid); each pane connects independently to whichever SSH host is clicked while it's focused.
- **Terminal theme/font customization**: a preferences bar (theme preset, font size, font family) persisted to `localStorage` (`archnest-terminal-prefs`), applied per-pane on connect. - **Terminal theme/font customization**: a preferences bar (theme preset, font size, font family) persisted to `localStorage` (`archnest-terminal-prefs`), applied per-pane on connect.
- Verified via a clean production build (`tsc -b && vite build`) — no real browser available in this environment to click through tabs/panes, so this is build/type verification only, not an interactive UI test. - Verified via a clean production build (`tsc -b && vite build`), and subsequently **browser-verified** (Playwright/Chromium, once available): logged in, opened `/terminal`, connected a pane to a real SSH host (confirmed by the live remote prompt `uitester@vm:~$` and a `Connected — <host>` status), split into 2 and 4 panes (confirmed 1→2→4 live `xterm` instances rendering as a 2×2 grid), opened a new tab, and changed the theme preference — confirmed it persisted to `localStorage` (`archnest-terminal-prefs``{"themeName":"Matrix",...}`). The original build-only caveat is now closed.
- ✅ **Phase 1c — done, with one documented verification gap.** - ✅ **Phase 1c — done, with one documented verification gap.**
- **OPKSSH / certificate auth**: `ssh2` (the npm library) has no support for OpenSSH certificates — confirmed by inspecting its type definitions and README, no certificate-related auth flow exists. Implemented `connectWithCertificate()` in `backend/src/routes/terminal.ts`: writes the stored private key + certificate to a temp dir (mode `0600`) and shells out to the system `ssh` binary (which natively understands `-o CertificateFile=`) under a real `node-pty` pty. Used automatically when an SSH integration has a `certificate` secret configured (new field added to Settings' SSH host form). Does **not** support jump-host chaining (documented limitation, not silently dropped — Termix's own OPKSSH path doesn't generally chain through jump hosts either). **Verified end-to-end** (gap from the original pass now closed): with `openssh-client`/`openssh-server` available, built a real SSH CA, signed a user key into an OpenSSH certificate (principal `certuser`), configured a real `sshd` with `TrustedUserCAKeys` + `PasswordAuthentication no` (so only cert auth could succeed), created a real `ssh`-type integration carrying the private key + certificate as secrets, and drove ArchNest's actual `/api/terminal` WebSocket route: it reached `connected`, spawned the cert-auth pty, and a real shell echoed back a marker as `certuser` — i.e. authentication genuinely happened via the certificate, not a password or plain key. - **OPKSSH / certificate auth**: `ssh2` (the npm library) has no support for OpenSSH certificates — confirmed by inspecting its type definitions and README, no certificate-related auth flow exists. Implemented `connectWithCertificate()` in `backend/src/routes/terminal.ts`: writes the stored private key + certificate to a temp dir (mode `0600`) and shells out to the system `ssh` binary (which natively understands `-o CertificateFile=`) under a real `node-pty` pty. Used automatically when an SSH integration has a `certificate` secret configured (new field added to Settings' SSH host form). Does **not** support jump-host chaining (documented limitation, not silently dropped — Termix's own OPKSSH path doesn't generally chain through jump hosts either). **Verified end-to-end** (gap from the original pass now closed): with `openssh-client`/`openssh-server` available, built a real SSH CA, signed a user key into an OpenSSH certificate (principal `certuser`), configured a real `sshd` with `TrustedUserCAKeys` + `PasswordAuthentication no` (so only cert auth could succeed), created a real `ssh`-type integration carrying the private key + certificate as secrets, and drove ArchNest's actual `/api/terminal` WebSocket route: it reached `connected`, spawned the cert-auth pty, and a real shell echoed back a marker as `certuser` — i.e. authentication genuinely happened via the certificate, not a password or plain key.
- **tmux session monitor/reattach**: new WebSocket message `list_tmux` execs `tmux list-sessions` on the target host and returns session names; `connect` accepts an optional `tmuxSession` (validated against `^[A-Za-z0-9_-]{1,64}$` before being interpolated into a shell command, to prevent injection) which attaches to that tmux session or creates it if missing, via `exec('tmux attach -t <name> || tmux new-session -s <name>', { pty: ... })` instead of a plain `client.shell()`. `src/pages/Terminal.tsx`'s pane header gained a tmux session picker (plain shell / new session / attach to an existing one). **Verified end-to-end** against a real test SSH server running real `bash`/`tmux` processes (via `node-pty`): listed zero sessions, created a `testsess` tmux session through the WS protocol, confirmed a follow-up `list_tmux` call returned `['testsess']`. - **tmux session monitor/reattach**: new WebSocket message `list_tmux` execs `tmux list-sessions` on the target host and returns session names; `connect` accepts an optional `tmuxSession` (validated against `^[A-Za-z0-9_-]{1,64}$` before being interpolated into a shell command, to prevent injection) which attaches to that tmux session or creates it if missing, via `exec('tmux attach -t <name> || tmux new-session -s <name>', { pty: ... })` instead of a plain `client.shell()`. `src/pages/Terminal.tsx`'s pane header gained a tmux session picker (plain shell / new session / attach to an existing one). **Verified end-to-end** against a real test SSH server running real `bash`/`tmux` processes (via `node-pty`): listed zero sessions, created a `testsess` tmux session through the WS protocol, confirmed a follow-up `list_tmux` call returned `['testsess']`.
@ -148,12 +148,16 @@ One real bug was caught and fixed during this browser verification: the page ini
One real bug was caught and fixed: the first version ran all 10 collectors via `Promise.all`, which opens 15-20 concurrent SSH exec channels — this silently exceeded OpenSSH's default `MaxSessions 10` and starved whichever collectors lost the race (`network`/`processes`/`ports`/`firewall`/`loginStats` came back empty while `cpu`/`memory`/`disk`/`uptime`/`system` succeeded). Fixed by running collectors sequentially in `collectHostMetrics()` — acceptable since this is on-demand polling, not a latency-critical path. One real bug was caught and fixed: the first version ran all 10 collectors via `Promise.all`, which opens 15-20 concurrent SSH exec channels — this silently exceeded OpenSSH's default `MaxSessions 10` and starved whichever collectors lost the race (`network`/`processes`/`ports`/`firewall`/`loginStats` came back empty while `cpu`/`memory`/`disk`/`uptime`/`system` succeeded). Fixed by running collectors sequentially in `collectHostMetrics()` — acceptable since this is on-demand polling, not a latency-critical path.
**Documented gaps**: **Follow-up verification (gaps from the first pass now closed):** with `iproute2` installed and a test `sshd` configured for root login, the three previously-unverified collectors were re-run against a real host over the real API and all returned correct data:
- `network` and `ports` collectors returned empty against the sandbox's test container because it has no `iproute2` package (`ip`/`ss` missing) — confirmed via manual SSH (`which ip`/`which ss` both failed) rather than a code defect. Logic is unverified against a host that actually has these tools. - `network``eth0` with its real IP (`192.0.2.2/24`) and state `UP`.
- `firewall` returned `type: "none"` because `iptables-save` requires root and the test SSH user wasn't root — expected per the null-tolerant design, but the root-required path itself wasn't exercised. - `ports``source: "ss"`, 6 listening ports, with real process names and PIDs (`sshd`, etc.).
- `loginStats` returned empty because the test container's `wtmp` had no real login history and `/var/log/auth.log`/`secure` weren't populated — `last`/`grep` both ran successfully, just had nothing to report. - `firewall` → after adding two `iptables` rules (`--dport 22`/`--dport 80 -j ACCEPT`) and connecting as root, `type: "iptables"`, `status: "active"`, and the INPUT chain parsed back the two rules correctly.
- The frontend page was typechecked and manually reviewed, and the route was confirmed to be served by Vite, but **not visually verified in a browser** — Playwright wasn't available in this sandbox for this phase (no cached install found). This is a real verification gap, not a claim of UI testing that didn't happen.
- All test artifacts (test `sshd` process, test OS user, test backend instance, test DB, tokens, temp env/log files) were cleaned up afterward. The **frontend was also browser-verified** (Playwright/Chromium, now available): logged in, opened `/host-metrics`, selected the host, and confirmed all widgets render with real live data (CPU/memory/disk gauges, uptime, the `eth0` interface, listening ports with process names, the top-processes table, the `iptables` firewall summary with 2 rules, and login activity) — see screenshot evidence captured during the run.
**Remaining documented gap**:
- `loginStats` returned empty because the test host's `wtmp` had no real login history and `/var/log/auth.log`/`secure` weren't populated — `last`/`grep` both ran successfully, just had nothing to report. This is data-availability, not a code defect; unverified against a host with real login history.
- All test artifacts (test `sshd` process, test OS users, test iptables rules, test backend instance, test DB, tokens, temp files) were cleaned up afterward.
### Phase 7 — Host-to-Host File Transfer (DONE) ### Phase 7 — Host-to-Host File Transfer (DONE)
@ -171,6 +175,8 @@ One real bug was caught and fixed: the first version ran all 10 collectors via `
- **Error handling**: a transfer of a nonexistent source path ended `status: "failed"` with a clear `"No such file"` error rather than hanging. - **Error handling**: a transfer of a nonexistent source path ended `status: "failed"` with a clear `"No such file"` error rather than hanging.
- **Cancellation**: an 80 MB transfer cancelled ~0.3 s in stopped at 162 KB with `status: "cancelled"` — confirming the mid-stream cancel flag actually interrupts the copy. - **Cancellation**: an 80 MB transfer cancelled ~0.3 s in stopped at 162 KB with `status: "cancelled"` — confirming the mid-stream cancel flag actually interrupts the copy.
The **frontend transfer UI was also browser-verified** (Playwright/Chromium): logged in, opened the Files page, switched to a source SSH host, navigated into a directory, clicked the per-row "Send to another host" action, picked the destination host + directory in the modal, and confirmed the live "Host-to-Host Transfers" panel rendered the transfer and reached a full `completed` progress bar — then verified on the destination host's disk that the file actually landed with correct content.
All test artifacts (test `sshd`, both test OS users + their home dirs, test backend instance, test DB, temp files) were cleaned up afterward. All test artifacts (test `sshd`, both test OS users + their home dirs, test backend instance, test DB, temp files) were cleaned up afterward.
### Also worth checking during/after the phases above ### Also worth checking during/after the phases above