Make the automated pipeline the documented "setup moving forward" and
finish scrubbing the last stale GitHub-Actions/racknerd1 references that
never reached main.
- HANDOFF.md: refresh the stale 2026-06-21 snapshot. New "CI/CD & deploy"
section (push to main -> build + push to registry.snsnetlabs.com ->
auto-deploy to racknerd2 over SSH, SHA-pinned, /api/health gate),
racknerd2 validation-host + SSH-tunnel access notes, Forgejo workflow
rule, and a current Deployment + orientation section.
- .kiro/steering/project-guide.md: Forgejo-only Git workflow (no gh),
CI/CD row, registry host, racknerd2 + forgejo-runner SSH entries, and a
CI/CD pipeline section.
- .kiro/hooks/tunnel-racknerd2-8080.kiro.hook: the "View ArchNest on
racknerd2" hook (ssh -L 8080:localhost:8080 -N) to view the deployed
site at http://localhost:8080 (racknerd2's edge only allows port 22).
- src/pages/Settings.tsx: About panel repo URL -> Forgejo.
- .dockerignore: .github -> .forgejo.
- TERMIX_MIGRATION.md / docs/OPEN-SOURCE-RELEASE.md: drop stale
.github/workflows + "GitHub Actions deploy" references.
Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
Add a `deploy` job to build.yml that needs `build`, so every push to main
builds + pushes the images and then deploys them to racknerd2 over the
mesh, pinned to the built commit's SHA, with an /api/health gate. Fully
hands-off.
The standalone deploy.yml stays as a manual workflow_dispatch for
deploying/rolling back to an arbitrary tag without rebuilding.
deploy/README.md updated to document the auto-deploy flow.
Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
The first CI build failed: the job container (node:22-bookworm) installed
Debian's `docker.io` (Docker 20.10.24, API 1.41), which the host daemon
(29.x, minimum API 1.44) rejects with "client version 1.41 is too old".
Install docker-ce-cli from Docker's official apt repo instead, which is
current and talks to the daemon fine. Verified on the runner: a
node:22-bookworm container with the mounted socket + docker-ce-cli
connects to the 29.1.3 daemon (API 1.52) successfully. This also confirms
the runner's docker_host=automount is working (the client reached the
daemon; only the version was the problem).
Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
The Forgejo container registry now lives on a dedicated unproxied
(DNS-only) host, registry.snsnetlabs.com, so large image layers bypass
Cloudflare's ~100 MB request-body cap (the backend image's 262 MB and
317 MB layers previously hit 413 Payload Too Large through the proxied
forgejo.snsnetlabs.com host). The web UI / packages list stays on
forgejo.snsnetlabs.com behind Cloudflare Access SSO.
- build.yml: REGISTRY -> registry.snsnetlabs.com
- deploy/docker-compose.yml: image refs -> registry.snsnetlabs.com
- deploy/README.md: push/pull/login host -> registry.snsnetlabs.com
(packages web UI URL kept on forgejo.snsnetlabs.com)
Also record the versioning convention in HANDOFF + steering: development
happens on even major versions, releases on odd; currently developing v2
(prior released line is v1, see the v1.0 git tag). package.json and the
About panel are not yet bumped to v2.
Validated end to end: built both images on the runner host, pushed to
registry.snsnetlabs.com (backend included, no 413), pulled on racknerd2,
brought the stack up, /api/health returns {"ok":true} over the mesh IP.
Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
Build the frontend and backend images in CI, push them to the Forgejo
container registry, and deploy to racknerd2 (validation host) over the
NetBird mesh. racknerd2 only pulls + runs (1.9 GiB RAM, never builds).
- .forgejo/workflows/build.yml: on push to main / manual, build both
images and push :latest + :<sha> to forgejo.snsnetlabs.com/sam/...
(installs the docker CLI in the job; relies on the runner's
docker_host=automount to reach the host engine).
- .forgejo/workflows/deploy.yml: manual dispatch; SSH to racknerd2,
docker compose pull + up -d, then /api/health check.
- deploy/docker-compose.yml: registry-image compose. Ports bound to the
mesh IP only (Docker bypasses ufw), so the app is reachable over the
mesh, not the public interface.
- deploy/.env.example + deploy/README.md: deploy host config + full
pipeline/prereq docs.
- .gitignore: ignore real .env / deploy/.env.
Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
Appearance tab was local-state-only — light mode did nothing. Wire it up:
- index.css: theme color tokens are now CSS variables on :root (dark default)
with a [data-theme="light"] override using a soft GRAY page background
(#E4E6EB), not white. The @theme tokens + html/body + select reference the
vars, so the app shell and any component using bg-page/text-text-primary/etc.
themes automatically.
- New src/lib/theme.ts: localStorage-backed appearance prefs (theme/fontSize/
radius/animations) + applyAppearance() toggling data-theme on <html>, mirroring
the terminalPrefs pattern.
- main.tsx applies saved theme before first render (no flash).
- Settings AppearanceSection persists + applies on change; theme/fontSize/radius/
animations are live. Dropped the non-functional "Sidebar Expanded by default"
toggle. (accent color is still cosmetic-only — full token migration of
hardcoded-hex pages is a separate task, noted in ROADMAP.)
Also adds the Remote Desktop GNOME & KDE support work to ROADMAP as an add-on
(XFCE confirmed working; GNOME needs a FreeRDP-3 guacd image, KDE via xrdp +
startplasma-x11). Full detail in docs/rdp-debug-handoff.md.
Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
Records the full chain that got RDP working end-to-end on XFCE (auth/xrdp,
session, compositing, scaling, ping echo, input, 1080p — PRs #41-48), and adds
a desktop-environment support matrix plus researched paths to make GNOME and KDE
work too.
Key findings (VM-verified, not theory):
- XFCE over xrdp works today with guacd's FreeRDP 2.
- GNOME 50 is Wayland-only (no Xorg session for xrdp) AND gnome-remote-desktop
mandates NLA that FreeRDP 2 can't do — blocked both ways. The real unlock is a
custom guacd image built against FreeRDP 3; GNOME headless "system" RDP (GDM
handover, GNOME 46+) then becomes viable.
- KDE Plasma 6 should work like XFCE via xrdp + startplasma-x11 (X11 session
supported through ~early 2027); KRdp is the Wayland-native future path.
Includes a suggested order of work for the next agent.
Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
Bump the guacd connectionDefaultSettings for rdp/vnc/telnet from 1024x768 to
1920x1080 so new remote desktop sessions open at 1080p by default. dpi stays 96.
Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
The multi-session RemoteDesktop view rendered the remote display but never
captured input — no Guacamole.Mouse / Guacamole.Keyboard were created — so the
desktop showed but the mouse and keyboard did nothing.
Add per-session input:
- Guacamole.Mouse on the display element, forwarding mouse state via
client.sendMouseState with coordinates divided by the current display scale so
clicks land correctly on the scaled-down canvas.
- Guacamole.Keyboard on document, forwarding key events via client.sendKeyEvent,
but only while that session is the active/visible tab so a background session
can't steal keystrokes.
- Detach keyboard handlers on session close and on unmount.
Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
THE actual root cause of the flicker-then-blank / "connected but drops" RDP
behavior. guacamole-common-js's WebSocketTunnel sends an internal stability
"ping" (empty INTERNAL_DATA_OPCODE: `0.,4.ping,<ts>;`) and resets its
receiveTimeout only on inbound data. guacamole-lite 1.2.0 forwards that ping
straight to guacd, which neither understands nor echoes it. On an idle desktop
(no frames flowing), nothing resets the client timer, so the tunnel hits
UPSTREAM_TIMEOUT, closes, and reconnects — the flicker/drop loop seen in guacd
as "User is not responding".
Fix: intercept the internal ping on the WS in guacamole.ts and echo it straight
back to the client before ClientConnection forwards it to guacd, so the client's
stability timer is satisfied even when the remote desktop is idle.
Verified separately: a guacd connection that echoes sync held 30s/116 syncs with
no drop; server/xrdp/XFCE are healthy. This was purely the missing ping echo.
Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
Symptom: desktop flickers in for a moment then goes blank while the tab still
says "connected"; guacd logs "User is not responding" and the client reconnects
in a loop.
Root cause: the multi-session view moved each session's display element between
DOM nodes on tab switch (host.innerHTML='' + appendChild). Detaching a
guacamole-common-js display from the DOM stalls its sync loop, so the client
stops echoing guacd's sync instructions and guacd drops it as unresponsive.
Proven out-of-band: a raw guacd client that echoes sync held a connection for
30s/116 syncs with no drop, while the browser dropped within seconds.
Fix: mount each session's container into the display host ONCE and never move
it; toggle visibility (display:none) to switch tabs so every session's display
stays in the DOM and its sync loop keeps running. Containers are absolutely
positioned in a relative host; close still removes the container.
Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
The multi-session RemoteDesktop tabs (05e78f0) appended the Guacamole display
into an off-DOM container before connecting, and never called display.scale().
Frames arrived while the canvas was detached/zero-sized, so the desktop rendered
but painted to an invisible area — connected-but-blank.
Fix: track the display per session, fit/scale it to the visible panel when a
session becomes active, on the display's own onresize, and on window resize.
Verified the VM/xrdp/guacd side streams frames fine (13-36 img frames in direct
guacd tests); this was purely the client-side mount/scale regression.
Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
Even with XFCE running, the screen stayed blank because xfwm4's GPU compositor
fails on the Virtio GPU (no GL driver): "Another compositing manager is running",
"failed to load driver: virtio_gpu". Fixed by disabling xfwm4 compositing via
xfconf and forcing LIBGL_ALWAYS_SOFTWARE for the RDP user. Verified a fresh
session renders cleanly through guacd.
Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
After replacing gnome-remote-desktop with xrdp, the connection succeeded but
showed a blank screen: GNOME 50 on Fedora is Wayland-only and can't run on
xrdp's Xorg backend, so the session started and died in ~2s. Fixed by installing
XFCE (an X11 desktop that works without GL) and creating the missing
/etc/xrdp/startwm.sh to launch it. Verified xfce4-session/xfwm4 persist and
guacd streams sustained desktop frames.
Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
The "Server refused connection (wrong security type?)" failure was root-caused
end-to-end: guacd 1.5.5 ships FreeRDP 2.11.5, whose NLA/CredSSP client cannot
authenticate against gnome-remote-desktop, which mandates NLA (HYBRID_REQUIRED_
BY_SERVER) with no option to disable it. The earlier EGL/Mesa/Zink GPU theory
was a red herring.
Proven at every layer: direct xfreerdp v3 to the VM, the real guacd protocol
path (all security modes fail identically), and guacd's own logs. Also verified
guacd:1.6.0 still ships FreeRDP 2.11.7, so an image bump would NOT fix it.
Fix applied to the test VM: replaced gnome-remote-desktop with xrdp (masked the
GNOME user service so it can't re-grab port 3389), which interoperates with
guacd's FreeRDP 2. Verified a real session streams through guacd with
security=any. No ArchNest code change was needed — the existing security/
ignore-cert handling in guacamole.ts is correct.
Documents this as a general finding since other users will hit GNOME's built-in
RDP the same way.
Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
Settings → Integrations now has a universal "Icon" field (any integration
type) accepting a pasted URL or an uploaded image, stored as config.iconUrl.
This overrides the built-in icon for that integration's Node Status tile.
Node Status tiles now resolve their icon through a priority chain: custom
iconUrl, then each built-in CDN candidate in order (assets-public first
where available, falling back to the existing dashboard-icons CDN), and
finally the generic per-kind Lucide icon if every candidate 404s. AWS now
tries samuelsjames.github.io/assets-public's aws-logo.svg before the
jsDelivr fallback; SSH gets a Linux logo from the same repo. Proxmox,
Weather, and Remote Desktop have no built-in candidates yet (no matching
assets in that repo) and fall back to the generic icon until added.
Settings → Integrations now has a universal "Icon" field (any integration
type) accepting a pasted URL or an uploaded image, stored as config.iconUrl.
This overrides the built-in icon for that integration's Node Status tile.
Node Status tiles now resolve their icon through a priority chain: custom
iconUrl, then each built-in CDN candidate in order (assets-public first
where available, falling back to the existing dashboard-icons CDN), and
finally the generic per-kind Lucide icon if every candidate 404s. AWS now
tries samuelsjames.github.io/assets-public's aws-logo.svg before the
jsDelivr fallback; SSH gets a Linux logo from the same repo. Proxmox,
Weather, and Remote Desktop have no built-in candidates yet (no matching
assets in that repo) and fall back to the generic icon until added.
Co-authored-by: Claude <noreply@anthropic.com>
Generalizes the Uptime Kuma monitor-grouping pattern to every integration:
Node Status now collapses each integration's resources into one tile (e.g.
30 EC2 instances under one "AWS" tile) instead of flooding the grid, with
members listed in Node Detail on selection. Proxmox stays ungrouped since
its VMs/LXCs are managed individually elsewhere in the app.
Adds integrationType to the /api/integrations/resources response so the
frontend can group/exclude by adapter type rather than resource kind (kind
alone can't distinguish Proxmox VMs from AWS VMs, for example).
Documents the grouping rule in HANDOFF.md and adds a paid-tier roadmap
entry for per-integration node tabs that will show every individual node.
Co-authored-by: Claude <noreply@anthropic.com>
heartbeatList/importantHeartbeatList emit monitor IDs as strings (server
iterates object keys), while monitorList and the live heartbeat event use
numbers. The lastHeartbeat map was keyed by the numeric monitor.id, so
string-keyed lookups from heartbeatList/importantHeartbeatList never hit.
Co-authored-by: Claude <noreply@anthropic.com>
importantHeartbeatList only contains entries for status transitions, so a
monitor that's been continuously up since creation never populates it,
showing as "unknown" in ArchNest despite being healthy in Uptime Kuma.
Co-authored-by: Claude <noreply@anthropic.com>
Integrations whose resources represent many sub-items (Uptime Kuma's
monitors) now collapse into a single tile using the Uptime Kuma CDN
icon, instead of flooding Node Status with one tile per monitor.
Selecting that tile lists every underlying monitor's status in a
scrollable Node Detail panel, so hundreds of monitors stay manageable.
Also drops the temporary debug logging added while diagnosing the
listener-timing bug, now that real monitor/heartbeat data confirmed
coming through.
The grid's flex column parent had no min-h-0, so it grew to fit all
content instead of being capped to the card height — overflow-y:auto
on the grid itself never had anything to scroll within.
Listeners for monitorList/importantHeartbeatList/heartbeat were being
attached after the login ack resolved, but the server pushes that
data right after login — sometimes in the same tick as the ack — so
Socket.IO dropped it before a listener existed. Listeners now attach
before login is sent.
Diagnosing why a connected Uptime Kuma instance with real monitors is
producing zero Resources — logs monitor count, active flags, and last
heartbeat per monitor so we can see exactly what the Socket.IO session
returns.
Each adapter now tags its Resources with a kind (vm, container, app,
host, network) so Node Status tiles show the right icon instead of a
generic server glyph — Proxmox LXCs/Docker containers get a container
icon, VMs get a VM icon, Uptime Kuma monitors get an app icon, etc.
Also stops silently swallowing listResources() failures — they're now
logged as warnings, since a connected-but-empty integration (e.g.
Uptime Kuma reporting zero monitors) was previously indistinguishable
from a real adapter error.
Clicking a tile in Node Status highlights it and surfaces its name,
source integration, status, and detail in a new Node Detail card on
the bottom row, between Integration Health and a now-narrower Recent
Activity card.
Status text now sits close to the name instead of being pushed to the
far edge, and the list caps at a fixed height with scroll so it stays
usable as more integrations are added.
Uptime Kuma has no REST API for monitor data; connect over the same
Socket.IO session the web UI uses (login, then read monitorList and
heartbeat events) so connected monitors now surface as Resources.
Switches the integration's credentials from an API key to
username/password, matching what Uptime Kuma's session login expects.
Use an explicit inline gap instead of relying solely on a dynamically
templated Tailwind class, and clip each pane's content (overflow-hidden)
so a pane can't visually bleed into the row gap below it.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019hu9pZvJY4BgmcQeAw2ugk
Wider gap between panes in split view (borders were crowding each other
in 4-up grid), more inset between the pane border and the connect/
disconnect header, and a small horizontal gutter around the xterm mount
so the prompt text doesn't sit flush against the border.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019hu9pZvJY4BgmcQeAw2ugk
- Help page now scrolls (it sat under a clipped section with no overflow handling).
- Connected Integrations on Glance shows 5 per row in a scrollable area with the
transparent ghost scrollbar, instead of growing the card unbounded.
- System Status KPI ring is now a thicker, vertically centered, multi-segment
donut broken down by integration type, each type colored consistently from a
shared first-come-first-served palette (src/lib/integrationColors.ts) so e.g.
whichever type connects first always gets the same color everywhere it's used.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019hu9pZvJY4BgmcQeAw2ugk
Lets a user pick one of three Starship presets (Nerd Font Symbols,
Pastel Powerline, Tokyo Night) from the Terminal page and install
Starship + a Nerd Font on the active pane's SSH host with one click,
instead of running a script by hand. Idempotent on the host side, and
available to all authenticated users like the rest of the SSH/Docker
tooling.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019hu9pZvJY4BgmcQeAw2ugk
Pairs with the Terminal page's Nerd Font glyph fallback: run this on any VM you SSH into from ArchNest to get a full icon-rich Starship prompt, not just working icon rendering.