Bring the docs in line with what shipped since the auth phases, and hand off the next planned feature cleanly for another agent to pick up. - HANDOFF.md: new TL;DR (auth complete; persistent terminals + Docker three-ways shipped); prominent "next task = Mesh Prerequisite Gate" callout warning not to code before the open decisions are answered; corrected standing rules (kiro/<feature> branches, gh-based workflow, npm run build over plain tsc, Co-authored-by trailers); architecture sections updated for TerminalSessionContext, dockerSsh/agents routes, docker_agent_reports table, ssh/docker.ts, and the new agent env vars; new "Docker: three ways" section. - README.md: Containers/Terminal page rows, route-group list, SSH layer, agent/ dir, ARCHNEST_AGENT_TOKEN/ARCHNEST_AGENT_STALE_MS, current-state paragraph, and doc reading order. - design-decisions.md: Terminal (persistence) and Containers (three sources + detail tab) page notes; backend Docker-transport note; mesh gate flagged under Future Integration Notes. - docs/mesh-prerequisite-gate.md (new): full design with lockout-safety invariants and the open decisions (A-D) needed before implementation. Docs only; no code changed. Co-authored-by: Samuel James <ssamjame@amazon.com> Co-authored-by: Kiro <noreply@kiro.dev>
255 lines
14 KiB
Markdown
255 lines
14 KiB
Markdown
# ArchNest
|
|
|
|
A self-hosted ops dashboard for a homelab/cloud setup: live infrastructure
|
|
monitoring across 9 real integration types, a categorized bookmark hub, a
|
|
full SSH suite (terminal, tunnels, file manager, host-to-host transfer, live
|
|
host metrics), Docker container management, and RDP/VNC/Telnet remote desktop
|
|
— all in one app, with zero mock data anywhere.
|
|
|
|
**This repo is private and will never be public.** This README is written for
|
|
the owner and for any AI session picking up the project cold — it should be
|
|
detailed enough that neither needs to re-derive context from scratch.
|
|
|
|
## What this is, in one paragraph
|
|
|
|
ArchNest replaced a Homarr-style bookmark dashboard plus a handful of
|
|
disconnected admin tools (Proxmox UI, Portainer, separate SSH terminals,
|
|
WinSCP-equivalents) with one app that talks directly to the underlying
|
|
systems. It started as a 6-page mockup/portfolio piece and has since grown
|
|
into an 11-page real tool with a real Fastify backend, real SSH/Docker/cloud
|
|
integrations, and no synthetic data — every number on every page comes from
|
|
a live API call, a SQLite-backed table, or an SSH command run against a
|
|
managed host.
|
|
|
|
## Current state & direction
|
|
|
|
**Live and deployed** at `archnest.snsnetlabs.com`, auto-deploying on every
|
|
merge to `main` via `.github/workflows/deploy.yml`. All 11 pages and their
|
|
backend routes are built and working — there is no pending/on-hold page.
|
|
|
|
Auth is feature-complete for self-hosted (Phases 1-3: user menu wiring,
|
|
password/sessions/login-log, multi-user roles with a 10-seat cap); Phase 4
|
|
(Authentik SSO) is **deferred to a paid AWS add-on** — see `ROADMAP.md`.
|
|
Recently shipped: persistent terminal sessions across navigation, and Docker
|
|
container visibility/management three ways (Engine TCP API, `docker` CLI over
|
|
SSH, and a read-only push agent — see `docs/docker-agent-monitoring.md`).
|
|
|
|
The **next planned feature is the Mesh Prerequisite Gate** — requiring a
|
|
verified NetBird mesh before the app can be configured. It is **designed but
|
|
not built** (`docs/mesh-prerequisite-gate.md`) and has open decisions that need
|
|
the user's sign-off before coding (notably defaulting it OFF so it can't lock
|
|
the live instance). See `HANDOFF.md` for where to resume.
|
|
|
|
If you're a fresh AI session: read this file, then `HANDOFF.md` (current
|
|
task state + standing workflow rules), then `design-decisions.md` (visual
|
|
conventions + accurate per-page implementation notes), then `ROADMAP.md`
|
|
(deferred/tiered work) and the `docs/` design docs (`docker-agent-monitoring.md`,
|
|
`mesh-prerequisite-gate.md`), then `TERMIX_MIGRATION.md`
|
|
(history of how the SSH/Docker/Guacamole feature set was built) if you need
|
|
that context.
|
|
|
|
## Pages
|
|
|
|
| Page | Route | What it does |
|
|
|------|-------|---------------|
|
|
| Glance | `/` | Home dashboard — system/integration health, resource overview, recent activity, shortcuts |
|
|
| Infrastructure | `/infrastructure` | Resource inventory across all integrations — distribution donut, per-resource status grid, integration health, activity |
|
|
| BookNest | `/booknest` | Categorized bookmark hub — quick access, favorites, link health, full CRUD |
|
|
| Terminal | `/terminal` | Web SSH terminal — multi-tab, split panes, tmux attach, cert auth (OPKSSH); **sessions stay connected across page navigation** |
|
|
| Tunnels | `/tunnels` | SSH tunnel manager — local/remote/dynamic (SOCKS5) forwarding, auto-start, live status |
|
|
| Files | `/files` | SFTP file browser/editor over managed SSH hosts, with host-to-host transfer |
|
|
| Containers | `/containers` | Docker containers across **three sources** (Engine TCP API, `docker` CLI over SSH, or a read-only push agent) — list/start/stop/restart/pause/remove, logs, interactive exec; tabbed with a clickable per-container detail view |
|
|
| Remote Desktop | `/remote-desktop` | RDP/VNC/Telnet sessions via a Guacamole sidecar |
|
|
| Host Metrics | `/host-metrics` | Live CPU/memory/disk/network/processes/ports/firewall/login-activity per SSH host, polled every 5s |
|
|
| Settings | `/settings` | Profile, Appearance, Security, Integrations, Notifications, Data & Backup, About — deep-linkable via `?tab=` |
|
|
| Help | `/help` | Static guided tour of every page above |
|
|
| Login / Enrollment | `/login`, `/enrollment` | Auth entry points — not in the sidebar nav |
|
|
|
|
See `design-decisions.md`'s "Page Notes" section for a detailed, per-page
|
|
breakdown of layout, real data sources, and known quirks — it's kept in sync
|
|
with the actual code, not a spec written before the page existed.
|
|
|
|
## Architecture
|
|
|
|
### Frontend (`/src`)
|
|
- React 19 + Vite + TypeScript, Tailwind CSS v4, Recharts (donuts/area
|
|
charts), Lucide React icons, React Router.
|
|
- `src/lib/api.ts` — typed fetch wrapper (`apiFetch`) + one function per
|
|
backend endpoint + matching TS interfaces. This is the contract between
|
|
frontend and backend; any new backend route needs a matching entry here.
|
|
- `src/lib/AuthContext.tsx` — auth state backed by `localStorage` (JWT
|
|
carrying a server-tracked session id; signing out revokes the session
|
|
server-side).
|
|
- `src/lib/TerminalSessionContext.tsx` — keeps SSH terminal sessions
|
|
(xterm + WebSocket + DOM node) alive above the router so they survive
|
|
in-app navigation; shared constants in `src/lib/terminalPrefs.ts`.
|
|
- `src/pages/` — one file per route (see table above), plus `Login.tsx` /
|
|
`Enrollment.tsx` for the unauthenticated/first-run flows.
|
|
- `src/components/` — `TopBar.tsx` (title, global search across pages/
|
|
integrations/bookmarks, user dropdown), `Sidebar.tsx` (nav + system-health
|
|
rollup widget).
|
|
- `App.tsx` — route table, plus per-route hero-banner config (`showHero`,
|
|
`heroPaddingTop`, `heroObjectPosition` lookup maps) and `topBarHeight`
|
|
lookup for pages with a subtitle (currently only BookNest).
|
|
|
|
### Backend (`/backend`)
|
|
- Fastify 5, TypeScript, ESM (`tsx` for dev, `tsc -b` for build), entrypoint
|
|
`src/server.ts`.
|
|
- `backend/src/db/index.ts` — SQLite schema + `logEvent()` audit log,
|
|
plus `sessions`/`login_events` tables and a multi-user `users` schema
|
|
(`role` admin/member + `active` columns).
|
|
- `backend/src/db/crypto.ts` — AES-256-GCM `encryptSecret`/`decryptSecret`,
|
|
keyed by `ARCHNEST_SECRET_KEY`.
|
|
- `backend/src/routes/` — one file per feature area:
|
|
- `auth.ts` — setup, login, profile, password change, sessions,
|
|
login audit log, and admin-only user management (`/api/setup`,
|
|
`/api/auth/*`, `/api/users`)
|
|
- `integrations.ts` — integration CRUD + connection testing
|
|
- `bookmarks.ts` — bookmarks + categories CRUD
|
|
- `events.ts` — activity log retrieval
|
|
- `terminal.ts` — SSH terminal WebSocket (`connect`/`input`/`resize`/
|
|
`list_tmux`/`disconnect`)
|
|
- `tunnels.ts` — SSH tunnel CRUD + connect/disconnect
|
|
- `files.ts` — SFTP list/read/write/mkdir/rename/delete/chmod/download/upload
|
|
- `docker.ts` — Docker Engine TCP API: container list/stats/logs/actions + exec WebSocket
|
|
- `dockerSsh.ts` — Docker over SSH: runs the `docker` CLI on a remote SSH host (list/logs/actions + exec WebSocket); no dockerd socket exposed
|
|
- `agents.ts` — Docker monitoring agents: token-gated push ingest (`POST /api/agents/docker/report`) + read-only host/container views
|
|
- `guacamole.ts` — Guacamole WebSocket proxy for remote desktop
|
|
- `metrics.ts` — live host metrics endpoint
|
|
- `transfer.ts` — host-to-host file transfer orchestration (start/poll/cancel)
|
|
- `data.ts` — full backup export/import (integrations + secrets + bookmarks + tunnels)
|
|
- `backend/src/integrations/` — one adapter per type, all real (none are
|
|
stubs): `proxmox.ts`, `docker.ts`, `netbird.ts`, `cloudflare.ts`, `aws.ts`,
|
|
`uptimeKuma.ts`, `weather.ts`, `ssh.ts`, `remoteDesktop.ts`. Each implements
|
|
`testConnection()` (required) and `listResources()` (optional);
|
|
`registry.ts` maps `IntegrationType` → adapter.
|
|
- `backend/src/ssh/` — the shared SSH transport layer used by Terminal,
|
|
Files, Tunnels, Transfers, and Host Metrics:
|
|
- `connect.ts` — jump-host chaining, host-key verification, certificate auth
|
|
- `sftp.ts` — ephemeral SFTP connections for file ops
|
|
- `transfer.ts` — streamed host-to-host copy/move with progress + cancel
|
|
- `docker.ts` — runs the `docker` CLI over SSH for the Containers page's
|
|
"Docker over SSH" source (list/logs/actions + interactive exec)
|
|
- `metrics/` — 10 sequential collectors (cpu, memory, disk, uptime,
|
|
network, system, processes, ports, firewall, login-stats) — sequential
|
|
on purpose, to stay under OpenSSH's `MaxSessions` limit per host.
|
|
- Docker images run on Alpine; **OpenSSL legacy provider is enabled** in
|
|
`backend/Dockerfile` (`OPENSSL_CONF=/etc/ssl/openssl-legacy.cnf`) so
|
|
old-format encrypted PEM keys (`BEGIN RSA PRIVATE KEY` + `DEK-Info`) still
|
|
decrypt under OpenSSL 3 — don't remove this without understanding why.
|
|
- **Required env vars, no defaults**: `ARCHNEST_SECRET_KEY`,
|
|
`ARCHNEST_JWT_SECRET`. The server refuses to start without both. Optional:
|
|
`ARCHNEST_DB_PATH`, `PORT`, `ARCHNEST_GUAC_CRYPT_KEY` /
|
|
`ARCHNEST_GUACD_HOST` / `ARCHNEST_GUACD_PORT`, `ARCHNEST_CORS_ORIGIN`,
|
|
`ARCHNEST_SESSION_LOG_DIR` (optional terminal session logging),
|
|
`ARCHNEST_AGENT_TOKEN` (shared token enabling the Docker monitoring-agent
|
|
ingest endpoint — ingest is disabled / returns 503 when unset),
|
|
`ARCHNEST_AGENT_STALE_MS` (default 90000; when an agent report is shown stale).
|
|
- `backend/src/docker/` — Docker Engine TCP API client used by `docker.ts`.
|
|
- `agent/` — the standalone Docker monitoring agent (`archnest-docker-agent.sh`
|
|
+ install/README). Runs on each Docker VM and pushes reports to ArchNest.
|
|
|
|
## Development
|
|
|
|
Frontend:
|
|
```bash
|
|
npm install
|
|
npm run dev
|
|
```
|
|
|
|
Backend:
|
|
```bash
|
|
cd backend
|
|
npm install
|
|
ARCHNEST_SECRET_KEY=$(openssl rand -hex 32) ARCHNEST_JWT_SECRET=$(openssl rand -hex 32) npm run dev
|
|
```
|
|
|
|
`ARCHNEST_DB_PATH` optionally overrides the SQLite file location (defaults to
|
|
a local path under `backend/`). `PORT` overrides the listen port (check
|
|
`server.ts` for the default).
|
|
|
|
Type-check both before committing — this is the minimum bar, not a substitute
|
|
for testing in a browser:
|
|
```bash
|
|
npx tsc --noEmit # from repo root, frontend
|
|
cd backend && npx tsc --noEmit # backend
|
|
```
|
|
Vite/the browser surface some runtime errors (e.g. missing icon exports —
|
|
see the lucide-react gotcha in `design-decisions.md`) that the type-checker
|
|
won't catch.
|
|
|
|
## Tech Stack
|
|
|
|
**Frontend**
|
|
- React 19 + Vite + TypeScript, React Router, Tailwind CSS v4
|
|
- Recharts (donuts, line/area charts), Lucide React (icons)
|
|
- xterm.js (Terminal page terminal rendering)
|
|
|
|
**Backend**
|
|
- Fastify 5 + TypeScript, `tsx` for dev, `tsc -b` for build
|
|
- `better-sqlite3` for storage
|
|
- `@fastify/jwt` for auth tokens, `bcryptjs` for password hashing
|
|
- `zod` for request validation
|
|
- AES-256-GCM (Node `crypto`) for encrypting integration secrets at rest
|
|
- SSH client library powering the SSH transport layer (`backend/src/ssh/`)
|
|
- Guacamole Lite protocol for RDP/VNC/Telnet, proxied to a `guacd` sidecar
|
|
|
|
**Integrations**: Proxmox, Docker, NetBird, Cloudflare, AWS, Uptime Kuma,
|
|
Weather (wttr.in), SSH, Remote Desktop (RDP/VNC/Telnet via Guacamole) — see
|
|
`backend/src/integrations/` for adapter implementations.
|
|
|
|
**Deploy target:** Docker on `racknerd1` → Nginx Proxy Manager at
|
|
`archnest.snsnetlabs.com`.
|
|
|
|
## Deployment
|
|
|
|
**Live and deployed.** `.github/workflows/deploy.yml` triggers on every push
|
|
to `main`: builds, SCPs the repo to `racknerd1`, and runs
|
|
`docker compose up -d --build` there, gated on an `/api/health` health check.
|
|
No further setup is needed — merging a PR to `main` redeploys automatically.
|
|
|
|
`docker-compose.yml` runs 3 services: `archnest` (frontend), `archnest-backend`,
|
|
and `guacd` (remote desktop sidecar).
|
|
|
|
If a deploy fails, check the workflow run's `deploy` job steps in order:
|
|
`Pre-flight` (confirms host `.env` exists) → `Copy repo to racknerd1` →
|
|
`Build, restart, and clean up` → `Health check (backend /api/health)`.
|
|
|
|
One-time setup already done (reference only, shouldn't need repeating): host
|
|
provisioning (Docker/Compose on `racknerd1`, deploy SSH user, `/opt/archnest`
|
|
directory), `/opt/archnest/.env` populated from `.env.example` with real
|
|
secrets, `RACKNERD_HOST`/`RACKNERD_USER`/`RACKNERD_SSH_KEY` added as GitHub
|
|
Actions secrets, DNS/Nginx Proxy Manager pointed at the host.
|
|
|
|
## Documentation map
|
|
|
|
- **`README.md`** (this file) — architecture, tech stack, deployment, page list.
|
|
- **`HANDOFF.md`** — current task state, standing workflow rules (git workflow,
|
|
mock-data policy, secrets discipline), and the auth/SSO roadmap. Read this
|
|
before starting any new work session.
|
|
- **`design-decisions.md`** — visual/UX conventions (colors, typography, card
|
|
style, animations) plus a detailed, accurate-as-of-now "Page Notes" section
|
|
per page — what's actually rendered and where its data comes from. This is
|
|
the file to update whenever a page's layout or data source changes.
|
|
- **`TERMIX_MIGRATION.md`** — phase-by-phase history of how the SSH/Tunnels/
|
|
Files/Containers/Remote Desktop/Host Metrics/Transfer/Data-export feature
|
|
set was built (originally scoped as a migration from a forked Termix
|
|
project, hence the name). Useful for historical "why was it built this
|
|
way" context on those specific features.
|
|
- **`.kiro/steering/design-rules.md`** — a condensed duplicate of
|
|
`design-decisions.md`'s Global Rules, auto-injected into every Kiro IDE
|
|
session (the Kiro extension reads `.kiro/steering/*` automatically). If you
|
|
update a global design rule, update both files in the same change —
|
|
`design-decisions.md` is canonical, this one just needs to stay in sync so
|
|
Kiro doesn't steer on stale info.
|
|
|
|
Three older docs were deleted as part of a documentation cleanup:
|
|
`archnest-blueprint.md` and `glance.md` (the original 6-page mockup pitch and
|
|
an early Glance-only spec, both describing fictional config files and
|
|
placeholder numbers that never matched the real build), and
|
|
`.kiro/specs/archnest-dashboard/` (an abandoned Kiro spec — requirements-only,
|
|
no `design.md`/`tasks.md` ever followed — describing the same stale 6-page/
|
|
80px-sidebar/Zustand-based vision). Their still-accurate content (color
|
|
palette, dropdown menu shape, card styling) was folded into
|
|
`design-decisions.md` and `.kiro/steering/design-rules.md`; everything else
|
|
was superseded by the real, deployed implementation described above.
|