Commit graph

9 commits

Author SHA1 Message Date
Samuel James
70f88efdc8
Add mesh prerequisite gate (#33)
* Add mesh prerequisite gate (NetBird verification before app config)

Implements the design in docs/mesh-prerequisite-gate.md per the user's
DECIDE A-D answers: a permanent admin override, B1 (reachable) verification
with host mesh IP shown informationally, members allowed in with a notice
instead of being blocked, and mesh.required defaulting off so the live
production instance is unaffected.

- system_config kv table + getConfig/setConfig helpers
- /api/system/mesh-status, /mesh/verify, /mesh/override, /mesh/required
- AuthContext gains a 'needs-mesh' status (admins only) and exposes
  meshStatus for a member-facing banner
- MeshGate page reuses the integration create+test flow to connect NetBird

* Make mesh verification universal (CIDR check, not NetBird-specific)

Replace the NetBird-adapter-based "reachable" check with a vendor-agnostic
one: the admin supplies the mesh's IP range (CIDR), and verification just
confirms this host has an address inside it. Works identically for
NetBird, WireGuard, ZeroTier, Tailscale, or any other mesh tech, with no
integration record or vendor API call required.

* Add reachability fallback for routed meshes (VPC peering, etc.)

A host can be on the mesh's "side" of a routed network (e.g. a VPC peered
into a NetBird/WireGuard mesh) without holding a local IP in the mesh's
own CIDR. Local-IP-in-CIDR stays the primary check; if it fails, the admin
can supply a known peer/gateway IP on the mesh and we verify by pinging
it instead. Adds iputils to the backend image for the ping binary.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-20 17:30:46 -04:00
Samuel James
35fd7fc703
Add Docker-over-SSH management and push-agent monitoring (#31)
Expands the Containers feature with two new ways to see and manage Docker
containers without exposing the Docker Engine TCP socket, plus the docs and
roadmap entries that frame them.

Docker over SSH (management):
- Runs the `docker` CLI on a remote SSH host instead of talking to the Engine
  TCP API, reusing the existing SSH transport (jump-host chaining, host-key
  verification, key/password auth) via connectTarget + execCommand. No dockerd
  socket has to be exposed — the mesh + SSH auth are the gate.
- backend/src/ssh/docker.ts: list/logs/start/stop/restart/pause/unpause/remove
  and an interactive `docker exec` shell builder. Container refs are validated
  against a strict allowlist and single-quoted to prevent command injection;
  action verbs are whitelisted.
- backend/src/routes/dockerSsh.ts: REST routes mirroring the TCP Docker API
  shape (mutating actions gated by adminOnly) + a /api/docker-ssh/exec
  WebSocket modeled on the terminal PTY plumbing.
- Note: the SSH path uses the ssh2 key/password auth; it does not implement the
  OpenSSH-certificate (OPKSSH) fallback that the terminal route has.

Docker push-agent monitoring (self-hosted, read-only):
- A small bash agent (agent/archnest-docker-agent.sh) runs on each Docker VM,
  collects a rich snapshot (docker ps + inspect + a stats snapshot), masks
  secret-looking env values locally, and POSTs it to ArchNest. VMs need
  outbound-only mesh access — no exposed port, no SSH for monitoring.
- backend/src/routes/agents.ts: token-gated ingest
  (POST /api/agents/docker/report, ARCHNEST_AGENT_TOKEN, constant-time compare;
  503 when unset, so it is disabled by default) plus user-auth read endpoints
  (hosts list with staleness flag, per-host containers, single-container
  detail). New docker_agent_reports table (latest report per host).
- Ingest stores data only; it never executes anything from the agent.

Containers page:
- Host selector now spans Docker API, SSH, and Agent sources.
- Intra-page tabs: a Containers list plus dynamic, closeable per-container
  detail tabs opened by clicking a container name. Agent detail shows
  overview/state/stats/ports/networks/mounts/env(masked)/labels; docker/ssh
  degrade gracefully. Agent rows are read-only; docker/ssh keep management.

Docs/roadmap:
- docs/docker-agent-monitoring.md (design doc, written before implementation).
- ROADMAP.md: LXC management (paid), Docker monitoring agent tiering
  (push self-hosted now / pull-agent paid), terminal grid tiering.

Deferred (documented, not built here): the mesh-prerequisite setup gate, the
paid pull-agent (Option 2), per-host tokens, time-series metrics.

Requires ARCHNEST_AGENT_TOKEN in the backend env to enable agent ingest.
Verified: backend `tsc --noEmit` and frontend `tsc -b && vite build` both pass;
agent jq filters, byte conversion, and `bash -n` checked locally.

Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
2026-06-20 16:24:57 -04:00
Samuel James
d863448495
Add auth Phase 3: multi-user accounts with admin/member roles (#28)
Implements Phase 3 of the auth roadmap: multiple user accounts (cap 10),
an admin/member role model, and admin-only gating of config-mutating
routes. Dashboard data stays shared across all users (per the product
decision in HANDOFF.md — this is a household/self-hosted dashboard, not
a multi-tenant app), so there is no per-user data isolation.

Schema (backend/src/db/index.ts):
- Idempotent migration adds `role` (default 'admin') and `active`
  (default 1) columns to `users` when missing. The 'admin' default means
  the pre-existing single user is backfilled to admin on deploy and keeps
  full access; newly created users are inserted explicitly as 'member'.
  Verified against a production-like old schema (columns added, existing
  user backfilled to admin/active).

Auth + access control:
- `/api/setup` creates the first user as admin. Login enforces `active`
  (deactivated accounts get 403) and embeds the live role in the session.
- `app.authenticate` now reads role+active fresh from the DB on every
  request (not from the possibly-stale JWT claim), rejects inactive
  accounts, and stashes the role on req.user.
- New `requireAdmin` (auth + role check) and `adminOnly` (role check for
  routes already behind the plugin-level authenticate hook) decorators.

User management (admin-only, in auth.ts):
- GET/POST/PUT/DELETE /api/users — list, create (admin sets a temp
  password; no public signup), change role, activate/deactivate, delete.
- 10-user cap enforced server-side; guard rails prevent removing the last
  active admin (demote/deactivate/delete) and deleting your own account;
  deactivating or deleting a user drops their sessions immediately.

Admin-only route gating (members get 403):
- integrations create/update/delete/test, tunnels create/delete, data
  export/import. Read routes and tunnel connect/disconnect stay open to
  all authenticated users, as do all the SSH/Docker/RDP tools and
  bookmarks (members are trusted to use the tooling, per product decision).

Frontend:
- api.ts: listUsers/createUser/updateUser/deleteUser + ManagedUser type;
  role+active added to AuthUser.
- Settings: new admin-only "Users" section (create form, role toggle,
  activate/deactivate, delete, 10-cap indicator). Nav filters the Users
  tab by role and guards ?tab= deep-links. Data & Backup shows an
  admin-only notice for members; Integrations shows a read-only banner
  for members. (Backend remains the real enforcement boundary.)

Verified end-to-end against a throwaway backend: role assignment,
member 403s on every admin-only route + 200s on shared/read routes,
admin 200/201s, last-admin guards (409/400), deactivation killing an
active session and blocking re-login (then reactivation restoring it),
and the 10-user cap (409 on the 11th). Both frontend and backend
type-check clean.

Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
2026-06-20 12:43:24 -04:00
Samuel James
2ccc7b82d7
Add auth Phase 2: password change, sessions, and login audit log (#27)
Builds out the Settings → Security tab (previously a "coming soon"
placeholder) and the backend behind it. Still single-user; multi-user
and SSO remain Phases 3-4.

Backend:
- New `sessions` table (id, user_id, user_agent, ip, created_at,
  last_seen_at) and `login_events` table (user_id, username, ip,
  user_agent, success, created_at).
- Login and setup now mint a session row and embed its id as a `sid`
  claim in the JWT. The `authenticate` hook validates that the session
  still exists (and bumps last_seen_at), so revoking a session genuinely
  invalidates its token instead of relying on the JWT signature alone.
  Tokens minted before sessions existed have no `sid` and stay valid
  until expiry, for backward compatibility.
- Every login attempt (success and failure) is recorded in login_events
  for the audit trail.
- New endpoints: PUT /api/auth/password (verifies current via bcrypt,
  hashes new at cost 12, revokes all *other* sessions on success),
  GET /api/auth/sessions, DELETE /api/auth/sessions/:id (can't revoke
  the current one), POST /api/auth/logout (revokes current session),
  GET /api/auth/login-events?limit.
- AuthContext.logout() now calls POST /api/auth/logout best-effort so
  signing out revokes the server session, not just the local token.

Frontend:
- SecuritySection: change-password form (current/new/confirm with
  show/hide and client-side validation), active-sessions list (device
  description from user-agent, IP, last-seen relative time, per-session
  "Sign out" for non-current sessions), and a recent login-activity feed
  (success/failure dot, user, IP, relative time).
- api.ts: changePassword/listSessions/revokeSession/logout/
  listLoginEvents + AuthSession/LoginEvent types.

Verified end-to-end against a throwaway backend instance: session
creation, second-device session, failed-login logging, cross-session
revocation invalidating the revoked token, password change keeping the
current session alive while revoking others, and logout invalidating the
current session. Frontend + backend both type-check clean.

Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
2026-06-20 11:50:56 -04:00
Claude
eaa971bb5a
Phase 2: SSH tunnels (local/remote/dynamic SOCKS5 port forwarding)
- backend/src/ssh/connect.ts: extracted shared SSH-connect logic
  (jump-host chaining, TOFU host-key verification) out of terminal.ts
  so tunnels can reuse it.
- backend/src/tunnels/manager.ts + socks5.ts: in-memory tunnel
  runtime manager supporting local forward (forwardOut), remote
  forward (forwardIn), and dynamic SOCKS5 proxying, with automatic
  reconnect/retry and an auto-start-on-boot option. New `tunnels`
  table persists configs as the saved presets.
- backend/src/routes/tunnels.ts: REST CRUD + connect/disconnect.
- src/pages/Tunnels.tsx: new /tunnels page (sidebar entry added) to
  create, start/stop, and delete tunnels with live status polling.
- Verified end-to-end against a real ssh2 test server handling real
  forwardOut/forwardIn requests and a real upstream TCP echo server -
  all three tunnel modes moved real data, and disconnect correctly
  tore down the local listener.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BbJV5nm8KPVH1oNJYKpnoF
2026-06-19 11:40:59 +00:00
Claude
5d56a1d902
Phase 1b: SSH jump-host chaining, TOFU host-key verification, multi-host Settings UI
Terminal connections can now reference a jumpHostIntegrationId on the SSH
integration config; the backend connects to the jump host first and tunnels
to the real target via ssh2's forwardOut(), rather than connecting directly.

Added an ssh_host_keys table and a hostVerifier callback that accepts and
stores a host's fingerprint on first connect, then hard-rejects on any
mismatch on subsequent connects (trust-on-first-use).

Settings previously only ever showed/edited one integration per type, which
silently prevented configuring more than one SSH host at all. Added a
dedicated multi-host SSH section (per-host Save/Test/Delete, Add SSH Host,
and a Jump Host dropdown) so jump-host chaining is actually usable from the UI.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BbJV5nm8KPVH1oNJYKpnoF
2026-06-19 11:04:46 +00:00
Claude
71f49e0700
Add Phase 1a: core SSH terminal (Termix migration)
Implements the minimal-viable terminal described in TERMIX_MIGRATION.md
Phase 1a: a real interactive SSH session in the browser over a
WebSocket, using xterm.js on the frontend and ssh2 on the backend.
Reuses ArchNest's existing SSH integrations (host/port/username/
password/privateKey/passphrase) instead of introducing a second,
duplicate host-management system the way Termix has one.

Backend: new /api/terminal WebSocket route (registered via
@fastify/websocket) handling connect/input/resize/disconnect messages,
authenticated via a JWT passed as a query param (browsers can't set
custom headers on the WS handshake). Extracted the integration secret
loader out of routes/integrations.ts into db/secrets.ts so the new
terminal route can reuse it without duplicating the decrypt logic.

Frontend: new Terminal.tsx page listing configured SSH hosts and
rendering an xterm.js terminal wired to the WebSocket; wired into
App.tsx at /terminal. vite.config.ts's dev proxy now forwards
WebSocket upgrades (ws: true) so this works under `npm run dev`.

Verified end-to-end against a real (test) ssh2-based SSH server:
connect, shell banner, keystroke echo, and prompt redraw all worked
correctly over the actual WebSocket protocol.

Deliberately deferred to Phase 1b/1c per the migration doc: jump-host
chaining, tab/split-pane UI, terminal theme/font settings, OPKSSH cert
auth, tmux session monitor, session recording.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BbJV5nm8KPVH1oNJYKpnoF
2026-06-19 10:52:04 +00:00
Claude
3b920fcfb2
Replace mock data on Glance and Infrastructure with real backend data
Adds an events table + logEvent helper for a genuine activity log, and
a /api/integrations/resources aggregate endpoint backed by a new optional
listResources adapter method (implemented for Docker via its containers API).
StatusCards, MiddleRow, BottomRow, and Infrastructure now render real
integration/resource/event data instead of hardcoded numbers, with empty
states where no data source exists yet (AWS cost, historical trends).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BbJV5nm8KPVH1oNJYKpnoF
2026-06-18 19:56:10 +00:00
Claude
a0b71c7028
Add backend skeleton: Fastify + SQLite API with auth and integrations
- Single-user JWT auth with a first-run /api/setup endpoint, gated by
  GET /api/system/setup-status, to back an upcoming enrollment page
- SQLite schema for users, integrations, secrets (AES-256-GCM encrypted),
  bookmarks, and bookmark categories
- Integration adapter registry with real health-check adapters for
  Uptime Kuma and Docker, stubs for the rest, wired to
  POST /api/integrations/:id/test
- CRUD routes for integrations and bookmarks
- backend/ as its own Docker service in docker-compose.yml, Vite dev
  proxy for /api, .env.example for required secrets

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BbJV5nm8KPVH1oNJYKpnoF
2026-06-18 19:04:48 +00:00