Commit graph

11 commits

Author SHA1 Message Date
Samuel James
ad4687660c Document the Forgejo CI/CD + racknerd2 setup as the baseline
All checks were successful
Build & Push Images / build (push) Successful in 41s
CI / validate (push) Successful in 51s
Build & Push Images / deploy (push) Successful in 30s
Make the automated pipeline the documented "setup moving forward" and
finish scrubbing the last stale GitHub-Actions/racknerd1 references that
never reached main.

- HANDOFF.md: refresh the stale 2026-06-21 snapshot. New "CI/CD & deploy"
  section (push to main -> build + push to registry.snsnetlabs.com ->
  auto-deploy to racknerd2 over SSH, SHA-pinned, /api/health gate),
  racknerd2 validation-host + SSH-tunnel access notes, Forgejo workflow
  rule, and a current Deployment + orientation section.
- .kiro/steering/project-guide.md: Forgejo-only Git workflow (no gh),
  CI/CD row, registry host, racknerd2 + forgejo-runner SSH entries, and a
  CI/CD pipeline section.
- .kiro/hooks/tunnel-racknerd2-8080.kiro.hook: the "View ArchNest on
  racknerd2" hook (ssh -L 8080:localhost:8080 -N) to view the deployed
  site at http://localhost:8080 (racknerd2's edge only allows port 22).
- src/pages/Settings.tsx: About panel repo URL -> Forgejo.
- .dockerignore: .github -> .forgejo.
- TERMIX_MIGRATION.md / docs/OPEN-SOURCE-RELEASE.md: drop stale
  .github/workflows + "GitHub Actions deploy" references.

Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
2026-06-25 13:37:39 -04:00
Samuel James
00fc3ceed3 Point registry at registry.snsnetlabs.com; record even=dev versioning
Some checks failed
Build & Push Images / build (push) Failing after 29s
CI / validate (push) Successful in 1m12s
The Forgejo container registry now lives on a dedicated unproxied
(DNS-only) host, registry.snsnetlabs.com, so large image layers bypass
Cloudflare's ~100 MB request-body cap (the backend image's 262 MB and
317 MB layers previously hit 413 Payload Too Large through the proxied
forgejo.snsnetlabs.com host). The web UI / packages list stays on
forgejo.snsnetlabs.com behind Cloudflare Access SSO.

- build.yml: REGISTRY -> registry.snsnetlabs.com
- deploy/docker-compose.yml: image refs -> registry.snsnetlabs.com
- deploy/README.md: push/pull/login host -> registry.snsnetlabs.com
  (packages web UI URL kept on forgejo.snsnetlabs.com)

Also record the versioning convention in HANDOFF + steering: development
happens on even major versions, releases on odd; currently developing v2
(prior released line is v1, see the v1.0 git tag). package.json and the
About panel are not yet bumped to v2.

Validated end to end: built both images on the runner host, pushed to
registry.snsnetlabs.com (backend included, no 413), pulled on racknerd2,
brought the stack up, /api/health returns {"ok":true} over the mesh IP.

Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
2026-06-25 10:55:15 -04:00
Samuel James
9f10e8ee6f
Group all integration node tiles by integration except Proxmox (#39)
Generalizes the Uptime Kuma monitor-grouping pattern to every integration:
Node Status now collapses each integration's resources into one tile (e.g.
30 EC2 instances under one "AWS" tile) instead of flooding the grid, with
members listed in Node Detail on selection. Proxmox stays ungrouped since
its VMs/LXCs are managed individually elsewhere in the app.

Adds integrationType to the /api/integrations/resources response so the
frontend can group/exclude by adapter type rather than resource kind (kind
alone can't distinguish Proxmox VMs from AWS VMs, for example).

Documents the grouping rule in HANDOFF.md and adds a paid-tier roadmap
entry for per-integration node tabs that will show every individual node.

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-21 09:35:55 -04:00
Samuel James
07116a0475
Docker setup-script hint + expanded Help page (#35)
* Add mesh prerequisite gate (NetBird verification before app config)

Implements the design in docs/mesh-prerequisite-gate.md per the user's
DECIDE A-D answers: a permanent admin override, B1 (reachable) verification
with host mesh IP shown informationally, members allowed in with a notice
instead of being blocked, and mesh.required defaulting off so the live
production instance is unaffected.

- system_config kv table + getConfig/setConfig helpers
- /api/system/mesh-status, /mesh/verify, /mesh/override, /mesh/required
- AuthContext gains a 'needs-mesh' status (admins only) and exposes
  meshStatus for a member-facing banner
- MeshGate page reuses the integration create+test flow to connect NetBird

* Make mesh verification universal (CIDR check, not NetBird-specific)

Replace the NetBird-adapter-based "reachable" check with a vendor-agnostic
one: the admin supplies the mesh's IP range (CIDR), and verification just
confirms this host has an address inside it. Works identically for
NetBird, WireGuard, ZeroTier, Tailscale, or any other mesh tech, with no
integration record or vendor API call required.

* Add reachability fallback for routed meshes (VPC peering, etc.)

A host can be on the mesh's "side" of a routed network (e.g. a VPC peered
into a NetBird/WireGuard mesh) without holding a local IP in the mesh's
own CIDR. Local-IP-in-CIDR stays the primary check; if it fails, the admin
can supply a known peer/gateway IP on the mesh and we verify by pinging
it instead. Adds iputils to the backend image for the ping binary.

* Add Mesh section to Settings for configuring/testing the mesh gate

Admins can now toggle mesh.required, run verify/override, and see
current mesh status entirely from the app, without hitting the API
directly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019hu9pZvJY4BgmcQeAw2ugk

* Show a host-specific Docker remote-API setup script in Settings

When adding/editing a Docker integration with a tcp:// or http:// remote
URL, display a copyable systemd override + curl verification script
scoped to the entered host:port, so enabling the daemon's API doesn't
require looking up the steps separately.

* Expand Help page with quick-start guide and real-world examples

Adds a quick-start ordering card and per-feature example callouts (with icons) so first-time users see concrete use cases, not just descriptions.

* Update HANDOFF/README for handoff: mesh gate shipped, Docker UX work, no feature queued

Corrects the stale 'mesh gate not built' framing (it shipped across 4 commits, all merged) and documents the Docker setup-script hint + Help page expansion done this session. Leaves a clear next-task list for the picking-up agent: decide on merging claude/youthful-cerf-ibvxfb, then check with the user for the next priority.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-21 04:34:59 -04:00
Samuel James
cdd93f204e
docs: sync HANDOFF/README/design-decisions; add mesh-gate design (#32)
Bring the docs in line with what shipped since the auth phases, and hand
off the next planned feature cleanly for another agent to pick up.

- HANDOFF.md: new TL;DR (auth complete; persistent terminals + Docker
  three-ways shipped); prominent "next task = Mesh Prerequisite Gate"
  callout warning not to code before the open decisions are answered;
  corrected standing rules (kiro/<feature> branches, gh-based workflow,
  npm run build over plain tsc, Co-authored-by trailers); architecture
  sections updated for TerminalSessionContext, dockerSsh/agents routes,
  docker_agent_reports table, ssh/docker.ts, and the new agent env vars;
  new "Docker: three ways" section.
- README.md: Containers/Terminal page rows, route-group list, SSH layer,
  agent/ dir, ARCHNEST_AGENT_TOKEN/ARCHNEST_AGENT_STALE_MS, current-state
  paragraph, and doc reading order.
- design-decisions.md: Terminal (persistence) and Containers (three
  sources + detail tab) page notes; backend Docker-transport note; mesh
  gate flagged under Future Integration Notes.
- docs/mesh-prerequisite-gate.md (new): full design with lockout-safety
  invariants and the open decisions (A-D) needed before implementation.

Docs only; no code changed.

Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
2026-06-20 16:42:47 -04:00
Samuel James
b836ac1a02
Keep SSH terminal sessions connected across page navigation (#30)
The Terminal page held all session state (xterm instances and their
WebSockets) in component-local React state. Because it renders as a
`<Route element={<Terminal />}>`, navigating away unmounted it and ran
the xterm cleanup (`term.dispose()` + `ws.close()`), tearing down every
SSH session. Returning to the page reconnected from scratch, losing
scrollback and any running work.

Lift terminal sessions into a `TerminalSessionProvider` mounted above the
router (in `main.tsx`, inside `AuthProvider`). The provider owns each
pane's xterm instance, fit addon, WebSocket, and a persistent wrapper DOM
node. Wrappers live in a hidden container at the app root; the Terminal
page re-parents them into its grid on mount and moves them back to the
hidden root on unmount instead of disposing — so the xterm + WebSocket
keep running in the background across route changes.

Disconnect semantics: closing a tab/pane (or shrinking the 1/2/4 grid)
destroys those sessions; logout tears down all sessions. A full browser
reload still drops connections (the WebSocket dies with the page) — this
persists across in-app navigation only.

Shared terminal constants/types/prefs are split into a non-component
module (`src/lib/terminalPrefs.ts`) so the context file stays a clean
component module.

Also document the terminal window grid-view tiering in ROADMAP.md
(self-hosted = 4-window cap, current; paid = as many as fit on screen,
planned for the AWS deployment), and realign HANDOFF/README/design-docs
to reflect that auth Phase 3 (multi-user) shipped and Phase 4 (SSO) is
deferred to a paid AWS add-on.

Verified with a clean `tsc -b && vite build` (frontend) and
`tsc --noEmit -p .` (backend).

Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
2026-06-20 15:02:50 -04:00
Samuel James
d863448495
Add auth Phase 3: multi-user accounts with admin/member roles (#28)
Implements Phase 3 of the auth roadmap: multiple user accounts (cap 10),
an admin/member role model, and admin-only gating of config-mutating
routes. Dashboard data stays shared across all users (per the product
decision in HANDOFF.md — this is a household/self-hosted dashboard, not
a multi-tenant app), so there is no per-user data isolation.

Schema (backend/src/db/index.ts):
- Idempotent migration adds `role` (default 'admin') and `active`
  (default 1) columns to `users` when missing. The 'admin' default means
  the pre-existing single user is backfilled to admin on deploy and keeps
  full access; newly created users are inserted explicitly as 'member'.
  Verified against a production-like old schema (columns added, existing
  user backfilled to admin/active).

Auth + access control:
- `/api/setup` creates the first user as admin. Login enforces `active`
  (deactivated accounts get 403) and embeds the live role in the session.
- `app.authenticate` now reads role+active fresh from the DB on every
  request (not from the possibly-stale JWT claim), rejects inactive
  accounts, and stashes the role on req.user.
- New `requireAdmin` (auth + role check) and `adminOnly` (role check for
  routes already behind the plugin-level authenticate hook) decorators.

User management (admin-only, in auth.ts):
- GET/POST/PUT/DELETE /api/users — list, create (admin sets a temp
  password; no public signup), change role, activate/deactivate, delete.
- 10-user cap enforced server-side; guard rails prevent removing the last
  active admin (demote/deactivate/delete) and deleting your own account;
  deactivating or deleting a user drops their sessions immediately.

Admin-only route gating (members get 403):
- integrations create/update/delete/test, tunnels create/delete, data
  export/import. Read routes and tunnel connect/disconnect stay open to
  all authenticated users, as do all the SSH/Docker/RDP tools and
  bookmarks (members are trusted to use the tooling, per product decision).

Frontend:
- api.ts: listUsers/createUser/updateUser/deleteUser + ManagedUser type;
  role+active added to AuthUser.
- Settings: new admin-only "Users" section (create form, role toggle,
  activate/deactivate, delete, 10-cap indicator). Nav filters the Users
  tab by role and guards ?tab= deep-links. Data & Backup shows an
  admin-only notice for members; Integrations shows a read-only banner
  for members. (Backend remains the real enforcement boundary.)

Verified end-to-end against a throwaway backend: role assignment,
member 403s on every admin-only route + 200s on shared/read routes,
admin 200/201s, last-admin guards (409/400), deactivation killing an
active session and blocking re-login (then reactivation restoring it),
and the 10-user cap (409 on the 11th). Both frontend and backend
type-check clean.

Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
2026-06-20 12:43:24 -04:00
Samuel James
ae5142769d
Update handoff docs for deployed state and auth roadmap (#22)
* Add editable display-name field to generic integrations

Lets users set a custom name for Proxmox, Docker, AWS, Remote Desktop,
Netbird, Cloudflare, Uptime Kuma, and Weather integrations, separate
from the host/IP field, mirroring the SSH host rename pattern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_016kF4hZWEkRCPPvCZTeXxn4

* Surface the new-integration name field as a labeled input

The name field for new generic integrations was a faint header input
with only placeholder text, easy to miss. Move it into the form grid
as a proper labeled "Name" field next to the other connection fields.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_016kF4hZWEkRCPPvCZTeXxn4

* Add file upload for SSH private key and certificate fields

Lets users pick a key file from disk (e.g. ~/.ssh) instead of pasting
its contents into the Private Key / OPKSSH Certificate fields.

* Fix SSH private key paste corrupting multi-line PEM format

Private Key and Certificate fields were single-line <input> elements,
which strip newlines on paste and corrupt PEM-formatted keys (causing
'Unsupported key format' errors). Render them as multi-line textareas
instead so pasted keys keep their line breaks.

* Add JSON-converted bookmark import file for Archnest data import

Converts homarr-bookmarks.md into the format expected by /api/data/import.

* Auto-populate bookmark icons via favicon service in import JSON

Each bookmark now points to Google's favicon endpoint for its domain
instead of having no icon at all.

* Update handoff docs for deployed state and auth-system work-in-progress

HANDOFF.md and README.md still described deployment as the open task;
the app has been live on racknerd1 for several sessions now. Rewrites
both to reflect current state and lay out the 4-phase auth/SSO plan
(menu fix done, password/sessions/login-log/multi-user/SSO pending) so
the next session can pick up at Phase 2 without re-deriving context.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-20 09:50:06 -04:00
Claude
3d9c4c65c2
Update docs: mark feature work complete, document deploy setup as the only remaining task
HANDOFF.md and TERMIX_MIGRATION.md were stale (pre-dated the full Termix migration). Rewrote HANDOFF.md to reflect the current feature-complete state and point straight at deployment setup. Expanded README's Deployment section into concrete steps (host provisioning, secrets, .env, DNS) since the workflow/compose files already exist and just need configuring. Added a top-level .env.example for the server-side .env that docker-compose.yml expects.
2026-06-19 16:41:32 +00:00
Claude
1d1f98f5aa
Update HANDOFF.md: Proxmox TLS and fast-jwt fixes are done 2026-06-19 10:28:58 +00:00
Claude
70b2ef8a69
Update README and add HANDOFF.md for session handoff
Documents the real backend, all 8 completed integration adapters, known
caveats (Proxmox TLS, fast-jwt vuln, SSH key textarea UX, the
IntegrationType/integrationTypes enum duplication footgun), and what's
explicitly on hold (Terminal/Termix), so another AI session can resume
work with full context.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BbJV5nm8KPVH1oNJYKpnoF
2026-06-18 21:12:50 +00:00