dev_arc_aws/.kiro/steering/code-audit.md

177 lines
6.8 KiB
Markdown
Raw Normal View History

2026-06-24 19:20:18 +00:00
---
inclusion: manual
---
# ArchNest Code Audit — Known Issues & Fix Plan
> Last audited: 2026-06-24. This file tracks known issues by severity.
> Use this to guide fixes before production deploy.
---
## CRITICAL (3) — Must fix before deploy
### 1. Terminal WebSocket session leak
- **File**: `src/lib/TerminalSessionContext.tsx`
- **Problem**: Old WebSocket reference overwritten before cleanup runs on reconnect, creating dangling connections that never close.
- **Fix**: Close existing WS before creating new one. Guard against double-open.
### 2. tmux session name injection
- **File**: `backend/src/routes/terminal.ts`
- **Problem**: Regex allows characters that don't fully prevent shell metacharacter injection when constructing the `tmux attach -t` command.
- **Fix**: Whitelist alphanumeric + dash + underscore only. Reject everything else.
### 3. Docker container ref validation insufficient
- **File**: `backend/src/ssh/docker.ts`
- **Problem**: `CONTAINER_REF_RE` may allow problematic characters through validation gaps.
- **Fix**: Tighten regex to `^[a-zA-Z0-9][a-zA-Z0-9_.-]*$` only.
---
## HIGH (10) — Fix before production
### 4. Unhandled promise in Sidebar
- **File**: `src/components/Sidebar.tsx`
- **Fix**: Add `.catch()` to `api.listIntegrations()` call.
### 5. JWT token in WebSocket URL
- **File**: `src/lib/TerminalSessionContext.tsx`
- **Problem**: Token in query string gets logged in server access logs.
- **Fix**: Send token as first WebSocket message or via subprotocol.
### 6. Path traversal in file operations
- **File**: `backend/src/routes/files.ts`
- **Problem**: No validation against `../` sequences or absolute paths.
- **Fix**: Normalize path, reject if it escapes the allowed root.
### 7. Agent token timing attack
- **File**: `backend/src/routes/agents.ts`
- **Problem**: Early return on length mismatch leaks timing info.
- **Fix**: Use `crypto.timingSafeEqual()` with Buffer padding.
### 8. No data import size limit
- **File**: `backend/src/routes/data.ts`
- **Problem**: Admin can import arbitrarily large JSON → DoS.
- **Fix**: Add `bodyLimit` to the route (e.g., 10MB max).
### 9. SSH connection leak in Docker-over-SSH
- **File**: `backend/src/ssh/docker.ts`
- **Problem**: Promise rejection paths may leave SSH connections open.
- **Fix**: Add `finally` block that calls `client.end()`.
### 10. CORS origin defaults to any
- **File**: `docker-compose.yml` / `server.ts`
- **Problem**: `ARCHNEST_CORS_ORIGIN` falls back to `true` (allow all) when unset.
- **Fix**: Require explicit origin in production. Fail-closed.
### 11. No React error boundary
- **File**: `src/App.tsx`
- **Problem**: Any component crash takes down the entire app.
- **Fix**: Add an ErrorBoundary wrapper around `<Dashboard />`.
### 12. WebSocket JSON.parse unhandled
- **File**: `backend/src/routes/docker.ts`
- **Problem**: No try-catch around JSON.parse in WS message handler.
- **Fix**: Wrap in try-catch, send error frame back to client.
### 13. Session log path traversal
- **File**: `backend/src/routes/terminal.ts`
- **Problem**: `integrationId` used directly in filesystem path without sanitization.
- **Fix**: Validate integrationId is numeric only.
---
## MEDIUM (13) — Should fix
### 14. Auth token in localStorage (XSS risk)
- `src/lib/AuthContext.tsx` — JWT in localStorage. If XSS hits, token is stolen.
- Long-term fix: HttpOnly cookie. Short-term: accept risk, add CSP headers.
### 15. No refresh token mechanism
- `backend/src/routes/auth.ts` — Users forced to re-login on token expiry.
- Fix: Add refresh token rotation endpoint.
### 16. No login rate limiting
- `backend/src/routes/auth.ts` — Brute-force attack possible.
- Fix: Add rate limiter plugin (e.g., `@fastify/rate-limit`, 5 attempts/min).
### 17. Weak binary file detection
- `backend/src/routes/files.ts` — Null byte check insufficient for UTF-8 binary files.
- Fix: Check first 512 bytes for control characters (excluding newline/tab).
### 18. Missing category uniqueness
- `backend/src/db/index.ts``bookmark_categories` allows duplicate names.
- Fix: Add UNIQUE constraint on `(name)` or `(name, tenant_id)` for SaaS.
### 19. IntegrationId validation weak
- `backend/src/routes/docker.ts``Number()` on string param doesn't reject `NaN`.
- Fix: Parse with parseInt, check `isNaN()`, return 400.
### 20. SSH shell without PTY in Docker exec
- `backend/src/routes/dockerSsh.ts` — Output buffering issues in non-interactive mode.
- Fix: Always request PTY for exec sessions.
### 21. No privileged port validation
- `backend/src/routes/tunnels.ts` — Ports <1024 allowed without OS privilege.
- Fix: Warn or reject ports <1024 unless running as root.
### 22. No size validation on data export
- `backend/src/routes/data.ts` — Can generate massive JSON response.
- Fix: Stream response or add a row limit.
### 23. Silent listResources failures
- `backend/src/routes/integrations.ts` — Errors caught but not surfaced.
- Fix: Return partial results + error flag per integration.
### 24. Missing admin action audit logging
- Multiple route files — No before/after state comparison in logs.
- Fix: Log old + new values on integration/user updates.
### 25. Terminal resize race condition
- `backend/src/routes/terminal.ts` — Resize between connect and channel-ready is lost.
- Fix: Queue resize events until channel is confirmed ready.
### 26. Missing database indexes
- `backend/src/db/index.ts` — No indexes on `events.created_at`, `sessions.user_id`.
- Fix: Add indexes for frequently-queried columns.
---
## LOW (14) — Nice to have
27. Missing CSP / X-Frame-Options headers
28. No avatar upload size validation
29. CPU stats edge case (single-core shows 0%)
30. Hardcoded tmux/docker format strings
31. State mutations outside React lifecycle
32. Weak CIDR validation error messages
33. File size limit hardcoded (should be configurable)
34. SSH key format not validated on upload
35. No import audit trail (who imported what, when)
36. Session logging file permissions not restricted
37. Missing Content-Security-Policy header
38. No health check for guacd sidecar
39. Hardcoded 5-second metrics polling interval
40. No graceful shutdown for SSH connections on SIGTERM
---
## Deploy Test Plan
### Phase A: Fix Critical + High
Fix issues 1-13 before any deploy. These represent real security and stability risks.
### Phase B: Test Deploy (AWS)
1. Deploy CloudFormation stack (t4g.small EC2, us-east-1)
2. SSH in, clone repo, create `.env`, `docker compose up -d --build`
3. Run through every page: Glance, Infrastructure, Terminal, Tunnels, Files, Containers, Remote Desktop, Host Metrics, BookNest, Settings, Help
4. Test: SSH connect, Docker list, file upload, bookmark CRUD, metrics polling
5. Budget alarm set at $30/month
### Phase C: Destroy
After successful test:
```bash
aws cloudformation delete-stack --stack-name archnest-test
```
This removes ALL resources (EC2, EIP, SG, budget) — clean slate.