dev_arc_aws/.kiro/specs/code-audit-fixes/requirements.md
Samuel James c9d7462c33
All checks were successful
CI / validate (pull_request) Successful in 7m7s
Add code-audit-fixes spec + audit steering
- Spec for fixing 13 Critical + High code audit issues
- requirements.md (14 requirements, EARS format)
- design.md (component fixes + 8 correctness properties)
- tasks.md (8 tasks, parallel execution waves)
- CloudFormation test deploy template planned
- code-audit.md steering file (all 40 issues documented)

Co-authored-by: Samuel James <ssamjame@amazon.com>
Co-authored-by: Kiro <noreply@kiro.dev>
2026-06-24 15:19:32 -04:00

11 KiB

Requirements Document

Introduction

This feature addresses 13 Critical and High severity issues identified during a code audit of the ArchNest self-hosted ops dashboard. The fixes target security vulnerabilities (injection, traversal, timing attacks, token exposure), resource leaks (WebSocket, SSH connections), and stability gaps (missing error boundaries, unhandled exceptions). A CloudFormation template for test deployment is also included to verify fixes in an isolated environment.

Glossary

  • Backend: The Fastify 5 + TypeScript server application in backend/src/
  • Frontend: The React 19 + Vite 8 + TypeScript client application in src/
  • WebSocket_Session: A browser-to-server WebSocket connection used for terminal, Docker exec, or tmux-list operations
  • Terminal_Route: The backend WebSocket endpoint at /api/terminal handling SSH terminal sessions
  • Docker_SSH_Module: The backend/src/ssh/docker.ts module that runs Docker CLI commands over SSH
  • Files_Route: The backend REST endpoint group at /api/files/:integrationId/* for SFTP file operations
  • Agents_Route: The backend endpoint at /api/agents/docker/report for agent token-gated ingest
  • Data_Route: The backend endpoint group at /api/data/import and /api/data/export for backup/restore
  • Docker_Exec_Route: The backend WebSocket endpoint at /api/docker/exec for Docker container exec sessions
  • Error_Boundary: A React class component that catches JavaScript errors in its child component tree and renders a fallback UI
  • CloudFormation_Template: An AWS CloudFormation YAML file that provisions test infrastructure (EC2, security group, budget alarm)
  • Container_Ref: A Docker container name or ID string validated before interpolation into shell commands
  • Integration_ID: A numeric identifier for an SSH/Docker integration stored in the SQLite database

Requirements

Requirement 1: WebSocket Session Leak Prevention

User Story: As an operator, I want terminal reconnections to properly close prior WebSocket connections, so that dangling sessions do not accumulate and exhaust server resources.

Acceptance Criteria

  1. WHEN a new terminal WebSocket connection is initiated for a pane, THE Frontend SHALL close the existing WebSocket (if any) before creating a new WebSocket instance.
  2. WHEN the existing WebSocket readyState is OPEN or CONNECTING, THE Frontend SHALL call ws.close() on the existing reference prior to reassignment.
  3. IF the WebSocket onclose event fires after a new connection has replaced the reference, THEN THE Frontend SHALL not modify state belonging to the new connection.

Requirement 2: tmux Session Name Injection Prevention

User Story: As a security engineer, I want tmux session names to be strictly validated, so that shell metacharacter injection through the terminal WebSocket is impossible.

Acceptance Criteria

  1. THE Terminal_Route SHALL validate tmux session names against the pattern ^[A-Za-z0-9_-]{1,64}$.
  2. WHEN a connect message includes a tmuxSession value that does not match the allowed pattern, THE Terminal_Route SHALL treat it as null and open a plain shell instead.
  3. WHEN a valid tmux session name is used in a shell command, THE Terminal_Route SHALL pass it only within a validated context where no additional characters can be appended by the client.

Requirement 3: Docker Container Reference Validation

User Story: As a security engineer, I want container name/ID references to be tightly validated, so that shell command injection through crafted container identifiers is prevented.

Acceptance Criteria

  1. THE Docker_SSH_Module SHALL validate container references against the pattern ^[a-zA-Z0-9][a-zA-Z0-9_.-]{0,127}$.
  2. WHEN a container reference fails validation, THE Docker_SSH_Module SHALL throw an error before any shell command is constructed.
  3. THE Docker_SSH_Module SHALL pass validated container references through single-quote shell escaping before interpolation into commands.

Requirement 4: Sidebar Promise Error Handling

User Story: As a user, I want the sidebar to handle API failures gracefully, so that an unhandled promise rejection does not crash the application or produce console errors.

Acceptance Criteria

  1. WHEN the api.listIntegrations() call in the Sidebar component rejects, THE Frontend SHALL catch the error and leave the integrations state as null.
  2. IF the integrations API call fails, THEN THE Frontend SHALL display the "Checking…" status label rather than an error state.

Requirement 5: WebSocket Authentication via First Message

User Story: As a security engineer, I want JWT tokens removed from WebSocket URL query strings, so that tokens are not logged in server access logs or proxy logs.

Acceptance Criteria

  1. THE Frontend SHALL not include the JWT token as a URL query parameter when opening terminal or Docker exec WebSocket connections.
  2. WHEN a terminal WebSocket connection opens, THE Frontend SHALL send an authentication message containing the JWT token as the first message before any other message type.
  3. WHEN the Backend receives a WebSocket connection on the terminal or Docker exec endpoints, THE Backend SHALL require a valid JWT token in the first message before processing connect or list_tmux messages.
  4. IF the first message does not contain a valid JWT token, THEN THE Backend SHALL send an error frame and close the WebSocket connection.

Requirement 6: File Path Traversal Prevention

User Story: As a security engineer, I want file operation paths to be validated against directory traversal, so that attackers cannot access files outside the intended directory scope.

Acceptance Criteria

  1. WHEN a path parameter is received by the Files_Route, THE Files_Route SHALL reject paths containing ../ sequences after normalization.
  2. WHEN a path parameter is received by the Files_Route, THE Files_Route SHALL reject absolute paths (paths starting with /).
  3. IF a path fails traversal validation, THEN THE Files_Route SHALL return HTTP 400 with a descriptive error message.
  4. THE Files_Route SHALL apply path validation to all endpoints that accept a user-supplied path: list, content read, content write, mkdir, rename, delete, chmod, download, and upload.

Requirement 7: Agent Token Timing-Safe Comparison

User Story: As a security engineer, I want agent token comparison to use constant-time equality, so that timing side-channel attacks cannot be used to guess the token byte-by-byte.

Acceptance Criteria

  1. THE Agents_Route SHALL compare presented tokens using crypto.timingSafeEqual().
  2. WHEN the presented token length differs from the expected token length, THE Agents_Route SHALL pad both buffers to equal length before performing the constant-time comparison.
  3. THE Agents_Route SHALL return the same HTTP response (401 Unauthorized) regardless of whether the token length or content mismatched.

Requirement 8: Data Import Size Limit

User Story: As an operator, I want data import requests to have a body size limit, so that an admin cannot accidentally or maliciously cause a denial-of-service by importing an extremely large JSON payload.

Acceptance Criteria

  1. THE Data_Route import endpoint SHALL enforce a maximum request body size of 10 MB.
  2. IF a request body exceeds 10 MB, THEN THE Data_Route SHALL reject the request with HTTP 413 before parsing the JSON payload.

Requirement 9: SSH Connection Leak Prevention

User Story: As an operator, I want SSH connections used for Docker-over-SSH operations to be closed reliably, so that connection leaks do not exhaust SSH connection limits on remote hosts.

Acceptance Criteria

  1. THE Docker_SSH_Module withSshClient function SHALL close both the primary SSH client and any jump-host client in a finally block after the operation completes.
  2. IF the operation function throws an error, THEN THE Docker_SSH_Module SHALL still close both SSH connections before returning the error result.

Requirement 10: CORS Origin Fail-Closed Default

User Story: As a security engineer, I want CORS to reject all cross-origin requests when no explicit origin is configured, so that a misconfigured deployment does not silently allow any origin.

Acceptance Criteria

  1. WHEN the ARCHNEST_CORS_ORIGIN environment variable is not set, THE Backend SHALL default CORS origin to false (reject all cross-origin requests).
  2. WHEN the ARCHNEST_CORS_ORIGIN environment variable is set to a valid origin string, THE Backend SHALL use that value as the allowed CORS origin.
  3. THE Backend SHALL log a warning at startup when ARCHNEST_CORS_ORIGIN is not configured, indicating that cross-origin requests are blocked.

Requirement 11: React Error Boundary

User Story: As a user, I want the application to catch rendering errors gracefully, so that a single component crash does not take down the entire dashboard.

Acceptance Criteria

  1. THE Frontend SHALL wrap the Dashboard component tree in an Error_Boundary component.
  2. WHEN a child component throws during rendering, THE Error_Boundary SHALL catch the error and render a fallback UI instead of a blank screen.
  3. THE Error_Boundary fallback UI SHALL display a message indicating an error occurred and provide a way to reload the page.
  4. THE Error_Boundary SHALL log the caught error to the browser console for debugging.

Requirement 12: WebSocket JSON Parse Error Handling

User Story: As an operator, I want malformed WebSocket messages to be handled gracefully, so that invalid JSON from a client does not crash the connection handler.

Acceptance Criteria

  1. WHEN the Docker_Exec_Route receives a WebSocket message that is not valid JSON, THE Docker_Exec_Route SHALL send an error frame with message "Invalid JSON" to the client.
  2. WHEN the Docker_Exec_Route receives invalid JSON, THE Docker_Exec_Route SHALL not close the WebSocket connection (allowing the client to retry).

Requirement 13: Session Log Path Traversal Prevention

User Story: As a security engineer, I want the integrationId used in session log file paths to be strictly validated as numeric, so that path traversal through crafted identifiers is impossible.

Acceptance Criteria

  1. WHEN session logging is enabled, THE Terminal_Route SHALL validate that the integrationId is a positive integer before constructing the log file path.
  2. IF the integrationId is not a valid positive integer, THEN THE Terminal_Route SHALL skip session logging for that connection rather than writing to an unvalidated path.

Requirement 14: CloudFormation Test Deploy Template

User Story: As a developer, I want a CloudFormation template that provisions a t4g.small EC2 instance with Docker in us-east-1, so that I can verify all fixes in an isolated environment within a $30/month budget.

Acceptance Criteria

  1. THE CloudFormation_Template SHALL provision a t4g.small EC2 instance in us-east-1 running Amazon Linux 2023.
  2. THE CloudFormation_Template SHALL create a security group allowing inbound SSH (port 22) and HTTP/HTTPS (ports 80, 443) from any source.
  3. THE CloudFormation_Template SHALL install Docker and Docker Compose on the EC2 instance via UserData.
  4. THE CloudFormation_Template SHALL create an AWS Budget alarm at $30/month threshold with email notification.
  5. THE CloudFormation_Template SHALL output the instance public IP and instance ID for SSH access.
  6. THE CloudFormation_Template SHALL accept parameters for the SSH key pair name and notification email address.