dev_arc_aws/glance.md

357 lines
17 KiB
Markdown
Raw Normal View History

2026-06-18 08:14:00 -04:00
# Glance Page — Detailed Specification
> Purpose: The Glance page is the operational heartbeat of ArchNest. It provides an at-a-glance view of system health, resource utilization, security posture, and network connectivity across your homelab infrastructure. Every element answers a simple question: "Is everything okay right now?"
---
## Layout Structure
The Glance page uses this vertical stack (top to bottom):
1. **Top Bar** (sticky, 56px)
2. **Hero Banner** (200px, rounded, with KPI cards overlapping the bottom ~25%)
3. **Middle Row** (3 columns: 30% | 40% | 30%)
4. **Bottom Row** (2 columns: 65% | 35%)
**No footer.** The page scrolls naturally without a fixed status bar at the bottom.
---
## Top Bar
**Display:**
- Height: 56px, sticky at top, z-index above all content
- Background: Page color (#0D0E10), no border
- Left: Page title "GLANCE" (18px, bold, uppercase, primary text with subtle gold drop-shadow glow)
- Center-right: Search bar (260px, rounded-full, placeholder "Search resources...", card background)
- Right: Notification bell (with red badge count) + User avatar + dropdown trigger
### User Avatar & Dropdown Menu
**Avatar Display:**
- 32px circle with 2px gold border and subtle gold glow
- Shows initials "AO" in gold
- Adjacent: "ArchNest Ops" (12px, primary) + "Administrator" (9px, secondary)
- Chevron icon (rotates on open)
**Dropdown Menu (on click):**
- Position: Below avatar, aligned right
- Background: Card color with border, rounded-xl, shadow
- Header section: User name + email
- Menu items:
- **Profile** — navigates to Settings > Profile section
- **Appearance** — navigates to Settings > Appearance section
- **Security** — navigates to Settings > Security/Integrations section
- **Help & Support** — opens documentation/wiki link
- Divider
- **Sign Out** — logs out of session (red text, danger action)
---
## Hero Banner
**Display:**
- Full width of content area, 200px height, 12px border radius, overflow hidden
- Image: `archnest-hero-banner.png` from `/public` (local) or CDN
- Image positioning: `object-cover` with `object-position: center 30%` — prioritizes showing the upper portion (skyline/arch) rather than center-cropping
- The bottom ~25% of the banner is overlapped by the KPI status cards (negative margin)
- If image fails to load: shows card background color (#141518), no broken icon
---
## Top Row — Status KPI Cards (4 cards)
### Card Grid Layout
**IMPORTANT — Asymmetric widths:**
- KPI 1 (System Status): **1.3fr** — wider, has progress ring + sparkline
- KPI 2 (Infrastructure): **1fr** — standard width
- KPI 3 (Security): **1fr** — standard width
- KPI 4 (Network): **1.3fr** — wider, has sparkline chart
Grid: `grid-cols-[1.3fr_1fr_1fr_1.3fr]` with 12px gap.
**Card styling (all 4):**
- Background: `#141518` at 95% opacity with backdrop-blur (glass effect over banner)
- Border: 1px solid #1E2025, 12px radius
- Padding: 16px
- Hover: border transitions to gold (0.2s)
---
### 1. System Status
**What it represents:** Overall system health — a composite "green light" that confirms all critical services are reachable and all local packages/security updates are current.
**How it gets its data:**
- **Ping sweep**: Every 5 minutes, the backend pings all endpoints defined in a `systems.config` file (IP addresses of LXC containers, VMs, physical hosts, and key services). If all respond, health = 100%.
- **Package currency check**: Queries `apt` (or equivalent) on monitored hosts for pending security updates. Any pending security update reduces the percentage.
- **Calculation**: `(reachable_hosts / total_hosts) * weight_A + (up_to_date_hosts / total_hosts) * weight_B`. Default weights: 70% reachability, 30% package currency.
**Display:**
- Card title: "SYSTEM STATUS" (10px, uppercase, tracking 1.5px, secondary color, font-medium)
- Layout: Title top, then flex row with text left and ring right
- Left content:
- "All Systems" (13px, bold, primary)
- "Operational" (13px, bold, gold, italic)
- Right content: Progress Ring (44px diameter, 3px stroke, gold on dark track)
- Divider: 1px border-top, border/60 opacity
- Below divider: Sparkline (gold line chart, 20px height, no axes, showing last 12 check results)
- Footer text: "Last checked: 2m ago" (9px, secondary)
**Thresholds:**
- 100%: "All Systems Operational" (gold text)
- 8099%: "Degraded" (orange text)
- Below 80%: "Critical" (red text)
---
### 2. Infrastructure
**What it represents:** Total count of managed resources (LXC containers, VMs, Docker containers, bare-metal hosts) and their operational state.
**How it gets its data:**
- **Config-driven**: Reads from `infra.config` which contains API endpoints, SSH keys, and connection details for each resource (Proxmox API for LXC/VMs, Docker socket/API for containers, SSH for bare-metal).
- **Status polling**: Every 5 minutes, queries each resource's API or SSH to determine if it's running, stopped, or unreachable.
- **Disk usage check**: Pulls primary disk utilization from each resource. Any resource exceeding 70% disk usage triggers a warning signal.
**Display:**
- Card title: "INFRASTRUCTURE" (10px, uppercase, tracking 1.5px, secondary color)
- Icon + number row: Server icon (16px, gold) + "24" (24px, bold, primary)
- Subtitle: "Total Resources" (10px, secondary)
- Divider
- Breakdown row (9px): green dot + "24 Running" | yellow dot + "0 Warning" | red dot + "0 Critical"
**Signal logic:**
- 🟢 Running: Resource is responsive and disk < 70%
- 🟡 Warning: Resource is responsive but disk ≥ 70%, or resource response is slow (>2s)
- 🔴 Critical: Resource is unreachable or disk ≥ 90%
---
### 3. Security
**What it represents:** Active security alerts — failed intrusion attempts, brute-force attacks detected by fail2ban, and outdated security packages.
**How it gets its data:**
- **Fail2ban logs**: Queries fail2ban on each monitored host for currently banned IPs and recent ban events (last 24h).
- **Auth log monitoring**: Parses `/var/log/auth.log` (or equivalent) for failed login attempts exceeding a threshold (e.g., 5+ failures from same IP in 10 minutes).
- **Security package check**: Counts hosts with pending security-specific updates (`apt list --upgradable` filtered to security repo).
- **Alert count**: Sum of active fail2ban bans + hosts with outdated security packages.
**Display:**
- Card title: "SECURITY" (10px, uppercase, tracking 1.5px, secondary color)
- Icon + number row: Shield icon (16px, gold) + "2" (24px, bold, primary)
- Subtitle: "Active Alerts" (10px, secondary)
- Divider
- Breakdown row (9px): green dot + "2 Low" | yellow dot + "0 Medium" | red dot + "0 High"
**Severity logic:**
- Low (green): Informational — single failed login attempt, package update available
- Medium (yellow): Multiple failed attempts from same IP, security package >7 days overdue
- High (red): Active brute-force attack (10+ attempts in 5 min), critical vulnerability unpatched >14 days
---
### 4. Network
**What it represents:** Local network uptime — confirms internet connectivity and DNS resolution are functioning.
**How it gets its data:**
- **Ping probes**: Every 60 seconds, pings multiple resolvers (e.g., 1.1.1.1, 8.8.8.8, and one internal DNS) from the ArchNest host.
- **Uptime calculation**: Tracks successful/failed pings over a rolling 24-hour window. Uptime % = (successful_pings / total_pings) * 100.
- **Sparkline data**: Stores the last 24 data points (one per hour) for the mini trend chart.
**Display:**
- Card title: "NETWORK" (10px, uppercase, tracking 1.5px, secondary color)
- Icon + number row: Network icon (16px, gold) + "98.7%" (24px, bold, primary)
- Subtitle: "Network Uptime" (10px, secondary)
- Divider
- Below divider: Area sparkline chart (gold gradient fill, 24px height, no axes, 24 data points representing hourly uptime)
**Thresholds:**
- ≥99%: Healthy (no indicator needed)
- 9599%: Minor instability (orange sparkline highlight)
- <95%: Degraded (red sparkline highlight, triggers alert)
---
## Middle Row — Detail Panels (3 columns: 30% | 40% | 30%)
### 5. Resource Overview (left, 30%)
**What it represents:** Top 5 resource utilization metrics across your infrastructure — a quick view of where capacity is being consumed.
**How it gets its data:**
- Pulls from the same infrastructure polling that feeds the Infrastructure KPI
- **Compute**: Running LXC/VM count vs total allocated slots (from infra.config)
- **Storage**: Sum of used disk across all storage pools vs total capacity (e.g., Proxmox storage API, `df` on hosts)
- **Database**: Active database instances vs provisioned (PostgreSQL/MariaDB connection count or instance count)
- **Network**: Same uptime % from the Network KPI (displayed as a bar for consistency)
- **Containers**: Running Docker containers vs total defined in docker-compose or container configs
**Display:**
- Card title: "RESOURCE OVERVIEW" (uppercase, secondary color)
- Close button (X) in top-right corner
- 5 rows, each containing:
- Icon (category-specific, 16px, secondary color)
- Label (e.g., "Compute", "Storage") — 13px, primary color
- Progress bar (gold fill on dark track, rounded ends)
- Value text (e.g., "18 / 24" or "12.4 / 20 TB") — 12px, secondary color
**Bar color logic:**
- 069%: Gold (#C8A434)
- 7089%: Warning orange (#E67E22)
- 90100%: Danger red (#E74C3C)
---
### 6. Recent Activity (center, 40%)
**What it represents:** A chronological feed of the most recent system events — gives context on what just happened across the infrastructure.
**How it gets its data:**
- **Event aggregation**: Collects events from multiple sources:
- Backup completion notifications (from cron jobs or backup tools like restic/borgbackup)
- Security scan results (from scheduled ClamAV/rkhunter scans)
- Instance state changes (container start/stop/create from Proxmox/Docker APIs)
- Configuration changes (git commits to infra-as-code repos, or config file modification timestamps)
- Authentication events (successful logins from auth.log)
- **Storage**: Events stored in a lightweight local database (SQLite or JSON file) with timestamp, type, title, source, and severity.
**Display:**
- Card title: "RECENT ACTIVITY" (uppercase, secondary color)
- Close button (X) in top-right corner
- 5 items, each containing:
- Icon (event-type-specific: checkmark for completion, shield for security, play for launch, gear for config, user for login) — 14px, in a small rounded container (28px, page background)
- Event title (13px, bold, primary color) — e.g., "Backup completed"
- Source subtitle (11px, secondary color) — e.g., "Database Cluster 01"
- Relative timestamp (11px, secondary color, right-aligned) — e.g., "2m ago"
**Event ordering:** Newest first, max 5 displayed.
---
### 7. Top Alerts (right, 30%)
**What it represents:** The most urgent issues requiring attention — prioritized by severity then recency.
**How it gets its data:**
- **Alert aggregation**: Combines alerts from:
- High CPU/RAM usage events (from resource polling — any resource >85% CPU or >90% RAM for 5+ minutes)
- Disk space warnings (from Infrastructure polling — any disk >70%)
- Security events (from fail2ban/auth.log — active attacks)
- SSL certificate expiry (checks cert expiry dates for tracked domains, alerts at 30/14/7 days)
- Service down events (from System Status polling — any unreachable service)
**Display:**
- Card title: "TOP ALERTS" (uppercase, secondary color)
- "View all" link (gold, top-right) — navigates to a full alerts view
- Up to 4 items, each containing:
- Severity dot (🔴 red/large for high, 🟡 yellow for medium) — 6px diameter
- Alert title (13px, primary color, font-medium) — e.g., "High CPU Usage"
- Source subtitle (11px, secondary color) — e.g., "App Server 02"
- Relative timestamp (11px, secondary color, right-aligned) — e.g., "2m ago"
**Sort order:** High severity first, then by recency within same severity level.
---
## Bottom Row — Charts & Actions (2 columns: 65% | 35%)
### 8. Network Traffic (left, 65%)
**What it represents:** Visual representation of network throughput over the last 24 hours — helps spot unusual spikes or drops in traffic.
**How it gets its data:**
- **Interface monitoring**: Reads network interface stats (bytes in/out) from the primary gateway or router. Options:
- SNMP polling from router/switch
- `vnstat` on the ArchNest host or gateway
- Proxmox node network stats API
- **Sampling**: Records bytes in/out every 5 minutes, calculates Mbps/Gbps rate
- **Current values**: Latest sample provides the "Incoming X.XX Gbps" and "Outgoing X.XX Gbps" figures
- **Trend calculation**: Compares current hour's average to same hour yesterday for the percentage change (↑/↓)
**Display:**
- Card title: "NETWORK TRAFFIC" (uppercase, secondary color)
- Background: Custom background image (`archnest-network-traffic-bg.png`) rendered at ~20% opacity behind the chart, giving the card a unique visual identity
- Area chart (overlaid on background):
- Fill: Gold/amber gradient (top: #C8A434 at ~30% opacity, fading to transparent)
- Line: Gold (#C8A434) for inbound, amber (#E67E22) for outbound
- X-axis: 24-hour span (no visible labels)
- Y-axis: Auto-scaled (no visible labels)
- Stats (right side of chart, vertically stacked):
- "Incoming" label (11px, secondary) + "1.23 Gbps" (18px, bold, primary) + "↓ 12.4%" (11px, red)
- "Outgoing" label (11px, secondary) + "1.08 Gbps" (18px, bold, primary) + "↑ 8.7%" (11px, green)
---
### 9. Shortcuts (right, 35%)
**What it represents:** Quick-action buttons for common administrative tasks — one-click access to frequent operations.
**How it works:**
- Each shortcut triggers a predefined action or navigates to a specific workflow:
- **Add Server**: Opens a form/modal to add a new resource to `infra.config` (host IP, type, credentials)
- **Create Backup**: Triggers an on-demand backup job for a selected resource or all resources
- **Deploy App**: Opens a deployment workflow (e.g., pull latest docker-compose, restart containers)
- **View Logs**: Navigates to a log viewer or opens a terminal session with log tailing
**Display:**
- Card title: "SHORTCUTS" (uppercase, secondary color)
- 4 buttons in a horizontal row:
- Each button: Outlined/stroked icon inside a bordered rounded container (40px)
- Icon style: Lucide outlined icons (18px), secondary color, gold on hover
- Label below icon: 10px, secondary color, centered
- Container: 1px border (#1E2025), 8px radius, hover → gold border + gold icon
---
## Data Refresh & Polling Summary
| Data Source | Poll Interval | Used By |
|-------------|---------------|---------|
| System ping sweep | 5 minutes | System Status KPI |
| Package update check | 1 hour | System Status KPI |
| Infrastructure resource status | 5 minutes | Infrastructure KPI, Resource Overview |
| Disk usage per resource | 5 minutes | Infrastructure KPI, Resource Overview, Top Alerts |
| Fail2ban / auth.log | 2 minutes | Security KPI, Top Alerts |
| Security package check | 1 hour | Security KPI |
| Network ping probes | 60 seconds | Network KPI |
| Network interface throughput | 5 minutes | Network Traffic chart |
| Event log (activity) | Real-time (push) or 1 minute | Recent Activity |
| SSL cert expiry | 24 hours | Top Alerts |
| CPU/RAM per resource | 5 minutes | Top Alerts |
---
## Configuration Dependencies
| Config File | Purpose |
|-------------|---------|
| `systems.config` | List of IPs/hosts to ping for System Status |
| `infra.config` | Resource definitions with API endpoints, SSH keys, connection types |
| `alerts.config` | Threshold definitions (disk %, CPU %, failed login count, cert days) |
| `network.config` | Resolver IPs to ping, interface to monitor for traffic |
---
## Interaction Behaviors
| Element | Action | Result |
|---------|--------|--------|
| X button (Resource Overview) | Click | Hides the card for current session |
| X button (Recent Activity) | Click | Hides the card for current session |
| "View all" (Top Alerts) | Click | Navigates to full alerts list view |
| Shortcut button | Click | Triggers associated action/workflow |
| Status card | Hover | Gold border transition (0.2s) |
| Progress ring | Page load | Animates from 0 to value (1s) |
| Sparkline | Page load | Draws line animation (1s) |
| Progress bars | Page load | Fill animation (0.8s staggered) |
| User avatar/chevron | Click | Opens/closes user dropdown menu |
| Dropdown: Profile | Click | Navigates to Settings > Profile |
| Dropdown: Appearance | Click | Navigates to Settings > Appearance |
| Dropdown: Security | Click | Navigates to Settings > Security |
| Dropdown: Help & Support | Click | Opens docs/wiki link |
| Dropdown: Sign Out | Click | Ends session, returns to login |
| Click outside dropdown | Click | Closes the dropdown menu |