4 Models Running

System Overview

System Health
Optimal
All services running normally
Active Models
4
llama3:8b, mistral:7b, codellama:13b, phi3:mini
Response Time
1.2s
Average response time (last hour)
Memory Usage
68%
43.5 GB / 64 GB total
GPU Utilization
84%
NVIDIA RTX 4090 - 20.2 GB / 24 GB
Cost Savings
87%
vs. cloud API costs (this month)

Recent Alerts

āš ļø
High GPU Memory Usage
GPU memory usage at 84% - consider unloading unused models
5 min ago
ā„¹ļø
Model Update Available
New version of llama3:8b available - improved performance
2 hours ago

Model Management

Llama 3 8B
General purpose model - fast and balanced
Running
Parameters
8.0B
Size
4.7 GB
Context
8,192
Speed
Fast
Mistral 7B
Long context specialist - great for analysis
Running
Parameters
7.3B
Size
4.1 GB
Context
32,768
Speed
Medium
CodeLlama 13B
Code generation and debugging specialist
Running
Parameters
13.0B
Size
7.3 GB
Context
16,384
Speed
Medium
Llama 3 70B
High-quality responses - requires GPU
Stopped
Parameters
70.6B
Size
39.9 GB
Context
8,192
Speed
Slow
Phi-3 Mini
Ultra-fast responses for simple tasks
Running
Parameters
3.8B
Size
2.2 GB
Context
4,096
Speed
Very Fast
Add New Model
Download from Ollama library or import custom model

Resource Monitor

System Resources Over Time

Real-time resource monitoring

Current Usage

CPU Usage 23%
Memory Usage 43.5 GB / 64 GB
GPU Memory 20.2 GB / 24 GB
GPU Utilization 84%
Disk Usage 156 GB / 500 GB
Network I/O 12.3 MB/s
Temperature 67°C
Active Requests 3

Configuration

Model Settings

Start essential models on boot

Resource Limits

80%
90%

Security & Access

Require API key for access
Log all API requests

Monitoring & Alerts

Send email alerts

Alerts & System Logs

Active Alerts

āš ļø
High GPU Memory Usage
GPU memory at 84% (20.2 GB / 24 GB) - consider stopping unused models
5 minutes ago
ā„¹ļø
Model Update Available
llama3:8b v2.1 available with 15% performance improvement
2 hours ago
ā„¹ļø
Scheduled Maintenance
System maintenance window scheduled for tonight 2:00 AM - 4:00 AM
1 day ago

System Logs

2024-01-05 14:32:15 [INFO] Ollama service started successfully
2024-01-05 14:32:16 [INFO] Model llama3:8b loaded (4.7 GB)
2024-01-05 14:32:18 [INFO] Model mistral:7b loaded (4.1 GB)
2024-01-05 14:32:20 [INFO] Model codellama:13b loaded (7.3 GB)
2024-01-05 14:32:22 [INFO] Model phi3:mini loaded (2.2 GB)
2024-01-05 14:32:25 [INFO] Sasha Studio API server listening on port 80
2024-01-05 14:35:12 [INFO] Chat request processed (llama3:8b, 1.2s response time)
2024-01-05 14:36:45 [INFO] Chat request processed (mistral:7b, 0.8s response time)
2024-01-05 14:38:23 [WARN] GPU memory usage: 84% (20.2 GB / 24 GB)
2024-01-05 14:39:15 [INFO] Chat request processed (codellama:13b, 1.5s response time)
2024-01-05 14:41:08 [INFO] System metrics collected - all services healthy
2024-01-05 14:42:30 [INFO] Model update check completed - 1 update available
2024-01-05 14:43:15 [INFO] Chat request processed (phi3:mini, 0.3s response time)