4. Data Model
Overview
Sasha Studio uses SQLite via better-sqlite3 (synchronous API). Single database file at DB_PATH (default: /app/data/sasha.db in Docker, ../data/sasha.db locally).
Schema source: claudecodeui/server/database/init.sql + migrations in db.js.
Database settings:
PRAGMA foreign_keys = ON
PRAGMA busy_timeout = 10000 (for NFS/EFS compatibility)
journal_mode = DELETE (not WAL -- safer on network filesystems)
RESET_DATABASE=true env var triggers full schema rebuild
Entity Relationship Diagram
users (1) ──────── (N) company_profiles
│ │
│ ├── (N) onboarding_documents
│ └── (N) research_documents
│
├── (N) sessions ──── (N) raw_exchanges ──── (N) usage_events
│ │ │
│ │ ├── (N) response_quality_signals
│ │ └── (N) insights
│ │
│ └── (N) instruction_failures
│
├── (N) api_keys ──── (N) api_key_usage
│
├── (N) schedules ──── (N) execution_history
│
├── (N) executions (unified log)
│
├── (N) hook_events
│
└── (N) feedback_events
cloud_providers (1) ── (N) cloud_connections (1) ── (1) cloud_credentials
│
└── (N) cloud_mounts ── (N) cloud_mount_events
bedrock_config (single row)
bedrock_model_patterns (cache)
bedrock_context_audit (log)
system_config (key-value)
named_secrets (encrypted)
prompt_versions (versioned prompts)
stats_refresh_log (reporting)
hook_ingest_state (checkpoint)
Tables
Core: users
| Column |
Type |
Constraints |
Description |
| id |
INTEGER |
PK AUTOINCREMENT |
|
| username |
TEXT |
UNIQUE NOT NULL |
Login identifier |
| password_hash |
TEXT |
NOT NULL |
bcrypt hash |
| created_at |
DATETIME |
DEFAULT CURRENT_TIMESTAMP |
|
| last_login |
DATETIME |
|
Updated on each login |
| is_active |
BOOLEAN |
DEFAULT 1 |
Account enabled |
| is_admin |
BOOLEAN |
DEFAULT 0 |
Admin privileges |
Indexes: idx_users_username, idx_users_active
Notes: First registered user automatically gets is_admin = 1. No self-registration after first user.
Core: company_profiles
| Column |
Type |
Constraints |
Description |
| id |
INTEGER |
PK AUTOINCREMENT |
|
| user_id |
INTEGER |
FK users(id) NOT NULL |
Owner |
| project_id |
VARCHAR(36) |
UNIQUE |
UUID for multi-project support |
| workspace_path |
TEXT |
|
Filesystem path for project |
| search_term |
TEXT |
|
Company name or URL used for research |
| search_type |
TEXT |
|
How company was identified |
| company_name |
TEXT |
|
|
| company_url |
TEXT |
|
|
| industry |
TEXT |
|
|
| company_size |
TEXT |
|
|
| founded_year |
INTEGER |
|
|
| headquarters |
TEXT |
|
|
| description |
TEXT |
|
|
| key_people |
JSON |
|
|
| key_customers |
JSON |
|
|
| key_partners |
JSON |
|
|
| competitors |
JSON |
|
|
| onboarding_completed |
BOOLEAN |
DEFAULT 0 |
|
| onboarding_method |
TEXT |
DEFAULT 'documents' |
|
| research_status |
TEXT |
DEFAULT 'pending' |
pending → researching → completed |
| created_at |
DATETIME |
DEFAULT CURRENT_TIMESTAMP |
|
| updated_at |
DATETIME |
DEFAULT CURRENT_TIMESTAMP |
|
Core: system_config
Key-value store for application settings.
| Column |
Type |
Description |
| id |
INTEGER |
PK |
| config_key |
TEXT |
UNIQUE NOT NULL |
| config_value |
TEXT |
JSON or plain text |
| updated_at |
DATETIME |
|
| updated_by |
INTEGER |
FK users(id) |
Known keys: AI provider settings, upload limits, general settings.
Auth: password_resets
| Column |
Type |
Description |
| id |
INTEGER |
PK |
| user_id |
INTEGER |
FK users(id) |
| token_hash |
TEXT |
SHA-256 hash of reset token |
| expires_at |
DATETIME |
Token expiration |
| used_at |
DATETIME |
NULL until used (single-use) |
| created_at |
DATETIME |
|
| created_ip |
TEXT |
IP that requested reset |
| user_agent |
TEXT |
Browser user agent |
Auth: api_keys
| Column |
Type |
Description |
| id |
TEXT |
PK, format: sk_... |
| key_hash |
TEXT |
bcrypt hash of full key |
| signing_secret |
TEXT |
HMAC secret for callback verification |
| name |
TEXT |
Human-readable label |
| user_id |
INTEGER |
FK users(id), owner |
| callback_url |
TEXT |
Default webhook callback URL |
| created_at |
TEXT |
|
| last_used_at |
TEXT |
Updated on each use |
| is_active |
INTEGER |
DEFAULT 1 |
Auth: api_key_usage
| Column |
Type |
Description |
| id |
INTEGER |
PK |
| api_key_id |
TEXT |
FK api_keys(id) |
| meeting_id |
TEXT |
|
| event_type |
TEXT |
start, stop, callback_delivery, callback_failure |
| timestamp |
TEXT |
|
| metadata |
TEXT |
JSON |
Activity: sessions
| Column |
Type |
Description |
| id |
TEXT |
PK (UUID) |
| started_at |
TEXT |
ISO timestamp |
| ended_at |
TEXT |
|
| updated_at |
TEXT |
|
| user_id |
INTEGER |
FK users(id) |
| project_id |
TEXT |
|
| model |
TEXT |
Claude model used |
| system_prompt |
TEXT |
System prompt snapshot |
| claude_md_snapshot |
TEXT |
CLAUDE.md content at session start |
| meta_json |
TEXT |
Additional session metadata |
| output_style_name |
TEXT |
Applied output style |
| output_style_version_id |
INTEGER |
|
Activity: raw_exchanges
| Column |
Type |
Description |
| id |
TEXT |
PK (UUID) |
| ts |
TEXT |
ISO timestamp |
| session_id |
TEXT |
FK sessions(id) |
| user_id |
INTEGER |
FK users(id) |
| role |
TEXT |
'user' or 'assistant' |
| content |
TEXT |
Message content |
| tool_names |
TEXT |
Comma-separated tool names used |
| is_correction |
BOOLEAN |
User corrected previous response |
| is_error |
BOOLEAN |
Response was an error |
| bash_exit_code |
INTEGER |
Exit code if bash command |
| files_touched |
TEXT |
Files modified in this exchange |
| tags |
TEXT |
|
| entities |
TEXT |
|
Activity: usage_events
| Column |
Type |
Description |
| id |
INTEGER |
PK |
| user_id |
INTEGER |
FK users(id) |
| session_id |
TEXT |
FK sessions(id) |
| exchange_id |
TEXT |
FK raw_exchanges(id) |
| provider |
TEXT |
'anthropic' or 'bedrock' |
| model_id |
TEXT |
Specific model identifier |
| input_tokens |
INTEGER |
|
| output_tokens |
INTEGER |
|
| thinking_tokens |
INTEGER |
Extended thinking tokens |
| cache_read_tokens |
INTEGER |
Prompt cache reads |
| cache_write_tokens |
INTEGER |
Prompt cache writes |
| cost_usd |
REAL |
Calculated cost |
| latency_ms |
INTEGER |
Response latency |
| created_at |
TEXT |
|
Quality: response_quality_signals
Detailed per-response quality analysis.
| Column |
Type |
Description |
| exchange_id |
TEXT |
FK raw_exchanges(id), UNIQUE |
| session_id |
TEXT |
FK sessions(id) |
| prompt_chars / response_chars |
INTEGER |
Character counts |
| prompt_tokens / response_tokens |
INTEGER |
Token counts |
| hedging_score |
INTEGER |
0-100, how much hedging language |
| certainty_score |
INTEGER |
0-100, confidence level |
| asks_clarification |
INTEGER |
0/1 |
| truncation_detected |
INTEGER |
0/1 |
| code_blocks_count |
INTEGER |
|
| hallucination_risk_score |
INTEGER |
0-100 |
| response_primary_type |
TEXT |
Classification |
| quality_score |
INTEGER |
Overall 0-100 |
| output_style_name |
TEXT |
|
| user_prompt_version_id |
INTEGER |
FK prompt_versions |
Quality: feedback_events
| Column |
Type |
Description |
| id |
TEXT |
PK |
| response_id |
TEXT |
|
| session_id |
TEXT |
|
| feedback |
TEXT |
CHECK: 'up', 'down', 'neutral' |
| reason |
TEXT |
Free-text explanation |
| model_id |
TEXT |
|
| user_id |
INTEGER |
|
| created_at |
TEXT |
|
Constraint: UNIQUE(response_id, user_id) -- one feedback per user per response.
Scheduling: schedules
| Column |
Type |
Description |
| id |
INTEGER |
PK |
| name |
TEXT |
UNIQUE, human-readable |
| prompt_name |
TEXT |
Skill/prompt to execute |
| cron_expression |
TEXT |
NULL = on-demand only |
| timezone |
TEXT |
DEFAULT 'UTC' |
| enabled |
INTEGER |
DEFAULT 1 |
| timeout_ms |
INTEGER |
DEFAULT 300000 (5 min) |
| last_run_at |
TEXT |
|
| last_status |
TEXT |
|
| created_by |
INTEGER |
FK users(id) |
Scheduling: execution_history
| Column |
Type |
Description |
| id |
INTEGER |
PK |
| execution_id |
TEXT |
UNIQUE, deterministic hash |
| schedule_id |
INTEGER |
FK schedules(id) |
| schedule_name |
TEXT |
Denormalized for display |
| status |
TEXT |
running, completed, failed, timeout, stale |
| trigger_type |
TEXT |
scheduled, manual, catchup |
| scheduled_for |
TEXT |
Time slot for this execution |
| started_at |
TEXT |
|
| completed_at |
TEXT |
|
| duration_ms |
INTEGER |
|
| output_file |
TEXT |
Path to output |
| result_json |
TEXT |
Structured result |
| input_tokens / output_tokens |
INTEGER |
|
Idempotency: UNIQUE(schedule_id, scheduled_for) prevents duplicate executions.
Scheduling: executions (unified log)
Unified execution log for all execution types (schedule, meeting, API task).
| Column |
Type |
Description |
| execution_type |
TEXT |
schedule, meeting, api_task |
| name |
TEXT |
Task/meeting name |
| status |
TEXT |
running, completed, failed, timeout, stale, cancelled |
| trigger_type |
TEXT |
scheduled, manual, catchup, api |
| api_key_id |
TEXT |
FK api_keys(id), for API-triggered |
| context_json |
TEXT |
Type-specific context |
| idempotency_key |
TEXT |
UNIQUE WHERE NOT NULL |
Cloud: cloud_providers
| Column |
Type |
Description |
| type |
TEXT |
CHECK: 'onedrive', 'sharepoint', 'gdrive' |
| display_name |
TEXT |
|
| client_id |
TEXT |
OAuth client ID |
| scopes |
TEXT |
OAuth scopes |
Cloud: cloud_connections
| Column |
Type |
Description |
| provider_id |
INTEGER |
FK cloud_providers(id) |
| status |
TEXT |
CHECK: 'pending', 'authorized', 'revoked' |
| scope_level |
TEXT |
CHECK: 'read', 'read_write' |
| integration_mode |
TEXT |
CHECK: 'mount', 'api' |
| created_by |
INTEGER |
FK users(id) |
Cloud: cloud_credentials
| Column |
Type |
Description |
| connection_id |
INTEGER |
PK, FK cloud_connections(id) |
| refresh_token_encrypted |
TEXT |
AES-256-GCM encrypted |
| access_token_encrypted |
TEXT |
AES-256-GCM encrypted |
| access_token_expires_at |
DATETIME |
|
| tenant_id |
TEXT |
Microsoft tenant ID |
| account_email |
TEXT |
|
Cloud: cloud_mounts
| Column |
Type |
Description |
| connection_id |
INTEGER |
FK cloud_connections(id) |
| mount_path |
TEXT |
Local filesystem mount point |
| remote_name |
TEXT |
rclone remote name |
| access_mode |
TEXT |
CHECK: 'read', 'read_write' |
| vfs_profile |
TEXT |
DEFAULT 'balanced' |
| health_status |
TEXT |
unmounted, mounting, mounted, degraded, remounting |
AI Config: bedrock_config
Single-row table (enforced by CHECK (id = 1)).
| Column |
Type |
Description |
| api_key_encrypted |
TEXT |
AES-256-GCM encrypted AWS key |
| region |
TEXT |
DEFAULT 'us-east-1' |
| active_model |
TEXT |
Current model ID |
| enable_1m_context |
BOOLEAN |
Extended context toggle |
Versioning: prompt_versions
| Column |
Type |
Description |
| scope |
TEXT |
CHECK: 'user', 'project', 'specialist', 'output_style' |
| scope_id |
TEXT |
Context-dependent identifier |
| file_path |
TEXT |
File being versioned |
| version |
INTEGER |
Incrementing version number |
| content_hash |
TEXT |
SHA-256 of content |
| content |
TEXT |
Full content snapshot |
| is_active |
INTEGER |
Current active version |
Constraint: UNIQUE(scope, scope_id, file_path, version)
Hook Analytics
Six tables track Claude Code hook events for analytics:
- hook_events -- Raw event records (event_id PK, event type, session, tool)
- hook_tool_execs -- Tool execution details (duration, command, success)
- hook_sessions -- Session metadata (start/end, source, event count)
- hook_subagents -- Subagent tracking (agent type, transcript path)
- hook_file_accesses -- File access audit (op, path, size)
- hook_usage_rollups -- Pre-aggregated stats (bucket: day/hour/session/tool)
- hook_command_rollups -- Command usage patterns (chain analysis)
- hook_ingest_state -- Checkpoint for incremental JSONL processing
Secrets: named_secrets
| Column |
Type |
Description |
| name |
TEXT |
Display name |
| normalized_name |
TEXT |
UNIQUE, lowercased/trimmed |
| value_encrypted |
TEXT |
AES-256-GCM encrypted |
| metadata |
JSON |
Usage context, tags |
| last_accessed_at |
DATETIME |
|
Monitoring: stats_refresh_log
Tracks analytics refresh operations.
| Column |
Type |
Description |
| triggered_by |
TEXT |
ui_button, sasha_internal, api_external, scheduled |
| status |
TEXT |
running, completed, failed |
| hook_events_processed |
INTEGER |
|
| incremental |
BOOLEAN |
Incremental vs full refresh |
Migration System
Migrations are applied in db.js at startup. They use ALTER TABLE and CREATE TABLE IF NOT EXISTS patterns. No down-migration support.
Key migrations:
- Add
is_admin column to users
- Create
usage_events table (replaces deprecated bedrock_usage)
- Create
feedback_events table
- Expand
prompt_versions scope to include 'output_style'
- Add output style tracking columns and indexes
- Make
schedules.cron_expression nullable (on-demand tasks)
- Add extended context columns for Bedrock
Sensitive Fields
| Table |
Field |
Protection |
| users |
password_hash |
bcrypt |
| bedrock_config |
api_key_encrypted |
AES-256-GCM |
| cloud_credentials |
refresh_token_encrypted |
AES-256-GCM |
| cloud_credentials |
access_token_encrypted |
AES-256-GCM |
| named_secrets |
value_encrypted |
AES-256-GCM |
| api_keys |
key_hash |
bcrypt |
| api_keys |
signing_secret |
Plaintext (server-side only) |
| password_resets |
token_hash |
SHA-256 |