Context is Everything logo

4. Data Model

Overview

Sasha Studio uses SQLite via better-sqlite3 (synchronous API). Single database file at DB_PATH (default: /app/data/sasha.db in Docker, ../data/sasha.db locally).

Schema source: claudecodeui/server/database/init.sql + migrations in db.js.

Database settings:

  • PRAGMA foreign_keys = ON
  • PRAGMA busy_timeout = 10000 (for NFS/EFS compatibility)
  • journal_mode = DELETE (not WAL -- safer on network filesystems)
  • RESET_DATABASE=true env var triggers full schema rebuild

Entity Relationship Diagram

users (1) ──────── (N) company_profiles
  │                       │
  │                       ├── (N) onboarding_documents
  │                       └── (N) research_documents
  │
  ├── (N) sessions ──── (N) raw_exchanges ──── (N) usage_events
  │         │                    │
  │         │                    ├── (N) response_quality_signals
  │         │                    └── (N) insights
  │         │
  │         └── (N) instruction_failures
  │
  ├── (N) api_keys ──── (N) api_key_usage
  │
  ├── (N) schedules ──── (N) execution_history
  │
  ├── (N) executions (unified log)
  │
  ├── (N) hook_events
  │
  └── (N) feedback_events

cloud_providers (1) ── (N) cloud_connections (1) ── (1) cloud_credentials
                                │
                                └── (N) cloud_mounts ── (N) cloud_mount_events

bedrock_config (single row)
bedrock_model_patterns (cache)
bedrock_context_audit (log)

system_config (key-value)
named_secrets (encrypted)
prompt_versions (versioned prompts)
stats_refresh_log (reporting)
hook_ingest_state (checkpoint)

Tables

Core: users

Column Type Constraints Description
id INTEGER PK AUTOINCREMENT
username TEXT UNIQUE NOT NULL Login identifier
password_hash TEXT NOT NULL bcrypt hash
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
last_login DATETIME Updated on each login
is_active BOOLEAN DEFAULT 1 Account enabled
is_admin BOOLEAN DEFAULT 0 Admin privileges

Indexes: idx_users_username, idx_users_active

Notes: First registered user automatically gets is_admin = 1. No self-registration after first user.

Core: company_profiles

Column Type Constraints Description
id INTEGER PK AUTOINCREMENT
user_id INTEGER FK users(id) NOT NULL Owner
project_id VARCHAR(36) UNIQUE UUID for multi-project support
workspace_path TEXT Filesystem path for project
search_term TEXT Company name or URL used for research
search_type TEXT How company was identified
company_name TEXT
company_url TEXT
industry TEXT
company_size TEXT
founded_year INTEGER
headquarters TEXT
description TEXT
key_people JSON
key_customers JSON
key_partners JSON
competitors JSON
onboarding_completed BOOLEAN DEFAULT 0
onboarding_method TEXT DEFAULT 'documents'
research_status TEXT DEFAULT 'pending' pending → researching → completed
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP

Core: system_config

Key-value store for application settings.

Column Type Description
id INTEGER PK
config_key TEXT UNIQUE NOT NULL
config_value TEXT JSON or plain text
updated_at DATETIME
updated_by INTEGER FK users(id)

Known keys: AI provider settings, upload limits, general settings.

Auth: password_resets

Column Type Description
id INTEGER PK
user_id INTEGER FK users(id)
token_hash TEXT SHA-256 hash of reset token
expires_at DATETIME Token expiration
used_at DATETIME NULL until used (single-use)
created_at DATETIME
created_ip TEXT IP that requested reset
user_agent TEXT Browser user agent

Auth: api_keys

Column Type Description
id TEXT PK, format: sk_...
key_hash TEXT bcrypt hash of full key
signing_secret TEXT HMAC secret for callback verification
name TEXT Human-readable label
user_id INTEGER FK users(id), owner
callback_url TEXT Default webhook callback URL
created_at TEXT
last_used_at TEXT Updated on each use
is_active INTEGER DEFAULT 1

Auth: api_key_usage

Column Type Description
id INTEGER PK
api_key_id TEXT FK api_keys(id)
meeting_id TEXT
event_type TEXT start, stop, callback_delivery, callback_failure
timestamp TEXT
metadata TEXT JSON

Activity: sessions

Column Type Description
id TEXT PK (UUID)
started_at TEXT ISO timestamp
ended_at TEXT
updated_at TEXT
user_id INTEGER FK users(id)
project_id TEXT
model TEXT Claude model used
system_prompt TEXT System prompt snapshot
claude_md_snapshot TEXT CLAUDE.md content at session start
meta_json TEXT Additional session metadata
output_style_name TEXT Applied output style
output_style_version_id INTEGER

Activity: raw_exchanges

Column Type Description
id TEXT PK (UUID)
ts TEXT ISO timestamp
session_id TEXT FK sessions(id)
user_id INTEGER FK users(id)
role TEXT 'user' or 'assistant'
content TEXT Message content
tool_names TEXT Comma-separated tool names used
is_correction BOOLEAN User corrected previous response
is_error BOOLEAN Response was an error
bash_exit_code INTEGER Exit code if bash command
files_touched TEXT Files modified in this exchange
tags TEXT
entities TEXT

Activity: usage_events

Column Type Description
id INTEGER PK
user_id INTEGER FK users(id)
session_id TEXT FK sessions(id)
exchange_id TEXT FK raw_exchanges(id)
provider TEXT 'anthropic' or 'bedrock'
model_id TEXT Specific model identifier
input_tokens INTEGER
output_tokens INTEGER
thinking_tokens INTEGER Extended thinking tokens
cache_read_tokens INTEGER Prompt cache reads
cache_write_tokens INTEGER Prompt cache writes
cost_usd REAL Calculated cost
latency_ms INTEGER Response latency
created_at TEXT

Quality: response_quality_signals

Detailed per-response quality analysis.

Column Type Description
exchange_id TEXT FK raw_exchanges(id), UNIQUE
session_id TEXT FK sessions(id)
prompt_chars / response_chars INTEGER Character counts
prompt_tokens / response_tokens INTEGER Token counts
hedging_score INTEGER 0-100, how much hedging language
certainty_score INTEGER 0-100, confidence level
asks_clarification INTEGER 0/1
truncation_detected INTEGER 0/1
code_blocks_count INTEGER
hallucination_risk_score INTEGER 0-100
response_primary_type TEXT Classification
quality_score INTEGER Overall 0-100
output_style_name TEXT
user_prompt_version_id INTEGER FK prompt_versions

Quality: feedback_events

Column Type Description
id TEXT PK
response_id TEXT
session_id TEXT
feedback TEXT CHECK: 'up', 'down', 'neutral'
reason TEXT Free-text explanation
model_id TEXT
user_id INTEGER
created_at TEXT

Constraint: UNIQUE(response_id, user_id) -- one feedback per user per response.

Scheduling: schedules

Column Type Description
id INTEGER PK
name TEXT UNIQUE, human-readable
prompt_name TEXT Skill/prompt to execute
cron_expression TEXT NULL = on-demand only
timezone TEXT DEFAULT 'UTC'
enabled INTEGER DEFAULT 1
timeout_ms INTEGER DEFAULT 300000 (5 min)
last_run_at TEXT
last_status TEXT
created_by INTEGER FK users(id)

Scheduling: execution_history

Column Type Description
id INTEGER PK
execution_id TEXT UNIQUE, deterministic hash
schedule_id INTEGER FK schedules(id)
schedule_name TEXT Denormalized for display
status TEXT running, completed, failed, timeout, stale
trigger_type TEXT scheduled, manual, catchup
scheduled_for TEXT Time slot for this execution
started_at TEXT
completed_at TEXT
duration_ms INTEGER
output_file TEXT Path to output
result_json TEXT Structured result
input_tokens / output_tokens INTEGER

Idempotency: UNIQUE(schedule_id, scheduled_for) prevents duplicate executions.

Scheduling: executions (unified log)

Unified execution log for all execution types (schedule, meeting, API task).

Column Type Description
execution_type TEXT schedule, meeting, api_task
name TEXT Task/meeting name
status TEXT running, completed, failed, timeout, stale, cancelled
trigger_type TEXT scheduled, manual, catchup, api
api_key_id TEXT FK api_keys(id), for API-triggered
context_json TEXT Type-specific context
idempotency_key TEXT UNIQUE WHERE NOT NULL

Cloud: cloud_providers

Column Type Description
type TEXT CHECK: 'onedrive', 'sharepoint', 'gdrive'
display_name TEXT
client_id TEXT OAuth client ID
scopes TEXT OAuth scopes

Cloud: cloud_connections

Column Type Description
provider_id INTEGER FK cloud_providers(id)
status TEXT CHECK: 'pending', 'authorized', 'revoked'
scope_level TEXT CHECK: 'read', 'read_write'
integration_mode TEXT CHECK: 'mount', 'api'
created_by INTEGER FK users(id)

Cloud: cloud_credentials

Column Type Description
connection_id INTEGER PK, FK cloud_connections(id)
refresh_token_encrypted TEXT AES-256-GCM encrypted
access_token_encrypted TEXT AES-256-GCM encrypted
access_token_expires_at DATETIME
tenant_id TEXT Microsoft tenant ID
account_email TEXT

Cloud: cloud_mounts

Column Type Description
connection_id INTEGER FK cloud_connections(id)
mount_path TEXT Local filesystem mount point
remote_name TEXT rclone remote name
access_mode TEXT CHECK: 'read', 'read_write'
vfs_profile TEXT DEFAULT 'balanced'
health_status TEXT unmounted, mounting, mounted, degraded, remounting

AI Config: bedrock_config

Single-row table (enforced by CHECK (id = 1)).

Column Type Description
api_key_encrypted TEXT AES-256-GCM encrypted AWS key
region TEXT DEFAULT 'us-east-1'
active_model TEXT Current model ID
enable_1m_context BOOLEAN Extended context toggle

Versioning: prompt_versions

Column Type Description
scope TEXT CHECK: 'user', 'project', 'specialist', 'output_style'
scope_id TEXT Context-dependent identifier
file_path TEXT File being versioned
version INTEGER Incrementing version number
content_hash TEXT SHA-256 of content
content TEXT Full content snapshot
is_active INTEGER Current active version

Constraint: UNIQUE(scope, scope_id, file_path, version)

Hook Analytics

Six tables track Claude Code hook events for analytics:

  • hook_events -- Raw event records (event_id PK, event type, session, tool)
  • hook_tool_execs -- Tool execution details (duration, command, success)
  • hook_sessions -- Session metadata (start/end, source, event count)
  • hook_subagents -- Subagent tracking (agent type, transcript path)
  • hook_file_accesses -- File access audit (op, path, size)
  • hook_usage_rollups -- Pre-aggregated stats (bucket: day/hour/session/tool)
  • hook_command_rollups -- Command usage patterns (chain analysis)
  • hook_ingest_state -- Checkpoint for incremental JSONL processing

Secrets: named_secrets

Column Type Description
name TEXT Display name
normalized_name TEXT UNIQUE, lowercased/trimmed
value_encrypted TEXT AES-256-GCM encrypted
metadata JSON Usage context, tags
last_accessed_at DATETIME

Monitoring: stats_refresh_log

Tracks analytics refresh operations.

Column Type Description
triggered_by TEXT ui_button, sasha_internal, api_external, scheduled
status TEXT running, completed, failed
hook_events_processed INTEGER
incremental BOOLEAN Incremental vs full refresh

Migration System

Migrations are applied in db.js at startup. They use ALTER TABLE and CREATE TABLE IF NOT EXISTS patterns. No down-migration support.

Key migrations:

  1. Add is_admin column to users
  2. Create usage_events table (replaces deprecated bedrock_usage)
  3. Create feedback_events table
  4. Expand prompt_versions scope to include 'output_style'
  5. Add output style tracking columns and indexes
  6. Make schedules.cron_expression nullable (on-demand tasks)
  7. Add extended context columns for Bedrock

Sensitive Fields

Table Field Protection
users password_hash bcrypt
bedrock_config api_key_encrypted AES-256-GCM
cloud_credentials refresh_token_encrypted AES-256-GCM
cloud_credentials access_token_encrypted AES-256-GCM
named_secrets value_encrypted AES-256-GCM
api_keys key_hash bcrypt
api_keys signing_secret Plaintext (server-side only)
password_resets token_hash SHA-256