Last updated: Nov 17, 2025, 05:25 PM UTC

AI Costs: Complete Transparency

Understanding Your AI Investment with Sasha


Executive Summary

Zero Markup Pricing

Sasha operates on a direct cost pass-through model. You pay your AI provider (Anthropic or AWS Bedrock) directly at their published rates—no markup, no hidden fees. Your AI costs are completely transparent and under your control.

You provide your own API credentials → You pay your AI provider directly → You see every transaction


How Sasha's AI Pricing Works

API Token-Based Pricing

Unlike traditional software licensing, AI services charge per token—roughly 4 characters or 0.75 words. Every query and response consumes tokens, and you only pay for what you use.

Why Token-Based Pricing?

  • Pay-per-use: No monthly minimums, no wasted capacity
  • Scalable: Costs grow proportionally with actual usage
  • Predictable: Clear pricing tables from AI providers
  • Optimizable: Multiple cost-saving strategies available

Two Provider Options

Sasha supports two ways to access AI capabilities, both with direct billing:

Option 1: Direct Anthropic API

Best for: Organizations wanting simplest setup and latest models

  • Setup: Provide your Anthropic API key in Sasha admin settings
  • Billing: Monthly invoice from Anthropic based on token usage
  • Monitoring: Full usage dashboard in Anthropic console
  • Security: Direct API connection, encrypted token storage
  • Access: Immediate access to newest Claude models

Option 2: AWS Bedrock

Best for: Organizations with existing AWS infrastructure and strict data residency requirements

  • Setup: Configure AWS credentials with Bedrock permissions
  • Billing: Included in your AWS monthly bill
  • Monitoring: CloudWatch metrics and AWS Cost Explorer
  • Security: Data processed within your AWS region, never leaves AWS
  • Compliance: Regional deployment options for data sovereignty

Actual 2025 AI Pricing

Pricing Improves Over Time

AI models get better and cheaper every year. Unlike traditional software with fixed pricing, you benefit from:

  • New models released regularly with better performance at lower cost
  • Price reductions as AI providers scale infrastructure (prices have dropped 70% since 2023)
  • Instant control to switch models in Sasha settings—no vendor lock-in
  • Your choice to upgrade when ready or stick with proven models

You're in control: When a better model launches, simply update your settings and start using it immediately at the new (typically lower) price.

Direct Anthropic Pricing (January 2025)

Model Input Tokens Output Tokens Best For
Claude 3.5 Sonnet (Recommended) $3.00 / 1M $15.00 / 1M Balanced performance and cost
Claude 3.5 Haiku (Economy) $0.25 / 1M $1.25 / 1M High-volume queries, quick responses
Claude Opus 4.1 (Premium) $15.00 / 1M $75.00 / 1M Complex reasoning, critical decisions

Cost-Saving Features:

  • Prompt Caching: Up to 90% savings on repeated context
  • Batch Processing: 50% discount for non-urgent queries
  • Long Context: 200K tokens standard, 1M available with premium pricing

AWS Bedrock Pricing (January 2025)

Model Input Tokens Output Tokens Batch Discount
Claude 3.5 Sonnet $3.00 / 1M $15.00 / 1M $1.50 / $7.50 (50% off)
Claude 3.5 Haiku $1.00 / 1M $5.00 / 1M $0.50 / $2.50 (50% off)
Claude Opus $15.00 / 1M $75.00 / 1M $7.50 / $37.50 (50% off)

Additional AWS Benefits:

  • Consolidated Billing: AI costs on same bill as your other AWS services
  • Volume Discounts: AWS Enterprise agreements may reduce costs further
  • Regional Options: Deploy in specific AWS regions for compliance
  • Provisioned Throughput: Dedicated capacity from $22-44/hour for guaranteed performance

Note: Prices vary slightly by AWS region. Check AWS Bedrock pricing page for your region.


Usage Scenarios: Real Cost Examples

Light Usage: 100-500 Queries/Month

Who: Small teams, occasional research, project-based knowledge access

Example: 300 queries per month, average 2,000 input tokens + 1,000 output tokens per query

Using Claude 3.5 Sonnet:

  • Input: 300 × 2,000 = 600,000 tokens = 0.6M tokens × $3 = $1.80
  • Output: 300 × 1,000 = 300,000 tokens = 0.3M tokens × $15 = $4.50
  • Total Monthly Cost: ~$6-8

Medium Usage: 1,000-5,000 Queries/Month

Who: Regular team usage, daily knowledge queries, document analysis

Example: 2,500 queries per month, average 3,000 input + 1,500 output tokens

Using Claude 3.5 Sonnet:

  • Input: 2,500 × 3,000 = 7.5M tokens × $3 = $22.50
  • Output: 2,500 × 1,500 = 3.75M tokens × $15 = $56.25
  • Total Monthly Cost: ~$75-100

With Prompt Caching (typical 50% cache hit rate):

  • Cached input: 3.75M × $0.30 = $11.25 (90% savings)
  • Fresh input: 3.75M × $3 = $11.25
  • Output: $56.25
  • Optimized Total: ~$75-80 (30% savings)

Heavy Usage: 10,000+ Queries/Month

Who: Enterprise-wide deployment, multiple teams, continuous AI assistance

Example: 15,000 queries per month, average 4,000 input + 2,000 output tokens

Using Claude 3.5 Sonnet:

  • Input: 15,000 × 4,000 = 60M tokens × $3 = $180
  • Output: 15,000 × 2,000 = 30M tokens × $15 = $450
  • Total Monthly Cost: ~$600-650

With Prompt Caching + Batch Processing:

  • Cached input (50%): 30M × $0.30 = $9
  • Fresh input (50%): 30M × $3 = $90
  • Batch output (40% of queries): 12M × $7.50 = $90
  • Real-time output (60%): 18M × $15 = $270
  • Optimized Total: ~$450-475 (30-35% savings)

What Influences AI Costs?

Query Complexity

Simple lookups use ~1,000 tokens. Complex document analysis can use 50,000+ tokens. More context = higher input costs.

Response Length

Short answers cost pennies. Detailed reports with summaries and analysis cost more. Output tokens are 5× more expensive than input.

Knowledge Base Size

Larger context windows (more documents) = more input tokens. Strategic document organization reduces costs.

Model Selection

Haiku for speed/cost, Sonnet for balance, Opus for critical tasks. Choosing the right model for each use case optimizes spending.


Practical Cost Optimization Strategies

Strategy 1: Smart Model Selection

Use Haiku ($0.25/$1.25 per 1M) for:

  • Quick lookups and simple Q&A
  • Document search and retrieval
  • Status checks and brief summaries

Use Sonnet ($3/$15 per 1M) for:

  • Complex analysis and reasoning
  • Multi-document synthesis
  • Strategic decision support

Potential Savings: 60-80% on routine queries


Strategy 2: Enable Prompt Caching

Sasha automatically caches:

  • Your knowledge base context
  • Frequently accessed documents
  • Common organizational information

Potential Savings: 50-90% on input tokens


Strategy 3: Batch Non-Urgent Work

For reports, summaries, and scheduled analysis:

  • Queue requests for batch processing
  • Get 50% discount on all tokens
  • Results delivered within hours instead of seconds

Potential Savings: 50% on background processing


Strategy 4: Optimize Knowledge Base

  • Focus on most-accessed documents
  • Remove redundant information
  • Structure documents for efficient retrieval
  • Use summaries instead of full documents where appropriate

Potential Savings: 20-40% on input tokens


Security & Compliance Impact on Costs

No Additional Security Costs

Unlike other AI solutions, Sasha's security features do not increase AI costs:

Encryption: Your API tokens are encrypted at rest (AES-256-GCM) - no AI cost impact
Private Deployment: AWS Bedrock keeps data in your region - same per-token pricing
Access Controls: Role-based permissions managed locally - zero AI cost
Audit Logging: All tracking happens in Sasha - no AI provider charges

The only AI costs you pay are for actual token consumption—security is free.


📞 Getting Started

Ready to See Your Costs?

30-Day Trial Estimates:

  • Configure your API credentials
  • Use Sasha normally for 30 days
  • Review actual token usage in your provider console
  • Make informed decisions based on real data

Most organizations discover:

  • AI costs are 70-80% lower than expected
  • Optimization features reduce costs by 40-60%
  • Direct billing eliminates vendor markup concerns
  • ROI is positive within first month of deployment
support@context-is-everything.com | Schedule a cost analysis call

Sasha AI Knowledge Management - Complete transparency, zero markup, full control