BillingSpending Limits

Chapter 6: Spending Limits

Overview

Spending limits provide granular control over API costs by setting per-access-token budgets. This chapter explains how to configure, manage, and monitor spending limits to prevent unexpected charges and control costs across different applications and environments.

What are Spending Limits?

Spending limits are configurable constraints that restrict how much an individual access token can spend over various time periods. They provide:

  • Cost Control: Prevent runaway costs from bugs or misuse
  • Budget Management: Enforce per-project or per-application budgets
  • Security: Limit damage from compromised tokens
  • Testing Safety: Cap costs for development and testing environments
  • Compliance: Meet internal financial controls and approval requirements

Types of Spending Limits

1. Daily Limits

Maximum amount an access token can spend in a 24-hour period.

Use Cases:

  • Development environments with daily budgets
  • Testing tokens that shouldn’t exceed daily thresholds
  • Rate-limiting expensive operations
  • Preventing daily cost spikes

2. Monthly Limits

Maximum amount an access token can spend in a calendar month.

Use Cases:

  • Production applications with monthly budgets
  • Department or team-level cost allocation
  • Long-term cost management
  • Matching financial reporting periods

3. Per-Request Limits

Maximum amount a single API request can cost.

Use Cases:

  • Preventing expensive single operations
  • Limiting token usage in LLM calls
  • Protecting against infinite loops or recursive calls
  • Enforcing cost-per-operation policies

Spending Limit Architecture

Access Token
    ├── Daily Limit: $10.00
    │   ├── Daily Spent: $3.45
    │   ├── Daily Remaining: $6.55
    │   └── Last Reset: 2024-01-15T00:00:00Z

    ├── Monthly Limit: $250.00
    │   ├── Monthly Spent: $47.30
    │   ├── Monthly Remaining: $202.70
    │   └── Last Reset: 2024-01-01T00:00:00Z

    └── Per-Request Limit: $1.00
        └── Enforced on each request

Request Flow:
1. Check per-request limit
2. Check daily limit
3. Check monthly limit
4. If all pass → Process request
5. If any fail → Reject with 402 Payment Required

Setting Spending Limits

Create or Update Limits

curl -X PUT https://api.polysystems.ai/api/keys/{key_id}/limits \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "daily_limit": 10.00,
    "monthly_limit": 250.00,
    "per_request_limit": 1.00
  }'

Request Parameters:

  • daily_limit (optional): Maximum daily spending in USD
  • monthly_limit (optional): Maximum monthly spending in USD
  • per_request_limit (optional): Maximum cost per single request in USD

Response:

{
  "access_key_id": "key-123e4567-e89b-12d3-a456-426614174000",
  "daily_limit": 10.00,
  "monthly_limit": 250.00,
  "per_request_limit": 1.00,
  "daily_spent": 0.00,
  "monthly_spent": 0.00,
  "daily_remaining": 10.00,
  "monthly_remaining": 250.00
}

Setting Individual Limits

You can set limits independently:

# Set only daily limit
curl -X PUT https://api.polysystems.ai/api/keys/{key_id}/limits \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -d '{
    "daily_limit": 5.00
  }'
 
# Set only monthly limit
curl -X PUT https://api.polysystems.ai/api/keys/{key_id}/limits \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -d '{
    "monthly_limit": 100.00
  }'
 
# Set only per-request limit
curl -X PUT https://api.polysystems.ai/api/keys/{key_id}/limits \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -d '{
    "per_request_limit": 0.50
  }'

No Limits (Unlimited)

To set unlimited spending, omit the parameter or set to null:

curl -X PUT https://api.polysystems.ai/api/keys/{key_id}/limits \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -d '{
    "daily_limit": null,
    "monthly_limit": null,
    "per_request_limit": null
  }'

Getting Spending Limits

View Current Limits

curl -X GET https://api.polysystems.ai/api/keys/{key_id}/limits \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

Response:

{
  "access_key_id": "key-123e4567-e89b-12d3-a456-426614174000",
  "daily_limit": 10.00,
  "monthly_limit": 250.00,
  "per_request_limit": 1.00,
  "daily_spent": 3.45,
  "monthly_spent": 47.30,
  "daily_remaining": 6.55,
  "monthly_remaining": 202.70
}

Understanding the Response

  • daily_limit: Maximum daily spending (null = unlimited)
  • monthly_limit: Maximum monthly spending (null = unlimited)
  • per_request_limit: Maximum per-request cost (null = unlimited)
  • daily_spent: Amount spent today
  • monthly_spent: Amount spent this month
  • daily_remaining: Available daily budget
  • monthly_remaining: Available monthly budget

Automatic Reset Behavior

Daily Reset

Daily spending counters reset at 00:00:00 UTC each day.

Day 1 (Jan 15):
├── 00:00:00 UTC: Counter resets to $0.00
├── 08:30:00 UTC: Spent $3.45
├── 14:20:00 UTC: Spent $2.10 (total: $5.55)
└── 23:59:59 UTC: Total spent: $8.23

Day 2 (Jan 16):
└── 00:00:00 UTC: Counter resets to $0.00 ← Automatic reset

Monthly Reset

Monthly spending counters reset at 00:00:00 UTC on the 1st of each month.

January 2024:
├── Jan 01 00:00:00 UTC: Counter resets to $0.00
├── Jan 15 12:00:00 UTC: Spent $47.30
└── Jan 31 23:59:59 UTC: Total spent: $184.55

February 2024:
└── Feb 01 00:00:00 UTC: Counter resets to $0.00 ← Automatic reset

Manual Reset

Reset Spending Counters

Manually reset spending counters to zero:

curl -X POST https://api.polysystems.ai/api/keys/{key_id}/limits/reset \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

Response:

{
  "success": true,
  "message": "Spending counters reset successfully"
}

Note: This resets both daily and monthly counters to zero and updates the last reset timestamps.

Use Cases for Manual Reset:

  • Testing limit enforcement
  • Resetting after accidental spending
  • Starting fresh mid-period
  • Administrative corrections

Removing Spending Limits

Delete All Limits

Remove all spending restrictions from an access token:

curl -X DELETE https://api.polysystems.ai/api/keys/{key_id}/limits \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

Response:

{
  "success": true,
  "message": "Spending limits removed successfully"
}

After deletion, the token has no spending restrictions (limited only by account balance).

Limit Enforcement

How Limits are Checked

For each API request:

1. Calculate request cost
2. Check per-request limit
   ├── If cost > per_request_limit → REJECT
   └── If pass → Continue

3. Check daily limit
   ├── If (daily_spent + cost) > daily_limit → REJECT
   └── If pass → Continue

4. Check monthly limit
   ├── If (monthly_spent + cost) > monthly_limit → REJECT
   └── If pass → Continue

5. Check account balance
   ├── If balance < cost → REJECT
   └── If pass → Continue

6. Process request
7. Deduct cost from balance
8. Update spending counters

Error Response: Limit Exceeded

When a spending limit is exceeded:

HTTP/1.1 402 Payment Required
Content-Type: application/json
{
  "error": "Spending limit exceeded",
  "message": "This access key has reached its spending limit. Please adjust limits or top up credits."
}

Error Response: Per-Request Limit Exceeded

{
  "error": "Per-request limit exceeded",
  "message": "This request costs $1.50 but the per-request limit is $1.00"
}

Development Tokens

Protect against runaway costs during development:

curl -X PUT https://api.polysystems.ai/api/keys/{dev_key_id}/limits \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -d '{
    "daily_limit": 1.00,
    "monthly_limit": 10.00,
    "per_request_limit": 0.10
  }'

Rationale:

  • Low daily limit prevents daily cost spikes
  • Monthly limit caps total development costs
  • Per-request limit prevents expensive test queries

Testing/CI Tokens

For automated testing and CI/CD pipelines:

curl -X PUT https://api.polysystems.ai/api/keys/{test_key_id}/limits \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -d '{
    "daily_limit": 5.00,
    "monthly_limit": 50.00,
    "per_request_limit": 0.50
  }'

Rationale:

  • Allows reasonable test coverage
  • Prevents infinite loops in tests
  • Caps monthly CI/CD costs

Staging Tokens

For staging/pre-production environments:

curl -X PUT https://api.polysystems.ai/api/keys/{staging_key_id}/limits \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -d '{
    "daily_limit": 25.00,
    "monthly_limit": 500.00,
    "per_request_limit": 2.00
  }'

Rationale:

  • Higher limits for realistic testing
  • Still protected from production-level costs
  • Allows performance testing

Production Tokens

For production applications with monitoring:

curl -X PUT https://api.polysystems.ai/api/keys/{prod_key_id}/limits \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -d '{
    "daily_limit": 100.00,
    "monthly_limit": 2000.00,
    "per_request_limit": 5.00
  }'

Rationale:

  • High enough for normal operation
  • Protects against unexpected spikes
  • Prevents catastrophic cost events
  • Still allows monitoring and alerts

Third-Party Integration Tokens

For external partners or customers:

curl -X PUT https://api.polysystems.ai/api/keys/{partner_key_id}/limits \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -d '{
    "daily_limit": 10.00,
    "monthly_limit": 200.00,
    "per_request_limit": 1.00
  }'

Rationale:

  • Clear budget communication
  • Prevents partner overuse
  • Predictable billing
  • Easy upgrade path

Monitoring Spending

Check Limits Programmatically

import os
import requests
 
def check_spending_status(key_id):
    """Check spending status for an access key"""
    response = requests.get(
        f'https://api.polysystems.ai/api/keys/{key_id}/limits',
        headers={'Authorization': f'Bearer {os.getenv("PS_JWT_TOKEN")}'}
    )
    
    limits = response.json()
    
    # Calculate usage percentages
    if limits['daily_limit']:
        daily_percent = (limits['daily_spent'] / limits['daily_limit']) * 100
        print(f"Daily: ${limits['daily_spent']:.2f} / ${limits['daily_limit']:.2f} ({daily_percent:.1f}%)")
    
    if limits['monthly_limit']:
        monthly_percent = (limits['monthly_spent'] / limits['monthly_limit']) * 100
        print(f"Monthly: ${limits['monthly_spent']:.2f} / ${limits['monthly_limit']:.2f} ({monthly_percent:.1f}%)")
    
    return limits
 
# Usage
limits = check_spending_status('key-123')

Alert on High Usage

def check_spending_alerts(key_id, warning_threshold=0.8):
    """Alert when spending reaches threshold"""
    response = requests.get(
        f'https://api.polysystems.ai/api/keys/{key_id}/limits',
        headers={'Authorization': f'Bearer {os.getenv("PS_JWT_TOKEN")}'}
    )
    
    limits = response.json()
    alerts = []
    
    # Check daily limit
    if limits['daily_limit'] and limits['daily_spent'] >= limits['daily_limit'] * warning_threshold:
        alerts.append({
            'type': 'daily',
            'spent': limits['daily_spent'],
            'limit': limits['daily_limit'],
            'percentage': (limits['daily_spent'] / limits['daily_limit']) * 100
        })
    
    # Check monthly limit
    if limits['monthly_limit'] and limits['monthly_spent'] >= limits['monthly_limit'] * warning_threshold:
        alerts.append({
            'type': 'monthly',
            'spent': limits['monthly_spent'],
            'limit': limits['monthly_limit'],
            'percentage': (limits['monthly_spent'] / limits['monthly_limit']) * 100
        })
    
    if alerts:
        for alert in alerts:
            print(f"⚠️  {alert['type'].upper()} SPENDING ALERT")
            print(f"   Spent: ${alert['spent']:.2f} / ${alert['limit']:.2f} ({alert['percentage']:.1f}%)")
    
    return alerts
 
# Run periodically
if __name__ == '__main__':
    check_spending_alerts('key-123', warning_threshold=0.8)

Multi-Token Budget Management

Strategy: Multiple Tokens with Different Limits

import requests
import os
 
class TokenBudgetManager:
    def __init__(self, jwt_token):
        self.jwt_token = jwt_token
        self.base_url = "https://api.polysystems.ai"
        self.headers = {
            "Authorization": f"Bearer {jwt_token}",
            "Content-Type": "application/json"
        }
    
    def create_token_with_limits(self, name, daily_limit, monthly_limit, per_request_limit):
        """Create new token and set limits"""
        # Create token
        response = requests.post(
            f"{self.base_url}/api/keys",
            headers=self.headers,
            json={"name": name}
        )
        token_data = response.json()
        key_id = token_data['id']
        
        # Set limits
        limits_response = requests.put(
            f"{self.base_url}/api/keys/{key_id}/limits",
            headers=self.headers,
            json={
                "daily_limit": daily_limit,
                "monthly_limit": monthly_limit,
                "per_request_limit": per_request_limit
            }
        )
        
        return {
            'token': token_data,
            'limits': limits_response.json()
        }
    
    def get_all_token_spending(self):
        """Get spending across all tokens"""
        response = requests.get(
            f"{self.base_url}/api/keys",
            headers=self.headers
        )
        tokens = response.json()
        
        total_daily = 0
        total_monthly = 0
        
        for token in tokens:
            if token['is_active']:
                limits_response = requests.get(
                    f"{self.base_url}/api/keys/{token['id']}/limits",
                    headers=self.headers
                )
                limits = limits_response.json()
                total_daily += limits['daily_spent']
                total_monthly += limits['monthly_spent']
        
        return {
            'total_daily_spent': total_daily,
            'total_monthly_spent': total_monthly,
            'active_tokens': len([t for t in tokens if t['is_active']])
        }
 
# Usage
manager = TokenBudgetManager(os.getenv('PS_JWT_TOKEN'))
 
# Create environment-specific tokens
dev_token = manager.create_token_with_limits(
    name="Development",
    daily_limit=1.00,
    monthly_limit=10.00,
    per_request_limit=0.10
)
 
staging_token = manager.create_token_with_limits(
    name="Staging",
    daily_limit=10.00,
    monthly_limit=100.00,
    per_request_limit=0.50
)
 
prod_token = manager.create_token_with_limits(
    name="Production",
    daily_limit=100.00,
    monthly_limit=2000.00,
    per_request_limit=5.00
)
 
# Monitor total spending
spending = manager.get_all_token_spending()
print(f"Total daily spending: ${spending['total_daily_spent']:.2f}")
print(f"Total monthly spending: ${spending['total_monthly_spent']:.2f}")

Best Practices

1. Always Set Limits for Development

# ✅ Good: Protected development token
curl -X POST https://api.polysystems.ai/api/keys \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -d '{"name": "Development"}' \
  | jq -r '.id' \
  | xargs -I {} curl -X PUT https://api.polysystems.ai/api/keys/{}/limits \
    -H "Authorization: Bearer $JWT_TOKEN" \
    -d '{"daily_limit": 1.00, "per_request_limit": 0.10}'
 
# ❌ Bad: No limits on development token
curl -X POST https://api.polysystems.ai/api/keys \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -d '{"name": "Development"}'

2. Increase Limits Gradually

Start conservative, increase as needed:

# Week 1: Conservative limits
curl -X PUT https://api.polysystems.ai/api/keys/{key_id}/limits \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -d '{"daily_limit": 10.00, "monthly_limit": 100.00}'
 
# Week 2: After monitoring usage
curl -X PUT https://api.polysystems.ai/api/keys/{key_id}/limits \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -d '{"daily_limit": 25.00, "monthly_limit": 250.00}'
 
# Week 3: Final production limits
curl -X PUT https://api.polysystems.ai/api/keys/{key_id}/limits \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -d '{"daily_limit": 100.00, "monthly_limit": 2000.00}'

3. Use Per-Request Limits for LLM Calls

Prevent expensive single requests:

curl -X PUT https://api.polysystems.ai/api/keys/{key_id}/limits \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -d '{
    "per_request_limit": 1.00
  }'

This prevents:

  • Infinite token generation
  • Unexpectedly large context windows
  • Expensive model calls (GPT-4, etc.)

4. Document Your Limit Strategy

# limits_config.yaml
tokens:
  development:
    daily_limit: 1.00
    monthly_limit: 10.00
    per_request_limit: 0.10
    reason: "Protect against dev mistakes"
  
  testing:
    daily_limit: 5.00
    monthly_limit: 50.00
    per_request_limit: 0.50
    reason: "Allow CI/CD testing"
  
  staging:
    daily_limit: 25.00
    monthly_limit: 500.00
    per_request_limit: 2.00
    reason: "Mirror production scale"
  
  production:
    daily_limit: 100.00
    monthly_limit: 2000.00
    per_request_limit: 5.00
    reason: "Normal operation with safety net"

5. Monitor and Adjust

#!/bin/bash
# monitor_limits.sh
 
JWT_TOKEN=$PS_JWT_TOKEN
API_URL="https://api.polysystems.ai"
 
echo "=== Spending Limits Report ==="
echo "Generated: $(date)"
echo ""
 
# Get all tokens
TOKENS=$(curl -s "$API_URL/api/keys" -H "Authorization: Bearer $JWT_TOKEN")
 
echo "$TOKENS" | jq -r '.[] | select(.is_active == true) | .id' | while read KEY_ID; do
  LIMITS=$(curl -s "$API_URL/api/keys/$KEY_ID/limits" -H "Authorization: Bearer $JWT_TOKEN")
  
  NAME=$(echo "$TOKENS" | jq -r ".[] | select(.id == \"$KEY_ID\") | .name")
  
  echo "Token: $NAME"
  echo "  Daily: $(echo "$LIMITS" | jq -r '.daily_spent') / $(echo "$LIMITS" | jq -r '.daily_limit')"
  echo "  Monthly: $(echo "$LIMITS" | jq -r '.monthly_spent') / $(echo "$LIMITS" | jq -r '.monthly_limit')"
  echo ""
done

Troubleshooting

Problem: Requests Rejected Despite Available Balance

Symptoms: API returns 402 Payment Required but account has sufficient balance

Solution: Check spending limits:

curl -X GET https://api.polysystems.ai/api/keys/{key_id}/limits \
  -H "Authorization: Bearer $JWT_TOKEN"

If limit exceeded, either:

  1. Wait for automatic reset (daily/monthly)
  2. Manually reset counters
  3. Increase limits
  4. Use different token

Problem: Can’t Set Negative Limits

Error: “Invalid limit: limit cannot be negative”

Solution: Use null for unlimited or positive values only:

curl -X PUT https://api.polysystems.ai/api/keys/{key_id}/limits \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -d '{"daily_limit": null}'  # Not -1 or negative

Problem: Limits Not Enforced

Possible Causes:

  1. Limits not set (all null)
  2. Request cost below limits
  3. Spending counter reset recently

Check:

# Verify limits are set
curl -X GET https://api.polysystems.ai/api/keys/{key_id}/limits \
  -H "Authorization: Bearer $JWT_TOKEN"
 
# Check if all limits are null

Summary

In this chapter, you learned:

  • ✅ What spending limits are and why they’re important
  • ✅ Three types of limits: daily, monthly, and per-request
  • ✅ How to set, update, and remove spending limits
  • ✅ Automatic reset behavior and manual reset options
  • ✅ How limits are enforced in the request pipeline
  • ✅ Recommended limit strategies for different environments
  • ✅ Monitoring and alerting on spending
  • ✅ Multi-token budget management
  • ✅ Best practices and troubleshooting

Next Steps