Implemented authenticated demo

Prompt Cost Guard

A lightweight AI governance demo that estimates token volume and cost, returns an allow/review decision, and records the check without calling a paid model.

Status Implemented demo

AWS focus

CognitoAPI GatewayLambdaDynamoDBBudgets

AWS

Prompt text

API

Cognito + API Gateway API Gateway

Lambda cost estimator Lambda

DDB

DynamoDB TTL record DynamoDB

AWS

ALLOW or REVIEW decision

Problem

Generative AI demos can become expensive quickly if public users can send unlimited prompts to paid models. The safer first step is to put the budget decision in front of inference instead of after the bill arrives.

Design

Cognito limits access to approved reviewer identities.
Lambda estimates input and output tokens from the submitted prompt.
A model tier determines the estimated per-thousand-token cost.
The API returns ALLOW for low-cost requests and REVIEW when the prompt exceeds the public demo threshold.
DynamoDB stores a short-lived audit record with the estimate, decision, and selected tier.

Cost controls

No paid model is invoked. Reviewer accounts are limited to three estimates per day, prompts are capped at 4,000 characters, and the estimate record expires automatically. This keeps the live demo effectively free while showing how a real Bedrock gate would be designed.

Next step

A production version could add a Bedrock model allowlist, AWS Budgets alerts, per-user monthly spend records, and prompt template versioning before enabling real inference.