A production-grade RAG system where multiple organizations share the same AI infrastructure — but their data is completely isolated from each other.
Built as a personal project to combine everything I was learning about RAG, AWS, and AI security.
Upload your organization's documents. Ask questions. Get answers — strictly from your documents, never anyone else's.
Multiple companies can use the same deployment simultaneously. Company A cannot see Company B's data. Not just filtered — physically isolated at every layer.
Every chatbot you see built with OpenAI or Claude does this:
User question → API call → Answer
One user, one AI, no concept of ownership or isolation. Your project does this:
Verified user (JWT + tenant_id)
↓
Their documents only (Pinecone namespace)
↓
AI answers only from those docs
↓
Every query logged to DynamoDB
The difference is ownership and isolation. This system knows who is asking, which company they belong to, what data they're allowed to see, and what the AI said.
User Login ↓ Layer 1 — AWS Cognito Per-tenant User Pool + JWT with tenant_id ↓ Layer 2 — API Gateway JWT validated at infrastructure level ↓ Layer 3 — S3 + Lambda Per-tenant buckets + Bedrock embeddings ↓ Layer 4 — Pinecone namespace = tenant_id (two isolation locks) ↓ Layer 5 — AI Agent AWS Bedrock Nova Micro, scoped to tenant docs ↓ Layer 6 — DynamoDB Every query logged for compliance ↓ Layer 7 — Frontend Login + Signup + Chat UI
- AWS account with CLI configured (
aws configure) - Pinecone account (free tier works)
- Python 3.11
pip install fastapi uvicorn mangum boto3 pinecone pypdf \
python-jose[cryptography] httpx python-dotenv python-multipartCreate a .env file in each layer folder:
POOL_ID=us-east-1_XXXXXXX
CLIENT_ID=your_cognito_app_client_id
TENANT_ID=your-org-name
REGION=us-east-1
PINECONE_API_KEY=your_pinecone_api_key
PINECONE_INDEX=rag-indexcd layer1-auth
python cognito_setup.pyCopy the POOL_ID and CLIENT_ID printed in the output into your .env.
cd layer3-storage
python s3_setup.pycd layer2-gateway
uvicorn lambda_handler:app --reloadServer runs at http://127.0.0.1:8000
Open layer7-frontend/login.html with Live Server in VS Code.
- Sign up with your email, password, and organization name
- Log in
- Upload a PDF or text file
- Ask questions about it
Every request carries a JWT issued by AWS Cognito. The tenant_id is stamped inside the token at account creation — non-mutable, cannot be changed.
# Every request — verify JWT and extract tenant
tenant_id = payload.get('custom:tenant_id')
# Never trust client-supplied tenant info — only the verified JWTDocuments stored in per-tenant S3 buckets and indexed in Pinecone under a namespace scoped to the tenant:
# Two independent isolation locks on every query
results = index.query(
vector=query_embedding,
namespace=tenant_id, # Lock 1 — namespace
filter={'tenant_id': tenant_id}, # Lock 2 — metadata
top_k=5
)The AI agent is sandboxed via system prompt — it can only answer from retrieved chunks which are already scoped to the tenant's namespace.
| Method | Endpoint | Auth | Description |
|---|---|---|---|
| POST | /auth/signup |
None | Create account + tenant |
| POST | /auth/login |
None | Login + get JWT |
| GET | /api/me |
JWT | Get current user + tenant |
| POST | /api/ask |
JWT | Ask a question |
| POST | /api/upload |
JWT | Upload + index a document |
| GET | /health |
None | Health check |
- Security has to be designed in from day one — not added later
- RAG quality depends heavily on chunk size (150 words works better than 500)
- AWS Bedrock is great for portfolio projects — no new accounts, pure IAM
- The hardest part wasn't the AI — it was making the tenant_id flow correctly through every layer
- Anyone can build a chatbot. The interesting problem is what happens when 100 companies share the same AI
Near zero. Everything within AWS free tier except Bedrock — which charges per token. For testing and portfolio use, expect less than $1 total.
Full writeup on Medium — https://medium.com/@cayalaso/i-built-a-secure-multi-tenant-rag-system-on-aws-9f1bb7e25f20
MIT — use it, fork it, build on it.