Skip to content

chakrinee/multi-tenant-rag

Repository files navigation

🥷 Secure Multi-Tenant Agentic RAG on AWS

A production-grade RAG system where multiple organizations share the same AI infrastructure — but their data is completely isolated from each other.

Built as a personal project to combine everything I was learning about RAG, AWS, and AI security.


What it does

Upload your organization's documents. Ask questions. Get answers — strictly from your documents, never anyone else's.

Multiple companies can use the same deployment simultaneously. Company A cannot see Company B's data. Not just filtered — physically isolated at every layer.


How is this different from a normal AI chatbot?

Every chatbot you see built with OpenAI or Claude does this:

User question → API call → Answer

One user, one AI, no concept of ownership or isolation. Your project does this:

Verified user (JWT + tenant_id)
    ↓
Their documents only (Pinecone namespace)
    ↓
AI answers only from those docs
    ↓
Every query logged to DynamoDB

The difference is ownership and isolation. This system knows who is asking, which company they belong to, what data they're allowed to see, and what the AI said.


Architecture

User Login ↓ Layer 1 — AWS Cognito Per-tenant User Pool + JWT with tenant_id ↓ Layer 2 — API Gateway JWT validated at infrastructure level ↓ Layer 3 — S3 + Lambda Per-tenant buckets + Bedrock embeddings ↓ Layer 4 — Pinecone namespace = tenant_id (two isolation locks) ↓ Layer 5 — AI Agent AWS Bedrock Nova Micro, scoped to tenant docs ↓ Layer 6 — DynamoDB Every query logged for compliance ↓ Layer 7 — Frontend Login + Signup + Chat UI

Setup

Prerequisites

  • AWS account with CLI configured (aws configure)
  • Pinecone account (free tier works)
  • Python 3.11

Install dependencies

pip install fastapi uvicorn mangum boto3 pinecone pypdf \
  python-jose[cryptography] httpx python-dotenv python-multipart

Environment variables

Create a .env file in each layer folder:

POOL_ID=us-east-1_XXXXXXX
CLIENT_ID=your_cognito_app_client_id
TENANT_ID=your-org-name
REGION=us-east-1
PINECONE_API_KEY=your_pinecone_api_key
PINECONE_INDEX=rag-index

Running locally

Step 1 — Provision AWS Cognito

cd layer1-auth
python cognito_setup.py

Copy the POOL_ID and CLIENT_ID printed in the output into your .env.

Step 2 — Create S3 bucket

cd layer3-storage
python s3_setup.py

Step 3 — Start the API server

cd layer2-gateway
uvicorn lambda_handler:app --reload

Server runs at http://127.0.0.1:8000

Step 4 — Open the frontend

Open layer7-frontend/login.html with Live Server in VS Code.

  • Sign up with your email, password, and organization name
  • Log in
  • Upload a PDF or text file
  • Ask questions about it

How tenant isolation works

Every request carries a JWT issued by AWS Cognito. The tenant_id is stamped inside the token at account creation — non-mutable, cannot be changed.

# Every request — verify JWT and extract tenant
tenant_id = payload.get('custom:tenant_id')
# Never trust client-supplied tenant info — only the verified JWT

Documents stored in per-tenant S3 buckets and indexed in Pinecone under a namespace scoped to the tenant:

# Two independent isolation locks on every query
results = index.query(
    vector=query_embedding,
    namespace=tenant_id,              # Lock 1 — namespace
    filter={'tenant_id': tenant_id},  # Lock 2 — metadata
    top_k=5
)

The AI agent is sandboxed via system prompt — it can only answer from retrieved chunks which are already scoped to the tenant's namespace.


API Endpoints

Method Endpoint Auth Description
POST /auth/signup None Create account + tenant
POST /auth/login None Login + get JWT
GET /api/me JWT Get current user + tenant
POST /api/ask JWT Ask a question
POST /api/upload JWT Upload + index a document
GET /health None Health check

What I learned

  • Security has to be designed in from day one — not added later
  • RAG quality depends heavily on chunk size (150 words works better than 500)
  • AWS Bedrock is great for portfolio projects — no new accounts, pure IAM
  • The hardest part wasn't the AI — it was making the tenant_id flow correctly through every layer
  • Anyone can build a chatbot. The interesting problem is what happens when 100 companies share the same AI

Cost

Near zero. Everything within AWS free tier except Bedrock — which charges per token. For testing and portfolio use, expect less than $1 total.


Blog post

Full writeup on Medium — https://medium.com/@cayalaso/i-built-a-secure-multi-tenant-rag-system-on-aws-9f1bb7e25f20


License

MIT — use it, fork it, build on it.

About

Secure Multi-Tenant Agentic RAG system on AWS

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages