🥷 Secure Multi-Tenant Agentic RAG on AWS

A production-grade RAG system where multiple organizations share the same AI infrastructure — but their data is completely isolated from each other.

Built as a personal project to combine everything I was learning about RAG, AWS, and AI security.

What it does

Upload your organization's documents. Ask questions. Get answers — strictly from your documents, never anyone else's.

Multiple companies can use the same deployment simultaneously. Company A cannot see Company B's data. Not just filtered — physically isolated at every layer.

How is this different from a normal AI chatbot?

Every chatbot you see built with OpenAI or Claude does this:

User question → API call → Answer

One user, one AI, no concept of ownership or isolation. Your project does this:

Verified user (JWT + tenant_id)
    ↓
Their documents only (Pinecone namespace)
    ↓
AI answers only from those docs
    ↓
Every query logged to DynamoDB

The difference is ownership and isolation. This system knows who is asking, which company they belong to, what data they're allowed to see, and what the AI said.

Architecture

User Login ↓ Layer 1 — AWS Cognito Per-tenant User Pool + JWT with tenant_id ↓ Layer 2 — API Gateway JWT validated at infrastructure level ↓ Layer 3 — S3 + Lambda Per-tenant buckets + Bedrock embeddings ↓ Layer 4 — Pinecone namespace = tenant_id (two isolation locks) ↓ Layer 5 — AI Agent AWS Bedrock Nova Micro, scoped to tenant docs ↓ Layer 6 — DynamoDB Every query logged for compliance ↓ Layer 7 — Frontend Login + Signup + Chat UI

Setup

Prerequisites

AWS account with CLI configured (aws configure)
Pinecone account (free tier works)
Python 3.11

Install dependencies

pip install fastapi uvicorn mangum boto3 pinecone pypdf \
  python-jose[cryptography] httpx python-dotenv python-multipart

Environment variables

Create a .env file in each layer folder:

POOL_ID=us-east-1_XXXXXXX
CLIENT_ID=your_cognito_app_client_id
TENANT_ID=your-org-name
REGION=us-east-1
PINECONE_API_KEY=your_pinecone_api_key
PINECONE_INDEX=rag-index

Running locally

Step 1 — Provision AWS Cognito

cd layer1-auth
python cognito_setup.py

Copy the POOL_ID and CLIENT_ID printed in the output into your .env.

Step 2 — Create S3 bucket

cd layer3-storage
python s3_setup.py

Step 3 — Start the API server

cd layer2-gateway
uvicorn lambda_handler:app --reload

Server runs at http://127.0.0.1:8000

Step 4 — Open the frontend

Open layer7-frontend/login.html with Live Server in VS Code.

Sign up with your email, password, and organization name
Log in
Upload a PDF or text file
Ask questions about it

How tenant isolation works

Every request carries a JWT issued by AWS Cognito. The tenant_id is stamped inside the token at account creation — non-mutable, cannot be changed.

# Every request — verify JWT and extract tenant
tenant_id = payload.get('custom:tenant_id')
# Never trust client-supplied tenant info — only the verified JWT

Documents stored in per-tenant S3 buckets and indexed in Pinecone under a namespace scoped to the tenant:

# Two independent isolation locks on every query
results = index.query(
    vector=query_embedding,
    namespace=tenant_id,              # Lock 1 — namespace
    filter={'tenant_id': tenant_id},  # Lock 2 — metadata
    top_k=5
)

The AI agent is sandboxed via system prompt — it can only answer from retrieved chunks which are already scoped to the tenant's namespace.

API Endpoints

Method	Endpoint	Auth	Description
POST	`/auth/signup`	None	Create account + tenant
POST	`/auth/login`	None	Login + get JWT
GET	`/api/me`	JWT	Get current user + tenant
POST	`/api/ask`	JWT	Ask a question
POST	`/api/upload`	JWT	Upload + index a document
GET	`/health`	None	Health check

What I learned

Security has to be designed in from day one — not added later
RAG quality depends heavily on chunk size (150 words works better than 500)
AWS Bedrock is great for portfolio projects — no new accounts, pure IAM
The hardest part wasn't the AI — it was making the tenant_id flow correctly through every layer
Anyone can build a chatbot. The interesting problem is what happens when 100 companies share the same AI

Cost

Near zero. Everything within AWS free tier except Bedrock — which charges per token. For testing and portfolio use, expect less than $1 total.

Blog post

Full writeup on Medium — https://medium.com/@cayalaso/i-built-a-secure-multi-tenant-rag-system-on-aws-9f1bb7e25f20

License

MIT — use it, fork it, build on it.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
layer1-auth		layer1-auth
layer2-gateway		layer2-gateway
layer3-storage		layer3-storage
layer4-vectordb		layer4-vectordb
layer5-agent		layer5-agent
layer6-logging		layer6-logging
layer7-frontend		layer7-frontend
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🥷 Secure Multi-Tenant Agentic RAG on AWS

What it does

How is this different from a normal AI chatbot?

Architecture

Setup

Prerequisites

Install dependencies

Environment variables

Running locally

Step 1 — Provision AWS Cognito

Step 2 — Create S3 bucket

Step 3 — Start the API server

Step 4 — Open the frontend

How tenant isolation works

API Endpoints

What I learned

Cost

Blog post

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🥷 Secure Multi-Tenant Agentic RAG on AWS

What it does

How is this different from a normal AI chatbot?

Architecture

Setup

Prerequisites

Install dependencies

Environment variables

Running locally

Step 1 — Provision AWS Cognito

Step 2 — Create S3 bucket

Step 3 — Start the API server

Step 4 — Open the frontend

How tenant isolation works

API Endpoints

What I learned

Cost

Blog post

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages