Chat with PDF

Query your PDF documents using local LLMs with Ollama.

Overview

Chat with PDF is a Retrieval-Augmented Generation (RAG) application that lets you ask questions about PDF documents. All processing happens locally on your machine - no data is sent to external servers.

Features

Local PDF text extraction
Semantic search using Ollama embeddings
Question answering with local LLMs
No API keys or external services required
Save and load document indexes for faster startup

Requirements

Python 3.12+
Ollama (https://ollama.ai)
Required Ollama models:
- llama3.3:70b (or another chat model)
- nomic-embed-text (for embeddings)

Installation

1. Install Ollama

# Linux
curl -fsSL https://ollama.ai/install.sh | sh

# macOS (or download from https://ollama.ai)
brew install ollama

2. Download required models

ollama pull llama3.3:70b
ollama pull nomic-embed-text

3. Install Python dependencies

# Install pipenv if needed
pip install pipenv

# Install project dependencies
pipenv install

# Activate environment
pipenv shell

Usage

Basic usage

python src/main.py --pdf path/to/document.pdf

Save index for faster loading

# First time: process PDF and save index
python src/main.py --pdf document.pdf --save-index

# Next time: load the saved index
python src/main.py --load-index document.index.json

Verbose output

python src/main.py --pdf document.pdf --verbose

Configuration

Edit config/settings.yaml to customize:

llm:
  model: "llama3.2"         # Chat model
  temperature: 0.3          # Creativity

embeddings:
  model: "nomic-embed-text" # Embedding model
  chunk_size: 500           # Words per chunk
  chunk_overlap: 50         # Overlap between chunks

search:
  top_k: 3                  # Number of relevant chunks

Project Structure

chat_with_pdf/
├── data/papers/         # Your PDF files
├── src/
│   ├── __init__.py
│   ├── pdf_loader.py    # PDF text extraction
│   ├── embeddings.py    # Ollama embeddings
│   ├── vector_store.py  # Simple vector storage
│   ├── chat.py          # Chat interface
│   └── main.py          # CLI entry point
├── config/settings.yaml # Configuration
├── logs/                # Log files
├── Pipfile              # Dependencies
└── README.md

How It Works

Load PDF: Extract text from PDF using PyMuPDF
Chunk: Split text into overlapping chunks
Embed: Create embeddings for each chunk using Ollama
Store: Keep embeddings in memory (or save to file)
Query: Embed user question, find similar chunks
Answer: Generate response using context + LLM

Hardware Requirements

RAM	Recommended Setup
8 GB	llama3.2:1b + nomic-embed-text
16 GB	llama3.2:3b + nomic-embed-text
32+ GB	llama3.1:8b + nomic-embed-text

GPU acceleration (NVIDIA, Apple Silicon) significantly speeds up processing.

Comparison to Cloud Solutions

Aspect	Chat with PDF (Local)	Cloud RAG Services
Privacy	Data stays local	Data sent to servers
Cost	Free	Per-query pricing
Speed	Depends on hardware	Generally fast
Internet	Not required	Required
Models	Limited by RAM	Latest models available

License

MIT License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chat with PDF

Overview

Features

Requirements

Installation

1. Install Ollama

2. Download required models

3. Install Python dependencies

Usage

Basic usage

Save index for faster loading

Verbose output

Configuration

Project Structure

How It Works

Hardware Requirements

Comparison to Cloud Solutions

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Chat with PDF

Overview

Features

Requirements

Installation

1. Install Ollama

2. Download required models

3. Install Python dependencies

Usage

Basic usage

Save index for faster loading

Verbose output

Configuration

Project Structure

How It Works

Hardware Requirements

Comparison to Cloud Solutions

License