Skip to content

vhdesai/LOI-Analysis

Repository files navigation

TermSheet Analysis Tool (Azure OpenAI & Google Gemini)

Disclaimer: This tool performs an analysis of legal documents. It does not provide legal advice, and its use does not establish an attorney-client relationship or attorney-client privilege. If you require attorney-client privilege or legal counsel, please consult with a qualified attorney. The results generated by this should not be relied upon as a substitute for professional legal advice.

This tool automates the analysis of legal contracts (PDF, Word, Excel, Images) using Large Language Models (LLMs). It supports Azure OpenAI (GPT-4) and Google Gemini (2.0 Flash), allowing for flexible, high-capacity document review.

Key Features

  • Hybrid Provider Support: Switch between Azure OpenAI and Google Gemini.
  • Dynamic Context Window: Automatically adjusts analysis chunk sizes based on the selected model's capacity (maximizing "Vik's Law" of 20-30% utilization).
  • Multi-Format Support: Extracts text from PDFs (OCR enabled), Word docs (.docx, .doc), Excel files (.xlsx), and common image formats.
  • Excel Reporting: Generates a structured Excel report (ai_results.xlsx) with analysis results for each contract.
  • GUI & CLI: Modern, user-friendly graphical interface and robust command-line interface.

1. Prerequisites

  • Python 3.10+ (Recommended)
  • Tesseract OCR (for scanned PDFs/Images)
  • Poppler (for PDF processing)
    • Windows: Download from poppler-windows, extract, and add the bin folder to your system PATH.

2. Dependencies

Install the required Python packages:

pip install -r requirements.txt

3. Configuration

Environment Variables (Authentication)

Create a .env file or set these variables in your system:

For Azure OpenAI:

  • AZURE_OPENAI_API_KEY: (Optional) If using API Key auth. If not set, the tool attempts Azure AD authentication (CLI, VS Code, Browser).

For Google Gemini:

  • API Key: Set GOOGLE_API_KEY environment variable.
  • OAuth (Recommended): Download client_secret.json from Google Cloud Console.
    • Create a Project > APIs & Services > Credentials.
    • Create "OAuth 2.0 Client ID" (Desktop App).
    • Download the JSON file and save it (e.g., as client_secret.json).

Prompts

  • The tool uses LOI_prompts.yaml (formerly prompts.yaml) to define the analysis questions.
  • Format:
    prompts:
      - column_header: "Parties Involved"
        prompt: "Identify the buyer and seller in this contract."
      - column_header: "Effective Date"
        prompt: "What is the effective date of the agreement?"

4. Usage

Graphical Interface (GUI)

Simply run the script without arguments to launch the GUI:

python LOI_Analysis.py
  • Select Input: Local folder or file.
  • Provider: Choose between Google (Gemini 2.0 Flash - Default) and Azure (GPT-4o).
  • Auth Method:
    • Google: OAuth (Recommended), API Key, or ADC.
    • Azure: RBAC or API Key.

Command Line Interface (CLI)

Run headlessly for automation:

# Analyze a folder using Google Gemini (Default)
python LOI_Analysis.py --input "C:\Contracts" --output "ai_results.xlsx"

# Analyze using Azure OpenAI
python LOI_Analysis.py --input "C:\Contracts" --provider azure

# Specify a specific model
python LOI_Analysis.py --input "C:\Contracts" --provider google --model gemini-2.0-flash-001

5. Context Window Logic ("Vik's Law")

The tool dynamically calculates the safe amount of text to send based on the model:

  • GPT-4o: ~128k tokens context -> Process ~30k tokens (~100k chars) per chunk.
  • Gemini 2.0 Flash: ~1M tokens context -> Process huge documents in single pass.

6. Project Structure

  • LOI_Analysis.py: Main application logic.
  • auth.py: Authentication handlers for Azure and Google.
  • LOI_prompts.yaml: Analysis questions configuration.
  • requirements.txt: Python dependencies.
  • Contracts/: Default input directory.

7. Building the Executable

You can create a standalone executable of the application using PyInstaller.

  1. Install PyInstaller:

    pip install pyinstaller
  2. Build the Executable:

    pyinstaller --onefile LOI_Analysis.py

    This command will generate a single executable file in the dist folder.

Troubleshooting

  • Tesseract Not Found: Ensure Tesseract is installed and added to PATH. You may need to restart your terminal/IDE.
  • Azure Auth Errors: Try running az login in your terminal if using RBAC.
  • Google Auth Errors: Ensure GOOGLE_API_KEY is set correctly.
  • PDF Extraction Issues: If OCR fails, check the logs for Poppler/Tesseract errors.

About

AI-powered contract analysis tool for Term Sheets and LOIs. Supports Azure OpenAI and Google Gemini, extracts key terms from Word, PDF, Excel, and image files, and generates structured Excel reports. Features GUI & CLI, robust authentication, and customizable prompts for flexible legal review

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages