Smart Vision Gemma

A real-time computer vision application that combines object detection (YOLOv8), face analysis, and hand tracking with generative AI (Google Gemma) to understand and describe the scene.

Features

Object Detection: Uses YOLOv8 to detect objects in real-time.
Face Analysis: Detects faces and analyzes attributes.
Hand Tracking: Tracks hand movements.
Scene Understanding: Uses Google's Gemma model to generate creative, sci-fi inspired descriptions of the detected scene.
Temporal Smoothing: Smooths detection results over time for a stable visualization.

Prerequisites

Python 3.8+
Webcam

Installation

Clone the repository (or download the files).

Create a virtual environment (recommended):

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```
Set up API Key:
- Get a Google AI API key from Google AI Studio.
- Create a .env file in the root directory.
- Add your key:
```
GOOGLE_API_KEY=your_api_key_here
```

Usage

Run the main application:

python main.py

Controls

q: Quit the application.
e: Ask Gemma to explain the current scene.
s: Toggle temporal smoothing.
r: Register the current face (saves to face_db/).

Project Structure

main.py: Entry point of the application.
detection/: Contains modules for object detection, face analysis, and hand tracking.
temporal/: Logic for smoothing detections over time.
utils/: Configuration and drawing utilities.
llm_helper.py: Interface for interacting with the Google Gemma API.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
detection		detection
temporal		temporal
utils		utils
.gitignore		.gitignore
README.md		README.md
llm_helper.py		llm_helper.py
main.py		main.py
requirements.txt		requirements.txt
yolov8m.pt		yolov8m.pt
yolov8n.pt		yolov8n.pt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Smart Vision Gemma

Features

Prerequisites

Installation

Usage

Controls

Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Smart Vision Gemma

Features

Prerequisites

Installation

Usage

Controls

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages