Skip to content

Commit b7e20e4

Browse files
Add integration with Chat-GPT APIs.
1 parent 0d2cbfd commit b7e20e4

7 files changed

Lines changed: 170 additions & 97 deletions

File tree

SearchEngine/.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
.env

SearchEngine/README.md

Lines changed: 60 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
1-
# 🔍 Mini Search Engine with Stack
1+
# 🔍 Mini Search Engine with Stack
22

33
A **Mini Search Engine project** developed as part of my **2nd Semester DSA Lab Project (BS AI, NFC IET Multan)**.
44
It demonstrates **core Data Structures and Algorithms (DSA)** concepts such as:
55
- **Stack** (for search history navigation)
66
- **Inverted Index / Hash Map** (for efficient keyword-based searching)
77
- **String processing & searching algorithms**
8+
- **GPT Integration** → fallback when no local documents match
89

910
---
1011

@@ -15,6 +16,7 @@ It demonstrates **core Data Structures and Algorithms (DSA)** concepts such as:
1516
-**Search History (REDO)** → supports `next` command just like a browser
1617
-**Document Viewer** → open `.txt` files directly from search results
1718
-**Automatic Crawler** → indexes all `.txt` files in the `documents/` folder
19+
-**GPT Fallback** → uses OpenAI GPT-3.5-Turbo when no local results found, saves responses in `gpt_docs/gpt.txt`
1820
-**Clean modular structure** for GitHub
1921

2022
---
@@ -25,34 +27,51 @@ Mini-Search-Engine/
2527
2628
├── stack.py # Stack implementation (push, pop, peek, empty)
2729
├── index.py # Inverted Index implementation
28-
├── search.py # Search Engine logic
30+
├── search.py # Search Engine logic + GPT integration
2931
├── main.py # Entry point for running the project
3032
3133
├── documents/ # Folder containing sample text files
3234
│ ├── doc1.txt
3335
│ ├── doc2.txt
3436
│ └── ...
3537
38+
├── gpt_docs/ # Folder storing GPT responses
39+
│ └── gpt.txt
40+
41+
├── .env # Stores OPENAI_API_KEY
42+
├── requirements.txt # Required Python packages
3643
└── README.md # Project documentation
3744
```
38-
## ⚡ How It Works
39-
- The program scans the `documents/` folder and builds an **inverted index**.
40-
- When the user searches, queries are **cleaned** (lowercased, punctuation removed, split into words).
41-
- Matching documents are **ranked by query word frequency**.
42-
- The query is **pushed onto the Stack (history)**.
43-
- If the user types **back**, the last query is **popped** and the previous one is shown again.
44-
- The user can open a result to see the **full content of the file**.
45+
### ⚡ How It Works
4546

46-
---
47+
+ The program scans the `documents/` folder and builds an inverted index.
48+
49+
+ When the user searches, queries are cleaned (lowercased, punctuation removed, split into words).
50+
51+
+ Matching documents are ranked by query word frequency.
52+
53+
+ The query is pushed onto the Stack (history).
54+
55+
+ If the user types `back`, the last query is popped and the previous one is shown again.
56+
57+
+ If no local document matches, the engine calls GPT, saves the result in `gpt_docs/gpt.txt`, indexes it, and shows it.
58+
59+
+ Users can open a result to see the full content of the file.
4760

4861
## ▶️ Usage
4962

50-
### Run the program:
51-
```bash
63+
1. Go to base Dir:
64+
```batch
65+
cd SearchEngine
66+
```
67+
2. Run
68+
```batch
5269
python main.py
5370
```
54-
### Example Session:
55-
```Loading index...
71+
72+
***Example Session:***
73+
```py
74+
Loading index...
5675
Index built with 5 documents.
5776

5877
Enter search query, 'back','next', 'show', or 'quit': ai
@@ -68,10 +87,16 @@ Enter search query, 'back','next', 'show', or 'quit': cs
6887
[Stack] Pushed: cs
6988

7089
Searching for: 'cs'
71-
Found 1 document(s):
72-
1. doc2.txt | Score: 1
90+
No matches found in local documents. Using ChatGPT...
91+
--- GPT Answer ---
92+
CS is the study of computers and computational systems...
93+
------------------
94+
1. gpt.txt | Score: 1
7395

74-
Enter document number to open, or 'continue': continue
96+
Enter document number to open, or 'continue': 1
97+
--- gpt.txt ---
98+
CS is the study of computers and computational systems...
99+
------------------
75100

76101
Enter search query, 'back','next', 'show', or 'quit': back
77102
[Stack] Popped: cs
@@ -80,41 +105,42 @@ Enter search query, 'back','next', 'show', or 'quit': back
80105
Back to: 'ai'
81106
1. doc5.txt | Score: 2
82107

83-
Enter document number to open, or 'continue': next
84-
Please enter a valid number or 'continue'.
108+
Enter search query, 'back','next', 'show', or 'quit': next
109+
Redo: 'cs'
110+
1. gpt.txt | Score: 1
85111

86112
Enter document number to open, or 'continue': continue
87113
Enter search query, 'back','next', 'show', or 'quit': quit
88114
Goodbye!
89115
```
90-
---
116+
91117
## 🏫 Academic Info
92118

93-
+ 📖 Course: Data Structures & Algorithms (DSA)
119+
📖 Course: Data Structures & Algorithms (DSA)
94120

95-
+ 🎓 Semester: 2nd Semester, BS Artificial Intelligence
121+
🎓 Semester: 2nd Semester, BS Artificial Intelligence
96122

97-
+ 🏛️ University: NFC IET Multan
123+
🏛️ University: NFC IET Multan
98124

99-
+ 👨‍💻 Student: Muawiya Amir
125+
👨‍💻 Student: Muawiya Amir
100126

101-
----
102-
### 👥 Team Members
127+
---
128+
## 👥 Team Members
103129

104-
+ 👨‍💻 Muawiya (Team Leader)
130+
👨‍💻 Muawiya (Team Leader)
105131

106-
+ 👨‍💻 M. Umar
132+
👨‍💻 M. Umar
107133

108134
---
109135

110-
### 🚀 Future Improvements
136+
## 🚀 Future Improvements
111137

112-
> + Add ***synonym & fuzzy*** matching for queries
138+
> Add synonym & fuzzy matching for queries
113139

114-
> + Implement ***OR / NOT*** search operators
140+
> Implement OR / NOT search operators
115141

116-
> + Enhance ranking with ***TF-IDF instead*** of simple counts
142+
> Enhance ranking with TF-IDF instead of simple counts
117143

118-
> + Build a ***GUI or Web-based*** interface
144+
> Build a GUI or Web-based interface
119145

120-
------
146+
> Maintain multiple GPT files (gpt_1.txt, gpt_2.txt, ...) to fully integrate undo/redo
2.48 KB
Binary file not shown.

SearchEngine/gpt_docs/gpt_1.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
AI, or artificial intelligence, is the simulation of human intelligence processes by machines, especially computer systems. It encompasses activities such as learning, reasoning, problem-solving, perception, and language understanding. AI technologies are used in a wide range of applications, from virtual assistants like Siri and Alexa to self-driving cars and advanced medical diagnostics.

SearchEngine/gpt_docs/gpt_2.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Islam is a monotheistic religion founded by the Prophet Muhammad in the 7th century in the Arabian Peninsula. It is based on the belief in one god, Allah, and the teachings of the Quran, which is considered the holy book of Islam. Followers of Islam are called Muslims and they follow the Five Pillars of Islam, which are the declaration of faith, prayer, fasting during the month of Ramadan, giving to charity, and making pilgrimage to Mecca at least once in a lifetime if possible. Islam is the second largest religion in the world, with over 1.8 billion followers.

SearchEngine/requirements.txt

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
openai>=1.0.0
2+
python-dotenv>=1.0.0
3+
# Python version
4+
python>=3.12
5+
# Core libraries
6+
openai>=1.0.0
7+
python-dotenv>=1.0.0
8+
9+
# Optional but useful
10+
rich>=13.0.0 # for pretty console output
11+
typing-extensions>=4.5.0 # extra type hints if needed
12+
# Development and testing
13+
pytest>=7.0.0 # for testing

0 commit comments

Comments
 (0)