Skip to content
View thelakshyadubey's full-sized avatar

Highlights

  • Pro

Block or report thelakshyadubey

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
thelakshyadubey/README.md

Lakshya Dubey

GenAI Engineer | LLM Systems | Computer Vision | Distributed AI

🔗 LinkedIn: https://linkedin.com/in/lakshyadubey
📧 lakshya.dubey04@gmail.com


About Me

I am an AI-focused Computer Engineering student building production-oriented systems in:

• Large Language Model (LLM) applications
• Retrieval-Augmented Generation (RAG) pipelines
• Computer Vision & Deep Learning
• Distributed AI systems

My work centers on integrating LLMs with vector databases, structured retrieval, and scalable backend architectures to build practical AI systems rather than isolated models.


Core Focus Areas

Generative AI Systems

  • LLM-based Question Answering pipelines
  • Vector embeddings & semantic retrieval
  • Prompt engineering & model orchestration
  • LLaMA-3 (GROQ), Pinecone, LangChain

Machine Learning & Deep Learning

  • Transfer Learning (Xception, YOLOv8)
  • CNN-based image forensics
  • Classical ML (SVM, classification pipelines)
  • Feature engineering & evaluation metrics

AI Infrastructure & Deployment

  • FastAPI / Flask model serving
  • RESTful AI APIs
  • Cloud integration (Google Cloud, AWS)
  • Distributed similarity systems

Highlight Projects

1️⃣ Automatic Ticket Classification & Document QA System

LLM-powered support automation system integrating:

  • Pinecone vector database
  • GROQ LLaMA-3 inference
  • SVM-based intelligent routing
  • Semantic retrieval + generative response

Focus: End-to-end GenAI workflow with structured routing logic.


2️⃣ Distributed AST-Based Plagiarism Detection System

Scalable plagiarism detection system using:

  • Abstract Syntax Tree (AST) similarity
  • Google Drive API integration
  • Distributed comparison logic

Focus: Structural code intelligence over naive text matching.


3️⃣ Deepfake Detection via Reverse Engineering

Transfer-learning-based detection pipeline using:

  • Xception architecture
  • OpenCV-based feature extraction
  • Metadata & illumination inconsistency analysis

Focus: AI-generated media forensic detection.


4️⃣ OneClouds — Privacy-Preserving Multi-Cloud Platform

Capstone project integrating:

  • Multi-cloud APIs (Google Drive, Dropbox, OneDrive)
  • OAuth2 authentication
  • Encrypted metadata-driven architecture

Focus: Secure cloud abstraction layer with system-level design.


Technical Stack

LLMs & RAG
LangChain · GROQ LLaMA-3 · Pinecone · SentenceTransformers

Deep Learning
TensorFlow · OpenCV · Transfer Learning · CUDA

Backend & APIs
FastAPI · Flask · REST APIs

Databases & Cloud
MongoDB · MySQL · Google Cloud · AWS

Core Foundations
Data Structures & Algorithms · Database Systems · System Design


Research & Publications

Distributed Web-Based Plagiarism Checker Using AST and Google Drive Integration
Published in Shreeshodhamantra International Multidisciplinary Academic Research Journal


Current Direction

I am currently focusing on:

  • Graph-based RAG architectures
  • Hybrid retrieval (vector + structured traversal)
  • LLM system evaluation and hallucination mitigation
  • Scalable AI deployment patterns

🌐 Socials:

LinkedIn email

💻 Tech Stack:

C++ HTML5 Java R Python CSS3 JavaScript Google Cloud Vercel Firebase Render Anaconda Angular Angular.js nVIDIA Django Express.js FastAPI Flask Flutter Swift NPM OpenCV Nodemon NodeJS Streamlit MySQL Firebase SQLite Adobe Figma Keras Matplotlib NumPy Pandas Plotly PyTorch scikit-learn Scipy TensorFlow GitHub Git Cisco Arduino Postman Power Bi Unity Canva MongoDB

📊 GitHub Stats:



🔝 Top Contributed Repo


Popular repositories Loading

  1. Deepfake_Detection Deepfake_Detection Public

    A deepfake face detection system using transfer learning with Xception CNN. Trained on real and fake face datasets using data augmentation, mixed precision, and GPU acceleration. Accurately classif…

    Python 3

  2. Distributed_Plagiarism_Checker Distributed_Plagiarism_Checker Public

    A lightweight web-based tool to detect code plagiarism using AST (Abstract Syntax Tree) similarity. It syncs student code files via Google Drive and compares them to identify structural similaritie…

    Python 1

  3. Customer_Care_Call_Summary_Alert Customer_Care_Call_Summary_Alert Public

    Customer Care Call Summarization using Whisper, LangChain Groq, and Zapier Email Automation

    Python 1 1

  4. Automatic_Ticket_Classification_Tool Automatic_Ticket_Classification_Tool Public

    Automatic ticket classification and document QA system with Pinecone vector search, GROQ LLaMA-3, and SVM-based department prediction

    Python 1

  5. Support_Chat_Bot_for_your_Website Support_Chat_Bot_for_your_Website Public

    Streamlit app that uses LangChain, HuggingFace, and Pinecone to semantically search website content from sitemap.xml URLs using LLMs. Fast, simple, and intelligent web content querying.

    Python 1

  6. Resume_Screening_Assistance Resume_Screening_Assistance Public

    A Streamlit app that screens resumes by matching them to job descriptions using LangChain, Pinecone, and LLaMA-3. It performs semantic search and summarizes top resumes to assist HR teams.

    Python 1