Skip to content
View minhnguyent546's full-sized avatar
💭
Cooooking
💭
Cooooking

Highlights

  • Pro

Block or report minhnguyent546

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
minhnguyent546/README.md

Hi! I am Minh-Thien Nguyen, an AI Research Engineer specializing in Natural Language Processing (NLP) and Deep Learning.

My research interests include Embedding Models, Image-Text Retrieval for Vietnamese, Optimal Transport, Retrieval-Augmented Generation (RAG), and Image Classification. Additionally, I work on personal projects involving distributed training, TPU model training, and RAG architectures for complex document multiple-choice QA.

My publications (including preprints):

  • ViCLIP-OT: The First Foundation Vision-Language Model for Vietnamese Image–Text Retrieval with Optimal Transport. [arXiv preprint]
  • soups: Leveraging Model Soups to Classify Intangible Cultural Heritage Images from the Mekong Delta. [arXiv]

Some of my best projects include:

  • seas: SEAS - A Smart Enrollment Advisory System for Can Tho University, built with async FastAPI, async SQLAlchemy, and async Qdrant.
  • viettel-ai-race-vbkt: Multiple-Choice Question Answering (MCQA) Pipeline for Complex Technical Documents.
  • medical-llama2: Med-Alpaca-2-7b-chat - A medical chatbot fine-tuned from the LLaMA 2 7B model.
  • pre-training-gpt2: An end-to-end workflow for pre-training a GPT-2 model from scratch, with a focus on scalable training on XLA-enabled devices via PyTorch/XLA (CUDA and TPU).

Algorithmic Articles:

  • Virtual tree/Cây ảo - VNOI Magazine, 2024 (magazine).
  • Subtle Techniques with the Xor Operation/Kỹ thuật tinh tế về phép Xor - VNOI Magazine, 2023 (magazine).
  • Read more in my hackmd blog.

Contact Information:

Pinned Loading

  1. seas seas Public

    SEAS - A Smart Enrollment Advisory System for CTU, built with async FastAPI, async SQLAlchemy, and async Qdrant.

    Python 3

  2. soups soups Public

    Official code for the paper: Leveraging Model Soups to Classify Intangible Cultural Heritage Images from the Mekong Delta.

    Python 1

  3. medical-llama2 medical-llama2 Public

    Med-Alpaca-2-7b-chat - A medical chatbot using LLaMA 2 model.

    Python 1

  4. viettel-ai-race-vbkt viettel-ai-race-vbkt Public

    Multiple-Choice Question Answering (MCQA) Pipeline for Complex Technical Documents

    Python 5

  5. pre-training-gpt2 pre-training-gpt2 Public

    Pre-training GPT-2 model from scratch using XLA devices (GPUs, TPUs).

    Python 3