Luong Quang Dung avatar

Luong Quang Dung

AI Engineer

I'm an AI Engineer with a solid track record of turning advanced AI models into real-world, production-ready systems. I began my career at FPT Software, developing scalable real-time video analytics and OCR systems for edge devices and servers using DeepStream, Triton, and microservices. There, I gained deep expertise in optimizing performance, accuracy, and deployment pipelines for AI solutions.

I now work at Success Software Services, focusing on large language models and conversational AI. My recent work includes building a Retrieval-Augmented Generation (RAG) platform with high-performance vector search and a real-time Voice Agent for healthcare and wellness using WebRTC and streaming inference. I’m passionate about combining high-performance engineering with natural, human-like AI experiences, following best practices in MLOps, observability, and scalability.

Experience

AI Engineer

May 2025 - Present

Success Software Services

  • Designed and deployed a RAG platform with FastAPI, vLLM, and Milvus to deliver precise, context-aware answers.
  • Created data pipelines for PDF ingestion, embeddings, and vector indexing with Airflow and Docker-based microservices.
  • Architected a real-time Voice Agent for healthcare and wellness using WebRTC, integrating STT, LLM reasoning, and TTS modules for natural conversations.
  • Implemented scalable microservice architecture with streaming inference, quantization, and observability to improve speed and reliability.

AI Engineer

June 2022 - May 2025

FPT Software Quy Nhon

  • Built and optimized real-time video analytics systems for Jetson edge devices and servers using NVIDIA DeepStream, Triton, Redis, Kafka, and Docker.
  • Delivered LLM-based chatbots and context-extraction systems integrating semantic retrieval and contextual memory.
  • Implemented OCR and object detection pipelines achieving up to 98% accuracy, improving efficiency and inference time.
  • Conducted performance optimizations and model fine-tuning to improve system accuracy and efficiency.
  • Collaborated with cross-functional teams to deliver high-quality solutions within tight deadlines.

Projects

End-to-End RAG Platform

End-to-End RAG Platform

An AI operations platform for managing LLM deployments with advanced RAG capabilities.

RAGLLMOpsLLMLiteLangchainNeMoGuardrails
Codewiki

Codewiki

Codewiki is an intelligent documentation tool that automatically generates wiki-style documentation for your GitHub or local repositories using AI.

LLMRAGNextJS
DeepStream YOLOv11

DeepStream YOLOv11

Plug-and-Play Custom Parsers for AI Models in NVIDIA DeepStream SDK. Supported YOLOv11, D-FINE, SCRFD model.

DeepStreamTensorRTYOLOD-FINESCRFD
Face Recognition System

Face Recognition System

Face recognition pipeline powered by Triton Inference Server.

TritonSCRFDArcfaceQdrant

Awards & Recognition

Best Performer of FPT Software 2024

January 2025

FPT Software Quy Nhon

Recognized for dedication and contributions to team success

1st runner-up MLOps Marathon

September 2023

Vietnam MLOps Marathon

A competition focused on real-world business challenges, and honing AI project deployment skills

Winner HuggingFace Competition

April 2023

HuggingFace

Detecting Ships in Ports to Avoid Congestion and Manage Traffic

Top 4 Final Quy Nhon AI Hackathon

September 2022

FPT Software

Placed second in Qualifying, advanced to the Finals, and finished in the top 4