Ngoc-Son Nguyen
Logo AI Research Resident

Xin chào vietnam icon ! Hi, I am Son Nguyen, an AI Research Resident at the FPT Software AI Center, working under the supervision of Dr. Van Nguyen, Prof. Truong-Son Hy, and Prof. Ngan Le. I received my B.Sc. degree in Data Science from the University of Science, Vietnam National University Ho Chi Minh City (HCMUS-VNU HCMC), in 2024, where I was advised by Dr. Tung Le.
My research interests include multimodal learning and generative models, particularly text-to-audio generation, video-to-speech synthesis, and visual voice cloning. I am also interested in medical imaging, with a focus on vision-language models and image synthesis.

Curriculum Vitae (CV)

Education
  • University of Science, Viet Nam National University, Ho Chi Minh City
    University of Science, Viet Nam National University, Ho Chi Minh City
    B.Sc. in Data Science
    Sep. 2020 - Dec. 2024
Work Experience
  • FPT Software AI Center
    FPT Software AI Center
    AI Research Resident
    Aug. 2024 - Present
  • University of Science, Viet Nam National University, Ho Chi Minh City
    University of Science, Viet Nam National University, Ho Chi Minh City
    Research Assistant
    Sep. 2023 - Aug. 2024
Honors & Awards
  • Outstanding Graduate Award, Vietnam National University, Ho Chi Minh City (VNU HCMC)
    2024
  • Outstanding Graduate in Data Science Award, University of Science, VNU HCMC (HCMUS) (Top 2)
    2024
  • Outstanding Achievement Award in Science and Technology Research, University of Science, VNU HCMC (HCMUS)
    2024
  • University Scholarship for Excellent Academic Achievement, University of Science, VNU HCMC (HCMUS)
    2020-2023
Selected Publications (view all )
DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and SynchronizationCVPR
DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization

Ngoc-Son Nguyen, Thanh V. T. Tran, Jeongsoo Choi, Hieu-Nghia Huynh-Nguyen, Truong-Son Hy, Van Nguyen

Findings of the Conference on Computer Vision and Pattern Recognition (Findings CVPR) 2026

DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and SynchronizationCVPR
DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization

Ngoc-Son Nguyen, Thanh V. T. Tran, Jeongsoo Choi, Hieu-Nghia Huynh-Nguyen, Truong-Son Hy, Van Nguyen

Findings of the Conference on Computer Vision and Pattern Recognition (Findings CVPR) 2026

DiFlow-TTS: Compact and Low-Latency Zero-Shot Text-to-Speech with Factorized Discrete Flow MatchingUnder Review
DiFlow-TTS: Compact and Low-Latency Zero-Shot Text-to-Speech with Factorized Discrete Flow Matching

Ngoc-Son Nguyen, Thanh V. T. Tran, Hieu-Nghia Huynh-Nguyen, Truong-Son Hy, Van Nguyen

Under Review 2025

DiFlow-TTS: Compact and Low-Latency Zero-Shot Text-to-Speech with Factorized Discrete Flow MatchingUnder Review
DiFlow-TTS: Compact and Low-Latency Zero-Shot Text-to-Speech with Factorized Discrete Flow Matching

Ngoc-Son Nguyen, Thanh V. T. Tran, Hieu-Nghia Huynh-Nguyen, Truong-Son Hy, Van Nguyen

Under Review 2025

LiteGPT: Large Vision-Language Model for Joint Chest X-ray Localization and Classification TaskarXiv
LiteGPT: Large Vision-Language Model for Joint Chest X-ray Localization and Classification Task

Khai Le-Duc*, Ryan Zhang*, Ngoc-Son Nguyen*, Tan-Hanh Pham, Anh Dao, Ba Hung Ngo, Anh Totti Nguyen, Truong-Son Hy (* equal contribution)

Preprint 2024

LiteGPT: Large Vision-Language Model for Joint Chest X-ray Localization and Classification TaskarXiv
LiteGPT: Large Vision-Language Model for Joint Chest X-ray Localization and Classification Task

Khai Le-Duc*, Ryan Zhang*, Ngoc-Son Nguyen*, Tan-Hanh Pham, Anh Dao, Ba Hung Ngo, Anh Totti Nguyen, Truong-Son Hy (* equal contribution)

Preprint 2024

Advancing Vietnamese Visual Question Answering with Transformer and Convolutional IntegrationElsevier Journal
Advancing Vietnamese Visual Question Answering with Transformer and Convolutional Integration

Ngoc-Son Nguyen, Van Son Nguyen, Tung Le

Journal Computers and Electrical Engineering 2024 Q1, IF = 4.9

Advancing Vietnamese Visual Question Answering with Transformer and Convolutional IntegrationElsevier Journal
Advancing Vietnamese Visual Question Answering with Transformer and Convolutional Integration

Ngoc-Son Nguyen, Van Son Nguyen, Tung Le

Journal Computers and Electrical Engineering 2024 Q1, IF = 4.9

All publications