PhD Researcher · ML Engineer · Multi-modal AI Specialist

Ngoc Dung
Huynh

Building large-scale vision-language systems and internet-scale data pipelines. First-author publications at ICCV 2025 and CVPR 2026. Engineering consultant at TII Abu Dhabi and PhD researcher at Deakin University.

Melbourne, VIC, Australia Top 9 Worldwide — Toloka VQA Top 7 Globally — COVID Detection

Email me GitHub LinkedIn Download CV

About

Who I am

I am a PhD researcher at Deakin University and an Engineering Consultant at the Technology Innovation Institute (TII, UAE). I specialise in multi-modal AI — the intersection of vision, language, and speech. My work spans architecting internet-scale data pipelines, training large vision-language models on GCP/AWS, building VQA benchmarks, and publishing at top-tier venues including ICCV and CVPR. I hold a BSc in Mathematics and an MSc in Data Science (GPA 86%) from Deakin University.

Core Competencies

What I work with

🤖

ML / AI

Vision-Language Models (VLMs), LLMs, Visual Question Answering, Multi-modal Reasoning, Speech-Vision-Language

⚙️

Frameworks

PyTorch, TensorFlow, Keras, HuggingFace Transformers, Weights & Biases

🗄️

Data Engineering

OCR Pipelines, ETL, Web Crawling, Deduplication, LLM Filtering, SFT Data Generation, Agent-Based Pipelines

☁️

Infrastructure

AWS, GCP, Docker, Linux, Slurm, Flask, React.js, Elasticsearch

💻

Programming

Python, JavaScript, SQL, R, C++

🎯

Specialties

Annotation Systems, STEM-VQA, Distributed Training, Benchmark Evaluation

Experience

Where I've worked

Engineering Consultant — Multi-modal AI & Data

Technology Innovation Institute (TII) · Abu Dhabi, UAE (Remote)

Jan 2025 – Present

Architected production-grade ETL pipelines to crawl, deduplicate and normalize internet-scale multi-modal datasets supporting Falcon-H training.
Processed 3M+ PDFs via OCR, layout parsing, and CV-based structured text extraction with multi-stage cleaning and deduplication.
Synthesized large-scale SFT datasets using GPT-4, Gemini, Claude, and Qwen to accelerate Falcon-H model alignment.
Unified data from 10+ agent platforms into multi-modal corpora with content filtering and quality-scoring pipelines.
Engineered a React-based annotation platform for segmentation and bounding-box labeling.
Trained large-scale VLMs on GCP and AWS across distributed Slurm clusters.
Built end-to-end VQA training and evaluation pipelines for STEM, charts, equations, and scientific plots.

Research Intern — Multi-modal AI

Technology Innovation Institute (TII) · Abu Dhabi, UAE

Apr 2024 – Jan 2025

Led research on multi-modal reasoning across speech, vision, and language modalities — contributing to ICCV 2025.
Developed and integrated ASR, VQA, OCR, and LLM inference components into unified end-to-end pipelines.
Co-authored ICCV 2025 paper and contributed to multiple arXiv publications on VLM evaluation.

Research Assistant — Visual Question Answering

Deakin University · Melbourne, Australia

Mar 2022 – Oct 2022

Ranked Top 9 worldwide in the Toloka VQA Challenge (WSDM Cup 2023).
Achieved Top 7 globally in the COVID Detection Challenge using 3D CT medical imaging.
Founded a university-wide AI competition at Deakin to grow the campus ML community.

Software Engineer (Part-time)

Stealth Startup · Singapore (Remote)

Aug 2020 – Nov 2021

Designed and delivered a full-stack multi-user annotation platform (Flask + React + Elasticsearch) with RESTful APIs.

Education

Academic background

PhD in Computer Science

Deakin University · Melbourne, Australia

Thesis: Designing Scalable and Interpretable Vision–Language–Speech Systems for Generalised Multi-modal Reasoning

Oct 2022 – Present

MSc in Data Science — GPA: 86%

Deakin University · Melbourne, Australia

Thesis: Speech-to-CDQL — Context Definition and Query Language from Natural Language for Smart Home

Mar 2020 – Mar 2022

BSc in Mathematics

University of Education Hue · Vietnam

Sep 2014 – Sep 2018

Ngoc DungHuynh

Who I am

What I work with

ML / AI

Frameworks

Data Engineering

Infrastructure

Programming

Specialties

Where I've worked

Academic background

Ngoc Dung
Huynh