
Thái Hoài An
Data Science Student
I’m a Data Science student who enjoys turning messy data and unclear problems into measurable ML outcomes.
My work spans NLP (RAG, IE/NER), time-series forecasting, and computer vision, with an emphasis on clean experimentation, benchmarking, and clear communication of results. I’ve built demo apps to showcase workflows end-to-end and validate ideas quickly. I’ve also participated in competitions and research, including award-winning work on Vietnamese fake-news detection. I’m looking for internship opportunities where I can contribute to real products and iterate fast with a team.
About Me
Education
BSc in Data Science (Expected 2027), University of Economics Ho Chi Minh City (UEH), 2023-Present
GPA: 3.7/4.0
Career Goals
My short-term goal is to join a data/AI team where I can work with real-world datasets, build end-to-end ML solutions, and learn from strong mentorship. In the long term, I plan to pursue a Master’s degree abroad and continue developing impactful AI applications.
Strengths
- Skilled in coding, algorithm exploration, and data handling across team and personal projects
- Proactive in learning new technologies and sharing knowledge through technical blogging
- Strong logical thinking with a focus on efficiency and practical solutions
- Confident in team leadership, task coordination, and delivering clear technical presentations
- Continuously improving personal learning methods to boost performance and adaptability
Education
Bachelor of Science in Data Science
University of Economics Ho Chi Minh City (UEH)
2023 - Present
Academic Achievements
- GPA: 3.7/4.0
- Merit-based Scholarship for Academic Excellence - Semester 2
Relevant Coursework
Academic Achievements
- Consolation Prize - National Excellent Student Contest in Chemistry, Vietnam (2023)
High School Diploma
Lê Quý Đôn High School for the Gifted, Ninh Thuận Province
2020 - 2023
Work & Research
Gallery hoặc list để giúp recruiter scan nhanh và người đọc sâu xem chi tiết. Filter theo loại, view linh hoạt, và sort theo mức độ gần đây/impact.

AI Viet Nam – AIO25 Program
Structured training track in AI/ML with hands-on assignments
Participated in the AIO25 program to strengthen fundamentals and applied skills in AI/ML through structured modules and practical exercises.
Outcome
- •Completed/ongoing modules with practical assignments and iterative improvement

Breast Cancer Ultrasound CAD (Segmentation + Classification)
Multi-task learning for medical image segmentation and diagnosis support
Computer vision project on BUSI ultrasound images comparing multi-task vs sequential pipelines. Implemented U-Net with EfficientNet encoder and evaluated segmentation (Dice/IoU) alongside classification performance.
Outcome
- •Segmentation reached Dice 0.7648 and IoU 0.6233 on BUSI
- •Multi-task setup improved classification accuracy to 0.853 (vs 0.620 sequential)

Vietnamese Fake News Detection (BiLSTM vs PhoBERT vs LLM)
Benchmarking classical deep learning and modern LLM prompting for Vietnamese news
Faculty-level research project on Vietnamese fake news detection using ReINTEL. Benchmarked BiLSTM, PhoBERT (frozen/fine-tuned), and LLM prompting; reported Accuracy/Macro-F1/AUC and performed error analysis and deployment trade-off review.
Outcome
- •Best model achieved Accuracy 0.963, Macro-F1 0.929, AUC 0.980 on test set
- •Produced benchmarking report with error analysis and inference trade-offs (latency/VRAM)

Vietnam Weather Analytics & Forecasting
Time-series features, dashboards, and weather prediction
End-to-end data analytics project on Vietnam weather data: cleaning, visualization dashboards, and ML-based forecasting/classification. Delivered a Streamlit demo to explore patterns and model outputs.
Outcome
- •Binary Rain vs Not Rain achieved Accuracy 0.836 and F1(Rain) 0.887
- •Delivered interactive dashboard for exploration and model interpretation

VN Bank Stock Analytics
Time-series forecasting and decision support with multi-source data
Data Mining project building a multi-source analytics pipeline for Vietnamese banking stocks. Performed feature engineering for time-series, trained models for return regression and risk forecasting, and summarized insights for investment decision support.
Outcome
- •Return regression achieved MAE 0.0938 and RMSE 0.1189
- •Risk forecasting achieved correlation 0.9808 (pred vs actual)

Vietnamese Medical IE Pipeline (NER + Relation Extraction)
Information extraction workflow with labeling, training, and evaluation
NLP project delivering an end-to-end IE workflow for Vietnamese medical text, including dataset labeling (Label Studio), model training, and evaluation. Explored semi-supervised hybrid relation extraction to improve performance under limited annotations.
Outcome
- •Semi-supervised hybrid RE achieved Accuracy 0.8125 and Macro-F1 0.6309
- •Established labeling-to-evaluation pipeline for repeatable experimentation

Top 3 – Humanitarian Logistics Hackathon
Smart surplus-food allocation for underserved communities
Collaborated with a cross-university team to build a logistics solution combining data management, ML allocation, and IoT warehouse tracking to reduce food waste.
Outcome
- •Top 3 finalist across HCMC universities
- •Proposed ML-driven allocation reducing surplus mismatch

MeetMate – AI Meeting Intelligence Platform
Real-time meeting understanding, decisions tracking, and post-meeting action automation
MeetMate is an AI-powered meeting intelligence platform developed during the VNPT AI Hackathon. The project addresses a common enterprise pain point: meetings consume significant time but often fail to produce clear decisions, accountable action items, and traceable outcomes. MeetMate integrates automatic speech recognition, real-time context understanding, and post-meeting summarization to transform unstructured conversations into structured minutes, decisions, and follow-up tasks. The solution focuses on enterprise readiness, emphasizing accuracy, explainability, and workflow integration rather than simple note-taking.
Outcome
- •Built an end-to-end AI meeting workflow from live transcription to structured post-meeting minutes
- •Designed decision and action-item extraction aligned with enterprise meeting templates

Genetic Algorithm for Maximum Network Flow
AI-driven optimizer with interactive flow simulation
Course research project exploring evolutionary optimization for network flow. Built a visual simulator and enhanced genetic operators for faster, more stable convergence on directed graphs.
Outcome
- •Interactive simulator to visualize flow networks and GA iterations
- •Adaptive mutation and population seeding improved convergence stability
Certificates & Courses

AIO25 - Module 1
AI VIET NAM
Completed: 2025-06-30
Completed Module 1 of AIO Course 2025: Python, OOP, Data Structures, Advanced SQL, Git, LaTeX, Linux, and built a RAG-based chatbot prototype.
View CertificateData Science & AI Program
AI VIET NAM
Completed: In Progress
A comprehensive DS-AI program covering data analytics foundations (Python, SQL, BI tools) and AI engineering skills (ML/DL, Transformers, deployment). Designed to prepare learners for roles from Data Analyst to AI Engineer through hands-on projects and real-world applications.
View Certificate