Profile Photo

Thái Hoài An

Data Science Student

I’m a Data Science student who enjoys turning messy data and unclear problems into measurable ML outcomes.

My work spans NLP (RAG, IE/NER), time-series forecasting, and computer vision, with an emphasis on clean experimentation, benchmarking, and clear communication of results. I’ve built demo apps to showcase workflows end-to-end and validate ideas quickly. I’ve also participated in competitions and research, including award-winning work on Vietnamese fake-news detection. I’m looking for internship opportunities where I can contribute to real products and iterate fast with a team.

About Me

Education

BSc in Data Science (Expected 2027), University of Economics Ho Chi Minh City (UEH), 2023-Present
GPA: 3.7/4.0

Career Goals

My short-term goal is to join a data/AI team where I can work with real-world datasets, build end-to-end ML solutions, and learn from strong mentorship. In the long term, I plan to pursue a Master’s degree abroad and continue developing impactful AI applications.

Strengths

  • Skilled in coding, algorithm exploration, and data handling across team and personal projects
  • Proactive in learning new technologies and sharing knowledge through technical blogging
  • Strong logical thinking with a focus on efficiency and practical solutions
  • Confident in team leadership, task coordination, and delivering clear technical presentations
  • Continuously improving personal learning methods to boost performance and adaptability

Education

Bachelor of Science in Data Science

University of Economics Ho Chi Minh City (UEH)

2023 - Present

Academic Achievements

  • GPA: 3.7/4.0
  • Merit-based Scholarship for Academic Excellence - Semester 2

Relevant Coursework

Data Structures and Algorithms
Data Base
Econometrics
Artificial Intelligence
Data Science
Mathematical Statistics
Data Mining
Machine Learning
Data Analytics Programming
Data Visualization
Big Data and Applications
NLP

Academic Achievements

  • Consolation Prize - National Excellent Student Contest in Chemistry, Vietnam (2023)

High School Diploma

Lê Quý Đôn High School for the Gifted, Ninh Thuận Province

2020 - 2023

Work & Research

Gallery hoặc list để giúp recruiter scan nhanh và người đọc sâu xem chi tiết. Filter theo loại, view linh hoạt, và sort theo mức độ gần đây/impact.

Sort by:
AI Viet Nam – AIO25 Program
AI/ML
2025–2026

AI Viet Nam – AIO25 Program

Structured training track in AI/ML with hands-on assignments

Participated in the AIO25 program to strengthen fundamentals and applied skills in AI/ML through structured modules and practical exercises.

Learner

Outcome

  • Completed/ongoing modules with practical assignments and iterative improvement
Machine Learning
Practice
Coursework
Status:
ongoing
Breast Cancer Ultrasound CAD (Segmentation + Classification)
Project
Computer Vision
2025

Breast Cancer Ultrasound CAD (Segmentation + Classification)

Multi-task learning for medical image segmentation and diagnosis support

Computer vision project on BUSI ultrasound images comparing multi-task vs sequential pipelines. Implemented U-Net with EfficientNet encoder and evaluated segmentation (Dice/IoU) alongside classification performance.

Developer
Research

Outcome

  • Segmentation reached Dice 0.7648 and IoU 0.6233 on BUSI
  • Multi-task setup improved classification accuracy to 0.853 (vs 0.620 sequential)
PyTorch
U-Net
EfficientNet
OpenCV
Medical Imaging
Status:
completed
Vietnamese Fake News Detection (BiLSTM vs PhoBERT vs LLM)
Research
AI/ML
2025

Vietnamese Fake News Detection (BiLSTM vs PhoBERT vs LLM)

Benchmarking classical deep learning and modern LLM prompting for Vietnamese news

Faculty-level research project on Vietnamese fake news detection using ReINTEL. Benchmarked BiLSTM, PhoBERT (frozen/fine-tuned), and LLM prompting; reported Accuracy/Macro-F1/AUC and performed error analysis and deployment trade-off review.

Researcher
Developer

Outcome

  • Best model achieved Accuracy 0.963, Macro-F1 0.929, AUC 0.980 on test set
  • Produced benchmarking report with error analysis and inference trade-offs (latency/VRAM)
PyTorch
Transformers
PhoBERT
BiLSTM
Evaluation
Status:
completed
Vietnam Weather Analytics & Forecasting
Project
Data Analytics
2025

Vietnam Weather Analytics & Forecasting

Time-series features, dashboards, and weather prediction

End-to-end data analytics project on Vietnam weather data: cleaning, visualization dashboards, and ML-based forecasting/classification. Delivered a Streamlit demo to explore patterns and model outputs.

Developer
Analyst

Outcome

  • Binary Rain vs Not Rain achieved Accuracy 0.836 and F1(Rain) 0.887
  • Delivered interactive dashboard for exploration and model interpretation
Python
scikit-learn
Streamlit
Visualization
Status:
completed
VN Bank Stock Analytics
Project
AI/ML
2025

VN Bank Stock Analytics

Time-series forecasting and decision support with multi-source data

Data Mining project building a multi-source analytics pipeline for Vietnamese banking stocks. Performed feature engineering for time-series, trained models for return regression and risk forecasting, and summarized insights for investment decision support.

Developer
Analyst

Outcome

  • Return regression achieved MAE 0.0938 and RMSE 0.1189
  • Risk forecasting achieved correlation 0.9808 (pred vs actual)
Python
XGBoost
Time-series
Feature Engineering
Status:
completed
Vietnamese Medical IE Pipeline (NER + Relation Extraction)
Project
NLP
2025

Vietnamese Medical IE Pipeline (NER + Relation Extraction)

Information extraction workflow with labeling, training, and evaluation

NLP project delivering an end-to-end IE workflow for Vietnamese medical text, including dataset labeling (Label Studio), model training, and evaluation. Explored semi-supervised hybrid relation extraction to improve performance under limited annotations.

Developer
Research

Outcome

  • Semi-supervised hybrid RE achieved Accuracy 0.8125 and Macro-F1 0.6309
  • Established labeling-to-evaluation pipeline for repeatable experimentation
Python
Label Studio
NER
Relation Extraction
BERT
Status:
completed
Top 3 – Humanitarian Logistics Hackathon
Competition
Logistics
2025

Top 3 – Humanitarian Logistics Hackathon

Smart surplus-food allocation for underserved communities

Collaborated with a cross-university team to build a logistics solution combining data management, ML allocation, and IoT warehouse tracking to reduce food waste.

Product
Research

Outcome

  • Top 3 finalist across HCMC universities
  • Proposed ML-driven allocation reducing surplus mismatch
Hackathon
Machine Learning
IoT
Teamwork
Status:
completed
MeetMate – AI Meeting Intelligence Platform
Competition
AI/ML
2025

MeetMate – AI Meeting Intelligence Platform

Real-time meeting understanding, decisions tracking, and post-meeting action automation

MeetMate is an AI-powered meeting intelligence platform developed during the VNPT AI Hackathon. The project addresses a common enterprise pain point: meetings consume significant time but often fail to produce clear decisions, accountable action items, and traceable outcomes. MeetMate integrates automatic speech recognition, real-time context understanding, and post-meeting summarization to transform unstructured conversations into structured minutes, decisions, and follow-up tasks. The solution focuses on enterprise readiness, emphasizing accuracy, explainability, and workflow integration rather than simple note-taking.

Product Owner
System Designer

Outcome

  • Built an end-to-end AI meeting workflow from live transcription to structured post-meeting minutes
  • Designed decision and action-item extraction aligned with enterprise meeting templates
Hackathon
Speech Recognition
NLP
RAG
Product Design
Enterprise Workflow
Status:
completed
Genetic Algorithm for Maximum Network Flow
Research
AI/ML
2024

Genetic Algorithm for Maximum Network Flow

AI-driven optimizer with interactive flow simulation

Course research project exploring evolutionary optimization for network flow. Built a visual simulator and enhanced genetic operators for faster, more stable convergence on directed graphs.

Team Lead
Developer

Outcome

  • Interactive simulator to visualize flow networks and GA iterations
  • Adaptive mutation and population seeding improved convergence stability
Python
PyQt5
Genetic Algorithm
Graph Theory
Status:
completed

Certificates & Courses

AIO25 - Module 1 Certificate

AIO25 - Module 1

AI VIET NAM

Completed: 2025-06-30

Completed Module 1 of AIO Course 2025: Python, OOP, Data Structures, Advanced SQL, Git, LaTeX, Linux, and built a RAG-based chatbot prototype.

View Certificate
Data Science & AI Program Certificate

Data Science & AI Program

AI VIET NAM

Completed: In Progress

A comprehensive DS-AI program covering data analytics foundations (Python, SQL, BI tools) and AI engineering skills (ML/DL, Transformers, deployment). Designed to prepare learners for roles from Data Analyst to AI Engineer through hands-on projects and real-world applications.

View Certificate