Highlights

Skills

  • Causal Inference
  • Machine Learning
  • Deep Learning
  • Natural Language Processing
  • Reinforcement Learning
  • Time Series Analysis

Programming Languages

  • Python (PyTorch, Scikit-learn, NumPy, Pandas, Matplotlib, NLTK)
  • R
  • MATLAB
  • SQL
  • HTML

Projects

Continuous Treatment Effects

This project aims to develop a novel hypothesis testing for causal neural network modeling in assessing the significance of continuous treatments. (2024 Poster at Statistics in the Age of AI)

Sentiment Analysis: Review and Rating

This project applies sentiment analysis to online review texts from TripAdvisor, Google Maps, and Yelp, using Python library NLTK to predict ratings and provide insights for businesses.

Causal Bandit Algorithm

This study reviews adaptive causal bandit algorithms, highlights HAC-UCB as the state-of-the-art, and explores improvements for environments without the conditionally benign property.

State Space Models: Logistic Map

This project reviews the use of the Gibbs sampler for Bayesian inference in nonlinear state space models, demonstrating its effectiveness in handling nonlinearity and uncertainties in chaotic systems.

Machine Translation: English-French

This project reviews transformers and BERT in machine translation, comparing their performance in English-to-French translation against a vanilla transformer model.

Clustering Analysis: Conflicts in Africa

This report introduces a novel clustering method for Somali conflicts (2019–2023) using Word2Vec, t-SNE, and weighted location data, outperforming baseline models and providing insights for conflict resolution and prevention.

Covariance Hypotheses Testing

This project compares the power of two-sample covariance testing methods: Random-matrix-theory-based approach and graph-based method under high-dimensional scenarios.

Linear Mixture Models: Lambs Weights

This project applies linear mixture models to lamb birth weight data, finding RMLE better supports random effects, shows consistent asymptotic behavior, and is preferred over MLE for analysis.

LASSO and Adaptive LASSO: Diabates

This project analyzes the diabetes dataset using Lasso and Adaptive Lasso regression, comparing models selected by AIC and BIC, and finding that Lasso with AIC results in the lowest MSE.

Convex Optimization: Likelihood Ratio Test

This project uses convex optimization to compute the empirical likelihood ratio test statistic for testing population means, with numerical solutions in MATLAB's CVX and insights into hypothesis testing through visualization of results across varying constraints.

Handwritten Digits Regonzation: MINIST

This project demonstrates handwritten digit recognition (3, 4, and 5) from the MNIST database using singular value decomposition for matrix factorization to build and evaluate a classifier.

Hidden Markov Models: Sleaping Stages

This project uses hidden Markov models to classify observation states, compute transition and emission matrices, and predict sleep stages for improved diagnosis efficiency.

Regression Analysis: Time Management and GPA

This project examines the time management behaviors of NTU students in the 2019 fall semester to explore its impact on academic performance.