Hello!
Welcome to my portfolio and thank you very much for your interest! If you would like to get to know me more, please see the About section or the Blog for more information on my background, experiences, and why I would be a great fit to join your team! Below you can find projects demonstrating my data science skillset. You can connect with me on LinkedIn or send me an e-mail. Additionally, the logos at the bottom of the page serve as links to my email, GitHub, and LinkedIn. Thanks!
Project Portfolio
Time Series Forecasting with TensorFlow, ARIMA, and Prophet
May 2021
Technologies used: Python, TensorFlow, LSTM, ARIMA, Prophet
Utilizes multivariate weather dataset and retail sales dataset. Demonstrates the entire time series analysis workflow (data exploration, cleaning, preprocessing, windowing, and predictions). Additionally utilizes the ARIMA model and the Facebook Prophet model for more advanced forecasting with cutting edge machine learning tools.
TensorFlow Natural Language Processing and Optical Character Recognition
May 2021
Technologies used: Python, TensorFlow, NLP, LSTM, Transfer Learning, OCR, ktrain, pytesseract
Utilizes the popular IMDB Reviews dataset containing 50,000 movie reviews labeled pos/neg. Demonstrates the entire natural language processing workflow (data loading, preprocessing, tokenization, model tuning, predictions). Additionally utilizes transfer learning with the BERT model and the ktrain workflow. Includes optical character recognition and sentiment analysis on extracted text.
TensorFlow CIFAR-10 Image Classification
May 2021
Technologies used: Python, TensorFlow, Keras, Dropout Regularization, Batch Normalization, Data Augmentation, Model Optimization
Utilizes the popular CIFAR-10 dataset containing 60,000 32x32 color images from 10 classes. Demonstrates the entire computer vision workflow (data loading, preprocessing, model tuning, predictions).
DrivenData.org Competition - Predicting H1N1 and Influenza Vaccinations
February 2021
Technologies used: Python, scikit-learn, Logistic Regression, OneVsRest Classifier
DrivenData.org open data science competition to predict if an individual would receive the H1N1 or seasonal flu vaccine based on various behavioral and demographic features. Major challenges/learnings included working with a multi-label dataset to predict the likelihood of multiple outcomes as well as creating the preprocessing pipeline to handle entirely unseen data. Model placed in the top 20% of all submissions.
BrainStation Teaching Assistant Lesson - Hacker Statistics
January 2021
Technologies used: Python, numpy, pandas, matplotlib
Simple lesson I created and delivered for the cohort demonstrating the power of the central limit theorem and the power of leveraging computers to solve complex problems. Overview of the foundational skillset used in data analysis and machine learning. Lesson was inspired by a DataCamp course I took to practice these skills.
BrainStation Diploma Capstone Project - MyGamePass
November 2020
Technologies used: Python, scikit-learn, NLTK, surprise, FunkSVD
Recommendation system for video games. Explored both collaborative filtering and content-based filtering by cosine similarity. Required extensive cleaning, EDA, and preprocessing as well as matrix factorization, natural language processing and machine learning.