Hello!

Welcome to my portfolio and thank you very much for your interest! If you would like to get to know me more, please see the About section or the Blog for more information on my background, experiences, and why I would be a great fit to join your team! Below you can find projects demonstrating my data science skillset. You can connect with me on LinkedIn or send me an e-mail. Additionally, the logos at the bottom of the page serve as links to my email, GitHub, and LinkedIn. Thanks!

Project Portfolio

Time Series Forecasting with TensorFlow, ARIMA, and Prophet

May 2021

Prophet Forecast

Technologies used: Python, TensorFlow, LSTM, ARIMA, Prophet

Utilizes multivariate weather dataset and retail sales dataset. Demonstrates the entire time series analysis workflow (data exploration, cleaning, preprocessing, windowing, and predictions). Additionally utilizes the ARIMA model and the Facebook Prophet model for more advanced forecasting with cutting edge machine learning tools.

TensorFlow Natural Language Processing and Optical Character Recognition

May 2021

NLP Workflow

Technologies used: Python, TensorFlow, NLP, LSTM, Transfer Learning, OCR, ktrain, pytesseract

Utilizes the popular IMDB Reviews dataset containing 50,000 movie reviews labeled pos/neg. Demonstrates the entire natural language processing workflow (data loading, preprocessing, tokenization, model tuning, predictions). Additionally utilizes transfer learning with the BERT model and the ktrain workflow. Includes optical character recognition and sentiment analysis on extracted text.

TensorFlow CIFAR-10 Image Classification

May 2021

TensorFlow CIFAR-10 Learning Curves

Technologies used: Python, TensorFlow, Keras, Dropout Regularization, Batch Normalization, Data Augmentation, Model Optimization

Utilizes the popular CIFAR-10 dataset containing 60,000 32x32 color images from 10 classes. Demonstrates the entire computer vision workflow (data loading, preprocessing, model tuning, predictions).

DrivenData.org Competition - Predicting H1N1 and Influenza Vaccinations

February 2021

Driven Data Competition - Predicintg H1N1 and Influenza Vaccinations

Technologies used: Python, scikit-learn, Logistic Regression, OneVsRest Classifier

DrivenData.org open data science competition to predict if an individual would receive the H1N1 or seasonal flu vaccine based on various behavioral and demographic features. Major challenges/learnings included working with a multi-label dataset to predict the likelihood of multiple outcomes as well as creating the preprocessing pipeline to handle entirely unseen data. Model placed in the top 20% of all submissions.

BrainStation Teaching Assistant Lesson - Hacker Statistics

January 2021

Hacker Statistics results

Technologies used: Python, numpy, pandas, matplotlib

Simple lesson I created and delivered for the cohort demonstrating the power of the central limit theorem and the power of leveraging computers to solve complex problems. Overview of the foundational skillset used in data analysis and machine learning. Lesson was inspired by a DataCamp course I took to practice these skills.

BrainStation Diploma Capstone Project - MyGamePass

November 2020

MyGamePass title slide

Technologies used: Python, scikit-learn, NLTK, surprise, FunkSVD

Recommendation system for video games. Explored both collaborative filtering and content-based filtering by cosine similarity. Required extensive cleaning, EDA, and preprocessing as well as matrix factorization, natural language processing and machine learning.