I'm Matthew Wilchek
Data Scientist and Geographer
- Age 32
- Address Annandale, VA. 22003.
- E-mail mwilchek@gmu.edu
- Phone (eight.60) 944-8795
- Freelance Not Available
- Active TS/SCI-CI Poly
Experienced Data Scientist in the federal law enforcement and intelligence industry with practical experience leading and deploying machine learning, data mining, and data visualization solutions; Interested in continued research for my Ph.D. program in Computer Science with focuses on computer vision and shared perception systems.
Professional Skills
Portfolio
Graduate School Deep Learning Project that Attempts to Classify Facial Emotions
This project attempted to develop a proof of concept to classify one of seven emotions from pictures of people that include Anger, Contempt, Disgust, Fear, Happy, Sadness and Surprise. After training a Convolutional Neural Network, we then applied our model to images of ourselves like in the example above. To view our approach check out our presentation of the project here.
Graduate School Machine Learning Project that Attempts to Predict the Stock Market
This project attempted to develop a proof of concept to predict any stock performance for current day and following day, based on history and the assumption news sentiment in a given day has a significant relationship. The above shows a Web Application developed in R to visualize the results of our project. To view our approach check out our presentation of the project here.
Stock and News Web Scraping
In order to create a sufficient supervised machine learning algorithm our project required sufficient data collection. We extracted stock data using an API from Yahoo Finance and news headline data from a Reddit API called PRAW. Once the data was collected on a daily basis, we executed a separate sentiment algorithm on the news data using R (that can be found here), and joined all to the master dataset for the algorithm. To view our data collection script click here.
Modeling
In order to account for changing sentiment on any given day from our previous sentiment algorithm, we needed to identify the most significant sentiment for the current and predicted following day for our primary models. To do so, we leveraged Python Sklearn's library for a number of machine learning libraries such as Random Forest Regressor, Neural Net Regressor, and Support Vector Machine Regressor. Click here to view a plot for one of our results in a single execution here.
Finally after identifying the most significant sentiment in the given day, we were able to refine our models for a fit regression. As new data is collected and added to our master dataset, our train/test prediction model will continually be refined and accuracy improved. To check out our final algorithm, click here.
This project tries to investigate for a linear relationship between global average temperature and time, using linear regression.
The team gathered climate data from Berkeley Earth, and focused on global land temperatures for the analysis. Data is available from 1750-2013. Our analysis and climate predictions are presented through ArcGIS and a R Shiny web application. All source code is availble in our Github repository here.
U.S. Temperatures
By using ArcGIS, I developed two interactive web map application to help visualize how CONUS temperatures changed over the years in major cities. The shapefiles used for the application can also be found in the Github repository here.
Below are where the 2 applications can be viewed:
Analyzing Models and Interpreting Results
The 1975-2013 model is Y = -49.148211+0.048966X, with both the coefficient and intercept statistically significant at the alpha <0.05 level. The overall model has a statistically F-statistic as well and is a good fit. The adjusted R^2 = 0.7813. While this R^2 is not as good as the first, it is still an acceptable R^2.
From an R^2 perspective, model 1 has the best result. However, from an interpretation standpoint, the coefficient for time in Model 1 is close to 0 and thus shows just a small rise in temperature. This model does not provide a good fit to recent temperatures while Model 4 does. To check out our predictions for future temperatures using Model 4, check out our web application here.
Learning R - Statistical Computing
This was one of my early assignments as I started my graduate degree in Data Science. The The class focused on learning the epicycle of data analysis and using R Statistical Computing to strenghten quantitative analysis skills. To see the results of my analysis click here.
In this assignment I performed an exploratory data analysis on a population of women who were at least 21 years old, of Pima Indian heritage and living near Phoenix, Arizona, and were tested for diabetes according to World Health Organization criteria. The data was collected by the US National Institute of Diabetes and Digestive and Kidney Diseases. It comprises of 532 complete records after dropping the (mainly missing) data on serum insulin. More information on the dataset can be found here.
User Account Management for Applications
https://github.com/mwilchek/Application-Account_v4/tree/master/src
This was one of my early Java programs I developed as I was learning Computer Science. For this assignment I created a Java application that leveraged JavaFX for a graphical user interface (GUI). I also leveraged various data structures such as Heap, Priority Queue, Binary Search Tree, and Weighted Graph. For this particular assignment, I also integrated some of the Google API Geo-coding files as well as an Ant Colony Optimization (ACO) Algorithm to increase performance and create more precise directional paths.
The application can also save data locally through a .dat binary file and re-load the same information during a later execution. Feel free to check out the source code to learn more.
Java Programming - Shortcut to Your Favorites
https://github.com/mwilchek/CSC_Homework3/blob/master/OpenFavorites.java
This was one of my early Java programs I developed as I was learning Computer Science. While, this program does not have much to do with data analysis, I wanted to include it on my site to share with everyone because it is quite handy.
Once executed the first time, this program will prompt the user to paste in the URLs of their favorite websites. After, it will open their default web browser with all of the sites the user listed. The next time the user runs the program, it will have saved the first list and can open the favorite sites again immediately.
Work Experience
DHS - Immigration and Customs Enforcement (ICE)
Senior Operations Research Analyst (Federal Position - Permanent)
Support the Systems and Operational Analysis Unit under ICE's Field Operations through designing and editing advance statistical programs and simulation models using machine learning in R and Python.
U.S. Army - DEVCOM
Data Scientist (Federal Position - Permanent)
Develop augmented reality (AR) experiences with Computer Vision AI for the Integrated Visual Augmentation System (IVAS) (Microsoft HoloLens) for soldiers that control & command various air systems
U.S. Census Bureau - Geography Dvision
Geographer (Federal Position - Term)
Supported a research and development branch with updating and innovating Big Data, while improving data visualization methodologies leveraging various programming languages.
USDA - Foreign Agricultural Service
Web/GIS Developer (Federal Position - Term)
Acted as the primary technology liaison/business analyst for the Disaster Assistance division's Office of the Director. Developed and co-authored multiple web GIS Story Map applications used to showcase international projects by the agency.
Northrop Grumman
Web Developer / Intel Analyst
Held multiple positions ranging from web development for classified projects/clients to researching and analyzing competitve business intelligence to support business development and procurement.
Northrop Grumman
Intern
Assisted Senior Solution Architects with the development of open-source intelligence (OSINT) analytical software by performing operational research, planning analysis, and financial intelligence (FININT) analysis such as predictive analytics
U.S. Treasury - Financial Crimes Enforcement Network (FinCEN)
Intel Analyst Intern
Studied trends and patterns, assisted in identifying intelligence gaps on global financial transactions activities in support of the prevention of financial crimes with senior intelligence analysts; including krypto- currency trends such as bitcoin
Education
Ph.D. in Computer Science
Virginia Polytechnic Institute and State University
Master of Science in Data Science
George Washington University
1 Week Course in Project Management for the PMP Certification
Project Management Institute (PMI)
Associate of Science in Computer Science
Northern Virginia Community College
Bachelor of Arts in International Affairs
George Mason University
My Interests
During my free time, I like to do side projects in machine learning and penetration testing.
I also enjoy traveling, computer gaming and have snowboarded since I was little.
- Mountain Biking
- Finishing Grad School
- Skateboarding
- Snowboarding
- Computer Gaming
- Penetration Testing
- Rock Climbing
Contact Me
Feel free to contact me
- Address Falls Church, VA. 22042
- phone (eight.60) 944-8795
- E-mail mwilchek@gmu.edu