Cluster Analysis

Cluster Analysis using ArcGIS


(Note: At the bottom of the map application are several widgets. A user can select different cluster layers to apply from the "Layer List". Also, if two layers are selected, a user can also select the "Swipe" widget to have a swipe function to move left and right between the two layers. Users can also add additional public data for their own context with the "Add Data" widget.)

Insights from Analysis

The above is an ArcGIS interactive web map application that includes a cluster analysis where criminal incidents have occurred in Washington, D.C. from January 1, 2010 to December 31, 2019. Anyone can interact with the map by zooming into neighborhoods of the city, then use the bottom right time scale to see how crime has evolved in the areas. The neighborhoods are also clickable which include some brief confidence interval breakdowns to paint a picture the likelihood of areas to be more dangerous than others.

At a high level view of the city from the center it would appear that throughout most of the past ten years theft while using a vehicle such as an electric scooter or car (blue clusters) were on the rise, but as 2019 progressed most of the vehicle thefts decreased, and theft without a vehicle such as cunningly stealing phones, wallets, etc increased (the red clusters).

If you zoom into the campus of Georgetown University, we can see that theft has also been an issue (primarily red clusters) throughout all of 2018 and in 2019 as well. However, what’s interesting is that in 2019 there are significantly fewer red clusters compared to 2018, which indicates that local law enforcement has prioritized making the campus safer. Compared to George Washington University, which has remained to be significantly safer than Georgetown with mostly all white clusters indicating very few crime incidents. Hopefully, local law enforcement is aware of these interesting trends and can continue to patrol these areas more or notify people parked around campus to be aware of their surroundings.

Data for Predictive Modeling

Feature Importance

Feature importance refers to techniques that assign a score to input functions primarily based on how useful they may be at predicting a target variable. There are many common and sources of characteristic importance scores, even though famous examples include statistical correlation scores, coefficients calculated as part of linear models, selection trees, and permutation importance scores.

Feature importance ratings play a vital function in a predictive modeling project, including imparting perception into the information and feature choice which could improve the efficiency and effectiveness of a predictive version at the problem. The next section of the criminal analysis includes a brief statistical test on how the DC Crime Incident features could be used for predictive modeling, as well as how Data Visualization is important to understand the results.

View Feature Importance