Stop and Frisk: Spatial Analysis of Racial Differences

In my last post, I compiled and cleaned publicly available data on over 4.5 million stops over the past 11 years. I also presented preliminary summary statistics showing that blacks had been consistently stopped 3-6 times more than whites over the last decade in NYC. Since the last post, I managed to clean and reformat the … Continue reading Stop and Frisk: Spatial Analysis of Racial Differences

Advertisements

Stop and Frisk: Blacks stopped 3-6 times more than Whites over 10 years

The NYPD provides publicly available data on stop and frisks with data dictionaries, located here. The data, ranging from 2003 to 2014, contains information on over 4.5 million stops. Several variables such as the age, sex, and race of the person stopped are included. I wrote some R code to clean and compile the data … Continue reading Stop and Frisk: Blacks stopped 3-6 times more than Whites over 10 years

Modeling Ebola Contagion Using Airline Networks in R

I first became interested in networks when reading Matthew O'Jackson's 2010 paper describing their application to economics. During the 2014 ebola outbreak, there was a lot of concern over the disease spreading to the U.S.. I was caught up with work/classes at the time, but decided to use airline flight data to at least explore the question. The source … Continue reading Modeling Ebola Contagion Using Airline Networks in R

NYC Motor Vehicle Collisions – Street-Level Heat Map

In this post I will extend a previous analysis creating a borough-level heat map of NYC motor vehicle collisions. The data is from NYC Open Data. In particular, I will go from borough-level to street-level collisions. The processing of the code is very similar to the previous analysis, with a few more functions that map streets to colors. … Continue reading NYC Motor Vehicle Collisions – Street-Level Heat Map

Simulating Endogeneity

Introduction The topic in this post is endogeneity, which can severely bias regression estimates. I will specifically simulate endogeneity caused by an omitted variable. In future posts in this series, I'll simulate other specification issues such as heteroskedasticity, multicollinearity, and collider bias. The Data-Generating Process Consider the data-generating process (DGP) of some outcome variable $latex Y $: … Continue reading Simulating Endogeneity

Visualizing Hubway Trips in Boston

Most Popular Hubway Stations (in order): Post Office Sq. - located in the heart of the financial district. Charles St. & Cambridge - the first Hubway stop after crossing from Cambridge over Longfellow Bridge. Tremont St & West - East side of the Boston Common South Station Cross St. & Hannover - entrance to North End combing from financial … Continue reading Visualizing Hubway Trips in Boston