Blocked Gibbs Sampling in R for Bayesian Multiple Linear Regression

In a previous post, I derived and coded a Gibbs sampler in R for estimating a simple linear regression. In this post, I will do the same for multivariate linear regression. I will derive the conditional posterior distributions necessary for the blocked Gibbs sampler. I will then code the sampler and test it using simulated … Continue reading Blocked Gibbs Sampling in R for Bayesian Multiple Linear Regression

Advertisements

Bayesian Simple Linear Regression with Gibbs Sampling in R

Many introductions to Bayesian analysis use relatively simple didactic examples (e.g. making inference about the probability of success given bernoulli data). While this makes for a good introduction to Bayesian principles, the extension of these principles to regression is not straight-forward. This post will sketch out how these principles extend to simple linear regression. Along … Continue reading Bayesian Simple Linear Regression with Gibbs Sampling in R

Fixed Effects, Random Effects, and First Differencing

I came across a stackoverflow post the other day touching on first differencing and decided to write a quick review of the topic as well as related random effects and fixed effects methods. In the end we'll see that random effects, fixed effects, and first differencing are primarily used to handle unobserved heterogeneity within a … Continue reading Fixed Effects, Random Effects, and First Differencing

Exploring P-values with Simulations in R

The recent flare-up in discussions on p-values inspired me to conduct a brief simulation study. In particularly, I wanted to illustrate just how p-values vary with different effect and sample sizes. Here are the details of the simulation. I simulated $latex n $ draws of my independent variable $latex X $: $latex X_n \sim N(100, 400)$ where $latex … Continue reading Exploring P-values with Simulations in R

Stop and Frisk: Blacks stopped 3-6 times more than Whites over 10 years

The NYPD provides publicly available data on stop and frisks with data dictionaries, located here. The data, ranging from 2003 to 2014, contains information on over 4.5 million stops. Several variables such as the age, sex, and race of the person stopped are included. I wrote some R code to clean and compile the data … Continue reading Stop and Frisk: Blacks stopped 3-6 times more than Whites over 10 years

Simulating Endogeneity

Introduction The topic in this post is endogeneity, which can severely bias regression estimates. I will specifically simulate endogeneity caused by an omitted variable. In future posts in this series, I'll simulate other specification issues such as heteroskedasticity, multicollinearity, and collider bias. The Data-Generating Process Consider the data-generating process (DGP) of some outcome variable $latex Y $: … Continue reading Simulating Endogeneity

Using Markov Chains to Model Mortgage Defaults in R

The goal of this post is to blend the material I've been learning in my night class with my day-job and R.If we have some object that switches between states over time according to fixed probabilities, we can model the long-term behavior of this object using Markov chains*.A good example is a mortgage. At any … Continue reading Using Markov Chains to Model Mortgage Defaults in R