# Using Markov Chains to Model Mortgage Defaults in R

The goal of this post is to blend the material I’ve been learning in my night class with my day-job and R.

If we have some object that switches between states over time according to fixed probabilities, we can model the long-term behavior of this object using Markov chains*.

A good example is a mortgage. At any given point in time, a loan has a probability of defaulting, stay current on payments, or getting paid-off in full. Collectively, we call these “transition probabilities.” These probabilities are assumed to be fixed over the life of the loan**.

As an example, we’ll look at conventional fixed-rate, 30-year mortgages. Let’s suppose that every current loan in time T has a 75% chance of staying current, 10% chance of defaulting, and 15% chance of being paid off in time T+1. These transition probabilities are outlined in the graphic above. Obviously, once a loan defaults or gets paid-off, it stays defaulted or paid-off. We call such states “absorbing states.”

Since we know the transition probabilities, all we need*** is an initial distribution of loans and we can predict the percentage of loans in each state at any given point in the 30-year period. Suppose we start off at T=0 with 100 current loans, and 0 defaulted and paid off loans. In time T+1, we know (according to our transition probabilities) that 75 of these 100 will remain current on their payments. However, 15 loans will be paid off and 10 loans will default. Since we assume that the transition probabilities are constant through the loans’ lives, we can use them to find the amount of current loans in time t=2. Of the 75 loans that were current in T+1, 56.25 loans will remain current in T+2 (since 75*.75=56.25).

If we repeat the process 28 more time (which is done in the posted code) and plot the points, we get the time series plotted above. After 30 years, there are no current loans (since they were all 30-year loans). They have all either paid off or defaulted, with more loans paid-off than defaulted.

*There are many drawbacks to using Markov Chains to model mortgages. This model assumes that the transition probabilities are the same for all 100 loans I use in my example. In reality, loans are not identical (e.g. the borrow of one loan may have much higher credit score than another. This difference would give the former a much lower chance of default) and transition probabilities are not constant throughout the lives of the loans (e.g. if interest rates plummet halfway through the loan’s life, the probability that the loan will be paid off skyrockets since the borrower can refinance at a lower rate) . In short, no one actually uses this model because it is too primitive. Interestingly enough, however, I did compare the curves in my plot against empirical data I have at work and the results are strikingly similar.

In industry, Survival Analysis is used most frequently to model loans (usually implemented using logistic regression with panel data or a proportional hazards models). This is an interesting blend of biostatistics and economics. It’s particularly funny when people apply the biostatistics terminology to mortgages to discuss single-monthly mortality (monthly probability of prepayment), hazards, or survival functions (i.e. the blue line in my chart.).

**In this case, these probabilities can be though of as “hazard rates.” The hazard of default, for example, is the probability that a loan defaults in time T+1 given that it has survived in time T. This is different from a probability of default. The former is a conditional probability whereas the latter is not.

***We don’t technically need an initial condition in this case for mathematical reasons which I won’t get into because time is a scarce resource.

# Using Stochastic Process Simulations to Forecast Stocks

I good alternative to using historical volatility to forecast 35 weeks ahead may be to use an implied volatility from the options market. The market price of the option contract that expires 35 weeks later can be used to “reverse-engineer” the market’s expected volatility over the forecast period. This may be the subject of my next post!

# The Unstarvable Beast: The Need for more Effective Government Spending

Revealing article by Harvard’s Kenneth Rogoff linked above…

Key Takeaways:

• The government operates in the service industry, which generally exhibits slow productivity growth.
• While productivity growth is slow, the industry is still forced to pay higher wages for relatively the same output because it competes for workers in the same labor market as other, high-productivity industries like finance and telecommunications. This translates into ever-increasing costs.
• This cost issue plagues the service industry. Costs are so high that the industry now accounts for more than 70% of spending.
• While this “cost disease” permeates the industry as a whole, the government suffers more than most players because (a) it’s expected to provide a wide array of services, making it impossible to specialize and reduce costs and (b) government often provides services in areas where there is little competition, making it difficult to lower costs since there is little incentive to innovate.

The Unstarvable Beast: The Need for more Effective Government Spending