Predictive Modeling: A Breakdown

There are many success stories featuring predictive models, but what does not get as widely reported are the failures: mistakes that range from subtle misinterpretations and minor miscues to unvarnished disasters.

Executive Summary

In Part 1 in a series of articles on the pitfalls of predictive modeling, Ira Robbin provides a general overview of the subject and discusses the problems of outliers, missing data, biased samples and hidden variables. For the complete report, download the entire whitepaper, "Predictive Modeling Pitfalls Whitepaper: When to Be Cautious ."

This three-part series focuses on the use of predictive models in property/casualty insurance and illustrates several pitfalls. Many of the pitfalls have nothing to do with technical aspects of model construction but rather with false assumptions about the data or misapplications of model results.

In part 1, we present a general overview of predictive models.

What Is a Predictive Model?

The term “predictive model” refers to a generalized linear model (GLM) or other related model. It does not include models such as catastrophe simulation models and econometric time series models, which are also used to make predictions. The GLM terminology has been around since the 1970s, when John Nelder and Robert Wedderburn unified several existing linear regression techniques in a “generalized” construct.