Can You Predict the Future?
Regression Analysis Basics
While this methodology had been successfully applied in many different disciplines, marketers were late to adopt the technology. Once it was accepted in the 1970s, however, it slowly began to flourish.
Regression analysis requires data, and few marketers back then maintained data in a usable format, if they had accurate data at all. In predictive model situations, we require that which we need to predict and those data items that help us to predict. The former data set (what we need to predict) is referred to as the dependent variable, because it's dependent on the latter data set (those that help us to predict), the independent variables.
While no set rules exist, unwritten guidelines point to anywhere between five and 15 variables in a regression-modeling algorithm.
It is the job of the analyst to find the "right" variables and the associated weights. The job of finding the correct variables is challenging. Sometimes a data element that appears in a model is an exact copy of a field that appears on a database. Many times, the model developer must recode or transform the data elements to optimize the final-model result. While beyond our discussion of why these recodes may be helpful, let me just point out two typical transformations that often are employed:
1. Binning: This technique categorizes data elements into buckets. For example, age can be broken up into six bins.
a. Below 24
f. 65 and older
Of course many other groupings of age can be developed.
2. Mathematical manipulation: This process involves simply applying a standard mathematical function to the data. For example, if customer spending is a potential predictor, the logarithm, or square root of spending, may serve as a better predictor or representation of this data element. This sort of transformation may result in a more powerful relationship between the predictor and the dependent variable.