Can You Predict the Future?
When to Use Regression and Neural Networks in Models
By Sam Koslowsky
The other day I made an unusual purchase on a credit card I seldom use. I found myself somewhat uncomfortable as I was told, "I'm sorry, but your charge has been rejected by your bank."
The store associate was kind enough to connect me to the financial service organization responsible for maintaining my credit card. I knew the person to whom I spoke wasn't the same person who picked my transaction out of the millions that are being evaluated continually. It was, more than likely, a statistical model that proved to be the culprit. The structure of these "rules," or models, that drive credit purchase acceptance more than likely was a regression- or neural network-supported architecture.
How did this model recognize that this particular transaction was unusual? Having previously examined transactions of millions of other people—including transactions that were eventually found to be fraudulent—the model generated an algorithm, or a set of rules, that permitted it to separate valid transactions from bad ones. Of course, a model only can select transactions that appear to be fraudulent. That's why a human typically gets involved to make the final determination. Fortunate for me, I remembered my mother's maiden name and the last four digits of my Social Security Number. The transaction eventually was approved.
Thirty years ago, it was difficult to find marketers developing formal targeted models for their mailing campaigns. Rather, a team of managers would decide what criteria it made sense to employ for list selection—more hunch and what's-worked-before thinking than anything scientific.
A statistical model removes the subjectivity from this analysis. The mathematical rigor applied to solving direct marketing problems guarantees objectivity in name/list selection. Regression analysis became the 1970's direct marketer's tool of choice.
Regression Analysis Basics
While this methodology had been successfully applied in many different disciplines, marketers were late to adopt the technology. Once it was accepted in the 1970s, however, it slowly began to flourish.
Regression analysis requires data, and few marketers back then maintained data in a usable format, if they had accurate data at all. In predictive model situations, we require that which we need to predict and those data items that help us to predict. The former data set (what we need to predict) is referred to as the dependent variable, because it's dependent on the latter data set (those that help us to predict), the independent variables.
While no set rules exist, unwritten guidelines point to anywhere between five and 15 variables in a regression-modeling algorithm.
It is the job of the analyst to find the "right" variables and the associated weights. The job of finding the correct variables is challenging. Sometimes a data element that appears in a model is an exact copy of a field that appears on a database. Many times, the model developer must recode or transform the data elements to optimize the final-model result. While beyond our discussion of why these recodes may be helpful, let me just point out two typical transformations that often are employed:
1. Binning: This technique categorizes data elements into buckets. For example, age can be broken up into six bins.
a. Below 24
f. 65 and older
Of course many other groupings of age can be developed.
2. Mathematical manipulation: This process involves simply applying a standard mathematical function to the data. For example, if customer spending is a potential predictor, the logarithm, or square root of spending, may serve as a better predictor or representation of this data element. This sort of transformation may result in a more powerful relationship between the predictor and the dependent variable.
Predictive models address two categories of problems. First, there are circumstances where the marketer must establish a prediction of a quantity. Typical questions answered here focus on a customer's potential balance, his or her spending, or orders. Second, "yes/no-type" problems can also be successfully resolved with models. These usually include response (Will a customer respond?) and attrition (Will a customer defect?) issues. The first problem solved results in a prediction of a quantity; the second problem delivers a probability of an event (such as a response) happening.
The formula for the regression model provides its user with three critical pieces of information. First, by direct examination of the algorithm, we notice which data elements play a role in defining the dependent variable.
From this, we can conclude that spending, age and the number of mailings influence our prediction of individual response.
Second, by looking at the sign of each variable in the equation, we can ascertain the direction of the relationship between dependent and independent variables. So "spending" has a positive (+) sign, implying that as spending increases, probability of response also increases. Age and number of mailings are negatively related. That is as age increases or number of mailings rises, probability of response declines.
Lastly, although not always apparent from examining the algorithm, the relative impact of each predictor on the dependent variable also can be estimated. So, we would be able to rank our three variables by their impact on the likelihood of responding.
Along Came Neural Networks
While marketers began to embrace regression technologies in the 1970s, artificial neural networks (ANN) became a buzzword only about a decade ago. ANNs are somewhat more complex than regression tools. Essentially they are multi-input, nonlinear models. Weights connect the input and output layers.
The INPUTS box, or layer, houses the predictors or independent variables. Weights connect this layer to a middle, or hidden, layer. A nonlinear process occurs at this stage. It is this point in the technology that many refer to as the "black box." In this middle layer, many seasoned practitioners cannot always assess what has transpired to each of the independent variables. Something clearly has occurred. We just are not always sure what that is, and its outcome is different and significant. The next step leads to the output layer, which is what we are trying to predict.
While there clearly are many differences that distinguish regression from neural "nets," an unambiguous distinction is the number of weights and associated independent variables contained in the model architectures. Because regression modeling typically contains five to 15 weights, the analyst may have to place added emphasis on assuring that the variables he or she selects are the right ones.
Of course, these "right" ones may include appropriate recodes or transformations. If these transformations are not properly constructed, a best set of final-model variables may not be available. The regression tool will not discern these recodes automatically.
Neural networks, however, may sense intricate interactions among the predictors without having to worry about transforming the inputs or independent variables. If we refer back to our binning example, where we categorized age, a neural network may not need this binning transformation to take place. Rather, it may be able to use actual age as the predictor. It may be able to discover the most appropriate binning classification for each particular variable. While regression modeling may also use actual age, for example, without binning, due diligence requires the analyst to evaluate whether or not binning or some other transformation is required. In regression, the modeler may have to determine how to create these "bins." In neural networks, as part of the nonlinear processing, the hidden layer may take care of that for the analyst.
While the neural network approach is adept at identifying relationships, this may not always be a plus. Because these relationships are uncovered as a result of the black box part of the process, it may not be practical to secure a lucid clarification as to how and why the predictors are being incorporated into the model. Yet, many marketers demand to know what is going on. It may not be possible to explain the inner workings of the neural net model to marketing managers.
The Right Approach?
I'm frequently asked which approach to use. The following Q&A section touches on the most common questions direct marketers have about the use of regression analysis and neural networks for modeling.
Here are my thoughts:
Q: Which technology—regression or neural networks—provides the best results?
A: Sometimes, neural networks can outperform a regression model. Many times, they cannot. It makes sense to use both, and assess the performance outcomes of both. If the analyst identifies a significant difference in performance from one technique, then consider using the technique that delivers the better result.
If a difference in performance is not readily apparent, then regression is clearly the way to go. Ease of model interpretation and deployment provides significant advantages to the regression tool.
Q: Do I need a trained statistician and/or analyst when developing my models?
A: Yes. Both tools require exploratory analysis. Marketers who believe they can automatically develop neural network or regression models without the guiding hand of a trained statistician or analyst are making a mistake. Optimal model development occurs when a trained statistician works closely with a seasoned marketer.
Q: Is software available for both regression and neural nets?
A: Yes, both regression and neural networks technologies are available either as a tool within a larger statistical or data-mining package, or as a stand-alone product. Just make sure that the hand guiding either software tool is an experienced one.
A More Complete Toolbox
Neural networks are an additional weapon in the analyst's toolbox. No one technology will always result in a superior solution. Therefore, having both techniques available makes the most sense.
Frankly, it is not the tool that makes or breaks an analysis. Rather, it is the synergies of the skilled analyst and the veteran marketer that can provide the real difference.
Sam Koslowsky is vice president of modeling solutions for Harte-Hanks Inc., a worldwide, direct and targeted marketing company that provides direct marketing services and shopper advertising opportunities to a wide range of consumer and B-to-B marketers. Koslowsky can be reached at (212) 520-3259 or via
e-mail at firstname.lastname@example.org.