Not all selected variables are equally important in model algorithms, either. More powerful variables will be assigned with higher weight, and the sum of these weighted values is what we call model score. Now, non-statisticians who have been slightly allergic to math since the third grade only need to know that the higher the score, the more likely the record in question is to be like the target. To make the matter even simpler, let's just say that you want higher scores over lower scores. If you are a salesperson, just call the high-score prospects first. And would you care how many variables are packed into that score, for as long as you get the good "Glengarry Glen Ross" leads on top?
So, let me ask again. Does this sound like something a rudimentary selection rule with two to three variables can beat when it comes to identifying the right target? Maybe someone can get lucky once or twice, but not consistently.
That leads to the next point, "consistency." Because models do not rely on a few popular variables, they are far less volatile than simple selection rules or queries. In this age of Big Data, there are more transaction and behavioral data in the mix than ever, and they are far more volatile than demographic and geo-demographic data. Put simply, people's purchasing behavior and preferences change much faster than family composition or their income, and that volatility factor calls for more statistical work. Plus, all facets of marketing are now more about measurable results (ah, that dreaded ROI, or "Roy," the way I call it), and the businesses call for consistent hitters over one-hit wonders.
"Revealing hidden patterns in data" is my favorite. When marketers are presented with thousands of variables, I see a majority of them just sticking to a few popular ones all the time. Some basic recency and frequency data are there, and among hundreds of demographic variables, the list often stops after income, age, gender, presence of children, and some regional variables. But seriously, do you think that the difference between a luxury car buyer and an SUV buyer is just income and age? You see, these variables are just the ones that human minds are accustomed to. Mathematics do not have such preconceived notions. Sticking to a few popular variables is like children repeatedly using three favorite colors out of a whole box of crayons.
Stephen H. Yu is a world-class database marketer. He has a proven track record in comprehensive strategic planning and tactical execution, effectively bridging the gap between the marketing and technology world with a balanced view obtained from more than 30 years of experience in best practices of database marketing. Currently, Yu is president and chief consultant at Willow Data Strategy. Previously, he was the head of analytics and insights at eClerx, and VP, Data Strategy & Analytics at Infogroup. Prior to that, Yu was the founding CTO of I-Behavior Inc., which pioneered the use of SKU-level behavioral data. “As a long-time data player with plenty of battle experiences, I would like to share my thoughts and knowledge that I obtained from being a bridge person between the marketing world and the technology world. In the end, data and analytics are just tools for decision-makers; let’s think about what we should be (or shouldn’t be) doing with them first. And the tools must be wielded properly to meet the goals, so let me share some useful tricks in database design, data refinement process and analytics.” Reach him at email@example.com.