Data Deep Dive: The Art of Targeting
Even if you own a sniper rifle (and I'm not judging), if you aim at the wrong place, you will never hit the target. Obvious, right? But that happens all the time in the world of marketing, even when advanced analytics and predictive modeling techniques are routinely employed. How is that possible? Well, the marketing world is not like an Army shooting range where the silhouette of the target is conveniently hung at the predetermined location, but it is more like the "Twilight Zone," where things are not what they seem. Marketers who failed to hit the real target often blame the guns, which in this case are targeting tools, such as models and segmentations. But let me ask, was the target properly defined in the first place?
In my previous columns, I talked about the importance of predictive analytics in modern marketing (refer to "Why Model?") for various reasons, such as targeting accuracy, consistency, deeper use of data, and most importantly in the age of Big Data, concise nature of model scores where tons of data are packed into ready-for-use formats. Now, even the marketers who bought into these ideas often make mistakes by relinquishing the important duty of target definition solely to analysts and statisticians, who do not necessarily possess the power to read the marketers' minds. Targeting is often called "half-art and half-science." And it should be looked at from multiple angles, starting with the marketer's point of view. Therefore, even marketers who are slightly (or, in many cases, severely) allergic to mathematics should come one step closer to the world of analytics and modeling. Don't be too scared, as I am not asking you to be a rifle designer or sniper here; I am only talking about hanging the target in the right place so that others can shoot at it.
Let us start by reviewing what statistical models are: A model is a mathematical expression of "differences" between dichotomous groups; which, in marketing, are often referred to as "targets" and "non-targets." Let's say a marketer wants to target "high-value customers." To build a model to describe such targets, we also need to define "non-high-value customers," as well. In marketing, popular targets are often expressed as "repeat buyers," "responders to certain campaigns," "big-time spenders," "long-term, high-value customers," "troubled customers," etc. for specific products and channels. Now, for all those targets, we also need to define "bizarro" or "anti-" versions of them. One may think that they are just the "remainders" of the target. But, unfortunately, it is not that simple; the definition of the whole universe should be set first to even bring up the concept of the remainders. In many cases, defining "non-buyers" is much more difficult than defining "buyers," because lack of purchase information does not guarantee that the individual in question is indeed a non-buyer. Maybe the data collection was never complete. Maybe he used a different channel to respond. Maybe his wife bought the item for him. Maybe you don't have access to the entire pool of names that represent the "universe."
Remember T, C, & M
That is why we need to examine the following three elements carefully when discussing statistical models with marketers who are not necessarily statisticians:
- Comparison Universe, and
I call them "TCM" in short, so that I don't leave out any element in exploratory conversations. Defining proper target is the obvious first step. Defining and obtaining data for the comparison universe is equally important, but it could be challenging. But without it, you'd have nothing against which you compare the target. Again, a model is an algorithm that expresses differences between two non-overlapping groups. So, yes, you need both Superman and Bizarro-Superman (who always seems more elusive than his counterpart). And that one important variable that differentiates the target and non-target is called "Dependent Variable" in modeling.
The third element in our discussion is the methodology. I am sure you may have heard of terms like logistic regression, stepwise regression, neural net, decision trees, CHAID analysis, genetic algorithm, etc., etc. Here is my advice to marketers and end-users:
- State your goals and usages cases clearly, and let the analyst pick proper methodology that suites your goals.
- Don't be a bad patient who walks into a doctor's office demanding a specific prescription before the doctor even examines you.
Besides, for all intents and purposes, the methodology itself matters the least in comparison with an erroneously defined target and the comparison universes. Differences in methodologies are often measured in fractions. A combination of a wrong target and wrong universe definition ends up as a shotgun, if not an artillery barrage. That doesn't sound so precise, does it? We should be talking about a sniper rifle here.
Clear Goals Leading to Definitions of Target and Comparison
So, let's roll up our sleeves and dig deeper into defining targets. Allow me to use an example, as you will be able to picture the process better that way. Let's just say that, for general marketing purposes, you want to build a model targeting "frequent flyers." One may ask for business or for pleasure, but let's just say that such data are hard to obtain at this moment. (Finding the "reasons" is always much more difficult than counting the number of transactions.) And it was collectively decided that it would be just beneficial to know who is more likely to be a frequent flyer, in general. Such knowledge could be very useful for many applications, not just for the travel industry, but for other affiliated services, such as credit cards or publications. Plus, analytics is about making the best of what you've got, not waiting for some perfect datasets.
Now, here is the first challenge:
- When it comes to flying, how frequent is frequent enough for you? Five times a year, 10 times, 20 times or even more?
- Over how many years?
- Would you consider actual miles traveled, or just number of issued tickets?
- How large are the audiences in those brackets?
If you decided that five times a year is a not-so-big or not-so-small target (yes, sizes do matter) that also fits the goal of the model (you don't want to target only super-elites, as they could be too rare or too distinct, almost like outliers), to whom are they going to be compared? Everyone who flew less than five times last year? How about people who didn't fly at all last year?
Actually, one option is to compare people who flew more than five times against people who didn't fly at all last year, but wouldn't that model be too much like a plain "flyer" model? Or, will that option provide more vivid distinction among the general population? Or, one analyst may raise her hand and say "to hell with all these breaks and let's just build a model using the number of times flown last year as the continuous target." The crazy part is this: None of these options are right or wrong, but each combination of target and comparison will certainly yield very different-looking models.
Then what should a marketer do in a situation like this? Again, clearly state the goal and what is more important to you. If this is for general travel-related merchandizing, then the goal should be more about distinguishing more likely frequent flyers out of the general population; therefore, comparing five-plus flyers against non-flyers—ignoring the one-to-four-time flyers—makes sense. If this project is for an airline to target potential gold or platinum members, using people who don't even fly as comparison makes little or no sense. Of course, in a situation like this, the analyst in charge (or data scientist, the way we refer to them these days), must come halfway and prescribe exactly what target and comparison definitions would be most effective for that particular user. That requires lots of preliminary data exploration, and it is not all science, but half art.
Now, if I may provide a shortcut in defining the comparison universe, just draw the representable sample from "the pool of names that are eligible for your marketing efforts." The key word is "eligible" here. For example, many businesses operate within certain areas with certain restrictions or predetermined targeting criteria. It would make no sense to use the U.S. population sample for models for supermarket chains, telecommunications, or utility companies with designated footprints. If the business in question is selling female apparel items, first eliminate the male population from the comparison universe (but I'd leave "unknown" genders in the mix, so that the model can work its magic in that shady ground). You must remember, however, that all this means you need different models when you change the prospecting universe, even if the target definition remains unchanged. Because the model algorithm is the expression of the difference between T and C, you need a new model if you swap out the C part, even if you left the T alone.
Sometimes it gets twisted the other way around, where the comparison universe is relatively stable (i.e., your prospecting universe is stable) but there could be multiple targets (i.e., multiple Ts, like T1, T2, etc.) in your customer base.
Let me elaborate with a real-life example. A while back, we were helping a company that sells expensive auto accessories for luxury cars. The client, following his intuition, casually told us that he only cares for big spenders whose average order sizes are more than $300. Now, the trouble with this statement is that:
- Such a universe could be too small to be used effectively as a target for models, and
- High spenders do not tend to purchase often, so we may end up leaving out the majority of the potential target buyers in the whole process.
This is exactly why some type of customer profiling must precede the actual target definition. A series of simple distribution reports clearly revealed that this particular client was dealing with a dual-universe situation, where the first group (or segment) is made of infrequent, but high-dollar spenders whose average orders were even greater than $300, and the second group is made of very frequent buyers whose average order sizes are well below the $100 mark. If we had ignored this finding, or worse, neglected to run preliminary reports and just relying on our client's wishful thinking, we would have created a "phantom" target, which is just an average of these dual universes. A model designed for such a phantom target will yield phantom results. The solution? If you find two distinct targets (as in T1 and T2), just bite the bullet and develop two separate models (T1 vs. C and T2 vs. C).
There are still other reasons why you may need multiple models. Let's talk about the case of "target within a target." Some may relate this idea to a "drill-down" concept, and it can be very useful when the prospecting universe is very large, and the marketer is trying to reach only the top 1 percent (which can be still very large, if the pool contains hundreds of millions of people). Correctly finding the top 5 percent in any universe is difficult enough. So what I suggest in this case is to build two models in sequence to get to the "Best of the Best" in a stepwise fashion.
- The first model would be more like an "elimination" model, where obviously not-so-desirable prospects would be removed from the process, and
- The second-step model would be designed to go after the best prospects among survivors of the first step.
Again, models are expressions of differences between targets and non-targets, so if the first model eliminated the bottom 80 percent to 90 percent of the universe and leaves the rest as the new comparison universe, you need a separate model—for sure. And lots of interesting things happen at the later stage, where new variables start to show up in algorithms or important variables in the first step lose steam in later steps. While a bit cumbersome during deployment, the multi-step approach ensures precision targeting, much like a sniper rifle at close range.
I also suggest this type of multi-step process when clients are attempting to use the result of segmentation analysis as a selection tool. Segmentation techniques are useful as descriptive analytics. But as a targeting tool, they are just too much like a shotgun approach. It is one thing to describe groups of people such as "young working mothers," "up-and-coming," and "empty-nesters with big savings" and use them as references when carving out messages tailored toward them. But it is quite another to target such large groups as if the population within a particular segment is completely homogeneous in terms of susceptibility to specific offers or products. Surely, the difference between a Mercedes buyer and a Lexus buyer ain't income and age, which may have been the main differentiator for segmentation. So, in the interest of maintaining a common theme throughout the marketing campaigns, I'd say such segments are good first steps. But for further precision targeting, you may need a model or two within each segment, depending on the size, channel to be employed and nature of offers.
Another case where the multi-step approach is useful is when the marketing and sales processes are naturally broken down into multiple steps. For typical B-to-B marketing, one may start the campaign by mass mailing or email (I'd say that step also requires modeling). And when responses start coming in, the sales team can take over and start contacting responders through more personal channels to close the deal. Such sales efforts are obviously very time-consuming, so we may build a "value" model measuring the potential value of the mail or email responders and start contacting them in a hierarchical order. Again, as the available pool of prospects gets smaller and smaller, the nature of targeting changes as well, requiring different types of models.
This type of funnel approach is also very useful in online marketing, as the natural steps involved in email or banner marketing go through lifecycles, such as blasting, delivery, impression, clickthrough, browsing, shopping, investigation, shopping basket, checkout (Yeah! Conversion!) and repeat purchases. Obviously, not all steps require aggressive or precision targeting. But I'd say, at the minimum, initial blast, clickthrough and conversion should be looked at separately. For any lifetime value analysis, yes, the repeat purchase is a key step; which, unfortunately, is often neglected by many marketers and data collectors.
Inversely Related Targets
More complex cases are when some of these multiple response and conversion steps are "inversely" related. For example, many responders to invitation-to-apply type credit card offers are often people with not-so-great credit. Well, if one has a good credit score, would all these credit card companies have left them alone? So, in a case like that, it becomes very tricky to find good responders who are also credit-worthy in the vast pool of a prospect universe.
I wouldn't go as far as saying that it is like finding a needle in a haystack, but it is certainly not easy. Now, I've met folks who go after the likely responders with potential to be approved as a single target. It really is a philosophical difference, but I much prefer building two separate models in a situation like this:
- One model designed to measure responsiveness, and
- Another to measure likelihood to be approved.
The major benefit for having separate models is that each model will be able employ different types and sources of data variables. A more practical benefit for the users is that the marketers will be able to pick and choose what is more important to them at the time of campaign execution. They will obviously go to the top corner bracket, where both scores are high (i.e., potential responders who are likely to be approved). But as they dial the selection down, they will be able to test responsiveness and credit-worthiness separately.
Mixing Multiple Model Scores
Even when multiple models are developed with completely different intentions, mixing them up will produce very interesting results. Imagine you have access to scores for "High-Value Customer Model" and "Attrition Model." If you cross these scores in a simple 2x2 matrix, you can easily create a useful segment in one corner called "Valuable Vulnerable" (a term that my mentor created a long time ago). Yes, one score is predicting who is likely to drop your service, but who cares if that customer shows little or no value to your business? Take care of the valuable customers first.
This type of mixing and matching becomes really interesting if you have lots of pre-developed models. During my tenure at a large data compiling company, we built more than 120 models for all kinds of consumer characteristics for general use. I remember the real fun began when we started mixing multiple models, like combining a "NASCAR Fan" model with a "College Football Fan" model; a "Leaning Conservative" model with an "NRA Donor" model; an "Organic Food" one with a "Cook for Fun" model or a "Wine Enthusiast" model; a "Foreign Vacation" model with a "Luxury Hotel" model or a "Cruise" model; a "Safety and Security Conscious" model or a "Home Improvement" model with a "Homeowner" model, etc., etc.
You see, no one is one dimensional, and we proved it with mathematics.
No One is One-dimensional
Obviously, these examples are just excerpts from a long playbook for the art of targeting. My intention is to emphasize that marketers must consider target, comparison and methodologies separately; and a combination of these three elements yields the most fitting solutions for each challenge, way beyond what some popular toolsets or new statistical methodologies presented in some technical conferences can acomplish. In fact, when the marketers are able to define the target in a logical fashion with help from trained analysts and data scientists, the effectiveness of modeling and subsequent marketing campaigns increase dramatically. Creating and maintaining an analytics department or hiring an outsourcing analytics vendor aren't enough.
One may be concerned about the idea of building multiple models so casually, but let me remind you that it is the reality in which we already reside, anyway. I am saying this, as I've seen too many marketers who try to fix everything with just one hammer, and the results weren't ideal—to say the least.
It is a shame that we still treat people with one-dimensional tools, such segmentations and clusters, in this age of ubiquitous and abundant data. Nobody is one-dimensional, and we must embrace that reality sooner than later. That calls for rapid model development and deployment, using everything that we've got.
Arguing about how difficult it is to build one or two more models here and there is so last century.
Stephen H. Yu is a world-class database marketer. He has a proven track record in comprehensive strategic planning and tactical execution, effectively bridging the gap between the marketing and technology world with a balanced view obtained from more than 30 years of experience in best practices of database marketing. Currently, Yu is president and chief consultant at Willow Data Strategy. Previously, he was the head of analytics and insights at eClerx, and VP, Data Strategy & Analytics at Infogroup. Prior to that, Yu was the founding CTO of I-Behavior Inc., which pioneered the use of SKU-level behavioral data. “As a long-time data player with plenty of battle experiences, I would like to share my thoughts and knowledge that I obtained from being a bridge person between the marketing world and the technology world. In the end, data and analytics are just tools for decision-makers; let’s think about what we should be (or shouldn’t be) doing with them first. And the tools must be wielded properly to meet the goals, so let me share some useful tricks in database design, data refinement process and analytics.” Reach him at firstname.lastname@example.org.