Database : Close the Loop Properly
Three major analytical gaps that lead to faulty data decisions
January 2008 By Stephen Yu• Data reconciliation. Collected source code information often does not agree with the master mail file. Information on the master file typically is trusted over manually collected figures.
• Date window. If there are multiple campaigns going on in close proximity in time, rules must be in place to credit proper campaigns for responses. Generally, the latest campaign gets the credit, as long as the response date is not too close to the mail-drop date.
• Allocation rules for unmatched responses. Even with the most sophisticated match programs, there always will be unmatched records. Provided this is a small part of the response universe, business rules must be set to handle the unmatched response records.
• Key report variables. To avoid any redundant data processing, key analytical variables must be defined and maintained throughout the process.
The main issue with skipping the matchback process is that the analysts solely rely on manually collected data on the response side, which may not agree with the master file or be completely missing. In addition, it is important to recognize that analysts cannot measure anecdotal responses to a fraction of a percent. Without the matchback, it may be necessary to surrender certain levels of details, such as segment or name source, due to lack of coverage, which often are the focal points of all direct mail response analyses.
The Flaw in Random Merge/Purge
Most of the matchback process is done on the master file, which is suitable for testing creative packages, offers and delivery channels. However, because merge/purge output keeps only one record per household/individual regardless of its origins, the master file has a serious flaw when it comes to list source evaluation. Yet, list level measurement is one of the key metrics in ROI studies, as the list cost is the one that varies the most. Imagine a situation where three list providers sent a responsive name, and only one lucky winner who survived the so-called “random” merge/purge gets all the credit in every subsequent study.
One may argue that the random allocation mechanism built into merge/purge is totally fair since the rule applies to all list providers. That is simply not the case unless the order sizes are about the same among all list providers. Assume a marketer orders names from only two sources. The marketer orders 1 million names from Vendor A, thanks to a long and successful relationship. The same marketer orders only 10,000 names from Vendor B, as it is relatively new to the industry. There were 5,000 duplicates between the two files. After the “random” allocation, each list will have lost 2,500 names. That loss for Vendor A is only 0.25 percent of the input, but it is a 25 percent loss for Vendor B, enough to influence the validity of response study. Now, this is when we assume random allocation works properly (often it doesn’t), and that all lists receive the same merge/purge priority (well, that never happens). In reality, where many more lists bump into each other with a far greater number of interfile duplicates, small test files practically are destroyed in terms of statistical validity before the study even begins.
This dilemma easily can be solved by using the “input” file to the merge/purge in the matchback process, crediting all name providers for known responders. Instead of matching the responder file to the master file (typical scenario), the merge/purge “input” file—with all interfile duplicates in tact—should be matched to the responder file, which acts as the base file. Through this “reverse” match process, marketers can study response rates list by list, as if each was the only one in the mailing.
Today, most mailers do not employ such a method, simply because larger file sizes on the input side lead to higher processing costs. Many believe the “random” merge/purge will resolve the allocation issue, although it may be seriously flawed when files sizes vary among list providers. With all other factors remaining constant, random allocation tends to favor larger lists. For smaller files, it would be like deciding the outcome of a baseball game after the fourth inning. Such inadequate response study practice should not be justified for the sake of cost savings. The question must be asked: Why are we still using an antiquated “per 1,000” pricing scale for vital functions like matchback, when processing and storage costs are fractions of what they used to be?
Response Rates Are Not Baseball Scores
Many marketers looks at results in multidimensional ways to understand the winning combinations of name sources, selection criteria, creative packages, offers and channels. However, when an analyst breaks down the responders into smaller groups using all possible combinations of study elements, the segment size may become too small to yield any meaningful statistics. There may be less than 10 responders with income over $75,000 who received letter version No. 2 plus a free shipping offer in a segment called List A. To avoid situations like this, it is more prudent to examine each measurement criterion separately.
Even when only one element is studied at a time, marketers must be aware of the statistical validity issue when comparing multiple segments. For example, there is little difference between response rates of 1.15 percent and 1.23 percent, unless over 100,000 pieces were mailed in each group. Too often marketers jump to conclusions and treat response rates as the ultimate ranking tool. To be fair, one must be aware of the sample size, confidence level and size of differences to be measured. Without statistical training, you must be careful not to draw conclusions too hastily. After all, those 5,000 merge/purge survivors in segments A and B may not be big enough to tell you any story about less than half a percent difference in response rate. When in doubt, please consult a statistician, or at least download some utility programs off the Internet and plug in the numbers before cleaning up your vendor list.
Closed-loop marketing is one of the most overused terms in marketing, and yet many marketers do not close the loops properly. Remember that imbedding key codes in your mail pieces is just the beginning. Properly analyzing the results and applying the knowledge to the next mailing will complete the circle.
Stephen Yu is vice president of database marketing at infoUSA National Accounts Division, a direct marketing solutions firm in Woodcliff Lakes, N.J. He can be reached at (201) 476-2305.
Page 1 | 2




Social Media ROI
Email Marketing that Works (2nd Edition)