Database: Close the Loop Properly
One may argue that the random allocation mechanism built into merge/purge is totally fair since the rule applies to all list providers. That is simply not the case unless the order sizes are about the same among all list providers. Assume a marketer orders names from only two sources. The marketer orders 1 million names from Vendor A, thanks to a long and successful relationship. The same marketer orders only 10,000 names from Vendor B, as it is relatively new to the industry. There were 5,000 duplicates between the two files. After the “random” allocation, each list will have lost 2,500 names. That loss for Vendor A is only 0.25 percent of the input, but it is a 25 percent loss for Vendor B, enough to influence the validity of response study. Now, this is when we assume random allocation works properly (often it doesn’t), and that all lists receive the same merge/purge priority (well, that never happens). In reality, where many more lists bump into each other with a far greater number of interfile duplicates, small test files practically are destroyed in terms of statistical validity before the study even begins.
This dilemma easily can be solved by using the “input” file to the merge/purge in the matchback process, crediting all name providers for known responders. Instead of matching the responder file to the master file (typical scenario), the merge/purge “input” file—with all interfile duplicates in tact—should be matched to the responder file, which acts as the base file. Through this “reverse” match process, marketers can study response rates list by list, as if each was the only one in the mailing.
Today, most mailers do not employ such a method, simply because larger file sizes on the input side lead to higher processing costs. Many believe the “random” merge/purge will resolve the allocation issue, although it may be seriously flawed when files sizes vary among list providers. With all other factors remaining constant, random allocation tends to favor larger lists. For smaller files, it would be like deciding the outcome of a baseball game after the fourth inning. Such inadequate response study practice should not be justified for the sake of cost savings. The question must be asked: Why are we still using an antiquated “per 1,000” pricing scale for vital functions like matchback, when processing and storage costs are fractions of what they used to be?