Direct Selling: Making a Match
Perfecting matchback in an imperfect world
March 2008 By Steve Trollinger
No industry standard exists for matchback processing. That much is true. But in this article, you’ll learn methods that, when used appropriately, just might help you with the all-important task of better identifying the sources of unknown orders.
First, let’s identify the real problem with matchbacks. Working with various companies during the past several years has given us the opportunity to see the outcomes of a number of matchback processes. The results? Most consumer catalogs see match rates between 25 percent and 50 percent for unaccounted-for orders (i.e., orders that cannot be attributed directly to a catalog mailing, e-mail campaign, search marketing or pay-per-click program, or other trackable effort).
B-to-B mailers often see higher hit rates at the company level. But, not surprisingly, at the contact level, match rates often are even lower than with consumer mailers. Why is it that matchback doesn’t work better?
Lack of Address Standardization
Common matchkeys (discussed below) rely on algorithms that cannot process nuances in address entry, such as E instead of East or Trolinger instead of Trollinger, or order records that contain a company name in the last name or address 3 field.
The best way to produce immediate gains in matchback results is to evaluate and fine-tune your methods for address standardization. Every directional, roadway indicator, post office box, apartment or suite number must be represented the same way within the mail file and the response data. Once you know that every effort has been made to build a good foundation, developing a matchkey that accurately links responders to the mail file is critical.
The Key That Opens All
This article began by making the point that there is no industry standard for matchback processing, and that fact is perhaps no more apparent than in matchkey development. The matchkey is a string of data that you ultimately use to link records on your order file to records in the mail file. Matchkeys are necessary because exact matches on name, address and ZIP code are extraordinarily rare. The challenge lies in how the matchkey should be developed.
In matchkey development, every character counts. The length of your matchkey is up to you (or your service bureau; I’ll discuss that later), but it should be meaningful and balance the risks of over- and under-matching.
Over-matching is when too many records are indicated as matches given a particular matchkey. If, for example, you used the first two digits of a street number and ZIP code, the key 5866202 could potentially match several records in your mail file.
Under-matching is, of course, the opposite. Here’s an example: Mail sent to my attention in Shawnee Mission, Kan., would reach me just as quickly as mail sent to Mission, Kan. If my matchkey incorporated the city name, TRL 5800 MSSN 66202 would not match TRL 5800 SHWN 66202, even though both keys referenced the exact same person. In other words, the key would produce under-matched results.
The balancing of under-matching and over-matching often is achieved through the introduction of multiple matchkeys to a matchback process. By creating two to three keys, you essentially make the statement, “what one key misses, the other will get.” Still, developing individual matchkeys should help you find new hits, not just refind matches you’ve already got.
While there is an endless variety of matchkey sequences, all matchkeys should bring in elements of name, address and ZIP, but the order number of the included elements can vary. A common matchkey will combine elements of surname, street address and ZIP in various ways, perhaps as TRL 5800 FXR 66202, or first three consonants of surname, address number, first three consonants of street name and ZIP code. But note that the street name can be confounded by the algorithm when directionals are incorporated, so another variation of a matchkey might be TRL 5800 66202. This might work for a Trollinger at that street number, but what about all the Smiths? The key SMT 5800 66202 would likely over-match customers.
The more elements you allow into the key, the more restrictive it becomes and the more likely under-matching is to occur. Variations on the above will tighten and loosen the key to match more or less records. The key TR 58 FX 662 will match the same records as the key in the paragraph above, but also would match a Travers at 5802 Foxhole Rd in ZIP code 66208. The point: Keep as tight a matchkey as possible to allow maximum matching without significant false positives.
Still, even tightly defined matchkeys often will produce less than 50 percent match rates, so don’t be discouraged when your perfect key doesn’t open every door.
If Orders Don’t Match, Why Mail So Many Catalogs?
If matchbacks typically produce low hit rates and merely a portion of uncoded orders are being attributed to catalog mailings, it only makes sense that the catalog programs should be scaled back, right? Not so fast.
More than one catalog mailer has learned the hard way that you can’t just cut catalogs because you think they aren’t producing return; sales will most certainly plummet with the circulation cuts. But how can you be sure that the same principles apply to your business? Simple: statistics.
Correlation analysis will quickly shed light on the effects of your mail programs on overall orders and sales, and generally confirms the findings of the USPS and even Britain’s Royal Mail: Catalogs drive the lion’s share of online order activity. To see the effects of your catalog mailings on online ordering, run a correlation analysis using Microsoft Excel’s Data Analysis Toolpak.
But before you begin, let’s make an assumption. Consider the reason your call-center phones ring with customers wanting to place orders. What, for the most part, causes customers to call and order? Your phones ring because you mail catalogs—so you should be able to correlate the order activity on your Web site with the order activity from your call center and come up with a reliable metric to build upon.
To run the correlation analysis, you’ll need two data sets: order counts by day for the catalog (e.g., call center) and unaccounted-for Web orders by day for the same period. Remember, unaccounted-for Web orders are those that cannot be attributed to a specific marketing activity, such as e-mail campaigns, search marketing and pay-per-click programs, affiliate marketing efforts, etc. Here you’re looking for orders that came directly to your URL or were a result of organic search, particularly on branded keywords.
Put these two columns of data side-by-side for each day of the period, and allow Excel to run the numbers for you. What results is a modest table with a handful of numbers, the most important of which is the correlation coefficient, a number between zero and one that indicates the degree to which two variables are linearly related. To get the real answer to the question, “How much does my catalog drive Web orders?” you must square the correlation coefficient to produce the coefficient of determination—a measure of the proportion of each other’s variability that two variables share.
If, for example, a correlation coefficient of 0.9 says there’s a high level of linear relation (the variables “move” the same way), squaring the coefficient says that 81 percent of the variability is shared between phone and Web orders. So, in this example, 81 percent of Web orders are directly related to phone orders. And if phone orders are driven by the catalog, so must 81 percent of the Web orders.
The findings that you uncover in evaluating your own data will give you two pieces of valuable information: 1) how important your catalog is to your overall business; and 2) the factor to use in allocating unaccounted-for orders back to catalog segments once the matchback itself is complete.
So even if you match and can attribute only 50 percent of the orders to the catalog, you know that you should attribute, in this example, 81 percent of the remaining unaccounted-for orders to the catalog. This is where allocation comes in.
Three Levels of Results
The outcome of a matchback analysis and correlation study serves as a guide for analysis of a mail campaign. By tracking order data, you know the number of customers bought using the catalog. For the rest, we’re guessing … which mailing “tripped their triggers”, whether their typing your company name into Google was a function of having received a catalog, and which combination of letters and numbers will find the most customers on your mail files.
Still, all that work produces a viable set of benchmarks for your analysis. You know what you definitely got from the mailing, what you most likely got and what you can presume you got by incorporating each level of data—namely, tracked, matched and correlation-based pro-rated allocation. Then you can make directional decisions for future campaigns—decisions that likely don’t involve drastic cuts in circulation.
Lastly, Who Does It?
Consider having a reputable service bureau process your matchback for you. The cost is often relatively minimal, particularly when you’re evaluating several mailings at once. Bureaus’ address standardization tools often are superior to anything you might have on your desktop—and many of them already have matchkeys developed and can turn around a project in a fraction of the time it would take if you did it yourself.
Good luck and happy mailing!
Steve Trollinger is executive vice president of J. Schmid & Associates, Mission, Kan. You can reach him at stevet@jschmid.com.
First, let’s identify the real problem with matchbacks. Working with various companies during the past several years has given us the opportunity to see the outcomes of a number of matchback processes. The results? Most consumer catalogs see match rates between 25 percent and 50 percent for unaccounted-for orders (i.e., orders that cannot be attributed directly to a catalog mailing, e-mail campaign, search marketing or pay-per-click program, or other trackable effort).
B-to-B mailers often see higher hit rates at the company level. But, not surprisingly, at the contact level, match rates often are even lower than with consumer mailers. Why is it that matchback doesn’t work better?
Lack of Address Standardization
Common matchkeys (discussed below) rely on algorithms that cannot process nuances in address entry, such as E instead of East or Trolinger instead of Trollinger, or order records that contain a company name in the last name or address 3 field.
The best way to produce immediate gains in matchback results is to evaluate and fine-tune your methods for address standardization. Every directional, roadway indicator, post office box, apartment or suite number must be represented the same way within the mail file and the response data. Once you know that every effort has been made to build a good foundation, developing a matchkey that accurately links responders to the mail file is critical.
The Key That Opens All
This article began by making the point that there is no industry standard for matchback processing, and that fact is perhaps no more apparent than in matchkey development. The matchkey is a string of data that you ultimately use to link records on your order file to records in the mail file. Matchkeys are necessary because exact matches on name, address and ZIP code are extraordinarily rare. The challenge lies in how the matchkey should be developed.
In matchkey development, every character counts. The length of your matchkey is up to you (or your service bureau; I’ll discuss that later), but it should be meaningful and balance the risks of over- and under-matching.
Over-matching is when too many records are indicated as matches given a particular matchkey. If, for example, you used the first two digits of a street number and ZIP code, the key 5866202 could potentially match several records in your mail file.
Under-matching is, of course, the opposite. Here’s an example: Mail sent to my attention in Shawnee Mission, Kan., would reach me just as quickly as mail sent to Mission, Kan. If my matchkey incorporated the city name, TRL 5800 MSSN 66202 would not match TRL 5800 SHWN 66202, even though both keys referenced the exact same person. In other words, the key would produce under-matched results.
The balancing of under-matching and over-matching often is achieved through the introduction of multiple matchkeys to a matchback process. By creating two to three keys, you essentially make the statement, “what one key misses, the other will get.” Still, developing individual matchkeys should help you find new hits, not just refind matches you’ve already got.
While there is an endless variety of matchkey sequences, all matchkeys should bring in elements of name, address and ZIP, but the order number of the included elements can vary. A common matchkey will combine elements of surname, street address and ZIP in various ways, perhaps as TRL 5800 FXR 66202, or first three consonants of surname, address number, first three consonants of street name and ZIP code. But note that the street name can be confounded by the algorithm when directionals are incorporated, so another variation of a matchkey might be TRL 5800 66202. This might work for a Trollinger at that street number, but what about all the Smiths? The key SMT 5800 66202 would likely over-match customers.
The more elements you allow into the key, the more restrictive it becomes and the more likely under-matching is to occur. Variations on the above will tighten and loosen the key to match more or less records. The key TR 58 FX 662 will match the same records as the key in the paragraph above, but also would match a Travers at 5802 Foxhole Rd in ZIP code 66208. The point: Keep as tight a matchkey as possible to allow maximum matching without significant false positives.
Still, even tightly defined matchkeys often will produce less than 50 percent match rates, so don’t be discouraged when your perfect key doesn’t open every door.
If Orders Don’t Match, Why Mail So Many Catalogs?
If matchbacks typically produce low hit rates and merely a portion of uncoded orders are being attributed to catalog mailings, it only makes sense that the catalog programs should be scaled back, right? Not so fast.
More than one catalog mailer has learned the hard way that you can’t just cut catalogs because you think they aren’t producing return; sales will most certainly plummet with the circulation cuts. But how can you be sure that the same principles apply to your business? Simple: statistics.
Correlation analysis will quickly shed light on the effects of your mail programs on overall orders and sales, and generally confirms the findings of the USPS and even Britain’s Royal Mail: Catalogs drive the lion’s share of online order activity. To see the effects of your catalog mailings on online ordering, run a correlation analysis using Microsoft Excel’s Data Analysis Toolpak.
But before you begin, let’s make an assumption. Consider the reason your call-center phones ring with customers wanting to place orders. What, for the most part, causes customers to call and order? Your phones ring because you mail catalogs—so you should be able to correlate the order activity on your Web site with the order activity from your call center and come up with a reliable metric to build upon.
To run the correlation analysis, you’ll need two data sets: order counts by day for the catalog (e.g., call center) and unaccounted-for Web orders by day for the same period. Remember, unaccounted-for Web orders are those that cannot be attributed to a specific marketing activity, such as e-mail campaigns, search marketing and pay-per-click programs, affiliate marketing efforts, etc. Here you’re looking for orders that came directly to your URL or were a result of organic search, particularly on branded keywords.
Put these two columns of data side-by-side for each day of the period, and allow Excel to run the numbers for you. What results is a modest table with a handful of numbers, the most important of which is the correlation coefficient, a number between zero and one that indicates the degree to which two variables are linearly related. To get the real answer to the question, “How much does my catalog drive Web orders?” you must square the correlation coefficient to produce the coefficient of determination—a measure of the proportion of each other’s variability that two variables share.
If, for example, a correlation coefficient of 0.9 says there’s a high level of linear relation (the variables “move” the same way), squaring the coefficient says that 81 percent of the variability is shared between phone and Web orders. So, in this example, 81 percent of Web orders are directly related to phone orders. And if phone orders are driven by the catalog, so must 81 percent of the Web orders.
The findings that you uncover in evaluating your own data will give you two pieces of valuable information: 1) how important your catalog is to your overall business; and 2) the factor to use in allocating unaccounted-for orders back to catalog segments once the matchback itself is complete.
So even if you match and can attribute only 50 percent of the orders to the catalog, you know that you should attribute, in this example, 81 percent of the remaining unaccounted-for orders to the catalog. This is where allocation comes in.
Three Levels of Results
The outcome of a matchback analysis and correlation study serves as a guide for analysis of a mail campaign. By tracking order data, you know the number of customers bought using the catalog. For the rest, we’re guessing … which mailing “tripped their triggers”, whether their typing your company name into Google was a function of having received a catalog, and which combination of letters and numbers will find the most customers on your mail files.
Still, all that work produces a viable set of benchmarks for your analysis. You know what you definitely got from the mailing, what you most likely got and what you can presume you got by incorporating each level of data—namely, tracked, matched and correlation-based pro-rated allocation. Then you can make directional decisions for future campaigns—decisions that likely don’t involve drastic cuts in circulation.
Lastly, Who Does It?
Consider having a reputable service bureau process your matchback for you. The cost is often relatively minimal, particularly when you’re evaluating several mailings at once. Bureaus’ address standardization tools often are superior to anything you might have on your desktop—and many of them already have matchkeys developed and can turn around a project in a fraction of the time it would take if you did it yourself.
Good luck and happy mailing!
Steve Trollinger is executive vice president of J. Schmid & Associates, Mission, Kan. You can reach him at stevet@jschmid.com.




Social Media ROI
Email Marketing that Works (2nd Edition)