Does the Shoe Fit? How to Select Demographic Data to Enhance Yo
For elements that are particularly important to your business, make sure the vendor can provide information for a significant proportion of your customers. For each element, the field match rate is the percent of the total sample that is marked with a known value. Table 1 (below) shows example match rates for two vendors. Vendor A has a higher top-level match rate and can identify age on 70 percent of the file. But vendor B is able to identify more customers as married and as homeowners. A furniture cataloger who is interested in home ownership may prefer vendor B.
Vendors will summarize information in various ways. One vendor might provide a marital status field with two levels, married or unknown. Take time to understand the possible values of each element. Look at distributions of key fields to make sure they are reasonable. For example, if you sell high-priced merchandise, you would not expect vendor data to show most of your customers with an income under $30,000.
Predictive power. If you plan to use demographic data in specific modeling situations, test these uses before you buy. Ask each vendor to return the enhanced sample file to you. Add demographic elements to your current models and see if lift is improved. Another way to look at predictive power is to compute averages for different levels of a demographic field. For example, look at the average dollars per customer for people marked as homeowners, unknown, or no top-level match. If the homeowners show a higher dollar per customer, then this field could be valuable in distinguishing among your customers.
Data accuracy. Are the overlay values correct? There may be particular elements already on your file to use for comparison. For example, a company that offers credit may be able to judge the accuracy of a vendor's age field based on internal credit application data. If you are particularly concerned about accuracy of certain fields, you could survey several hundred customers and compare their answers to the overlay data. Table 2 (at left) illustrates that vendor A tends to be more accurate on date of birth.