Big Data Must Get Smaller
In the business of predictive analytics for marketing, the following three types of data make up three dimensions of a target individual's portrait:
- Descriptive Data
- Transaction Data / Behavioral Data
- Attitudinal Data
In other words, if we get to know all three aspects of a person, it will be much easier to predict what the person is about and/or what the person will do. Why do we need these three dimensions? If an individual has a high income and is living in a highly valued home (demographic element, which is descriptive); and if he is an avid golfer (behavioral element often derived from his purchase history), can we just assume that he is politically conservative (attitudinal element)? Well, not really, and not all the time. Sometimes we have to stop and ask what the person's attitude and outlook on life is all about. Now, because it is not practical to ask everyone in the country about every subject, we often build models to predict the attitudinal aspect with available data. If you got a phone call from a political party that "assumes" your political stance, that incident was probably not random or accidental. Like I emphasized many times, analytics is about making the best of what is available, as there is no such thing as a complete dataset, even in this age of ubiquitous data. Nonetheless, these three dimensions of the data spectrum occupy a unique and distinct place in the business of predictive analytics.
So, in the interest of obtaining, maintaining and utilizing all possible types of data—or, conversely, reducing the size of data with conviction by knowing what to ignore, let us dig a little deeper:
Generally, demographic data—such as people's income, age, number of children, housing size, dwelling type, occupation, etc.—fall under this category. For B-to-B applications, "Firmographic" data—such as number of employees, sales volume, year started, industry type, etc.—would be considered as descriptive data. It is about what the targets "look like" and, generally, they are frozen in the present time. Many prominent data compilers (or data brokers, as the U.S. government calls them) collect, compile and refine the data and make hundreds of variables available to users in various industry sectors. They also fill in the blanks using predictive modeling techniques. In other words, the compilers may not know the income range of every household, but using statistical techniques and other available data—such as age, home ownership, housing value, and many other variables—they provide their best estimates in case of missing values. People often have some allergic reaction to such data compilation practices siting privacy concerns, but these types of data are not about looking up one person at a time, but about analyzing and targeting groups (or segments) of individuals and households. In terms of predictive power, they are quite effective and results are very consistent. The best part is that most of the variables are available for every household in the country, whether they are actual or inferred.
Stephen H. Yu is a world-class database marketer. He has a proven track record in comprehensive strategic planning and tactical execution, effectively bridging the gap between the marketing and technology world with a balanced view obtained from more than 30 years of experience in best practices of database marketing. Currently, Yu is president and chief consultant at Willow Data Strategy. Previously, he was the head of analytics and insights at eClerx, and VP, Data Strategy & Analytics at Infogroup. Prior to that, Yu was the founding CTO of I-Behavior Inc., which pioneered the use of SKU-level behavioral data. “As a long-time data player with plenty of battle experiences, I would like to share my thoughts and knowledge that I obtained from being a bridge person between the marketing world and the technology world. In the end, data and analytics are just tools for decision-makers; let’s think about what we should be (or shouldn’t be) doing with them first. And the tools must be wielded properly to meet the goals, so let me share some useful tricks in database design, data refinement process and analytics.” Reach him at firstname.lastname@example.org.