It’s All About Ranking
In order to develop models using variables that are descriptors of customers, not transactions, we must convert those relational or unstructured data into the structure that match the level by which you would like to rank the records. Even temporarily. As the size of databases are getting bigger and bigger and the storage is getting cheaper and cheaper, I'd say that the temporary time period could be, well, indefinite. And because the word "data-mart" is overused and confusing to many, let me just call that place the "Analytical Sandbox." Sandboxes are fun, and yes, all kinds of fun stuff for marketers and analysts happen there.
The Analytical Sandbox is where samples are created for model development, actual models are built, models are scored for every record—no matter how many there are—without hiccups; targets are easily sorted and selected by model scores; reports are created in meaningful and consistent ways (consistency is even more important than sheer accuracy in what we do), and analytical language such as SAS, SPSS or R are spoken without being frowned up by other computing folks. Here, analysts will spend their time pondering upon target definitions and methodologies, not about database structures and incomplete data fields. Have you heard about a fancy term called "in-database scoring"? This is where that happens, too.
And what comes out of the Analytical Sandbox and back into the world of relational database or unstructured databases—IT folks often ask this question—is going to be very simple. Instead of having to move mountains of data back and forth, all the variables will be in forms of model scores, providing answers to marketing questions, without any missing values (by definition, every record can be scored by models). While the scores are packing tons of information in them, the sizes could be as small as a couple bytes or even less. Even if you carry over a few hundred affinity scores for 100 million people (or any other types of entities), I wouldn't call the resultant file large, as it would be as small as a few video files, really.
Stephen H. Yu is a world-class database marketer. He has a proven track record in comprehensive strategic planning and tactical execution, effectively bridging the gap between the marketing and technology world with a balanced view obtained from more than 30 years of experience in best practices of database marketing. Currently, Yu is president and chief consultant at Willow Data Strategy. Previously, he was the head of analytics and insights at eClerx, and VP, Data Strategy & Analytics at Infogroup. Prior to that, Yu was the founding CTO of I-Behavior Inc., which pioneered the use of SKU-level behavioral data. “As a long-time data player with plenty of battle experiences, I would like to share my thoughts and knowledge that I obtained from being a bridge person between the marketing world and the technology world. In the end, data and analytics are just tools for decision-makers; let’s think about what we should be (or shouldn’t be) doing with them first. And the tools must be wielded properly to meet the goals, so let me share some useful tricks in database design, data refinement process and analytics.” Reach him at firstname.lastname@example.org.