Where Is the Data Movement Going?
Not too long ago, I was helping a start-up company in hiring a data analyst. During that process, I interviewed a candidate who boldly told me that he thought “Data Mining” did not exist before the turn of the century. He was shocked to hear that someone of my age (i.e., a grownup) actually had a long career making money with data. That was pretty funny at the time, as such a claim is not so different from saying the movie industry did not exist before the invention of computer graphics. Alright, some action sequences did look rudimentary in the olden days, but we supplemented crude special effects with our imagination, didn’t we?
Nonetheless, the up-and-coming generation cannot even imagine living in the Old World, which was without constant connection to the net. Many of them do not own a TV set, nor do they have a land-line. To them, the environment in which the grownups grew up, where there were no means to record anything and everything and share the results instantaneously, might as well be an extension of the Stone Age.
But some of us actually remember all too vividly the commercial application of mainframe computers, the PC revolution, expansion of the net, emergence and domination of wireless devices, and migration of information to the cloud. In fact, by witnessing such an evolution in a relatively short period of time, I can argue that the current way of sharing information through the cloud isn’t so different from the way the mainframe computers worked in the past, minus the fact that ugly green and black monitors were replaced by retina display and wires are about to go extinct.
In any case, data surely existed before the invention of the computers and massive storage devices. They were just not in digital format, but contained in scrolls or leatherbound books, taking up large spaces called libraries with limited access. Proper data mining could have taken decades in those days, including a few trips to Alexandria. Even after human beings started digitizing information, data mining – the activity of converting information to insights and applying them to decision-making processes — was not all that glorious for the first few decades.
The first group of people who applied statistical techniques to target marketing were equipped with clunky mainframes that could only read punch cards. That was around the time when Moon Shot was in full-force, so one may think I am talking about ancient history in the age of Big Data. But those brave souls – mostly in the publishing industry — paved the mathematical way for modern-day data scientists.
With computers that were a few million times slower and equally more expensive, they literally had one chance to get the statistical work right for Christmas campaigns. And even in those days, Christmas came around once a year, and even with such antiquated computers, targeting based on statistical modeling actually worked. I would compared such an endeavor to calculating the reentry trajectory of a space ship by hand. Lest we forget, tools may have changed, but not the mathematics.
Fast-forward to Present Day, and we now accumulate more data in a day than our ancestors ever did since the invention of paper. We have means to collect and store data during about every glance you take, and every move you make. We now have data mining tools that practically build models with a few clicks and, soon enough, we will be able to set analytical parameters by simply stating the business goals to a computer. Statisticians? Data scientists? Someday in the near future, a machine will replace most of their functions.
Then, why is that most of us are not impressed with the majority of marketing messages? How is it that key performance metrics of marketing efforts did not improve at the rate of increase in computing speed and capacity? Granted that we the consumers are constantly being bombarded with too many mindless sales pitches, but how is it that a typical response rate (not clickthrough or page views, but actual conversion rate) of a typical 1-to-1 campaign still hovers under 1 percent, and at times, way below that mark? Doesn’t that mean that the marketing efforts are failing more than 99 percent of the time? Pioneers of data-mining techniques did better than that with computers barely faster than using an abacus.
Maybe the computers that retain a sick amount of data and tool sets that retrieve information and build models really fast do not help the matter all that much after all. Maybe, the way we approach the data is off-course. Maybe, the users of the tool sets are messing it all up. So, let’s break it down.
About a decade ago, a movie called “Minority Report” starring Tom Cruise came out. Not particularly a great science fiction movie. But since it came out, I have been quoting parts of it as an example of the future of personalized marketing. So did many other data professionals. But allow me to bring it up one more time, as it laid out the steps the marketers and data players must consider in an easy-to-understand fashion.
In that movie, Tom Cruise’s character, a senior detective named John Anderton, runs away from the bad guys and replaces his eyes with someone else’s to distract the omnipresent retina scanners. Then he walks into a department store to be greeted by a computer configured for personalized marketing. Allow me share the script verbatim, so that we can re-live this scene:
- As Anderton walks in the door and gets his new eyes scanned, we hear a voice say:
- STORE VOICE: “Hello, Mr. Yakamoto! Welcome back to the Gap.”
- Anderton stops cold as a holographic image of a huge Asian man now appears, standing in front of him.
- STORE VOICE: “How did those assorted tank tops work out for you?”
- Anderton stops and stares at the thug-like previous owner of his eyes who is now shown wearing a sweater that changes from color to color.
- STORE VOICE: “Come on in and see how good you look in one of our new winter sweaters.”
Let us suspend our concerns about data privacy for a second (refer to “Don’t Do It Just Because You Can,” where I talked about the difference between being helpful and creepy), and break down what just happened here.
First, the computer (in this case, representing marketers) identified the target individual. Identification of an individual is the most essential step toward personalization. But even with an ample amount of collected data, including personal trails, most marketers do not even try it. By the way, email addresses and cookies do not represent an individual. I am talking about a consistent key that enables mapping of a complete customer’s journey, not some clicks here and there.
Secondly, the computer retrieved the target individual’s past browsing and purchase history in real-time. Such fast retrieval and application means that collected data are properly categorized and tagged around individuals for the purpose of personalization, which must be differentiated from categorization for product taxonomy for inventory management or simple Web display.
Then the machine runs through a product recommendation algorithm in real-time, using behavioral, transitional and environmental data it collected about that individual. The final step of this 1-to-1 marketing is delivering the message to the target individual at the right moment through the right channel (in this case, in the person’s face). And in this futuristic movie, all this happens within a few seconds. Impressive? Well, not really.
Granted that the computers of the future will be far more advanced — more advanced than today’s computers are in comparison to the mainframes of the 1960s. Still, it is not that impressive, because these are the same steps that any decent data player and analyst have been following since our predecessors first applied data mining techniques to target marketing. It just took much longer — six months or more, at times — to do all this. But is it all that surprising that one day, a computer will be able to finish these tasks in a second? Is it a big deal that it is showing sweaters in different colors as the new season is around, anyway?
The important lesson is that no matter how much improvement we will gain in computing and analytics in the future, no one should skip steps that are laid out in this example. However, I am concerned that too many new data players act as if some analytical silver bullet will make all the marketers’ dreams come true, when tool sets are, in reality, developed mostly for one major function at a time.
Let’s just say that the data refinement step (Step No. 2, in the example) is skipped over, as it happens so often even in organizations that currently want to adapt advanced analytics. Do you think that some analytical software will magically take care of categorization and data hygiene steps, too? One day it may. And even if such a day comes, it will take a different type of machine learning for that specific task, and the initial parameters will have to be set by humans with clear goals. Without a proper training process, the machine will not even understand the target categories when faced with gigabytes of raw data.
Too many developers, regardless of the company banner under which they work, are completely ignoring old-school disciplines. Many start-ups are acting like they are reinventing data mining all over again. Too many are attempting to develop a super machine that can just take any size of unrefined, unstructured, and uncategorized data and spit out answers, without funnel-like data reduction steps. Sometime in the future, things may just happen that way. In fact, I cheer for anyone who will have that kind of breakthrough.
However, at the risk of sounding like an old school geezer, I would say even such a machine will have to take steps laid out by our predecessors. Data mining started a long time ago with much slower computers, and the old-timers had to think about the steps even more carefully, as they had no time to waste precious machine time. And each necessary step definitely calls for a separate set of goals and different expertise. Data collection, mass storage, rapid retrieval, tagging and categorization, individual identification, data transformation and summary, modeling, scoring, customized message creation, and delivery of the final message to the right person at the right time through the right channel – they all call for different modules that must work together seamlessly. Just like the lunar module and its mother ship in the orbit, built by separate teams.
One day, machines will perform all these tasks flawlessly. Some steps, such as statistical modeling work, will be automated before others. Nevertheless, let’s not forget that human and mathematical elements are the only factors that remained constant during the evolution of data mining and decision science. That was true when the computing time was really precious, and it will be true when computing speed will be 1 million times faster than today.
How will the data players stay relevant in the future? The answer, I think, is having the ability to break complex problems into logical steps and raise questions in mathematically sound ways. Machines, no matter how advanced they may become, will not understand illogical requests. Plus, machines will not fully comprehend motivations of humans, whether they are marketers or consumers.
That is why the customer journey has to be mapped by logical humans with the help of smart machines, taking the best-of-both-worlds approach. I believe developers who understand that human element will help us leap toward the next phase. Not the ones who don’t take stepwise approaches, and not the ones who are completely tool set-oriented.
Stephen H. Yu is a world-class database marketer. He has a proven track record in comprehensive strategic planning and tactical execution, effectively bridging the gap between the marketing and technology world with a balanced view obtained from more than 30 years of experience in best practices of database marketing. Currently, Yu is principal and chief product officer at BuyerGenomics. Previously, Yu was the head of analytics and insights at eClerx, and VP, Data Strategy & Analytics at Infogroup. Prior to that, he was the founding CTO of I-Behavior Inc., which pioneered the use of SKU-level behavioral data. “As a long-time data player with plenty of battle experiences, I would like to share my thoughts and knowledge that I obtained from being a bridge person between the marketing world and the technology world. In the end, data and analytics are just tools for decision-makers; let’s think about what we should be (or shouldn’t be) doing with them first. And the tools must be wielded properly to meet the goals, so let me share some useful tricks in database design, data refinement process and analytics.” Reach him at email@example.com.