Freeform Data Are Not Exactly Free
Whenever "Big Data" is mentioned, there follows this sick stat that 2.5 quintillion bytes of data are being collected every day. The reason that number is so bloated is because literally everything that is digitized is considered as data now. They could be coming from simple tweets, Facebook postings, emails, blogs, videos, audio files, Web pages, mobile apps and program downloads. Imagine combining those with every click that you made, every page you viewed, every breath you took—literally. If you are connected to a medical device, every heartbeat your heart generated. Same goes for when you are wearing one of those fancy devices while jogging through your neighborhood. You can see why they say (though I always wonder who "they" are) we are living in the sea of data. At the dawn of Internet of Things (or the beginning of the Skynet), we should also accept that the human collectives will not be the dominant generators of massive amounts of data in the near future. We will hopefully be in control of the machines and the collected data will remain being mostly about us humans. Relatively speaking, we have only recently become the dominant species on this little planet at the far corner of the Milky Way Galaxy, and might as well enjoy being on the top of the food chain a little while longer.
Now, some of those massive data are in forms of numbers that we can add or subtract. That type of data is expressed in terms of dollars, cents, shillings, Euros or Yuans in the transactional world. Or, if they are about countable human behavior, it can be expressed in days, hours, minutes, seconds, clicks, views, downloads, meters, yards, miles, heartbeats, breaths, gallons, kilometers per hour, etc. Heck, during the last World Cup, they were measuring the exact running distance for each player through some wearable device already. (They displayed impressive figures—around six to seven miles per player per game without counting overtime—a few times more than a running back would cover in a typical football match).
Stephen H. Yu is a world-class database marketer. He has a proven track record in comprehensive strategic planning and tactical execution, effectively bridging the gap between the marketing and technology world with a balanced view obtained from more than 30 years of experience in best practices of database marketing. Currently, Yu is president and chief consultant at Willow Data Strategy. Previously, he was the head of analytics and insights at eClerx, and VP, Data Strategy & Analytics at Infogroup. Prior to that, Yu was the founding CTO of I-Behavior Inc., which pioneered the use of SKU-level behavioral data. “As a long-time data player with plenty of battle experiences, I would like to share my thoughts and knowledge that I obtained from being a bridge person between the marketing world and the technology world. In the end, data and analytics are just tools for decision-makers; let’s think about what we should be (or shouldn’t be) doing with them first. And the tools must be wielded properly to meet the goals, so let me share some useful tricks in database design, data refinement process and analytics.” Reach him at email@example.com.