Not All Databases Are Created Equal
This is why I insist that data scientists and professionals must be businesspeople first. The developers should never forget that end-users are not trained data experts. And guess what? Even professional analysts would appreciate intuitive variable sets and complete data dictionaries. So, pretty please, with sugar on top, make things easy and simple.
I saved this important item for last for a good reason. Yes, the dollar sign is a very important factor in all business decisions, but it should not be the sole deciding factor when it comes to databases. That means CFOs should not dictate the decisions regarding data or databases without considering the input from CMOs, CTOs, CIOs or CDOs who should be, in turn, concerned about all the other criteria listed in this article.
Playing with the data costs money. And, at times, a lot of money. When you add up all the costs for hardware, software, platforms, tool sets, maintenance and, most importantly, the man-hours for database development and maintenance, the sum becomes very large very fast, even in the age of the open-source environment and cloud computing. That is why many companies outsource the database work to share the financial burden of having to create infrastructures. But even in that case, the quality of the database should be evaluated based on all criteria, not just the price tag. In other words, don't just pick the lowest bidder and hope to God that it will be alright.
When you purchase external data, you can also apply these evaluation criteria. A test-match job with a data vendor will reveal lots of details that are listed here; and metrics, such as match rate and variable fill-rate, along with complete the data dictionary should be carefully examined. In short, what good is lower unit price per 1,000 records, if the match rate is horrendous and even matched data are filled with missing or sub-par inferred values? Also consider that, once you commit to an external vendor and start building models and analytical framework around their its, it becomes very difficult to switch vendors later on.
Stephen H. Yu is a world-class database marketer. He has a proven track record in comprehensive strategic planning and tactical execution, effectively bridging the gap between the marketing and technology world with a balanced view obtained from more than 30 years of experience in best practices of database marketing. Currently, Yu is president and chief consultant at Willow Data Strategy. Previously, he was the head of analytics and insights at eClerx, and VP, Data Strategy & Analytics at Infogroup. Prior to that, Yu was the founding CTO of I-Behavior Inc., which pioneered the use of SKU-level behavioral data. “As a long-time data player with plenty of battle experiences, I would like to share my thoughts and knowledge that I obtained from being a bridge person between the marketing world and the technology world. In the end, data and analytics are just tools for decision-makers; let’s think about what we should be (or shouldn’t be) doing with them first. And the tools must be wielded properly to meet the goals, so let me share some useful tricks in database design, data refinement process and analytics.” Reach him at email@example.com.