10 Tips for Better Big Data Analysis
Want to get the most out of your analysis of Excel data? Here are ten quick tips for working effectively and efficiently with big data.
Consider your work a search for buried treasure
You should view data analysis as a process similar to looking for buried treasure.
In other words, data mining resembles gold mining. You’re pounding your way through the data or sifting through the granularity in search of valuable nuggets. This effort can be painstaking and tedious.
However, with persistence and a bit of luck, you should often (will often?) be able to find valuable insights into both opportunities and threats you might otherwise have missed.
You want and need to remember that.
Collect more data
You should collect more data . . . and then be good about storing and saving the data you do collect.
In order words, don't sloppily discard or carelessly lose or foolishly throw away the data we already collect or have. That data could be priceless. And if it isn't priceless today, who knows? It might be at some point in the future.
Face it. The richer the data set, the better the chances some cool insight will jump out at you.
Create more data
Work to create more data.
Okay, that maybe sounds silly. But in some cases, useful data can be created very economically.
Here's a simple example: If you run a business, ask clients how they came to find you. You'll great insights into your marketing efforts as a result.
You probably have other interesting ways to create more data.
Regularly run experiments
Data creation methods such as experimenting via AB testing and pilot studies can economically provide data of extraordinary value.
For example, author Timothy Ferris in his bestselling book, The Four Hour Workweek, describes using pay-per-click advertisements to gauge product feasibility. That’s a great idea, and one that probably in many cases results in way more accurate analytical conclusions than a focus group.
Go big (with your datasets and your samples)
If you learned about statistics in the age before computers and their large datasets were widely available and easy to use, you may have a tendency to make judgments and decisions based on small datasets.
Today, that's really pretty inexcusable. Nowadays, you should be working with huge datasets. Whenever possible, go big and use big or bigger datasets and samples.
Don't delegate data analysis
From the perspective of many managers or business owners, having some young tech-savvy intern might seem like the best approach to getting really good data analysis performed.
But if you talk with the people doing lots of data analysis, you’re quite likely to hear that what you really want to do is assign the smartest, most experienced team member you can to work on this project. In other words, the people you really want doing this work are the people who probably don’t have time to do it.
Maybe, in fact, you should just do the data analysis yourself if you’re the grand Pooh-Bah.
Again, think about this work as akin to mining for buried treasure. The insights you might uncover could be enormously valuable. As good as some young buck or young doe might be, you for darn sure don’t want them to miss some outstanding opportunity or a potentially catastrophic threat because they lack experience or don’t yet have fully developed strategic thinking skills.
Waste time pouring over meaningless data
Here's a silly idea. Maybe you should occasionally waste time pouring over seemingly meaningless data: cross-tabulations of time-stamped sales receipts, analytics data from your website, third-party transaction logs, and so forth.
You never know what you’ll find. And sometimes the best insights can come from the most surprising places.
Inventory internal data sources
A housekeeping item: You probably want to keep an inventory of internal data sources. And the list should probably include more than just the accounting system and your web servers’ analytics files. All sorts of interesting data exists, when you start thinking about it. And some of this stuff will get lost or get forgotten if you aren't careful.
Build a library of external raw data sources
A quick reminder? Some of your raw data sources aren’t internal but external. Don’t forget about those.
Even the smallest businesses may have access to third-party payment processing files and transaction lists created by outside web services.
Protect proprietary data sources
Because any proprietary data sources potentially have enormous value, you of course want to carefully protect the asset.
Now of course this means that you want to safely store and regularly back up the data, but that's not all. Protecting your proprietary data means you want to make sure that the data stays proprietary and (maybe even more so) that any insights contained in the data stay internal. Something to think about. . .