Big Data For Dummies
Book image
Explore Book Buy On Amazon

Many companies that are beginning their exploration of big data are in the early stages of execution. Consider these do’s and don’ts as part of your strategy. Most companies are experimenting with pilots to see whether they can leverage big data sources to transform decision making. It is easy to make mistakes that can cause disruptions in your business strategy.

Do involve all business units in your big data strategy

Big data is not an isolated activity. Rather, it is the way that the business can leverage huge volumes of data to learn more about customers, processes, and events than would be possible with snapshots of data. If executed properly, a big data strategy can have a huge impact on the effectiveness of a business strategy.

Companies that assume that data that is out of the norm is wrong may suddenly discover some new emerging patterns of customer requirements. The business units can gain significant value when they are brought into the process early.

Do evaluate all delivery models for big data

It is natural to assume that if you are dealing with petabytes of data, the only way to store and manage that data is in the data center. Technology is evolving so that it is possible and necessary to use cloud computing storage and compute resources to manage big data. Evaluate the type of services that are cloud based and determine which ones have the performance that you will need.

Do think about your traditional data sources as part of your big data strategy

Many companies that have found value in big data analytics assume that they no longer have to think about the traditional data warehouse. This is not true. In fact, it is critical that you plan to use the results of your big data analytics in conjunction with your data warehouse. The data warehouse includes the information about the way your company operates.

Therefore, being able to compare the big data results against the benchmarks of your core data is critical for decision making.

Do plan for consistent big metadata

When you complete the analysis of a massive data set, it is quite possible that you will come up with data that all matches a pattern. This set of data now can lead your organization to begin analyzing a new issue in depth.

Keep in mind that this data might come from customer service sites or social media environments that have not been cleansed. Therefore, before you trust the data, you have to make sure that you are dealing with a consistent set of metadata so that you can bring this information into your organization and analyze it in concert with the data from your systems of record.

Do distribute your big data

When you are dealing with big data, don’t assume that you will be able to manage all this information within a single server. Find out how to use distributed computing techniques such as Hadoop to effectively manage the size, variety, and required speed to manage your data.

Don’t rely on a single approach to big data analytics

So much hype exists in the market around technologies such as Hadoop and MapReduce that you might lose sight of what you are actually trying to accomplish. A lot of important technologies are available, such as text analytics, predictive analytics, streaming data environments, and spatial data analysis, that may be important for the job you are trying to accomplish.

Spend the time to investigate the variety of technologies that can support you. Experiment and investigate the technology solutions that can make you successful.

Don’t go big with your data before you are ready

You are right to be excited about the potential that big data offers your company. Big data can mean the difference between jumping into an exciting new market before your competitors or being left behind. Walk before you run. You need to start with pilot projects that can allow you to gain some experience. You need to work with experts who can keep you from making mistakes because of inexperience.

Don’t overlook the need to integrate big data

Your big data sources will not be effective if they live in isolation from each other. Good technologies in the market are focused on making it easier to integrate the results of big data analytics with other data sources. Therefore, be prepared not just to analyze but also to integrate.

Don’t forget to manage big data securely

When companies embark on big data analysis, they often forget to maintain the same level of data security and governance that is assumed in traditional data management environments. When you begin doing analysis of several petabytes or more of data, you typically won’t mask out private information at the outset.

However, when you have a subset of that initial data set that is now critical to determining your next best action or your approach to a new market, you need to first secure that data so that it doesn’t put your business at risk. Some of this data will now become corporate intellectual property that has to be secured.

You may also need to manage privacy requirements. This security has to become part of your big data life cycle. In addition, some of the data sources that you are using may come from third-party data sources that require licenses. Make sure that you are allowed to use this data and that you haven’t violated governance rules.

Don’t overlook the need to manage the performance of your big data

Big data demonstrates that people are able to make use of more data than ever before at a faster rate of speed than was possible in the past. This capability to gain more insights is a huge benefit. If that data isn’t managed in an effective way, it will cause huge problems for the company. Therefore, you need to build manageability into your road map and plan for big data.

About This Article

This article is from the book:

About the book authors:

Judith Hurwitz is an expert in cloud computing, information management, and business strategy. Alan Nugent has extensive experience in cloud-based big data solutions. Dr. Fern Halper specializes in big data and analytics. Marcia Kaufman specializes in cloud infrastructure, information management, and analytics.

This article can be found in the category: