Typically, companies begin their journey to big data by starting with an organizational experiment to see whether big data can play an important role in defining and impacting business strategy. However, after it becomes clear that big data will have a strategic role as part of the information management environment, you have to make sure that the right structure is in place to support and protect the organization.
Before you establish policies, you first have to know what you are dealing with. For example, are you going to involve transactional systems, social media data, or machine-generated data? Do you intend to combine information from these different sources as part of your data analytics strategy?
If you are planning to move forward with more than an isolated experiment, you will need to update your governance strategy so that you are prepared to manage a new variety of data in ways that are safe.
Prepare for stewardship and management of big data risk
No matter what your information management strategy is, you need to make sure that you have the right level of oversight. This is simply a best practice in general and does not change when you add big data to the mix. However, you may need to implement data stewardship differently with the addition of big data sources.
For example, you might need to have a different individual monitor social media data because it has a different origin and different structure than traditional relational data. This new data steward role needs to be carefully defined so that the individual selected can work across the business units that find this type of data most relevant to how they are analyzing the business.
The data steward needs to understand or have access to the right people who understand the company’s data retention policy as well as the requirements for masking out personal data no matter where that data originates.
Set the right big data governance and quality policies
The way that an organization deals with big data is an ongoing cycle and not a one-time project. The potential for causing risk to the business can be serious if consistent rules and processes are not applied consistently. Data quality should also be approached from a governance standpoint. When you think about policy, here are some of the key elements that need to be codified to protect your organization:
Determine best practices that your peers have implemented to have consistent polices documented so that everyone has the same understanding of what is required.
Compare your policies with the governance requirements for your own business and your industry. Update your policies if you find oversights.
Do you have a policy about the length of time that you must hold on to information? Do these policies apply to the data you are collecting from external sources, such as customer discussion groups and social media sites?
What is the importance of the data sources that you are bringing into the business? Do you have quality standards in place so that a set of data is only used for decision making if it is proven to be clean and well documented?
It is easy to get caught up in the excitement of leveraging big data to conduct the type of analysis that was never achievable before. But if that analysis leads to incorrect conclusions, your business will be at risk. Even data coming from sensors could be impacted by extraneous data that will cause an organization to come to the wrong conclusion.