Data Protection Options for Big Data
Some big data experts believe that different kinds of data require different forms of protection and that, in some cases in a cloud environment, data encryption might, in fact, be overkill. You could encrypt everything. You could encrypt data, for example, when you write it to your own hard drive, when you send it to a cloud provider, and when you store it in a cloud provider's database.
Encrypting everything in a comprehensive way reduces your exposure; however, encryption poses a performance penalty. For example, many experts advise managing your own keys rather than letting a cloud provider do so, and that can become complicated. Keeping track of too many keys can be a nightmare. Additionally, encrypting everything can create other issues.
For example, if you're trying to encrypt data in a database, you will have to examine the data as it's moving (point-to-point encryption) and also while it's being stored in the database. This procedure can be costly and complicated. Also, even when you think you've encrypted everything and you’re safe, that may not be the case.
One of the long-standing weaknesses with encryption strategies is that your data is at risk before and after it's encrypted. For example, in a major data breach at Hannaford Supermarkets in 2008, the hackers hid in the network for months and were able to steal payment data when customers used their credit card at the point of sale. This breach took place before the data was encrypted.
Maintaining a large number of keys can be impractical, and managing the storing, archiving, and accessing of the keys is difficult. To alleviate this problem, generate and compute encryption keys as needed to reduce complexity and improve security.
Here are some other available data-safeguarding techniques:
Data anonymization: When data is anonymized, you remove all data that can be uniquely tied to an individual (such as a person’s name, Social Security number, or credit card number). Although this technique can protect some personal identification, hence privacy, you need to be really careful about the amount of information you strip out. If it's not enough, hackers can still figure out whom the data pertains to.
Tokenization: This technique protects sensitive data by replacing it with random tokens or alias values that mean nothing to someone who gains unauthorized access to this data. This technique decreases the chance that thieves could do anything with the data. Tokenization can protect credit card information, passwords, personal information, and so on. Some experts argue that it's more secure than encryption.
Cloud database controls: In this technique, access controls are built into the database to protect the whole database so that each piece of data doesn’t need to be encrypted.