Analysis Tools You Should Know to Get a Job in Big Data - dummies

Analysis Tools You Should Know to Get a Job in Big Data

By Jason Williamson

In the construction industry, it’s often said that the kitchen sells the house. What does this have to do with big data? If the end result is to communicate information to take action, then analysis and visualization tools are the kitchen of big data.

Business analytics or business intelligence tools

Business analytics (BA) or business intelligence (BI) tools can be used to directly connect to data stores, both structured and unstructured, to help in the analysis and interpretation. Appendix A fills you in on some resources for staying up to date on widely used BA and BI tools. Here are a few tools you may want to investigate:

  • Birst: Birst technologies is an on-demand, or cloud based, business intelligence and analytical tool for data analysis.

  • IBM Cognos: Cognos is the primary product for business intelligence from IBM. It was acquired in 2008.

  • Jaspersoft: Jaspersoft is an open-source business intelligence platform.

  • MicroStrategy: MicroStrategy, based in Washington, D.C., is a publicly traded business intelligence software firm that has been in business since 1989.

  • Oracle Business Intelligence: Oracle has a suite of business intelligence tools, some built in-house and some acquired from Hyperion. Oracle business intelligence is a suite of tools that can work on any relational database platform.

  • Pentaho: Pentaho is an open-source business intelligence software firm based in Florida, in operation since 2004.

  • QlikView: QlikView is an analytical visualization and business intelligence software firm.

  • RapidMiner: Rapid Miner is tool for predictive analytics. It provides an environment for machine learning and text analytics and is offered in both a commercial and open-source model.

  • SAP: SAP is large multinational software firm based in Germany. Products like SAP HANA allow users to process and analyze big datasets.

  • Tableau: Tableau is a software firm that offers visualization and business intelligence tools.

There is a lot of debate about the terms business analytics (BA) and business intelligence (BI). Both BA tools and BI tools are used to access data, process it, analyze it, and then communicate the results to the end-user. Many vendors and pundits argue over the terms and even use them interchangeably. Currently, business analytics appears to be growing in popularity and the term business intelligence is receding.

Visualization tools

Not all information can be communicated in two-dimensional graphs and charts. Data can be viewed beyond the two dimensions of X and Y. When data is viewed in a third dimension, it can be connected in ways that show various relationships, patterns, and correlations.

Remember graph theory? That’s where the edges and properties are applicable in a tangible way. This is where visualization tools come into play. These tools are able to communicate information, connectedness, and correlation in ways that are deep and dynamic.

Data journalist David McCandless gave a famous TED talk on visualization, in which he said, “By visualizing information, we turn it into a landscape that you can explore with your eyes, a sort of information map. And when you’re lost in information, an information map is kind of useful.” He goes on to give some really great examples of how data visualization of big data can reveal insights that may not otherwise been understood


The following is a list of great visualization tools:

  • Dygraphs: A free JavaScript-based library for building complicated charting in web browsers.

  • Exhibit: Interactive mapping tool created by MIT. Free to use.

  • Google Charts: A Google-powered online charting tool.

  • jQuery Visualize: An open-source charting engine supporting jQuery.

  • Kartograph: Build interactive mapping without Google Maps that can run across any browser.

  • Many Eyes: IBM-developed tools for analyzing publically available data sets.

  • R:A free, open-source environment for graphing statistical analysis.

  • WolframAlpha: Ask this engine anything. This knowledge engine comes back with information, charts, and data. It also supports an API to programmatically retrieve charting and information.

  • ZingChart: A JavaScript library supporting more than 100 chart types; supports Flash or HTML5

Sentiment analysis tools

Sentiment analysis tools and processes attempt to measure how people feel about a certain thing, event, or product. For example, if a technology company releases a new phone product, the company may be able to measure how people feel about it by combing people’s tweets, blog posts, Facebook updates, or other social media outlets.

The big data challenges include the following:

  • Volume: Sifting through millions of tweets looking for relevant hashtags while mashing that up with Instagram pictures can be time-consuming to say the least.

  • Interpretation: How do you interpret feeling when there is no structured way to communicate? On Twitter, spelling and grammar are almost worthless.

That’s where sentiment analysis helps out. Great advances have been made in text and speech analysis, and innovation continues. But text and speech analysis aren’t the only ways to measure sentiment. You can measure things like followers, retweets, likes, and other properties associated with social media moods.

Twitter is a common target for sentiment analysis, primarily because Twitter can give near real-time reaction to events. Other tools you can use include Google Alerts, Hootsuite, and Facebook Insights.

Machine learning

Machine learning is a focus within computer science that uses artificial intelligence (AI) to allow computers to automatically learn from data. A very simple example of this is the autocorrect or autocomplete feature on your smartphone. Your personal device “learns” the common words and phrases that you use to help with spelling correction and typing tasks.

Another example is how companies like Shazam are using music data to predict the next hits by learning from historical music patterns, arrangements, and beats. Patterns emerge that tend to produce winning music formulas.

Machine learning jobs are not limited to innovative firms like Shazam. Any company that has to think about frequent user interaction and predicting future patterns can use machine learning specialists.