10 Free Data Science Tools and Applications
Visualizations are a vitally important part of the data scientist’s toolkit for big data. With them, you can leverage the brain’s capacity to quickly absorb visual information. Data visualizations are a very effective means of communicating data insights.
Making custom web-based data visualizations with free R packages
These packages and tools are useful for creating really cool data visualizations, but they require you to code in R statistical programming language to be able to use them.
Getting Shiny by RStudio
With the 2012 launch of RStudio’s Shiny package, both statistical analysis and web-based data visualization can be carried out in the same framework.
If your goal is to quickly use a few lines of code to instantly generate a web-based data visualization application, you can use R’s Shiny package to do this.
Charting with rCharts
To see some examples of data visualizations created using rCharts, check out rCharts Gallery.
Mapping with rMaps
Using rMaps, you can create animated or interactive chloropleths, heat maps, or even maps with annotated location droplets.
If your goal is to create a spatial data visualization that has interactive sliders that users can move to select the data range they want to see, then rMaps offers you a perfect solution.
Check out more scraping, collecting, and handling tools
You can use web scraping to derive really interesting and unique datasets for your data-driven stories.
Scraping data with Import.io
Import.io is a free desktop application that, with a few clicks of the mouse, you can use to painlessly copy, paste, clean, and format any part of a web page. You can even use Import.io to automatically crawl and extract data from multipage lists.
Using Import.io, you can scrape data from a simple or complicated series of web pages:
To scrape a simple series of web pages, access them through simple hyperlinks, in Page 1, Page 2, Page 3, . . . series.
To scrap a complicated series of web pages, fill in a form or choose from a drop-down list, and submit your scraping request to the tool.
Collecting images with ImageQuilts
The task ImageQuilts performs is deceptively simple to describe but very complex to implement. ImageQuilts makes collages of tens of images, and pieces them all together into one “quilt” that’s comprised of multiple rows of equal height.
ImageQuilts even allows you to choose the order of images or to randomize them. You can use the tool to drag and drop any image to any place, remove an image, zoom all images at once, or zoom each image individually.
Wrangling data with DataWrangler
The kinds of manipulations you can do with DataWrangler are similar to what you can do in Excel using Visual Basic. An example of this type of task is using DataWrangler or Excel with Visual Basic to copy, paste, and format information from lists on the Internet.
DataWrangler is so great that it suggests actions based on your dataset, and can even repeat complex actions across entire datasets.
Check out more data exploration tools
Visualization is important for clarifying and communicating your data’s meaning, but careful data analysis is even more important.
Talking about Tableau Public
Tableau Public is a free desktop application that aims to be a complete package for chart making. Tableau Public creates three levels of document:
Worksheet: The Worksheet is where you can create individual charts from data you’ve imported from Access, Excel, or a text-format CSV file.
Dashboard: You can use a Tableau Dashboard to combine charts with text annotations or with other data charts.
Story: With a Tableau Story, you can combine several dashboards in a sort of slideshow presentation that shows a linear story in your data.
Getting up to speed in Gephi
Gephi is an open-source software package you can use to create graph layouts and then manipulate them to get the most clear and effective results. The kinds of connection-based visualizations you can create in Gephi are very useful in all types of network analyses.
This graph shows which characters appear in the same chapter as which other characters in Victor Hugo’s immense novel Les Misérables.
Here’s a hairball graph of the United States power grid, and the degrees of interconnectedness between thousands of power generation and power distribution facilities.
Machine learning with the WEKA suite
Waikato Environment for Knowledge Analysis (WEKA) is a standalone application that you can use to analyze patterns in your datasets and then visualize those patterns in all sorts of interesting ways. For advanced users, WEKA’s true value is derived from its suite of machine-learning algorithms that you can use to cluster or categorize your data.
Check out more web-based visualization tools
You can use a variety of free web apps to easily generate unique and interesting data visualizations.
Getting a little Weave up your sleeve
If your goal is to create visualizations that allow your audience to see and explore the interrelatedness between subsets of your data, then Weave is the perfect tool for this type of task.
Here’s a demo visualization on Weave’s own server. It depicts every county in the United States, with many columns of data from which to choose.
Checking out Knoema’s data visualization offerings
You can use Knoema’s data-visualization tools to create visualizations that enable your audience to easily explore data, drill down on geographic areas or on different indicators, and automatically produce data-driven timelines.
Here’s a chart and a table that were automatically generated with just two mouse clicks in Knoema.
You can use Knoema to make your own dashboards, too.
You can make dashboards from your own data or from open data in Knoema’s repository.