By Bernard Marr

Part of Big Data for Small Business For Dummies Cheat Sheet

The technical jargon surrounding big data can seem a little daunting at first. The key phrases and terms you’re likely to come across, with easy-to-understand definitions for each, follow:

  • Big data: Increasingly, everything you do leaves a digital trace (or data), which you (and others) can use and analyse. The phrase big data refers to that data being collected and the ability to make use of it.

  • Big data analytics: This is the process of collecting, processing and analysing data to generate insights that inform fact-based decision making. In many cases it involves software-based analysis using algorithms.

  • Algorithm: A mathematical formula or statistical process run by software to analyse data. It usually involves multiple calculation steps and can be used to automatically process data or solve problems.

  • Cloud computing: Software or data running on remote servers, rather than locally. So instead of storing or computing things on your own machine, you can use other computers that are connected to your computer via a network (such as the Internet).

  • Structured data: Any data or information located in a fixed field within a defined record or file, such as a database or spreadsheet. Its inherent structure makes it quick, easy and cheap to analyse.

  • Unstructured data: All the data not easily stored and indexed in traditional formats or databases. It includes email conversations, social media posts, video content, photos, voice recordings, sounds and so on. Its lack of structure makes it more difficult to analyse using traditional computer programs.

  • Semi-structured data: You guessed it, this is a cross between unstructured and structured data. It’s data that may have some structure that can be used for analysis but lacks the strict structure found in databases or spreadsheets. For example, a Facebook post can be categorised by author, date, length and even sentiment, but the content is generally unstructured.

  • Internal data: This accounts for all the data your business currently has or could potentially access or generate in future. It could be structured in format (for example, a customer database) or it could be unstructured (conversational data from customer service calls).

  • External data: Put simply, this is the infinite array of information that exists outside your business. It can be publically available or privately held and it can also be structured or unstructured in format.

  • The Internet of Things: A network that connects devices (the things referred to in the name) so that they can communicate with each other. This encompasses technology like smart televisions, smart phones, and sensors, and it’s all possible thanks to the massive increase in connectivity between devices, systems and services.