Articles & Books From Big Data

Article / Updated 12-01-2023
Getting the most out of your unstructured data is an essential task for any organization these days, especially when considering the disparate storage systems, applications, and user locations. So, it’s not an accident that data orchestration is the term that brings everything together.Bringing all your data together shares similarities with conducting an orchestra.
Cheat Sheet / Updated 04-12-2022
Big data makes big headlines, but it’s much more than just a buzz phrase or the latest business fad. The phenomenon is very real and it’s producing concrete benefits in so many different areas – particularly in business. Here you will get to the heart of big data as a business owner or manager: You will take a look at the key terminology you need to understand the crucial big data skills for businesses, ten steps to using big data to make better decisions, and tips for communicating insights from data to your colleagues.
Cheat Sheet / Updated 03-10-2022
Summary statistical measures represent the key properties of a sample or population as a single numerical value. This has the advantage of providing important information in a very compact form. It also simplifies comparing multiple samples or populations. Summary statistical measures can be divided into three types: measures of central tendency, measures of central dispersion, and measures of association.
Cheat Sheet / Updated 02-09-2022
To stay competitive today, companies must find practical ways to deal with big data — that is, to learn new ways to capture and analyze growing amounts of information about customers, products, and services.Data is becoming increasingly complex in structured and unstructured ways. New sources of data come from machines, such as sensors; social business sites; and website interaction, such as click-stream data.
Article / Updated 03-26-2016
Statistical software packages are extremely powerful these days, but they cannot overcome poor quality data. Following is a checklist of things you need to do before you go off building statistical models. Check data formats Your analysis always starts with a raw data file. Raw data files come in many different shapes and sizes.
Article / Updated 03-26-2016
The two basic types of probability distributions are known as discrete and continuous. Discrete distributions describe the properties of a random variable for which every individual outcome is assigned a positive probability. A random variable is actually a function; it assigns numerical values to the outcomes of a random process.
Article / Updated 03-26-2016
Security and privacy requirements, layer 1 of the big data stack, are similar to the requirements for conventional data environments. The security requirements have to be closely aligned to specific business needs. Some unique challenges arise when big data becomes part of the strategy: Data access: User access to raw or computed big data has about the same level of technical requirements as non-big data implementations.
Article / Updated 03-26-2016
What does your business now do with all the data in all its forms? Big data requires many different approaches to analysis, traditional or advanced, depending on the problem being solved. Some analyses will use a traditional data warehouse, while other analyses will take advantage of advanced predictive analytics.
Article / Updated 03-26-2016
Before you apply statistical techniques to a dataset, it's important to examine the data to understand its basic properties. You can use a series of techniques that are collectively known as Exploratory Data Analysis (EDA) to analyze a dataset. EDA helps ensure that you choose the correct statistical techniques to analyze and forecast the data.
Article / Updated 03-26-2016
While the worlds of big data and the traditional data warehouse will intersect, they are unlikely to merge anytime soon. Think of a data warehouse as a system of record for business intelligence, much like a customer relationship management (CRM) or accounting system. These systems are highly structured and optimized for specific purposes.