Statistics for Big Data For Dummies
Book image
Explore Book Buy On Amazon

Prior to performing any type of statistical analysis, understanding the nature of the data being analyzed is essential. You can use EDA to identify the properties of a dataset to determine the most appropriate statistical methods to apply to the data. You can investigate several types of properties with EDA techniques, including the following:

  • The center of the data

  • The spread among the members of the data

  • The skewness of the data

  • The probability distribution the data follows

  • The correlation among the elements in the dataset

  • Whether or not the parameters of the data are constant over time

  • The presence of outliers in the data

Another key question EDA answers is "Does the data conform to our assumptions?" Identifying the properties of a dataset is very important, because many statistical procedures are sensitive to the assumptions you make about the data.

About This Article

This article is from the book:

About the book authors:

Alan Anderson, PhD, is a professor of economics and finance at Fordham University and New York University. He's a veteran economist, risk manager, and fixed income analyst.

David Semmelroth is an experienced data analyst, trainer, and statistics instructor who consults on customer databases and database marketing.

This article can be found in the category: