David Semmelroth

Statistics for Big Data For Dummies Cheat Sheet

Cheat Sheet / Updated 03-10-2022

Summary statistical measures represent the key properties of a sample or population as a single numerical value. This has the advantage of providing important information in a very compact form. It also simplifies comparing multiple samples or populations. Summary statistical measures can be divided into three types: measures of central tendency, measures of central dispersion, and measures of association.

View Cheat Sheet

Big Data

Big Data and Healthcare Services

Article / Updated 03-26-2016

Healthcare is one area where big data has the potential to make dramatic improvements in the quality of life. The increasing availability of massive amounts of data and rapidly increasing computer power could enable researchers to make breakthroughs, such as the following: Predicting outbreaks of diseases Gaining a better understanding of the effectiveness and side effects of drugs Developing customized treatments based on patient histories Reducing the cost of developing new treatments One of the biggest challenges facing the use of big data in healthcare is that much of the data is stored in independent "silos.

View Article

Big Data

10 Key Concepts in Hypothesis Testing

Article / Updated 03-26-2016

Hypothesis testing is a statistical technique that is used in a variety of situations. Though the technical details differ from situation to situation, all hypothesis tests use the same core set of terms and concepts. The following descriptions of common terms and concepts refer to a hypothesis test in which the means of two populations are being compared.

View Article

Big Data

Big Data and Electric Utilities

Article / Updated 03-26-2016

One area where big data has made an impact on electric utilities is the development of smart meters. Smart meters provide a more accurate measure of energy usage by giving far more frequent readings than traditional meters. A smart meter may give several readings a day, not just once a month or once a quarter.

View Article

Big Data

Discrete and Continuous Probability Distributions

Article / Updated 03-26-2016

The two basic types of probability distributions are known as discrete and continuous. Discrete distributions describe the properties of a random variable for which every individual outcome is assigned a positive probability. A random variable is actually a function; it assigns numerical values to the outcomes of a random process.

View Article

Big Data

Overview of Graphical Techniques

Article / Updated 03-26-2016

Several different types of graphs may be useful for analyzing data. These include stem-and-leaf plots, scatter plots, box plots, histograms, quantile-quantile (QQ) plots, and autocorrelation plots. A stem-and-leaf plot consists of a “stem” that reflects the categories in a data set and a “leaf” that shows each individual value in the data set.

View Article

Big Data

Overview of Hypothesis Testing

Article / Updated 03-26-2016

One important way to draw conclusions about the properties of a population is with hypothesis testing. You can use hypothesis tests to compare a population measure to a specified value, compare measures for two populations, determine whether a population follows a specified probability distribution, and so forth.

View Article

Big Data

Measures of Association

Article / Updated 03-26-2016

Measures of association quantify the strength and the direction of the relationship between two data sets. Here are the two most commonly used measures of association: Covariance Correlation Both measures are used to show how closely two data sets are related to each other. The main difference between them is the units in which they are measured.

View Article

Big Data

Measures of Central Tendency

Article / Updated 03-26-2016

Measures of central tendency show the center of a data set. Three of the most commonly used measures of central tendency are the mean, median, and mode. Mean Mean is another word for average. Here is the formula for computing the mean of a sample: With this formula, you compute the sample mean by simply adding up all the elements in the sample and then dividing by the number of elements in the sample.

View Article

Big Data

Measures of Central Dispersion

Article / Updated 03-26-2016

Measures of central dispersion show how "spread out" the elements of a data set are from the mean. Three of the most commonly used measures of central dispersion include the following: Range Variance Standard deviation Range The range of a data set is the difference between the largest value and the smallest value.

View Article

Book & Article Categories

Book & Article Categories

Collections

David Semmelroth

Articles & Books From David Semmelroth

Statistics for Big Data For Dummies Cheat Sheet

Big Data and Healthcare Services

10 Key Concepts in Hypothesis Testing

Big Data and Electric Utilities

Discrete and Continuous Probability Distributions

Overview of Graphical Techniques

Overview of Hypothesis Testing

Measures of Association

Measures of Central Tendency

Measures of Central Dispersion