The Relationship between Probability and Inferential Statistics
A statistic is a result that’s derived from performing a mathematical operation on numerical data. In general, you use statistics in decision making. You also tend to encounter statistics of two distinct flavors:

Descriptive statistics: As the name implies, descriptive statistics focus on providing you with a description that illuminates some characteristic of your numerical dataset.

Inferential statistics: Rather than focusing on pertinent descriptions of your dataset, inferential statistics carve out a smaller section of the dataset and attempt to deduce something significant about the larger dataset. Use this type of statistics to get information about some real — world measure in which you’re interested.
It’s true that descriptive statistics describe the characteristics of a numerical dataset, but that doesn’t really tell you much about why you should care about that data. In fact, most data scientists are only interested in descriptive statistics because of what they reveal about the real — world measures they describe. For example, a descriptive statistic is often associated with a degree of accuracy, indicating the statistic’s value as an estimate of the realworld measure.
To better understand this concept, imagine that a business owner wants to estimate his upcoming quarter’s profits. He might take an average of his last few quarters’ profits to use as an estimate of how much he will make during the next quarter. But if the previous quarters’ profits varied widely, a descriptive statistic that estimated the variation of this predicted profit value (the amount by which this dollar estimate could differ from the actual profits he will make) would indicate just how far off the predicted value could be from the actual one. (Not bad information to have.)
Like descriptive statistics, inferential statistics also reveal something about the real‐]world measure in which you’re interested. In contrast to descriptive statistics, however, inferential statistics provide information about a small data selection, so you can use this information to infer something about the larger dataset from which it was taken. In statistics, this smaller data selection is known as a sample, and the larger, complete dataset from which the sample is taken is called the population.
If your dataset is too big to analyze in its entirety, then pull a smaller sample of this dataset, analyze that, and then make inferences about the entire dataset based on what you learn from analyzing the sample. You can also use inferential statistics in situations where you simply can’t afford to collect data for the entire population. In this case, you’d simply use the data you do have to make inferences about the population at large.
Other times, you may find yourself in situations where complete information for the population is simply not available. In these cases, you can use inferential statistics to estimate values for the missing data based on what you learn from analyzing the data that is available.
Descriptive statistics describe the characteristics of your numerical dataset, while inferential statistics are used to make inferences from subsets of data so you can better understand the larger datasets from which the subset is taken. To better understand this distinction, imagine that you have a socioeconomic dataset that describes women, from age 18 to 34 that live in Philadelphia, Pennsylvania.
Descriptive statistics would allow you to understand the characteristics of the woman population that comprises this subset. Or, you could use inferential statistics with this dataset to make inferences about the larger population of women, who are 18 to 34 years old, but that are living in all cities in the state of Pennsylvania (and not just in Philadelphia).
For an inference to be valid, you must select your sample carefully so that you get a true representation of the population. Even if your sample is representative, the numbers in the sample dataset will always exhibit some noise — random variation, in other words — that guarantees the sample statistic is not exactly identical to its corresponding population statistic.