|
Published:
November 26, 2013

Business Statistics For Dummies

Overview

Make some headway in the notoriously tough subject of business statistics

Business Statistics For Dummies helps you understand the core concepts and principles of business statistics, and how they relate to the business world. This book tracks to a typical introductory course offered at the undergraduate, so you know you’ll find all the content you need to pass your class and get your degree. You’ll get an introduction to statistical problems and processes common to the world of global business and economics. Written in clear and simple language, Business Statistics For Dummies gives you an introduction to probability, sampling techniques and distributions, and drawing conclusions from data. You’ll also discover how to use charts and graphs to visualize the most important properties of a data set.

  • Grasp the core concepts, principles, and methods of business statistics
  • Learn tricky concepts with simplified explanations and illustrative graphs
  • See how statistics applies in the real world, thanks to concrete examples
  • Read charts and graphs for a better understanding of how businesses operate

Business Statistics For Dummies is a lifesaver for students studying business at the college level. This guide is also useful for business professionals looking for a desk reference on this complicated topic.

Read More

About The Author

Alan Anderson, PhD, FRM, is a lecturer in the department of Economics at Fordham University. He is also Adjunct Professor of Finance at NYU, Adjunct Professor of Economics and Mathematics at Purchase College, SUNY, and Adjunct Professor of Finance at Manhattanville College. He has also worked at corporate institutions such as TIAA-CREF and Reuters in quantitative risk and analyzing credit spreads.

Sample Chapters

business statistics for dummies

CHEAT SHEET

Statistics make it possible to analyze real-world business problems with actual data so that you can determine if a marketing strategy is really working, how much a company should charge for its products, or any of a million other practical questions. The science of statistics uses regression analysis, hypothesis testing, sampling distributions, and more to ensure accurate data analysis.

HAVE THIS BOOK?

Articles from
the book

When you're working with populations and samples (a subset of a population) in business statistics, you can use three common types of measures to describe the data set: central tendency, dispersion, and association. By convention, the statistical formulas used to describe population measures contain Greek letters, while the formulas used to describe sample measures contain Latin letters.
Hypothesis testing isn't just for population means and standard deviations. You can use this procedure to test many different kinds of propositions. For example, a jury trial can be seen as a hypothesis test with a null hypothesis of "innocent" and an alternative hypothesis of "guilty." One particularly interesting application of hypothesis testing comes from the Royal Mint in England.
Statistics make it possible to analyze real-world business problems with actual data so that you can determine if a marketing strategy is really working, how much a company should charge for its products, or any of a million other practical questions. The science of statistics uses regression analysis, hypothesis testing, sampling distributions, and more to ensure accurate data analysis.
In the field of risk management, you can measure the risk of a portfolio with the Value at Risk (VaR) methodology. The standard VaR model (known as the variance-covariance approach) is typically based on the assumption that the returns to a portfolio follow the normal distribution. But the assumption that financial returns are normal when they're not normal can have adverse consequences.
Depending on your school of thought, forecasting market prices can be either a waste of time or the key to financial success. Either way, knowing about each camp is useful as you learn about business statistics. Forecasting is especially important in the field of finance. Investors try to decide which assets to buy or sell based on their own expectations of future market conditions — including stock prices, interest rates, exchange rates, and commodity prices.
Market participants — equity analysts, risk managers, portfolio managers, traders, and economists — must be able to accurately measure and model the risk and return of financial assets. As a starting point, modeling the properties of asset returns requires the choice of an appropriate probability distribution — a statistical function that describes all the possible values and likelihoods that a random variable can take within a given range.
Regression analysis is one of the most important statistical techniques for business applications. It's a statistical methodology that helps estimate the strength and direction of the relationship between two or more variables. The analyst may use regression analysis to determine the actual relationship between these variables by looking at a corporation's sales and profits over the past several years.
If the variances of two independent populations aren't equal (or you don't have any reason to believe that they're equal) and at least one sample is small (less than 30), the appropriate test statistic is In this case, you get the critical values from the t-distribution with degrees of freedom (df) equal to Note that this value isn't necessarily equal to a whole number; if the resulting value contains a fractional part, you must round it to the next closest whole number.
When you're testing hypotheses about two population means, where the variances of the two populations aren't equal, and the size of both samples are large (30 or greater), the appropriate test statistic is This test statistic is based on the standard normal distribution. As an example, say that a restaurant chain is interested in finding out whether the average sale per customer is the same in its domestic and foreign restaurants.
You can determine the relationship between two variables with two measures of association: covariance and correlation. For example, if an investor wants to understand the risk of a portfolio of stocks, then he can use these measures to properly determine how closely the returns on the stocks track each other. Covariance is used to measure the tendency for two variables to rise above their means or fall below their means at the same time.
A probability distribution is a formula or a table used to assign probabilities to each possible value of a random variable X. A probability distribution may be either discrete or continuous. A discrete distribution means that X can assume one of a countable (usually finite) number of values, while a continuous distribution means that X can assume one of an infinite (uncountable) number of different values.
When drawing conclusions about a population from randomly chosen samples (a process called statistical inference), you can use two methods: confidence intervals and hypothesis testing. Confidence intervals A confidence interval is a range of values that's expected to contain the value of a population parameter with a specified level of confidence (such as 90 percent, 95 percent, 99 percent, and so on).
In statistics, hypothesis testing refers to the process of choosing between competing hypotheses about a probability distribution, based on observed data from the distribution. It's a core topic and a fundamental part of the language of statistics. Hypothesis testing is a six-step procedure: 1. Null hypothesis 2.
You can test hypotheses about two population means where the populations are independent of each other, but have equal size and variance. With equal population variances, the test statistic requires the calculation of a pooled variance — this is the variance that the two populations have in common. You use the Student's t-distribution to find the test statistic and critical values.
When testing a hypothesis for a small sample where you have to find the appropriate critical left-tail value, this value depends on certain criteria. In addition to being negative, the value also depends on the sample size and whether or not the population standard deviation is known. A left-tailed test is a test to determine if the actual value of the population mean is less than the hypothesized value.
When testing a hypothesis for a small sample where you have to find the appropriate critical right-tail value, this value depends on certain criteria. In addition to being positive, the value also depends on the sample size and whether or not the population standard deviation is known. After you calculate a test statistic, you compare it to one or two critical values, depending on the alternative hypothesis, to determine whether you should reject the null hypothesis.
In statistics, a large sample has a size greater than or equal to 30. When you use a large sample to test a hypothesis about a population mean, the resulting two-tailed critical value or values from the standard normal distribution equal Because you draw these critical values from the standard normal distribution, you don’t have to calculate degrees of freedom.
When you use a small sample to test a hypothesis about a population mean, you take the resulting critical value or values from the Student's t-distribution. For a two-tailed test, the critical value is and n represents the sample size. The Student's t-distribution Degrees of Freedom t0.10 t0.05 t0.025 t0.01 t0.
Compared with other types of hypothesis tests, constructing the test statistic for ANOVA is quite complex. The first step in finding the test statistic is to calculate the error sum of squares (SSE). Calculating the SSE enables you to calculate the treatment sum of squares (SSTR) and total sum of squares (SST).
Calculating the treatment sum of squares (SSTR) and the total sum of squares (SST) are two important steps in constructing the test statistic for ANOVA. Once you have calculated the error sum of squares (SSE), you can calculate the SSTR and SST. When you compute SSE, SSTR, and SST, you then find the error mean square (MSE) and treatment mean square (MSTR), from which you can then compute the test statistic.
Regression analysis is a statistical tool used for the investigation of relationships between variables. Usually, the investigator seeks to ascertain the causal effect of one variable upon another — the effect of a price increase upon demand, for example, or the effect of changes in the money supply upon the inflation rate.
Two of the most widely used measures of association are covariance and correlation. These measures are closely related to each other; in fact, you can think of correlation as a modified version of covariance. Correlation is easier to interpret because its value is always between –1 and 1. For example, a correlation of 0.
One of the interesting properties of the t-distribution is that the greater the degrees of freedom, the more closely the t-distribution resembles the standard normal distribution. As the degrees of freedom increases, the area in the tails of the t-distribution decreases while the area near the center increases.
The geometric distribution is based on the binomial process (a series of independent trials with two possible outcomes). You use the geometric distribution to determine the probability that a specified number of trials will take place before the first success occurs. Alternatively, you can use the geometric distribution to figure the probability that a specified number of failures will occur before the first success takes place.
You can use the adjusted coefficient of determination to determine how well a multiple regression equation "fits" the sample data. The adjusted coefficient of determination is closely related to the coefficient of determination (also known as R2) that you use to test the results of a simple regression equation.
Probability distributions, including the t-distribution, have several moments, including the expected value, variance, and standard deviation (a moment is a summary measure of a probability distribution): The first moment of a distribution is the expected value, E(X), which represents the mean or average value of the distribution.
A frequency distribution shows the number of elements in a data set that belong to each class. In a relative frequency distribution, the value assigned to each class is the proportion of the total data set that belongs in the class. For example, suppose that a frequency distribution is based on a sample of 200 supermarkets.
The uniform distribution is used to describe a situation where all possible outcomes of a random experiment are equally likely to occur. You can use the variance and standard deviation to measure the "spread" among the possible values of the probability distribution of a random variable. For example, suppose that an art gallery sells two types of art work: inexpensive prints and original paintings.
To estimate a time series with regression analysis, the first step is to identify the type of trend (if any) that's present in the data. The type of trend, such as linear or quadratic, determines the exact equation that is estimated. No trend In the case where a time series doesn't increase or decrease over time, it may instead randomly fluctuate around a constant value.
The Poisson distribution is useful for measuring how many events may occur during a given time horizon, such as the number of customers that enter a store during the next hour, the number of hits on a website during the next minute, and so forth. The Poisson process takes place over time instead of a series of trials; each interval of time is assumed to be independent of all other intervals.
You use the addition rule to compute the probability of the union of two events. Mathematically speaking, for events A and B, the addition rule states that This shows that the probability of the union of events A and B equals the sum of the probability of A and the probability of B, from which the probability of both events is subtracted.
Two events are said to be complements if they are mutually exclusive and their union equals the entire sample space. This is represented by the complement rule, which is expressed as follows: P(AC) = 1 – P(A) AC is the complement of event A. Here's an example: Suppose that an experiment consists of choosing a single card from a standard deck.
As with the binomial and geometric distributions, you can use simple formulas to compute the moments — expected value, variance, and standard deviation — of the Poisson distribution. How to calculate the expected value of the Poisson distribution You can find the expected value of the Poisson distribution by using the formula, For example, say that on average three new companies are listed in the New York Stock Exchange (NYSE) each year.
To figure out the probability of the intersection of two events, you use the multiplication rule. This is used to determine the probability that two events are both true. For example, suppose an experiment consists of choosing a card from a standard deck. Event A = "the card is red." Event B = "the card is a king.
To estimate a time series regression model, a trend must be estimated. You begin by creating a line chart of the time series. The line chart shows how a variable changes over time; it can be used to inspect the characteristics of the data, in particular, to see whether a trend exists. For example, suppose you're a portfolio manager and you have reason to believe a linear trend occurs in a time series of returns to Microsoft stock.
You can estimate and predict the value of Y using a multiple regression equation. With multiple regression analysis, the population regression equation may contain any number of independent variables, such as In this case, there are k independent variables, indexed from 1 to k. For example, suppose that the Human Resources department of a major corporation wants to determine whether the salaries of its employees are related to the employees' years of work experience and their level of graduate education.
An unconditional, or marginal, probability is one where the events (possible outcomes) are independent of each other. When you create a joint probability table, the unconditional probability of an event appears as a row total or a column total. For example, say that you create a joint probability table representing the distribution of students in a business school; you classify them according to major and whether they're working on a bachelor's degree or a master's degree.
You can find the critical values for an ANOVA hypothesis using the F-table. Because the F-distribution is based on two types of degrees of freedom, there's one table for each possible value of alpha (the level of significance). The following table shows the different values of the F-distribution corresponding to a 0.
Cumulative frequency refers to the total frequency of a given class and all prior classes in a graph. For example, say that you have researched the price of gas at several gas stations in your area, and you broke down the price ranges into classes. Using a class range of $0.25, you might find results similar to those in the first two columns of the following table.
Moments are summary measures of a probability distribution, and include the expected value, variance, and standard deviation. The expected value represents the mean or average value of a distribution. The expected value is sometimes known as the first moment of a probability distribution. You calculate the expected value by taking each possible value of the distribution, weighting it by its probability, and then summing the results.
Moments are summary measures of a probability distribution, and include the expected value, variance, and standard deviation. The moments of the geometric distribution depend on which of the following situations is being modeled: The number of trials required before the first success takes place The number of failures that occur before the first success Just as with the binomial distribution, the geometric distribution has a series of simplified formulas for computing these moments.
Compared with other types of hypothesis tests, constructing the test statistic for ANOVA is quite complex. You construct the test statistic (or F-statistic) from the error mean square (MSE) and the treatment mean square (MSTR). In order to calculate the MSE and MSTR, you first have to calculate the error sum of squares (SSE), treatment sum of squares (SSTR), and total sum of squares (SST), followed by the error mean square (MSE) and treatment mean square (MSTR).
Sometimes a data set contains a large number of repeated values. In these situations, you can simplify the process of computing the mean by using weights — the frequencies of a value in a sample or a population. You can compute the arithmetic mean as a weighted average. The formula for computing a weighted arithmetic mean for a sample or a population is Here, wi represents the weight associated with element Xi; this weight equals the number of times that the element appears in the data set.
When a data set contains a large number of repeated values, you can simplify the process of computing the mean by using weights — the frequencies of a value in a sample or a population. You can then compute the geometric mean as a weighted average. You can calculate the weighted geometric mean in the same way for both samples and populations.
One way to illustrate the binomial distribution is with a histogram. A histogram shows the possible values of a probability distribution as a series of vertical bars. The height of each bar reflects the probability of each value occurring. A histogram is a useful tool for visually analyzing the properties of a distribution, and (by the way) all discrete distributions may be represented with a histogram.
The uniform distribution is a continuous distribution that assigns only positive probabilities within a specified interval (a, b) — that is, all values between a and b. (a and b are two constants; they may be negative or positive.) A continuous distribution can't be illustrated with a histogram, because this would require an infinite number of bars.
Quartiles split up a data set into four equal parts, each consisting of 25 percent of the sorted values in the data set. Quartiles are related to percentiles like so: First quartile (Q1) = 25th percentile Second quartile (Q2) = 50th percentile Third quartile (Q3) = 75th percentile Because the second quartile is the 50th percentile, it's also the median of a data set.
When comparing data samples from different populations, two of the most popular measures of association are covariance and correlation. Covariance and correlation show that variables can have a positive relationship, a negative relationship, or no relationship at all. A sample is a randomly chosen selection of elements from an underlying population.
Moments are summary measures of a probability distribution and include the expected value, variance, and standard deviation. You can use these values to measure to what extent the degrees of freedom affect the F-distribution. The expected value is known as the first moment of a probability distribution and represents the mean or average value of a distribution.
A scatter plot (also known as a scatter diagram) shows the relationship between two quantitative (numerical) variables. These variables may be positively related, negatively related, or unrelated: Positively related variables indicate that When one variable increases, the other variable tends to increase. When one variable decreases, the other variable tends to decrease.
In the binomial formula, you use the combinations formula to count the number of combinations that can be created when choosing x objects from a set of n objects: One distinguishing feature of a combination is that the order of objects is irrelevant. For example, you can use this formula to count the number of ways you choose two elective classes from a set of eight for the upcoming semester.
Relative variation refers to the spread of a sample or a population as a proportion of the mean. Relative variation is useful because it can be expressed as a percentage, and is independent of the units in which the sample or population data are measured. For example, you can use a measure of relative variation to compare the uncertainty or variation associated with the temperature in two different countries, even if one country uses Fahrenheit temperatures and the other uses Celsius temperatures.
You can use the Central Limit Theorem to convert a sampling distribution to a standard normal random variable. Based on the Central Limit Theorem, if you draw samples from a population that is greater than or equal to 30, then the sample mean is a normally distributed random variable. To determine probabilities for the sample meanthe standard normal tables requires you to convertto a standard normal random variable.
The properties of a probability distribution can be summarized with a set of numerical measures known as moments. One of these moments is called the expected value, or mean. In order to calculate an expected value, you use a summation operator. The summation operator is used to indicate that a set of values should be added together.
Random variables and probability distributions are two of the most important concepts in statistics. A random variable assigns unique numerical values to the outcomes of a random experiment; this is a process that generates uncertain outcomes. A probability distribution assigns probabilities to each possible value of a random variable.
An event is one possible outcome of a random experiment. Events may sometimes be related to each other. Two key ways in which events may be related are known as mutually exclusive and independent. How to Identify Mutually Exclusive Events Two events are said to be mutually exclusive if they can't both happen at the same time.
After you estimate the population regression line, you can check whether the regression equation makes sense by using the coefficient of determination, also known as R2 (R squared). This is used as a measure of how well the regression equation actually describes the relationship between the dependent variable (Y) and the independent variable (X).
In statistics, sampling distributions are the probability distributions of any given statistic based on a random sample, and are important because they provide a major simplification on the route to statistical inference. More specifically, they allow analytical considerations to be based on the sampling distribution of a statistic, rather than on the joint probability distribution of all the individual sample values.
A scatter plot is a special type of graph designed to show the relationship between two variables. With regression analysis, you can use a scatter plot to visually inspect the data to see whether X and Y are linearly related. The following are some examples. This figure shows a scatter plot for two variables that have a nonlinear relationship between them.
The F-distribution is a continuous probability distribution, which means that it is defined for an infinite number of different values. The F-distribution can be used for several types of applications, including testing hypotheses about the equality of two population variances and testing the validity of a multiple regression equation.
https://cdn.prod.website-files.com/6630d85d73068bc09c7c436c/69195ee32d5c606051d9f433_4.%20All%20For%20You.mp3

Frequently Asked Questions

No items found.