Measures of dispersion provide information about the spread of a variable’s values. There are four key measures of dispersion:
- Standard Deviation
Range is simply the difference between the smallest and largest values in the data. The interquartile range is the difference between the values at the 75th percentile and the 25th percentile of the data.
Variance is the most commonly used measure of dispersion. It is calculated by taking the average of the squared differences between each value and the mean.
Standard deviation, another commonly used statistic, is the square root of the variance.
Skew is a measure of whether some values of a variable are extremely different from the majority of the values. For example, income is skewed because most people make between $0 and \$200,000, but a handful of people earn millions. A variable is positively skewed if the extreme values are higher than the majority of values. A variable is negatively skewed if the extreme values are lower than the majority of values.
The incomes of five randomly selected people in the United States are $10,000, $10,000, $45,000, $60,000, and $1,000,000:
Range = 1,000,000 – 10,000 = 990,000
Variance = [(10,000 – 225,000)2 + (10,000 – 225,000)2 + (45,000 – 225,000)2 + (60,000 – 225,000)2 + (1,000,000 – 225,000)2] / 5 = 150,540,000,000
Standard Deviation = Square Root (150,540,000,000) = 387,995
Skew = Income is positively skewed