We have been told 3 different averages. How do we know the best one to choose in a given situation? |
The mean is the only calculation which uses the numerical values of all of the data items. This sounds like an advantage, and often is.
But if there is an outlier in your data (i.e. an abnormally high or low value) it will affect your calculation by "dragging up/down" the value of the mean. For example, the mean wage of employees of a firm will be artificially inflated if the salary of the managing director is included.
Quite often, the result of the calculation of the mean will be a theoretical value which is not possible in practice. (A famous example is "2.4 Children"). This may prove to be a disadvantage.
This calculation is quite 'immune' from outliers and from skew.
It usually yields an answer which is one of the original data items.
The median is a useful measure of the "central" data value. You will know that, in general,
half of the data values in the set will be greater than the median and half will be less.
The mode is the only average which is certain to be one of the original data items.
It is also the only one which is suitable for Qualitative (i.e. non-numerical) data.
When the distribution of the data is skewed (see diagram), the mode becomes more a measure of popularity than a central value.
Also, You may find that more than one data item 'ties' for the most
frequent. In this case you will have 2 or more modes.