Descriptive Statistics


  • From: David Hughes
  • Date: 17 Feb 1999
  • Subject: Choosing the best average

We have been told 3 different averages. How do we know the best one to choose in a given situation?


Maths Help suggests:

First, a reminder of the three averages:
Mean
The sum of the data items divided by the number of data items.
Median
The central data item when all data are listed in numerical order.
Mode
The data item(s) which occurs most often.

Mean

The mean is the only calculation which uses the numerical values of all of the data items. This sounds like an advantage, and often is.

But if there is an outlier in your data (i.e. an abnormally high or low value) it will affect your calculation by "dragging up/down" the value of the mean. For example, the mean wage of employees of a firm will be artificially inflated if the salary of the managing director is included.

Quite often, the result of the calculation of the mean will be a theoretical value which is not possible in practice. (A famous example is "2.4 Children"). This may prove to be a disadvantage.



Median

This calculation is quite 'immune' from outliers and from skew.

It usually yields an answer which is one of the original data items.

The median is a useful measure of the "central" data value. You will know that, in general, half of the data values in the set will be greater than the median and half will be less.

Mode

The mode is the only average which is certain to be one of the original data items.

It is also the only one which is suitable for Qualitative (i.e. non-numerical) data.

A negatively skewed distribution

When the distribution of the data is skewed (see diagram), the mode becomes more a measure of popularity than a central value.

Also, You may find that more than one data item 'ties' for the most frequent. In this case you will have 2 or more modes.


Return to Topic List