Although the concept of quartiles is reasonably easy to grasp, different statistical computer programs use slightly different methodologies to calculate it. There are at least 10 different methods which can produce slightly different results.
Excel has a QUARTILE() function which may produce results that don't always look obvious and may differ slightly from other calculations but are nevertheless valid and can be accepted. Attempting to exactly duplicate and check Excels calculation in every circumstance is difficult.
Statistics often involves trying to describe and compare different sets of data. There is often a trade off between usefulness and ease of understanding. Some of the more simple tools which can be readily understood provide a less reliable result.
One very simple tool to describe a set of data is the range of the values. By subtracting the smallest number from the largest, the range can describe the spread of values. e.g. [$F$20]=MAX(B5:B10)-MIN(B5:B10) This measure however can be distorted by extreme values. For example these two sets are similar except for an unusual extreme value in the second set:
|10, 13, 14, 12, 13, 19, 22||11, 14, 16, 88, 13, 20, 15|
One method of eliminating extreme values is to use an interquartile range. This measure excludes the lowest quarter and highest quarter of values and only uses the central values.
The Excel function takes the form QUARTILE(data_range, quart) where quart can be one of 5 values: 0 = minimum, 1 = 1st quartile, 2 = median, 3 = 3rd quartile, 4 = maximum. All data values in a range which are below the 1st quartile function result are within the first quartile.
In the following example the examination results of two different classes are compared. The range values indicate that Class A has a much wider dispersion of values. Although there are some extreme values, most students get similar results and the range is possibly misleading. The interquartile range is smaller and a more useful comparison.
The interquartile range is an absolute measure of dispersion because the values are in the same units as the original data (i.e. exam scores). The interquartile ratio is a relative measure of dispersion because the result is not in units.
[K4] =QUARTILE(B5:C12, 1)
The table above summarises the data from both classes. The pass marks have been based on all 16 students and a decision has been taken that the bottom quartile of students (who score less than 47.75) will fail the exam and the top quartile (who score above 64.5) will be awarded a distinction. The values in columns L, M and N have been calculated using the FREQUENCY() function.
|Get this information as a document
accompanied by Excel worksheets
|Click here for details about
file. It has been rewritten for Excel 2010.
|file: excel-quartiles.htm||Page last updated: May07||© MeadInKent.co.uk 2013||CMIDX S1 P8 Y|