Box-and-Whisker Plots

To understand box-and-whisker plots, you have to understand medians and quartiles of a data set.

The median is the middle number of a set of data, or the average of the two middle numbers (if there are an even number of data points).

The median ( Q 2 ) divides the data set into two parts, the upper set and the lower set. The lower quartile ( Q 1 ) is the median of the lower half, and the upper quartile ( Q 3 ) is the median of the upper half.

Example:

Find Q 1 , Q 2 , and Q 3 for the following data set, and draw a box-and-whisker plot.

{ 2,6,7,8,8,11,12,13,14,15,22,23}

There are 12 data points. The middle two are 11 and 12 . So the median, Q 2 , is 11.5 .

The "lower half" of the data set is the set { 2,6,7,8,8,11 } . The median here is 7.5 . So Q 1 =7.5 .

The "upper half" of the data set is the set { 12,13,14,15,22,23 } . The median here is 14.5 . So Q 3 =14.5 .

A box-and-whisker plot displays the values Q 1 , Q 2 , and Q 3 , along with the extreme values of the data set ( 2 and 23 , in this case):

A box & whisker plot shows a "box" with left edge at Q 1 , right edge at Q 3 , the "middle" of the box at Q 2 (the median) and the maximum and minimum as "whiskers".

Note that the plot divides the data into 4 equal parts. The left whisker represents the bottom 25% of the data, the left half of the box represents the second 25% , the right half of the box represents the third 25% , and the right whisker represents the top 25% .

Outliers

If a data value is very far away from the quartiles (either much less than Q 1 or much greater than Q 3 ), it is sometimes designated an outlier. Instead of being shown using the whiskers of the box-and-whisker plot, outliers are usually shown as separately plotted points.

The standard definition for an outlier is a number which is less than Q 1 or greater than Q 3 by more than 1.5 times the interquartile range ( IQR= Q 3 Q 1 ). That is, an outlier is any number less than Q 1 ( 1.5×IQR ) or greater than Q 3 +( 1.5×IQR ) .

Example:

Find Q 1 , Q 2 , and Q 3 for the following data set. Identify any outliers, and draw a box-and-whisker plot.

{ 5,40,42,46,48,49,50,50,52,53,55,56,58,75,102 }

There are 15 values, arranged in increasing order. So, Q 2 is the 8 th data point, 50 .

Q 1 is the 4 th data point, 46 , and Q 3 is the 12 th data point, 56 .

The interquartile range IQR is Q 3 Q 1 or 5647=10 .

Now we need to find whether there are values less than Q 1 ( 1.5×IQR ) or greater than Q 3 +( 1.5×IQR ) .

Q 1 ( 1.5×IQR )=4615=31

Q 3 +( 1.5×IQR )=56+15=71

Since 5 is less than 31 and 75 and 102 are greater than 71 , there are 3 outliers.

The box-and-whisker plot is as shown. Note that 40 and 58 are shown as the ends of the whiskers, with the outliers plotted separately.