# Outliers

An outlier is a value in a data set that is very different from the other values. That is, outliers are values unusually far from the middle.

In most cases, outliers have influence on mean, but not on the median, or mode. Therefore, the outliers are important in their effect on the mean.

There is no rule to identify the outliers. But some books refer to a value as an outlier if it is more than $1.5$ times the value of the interquartile range beyond the quartiles.

Also plotting the data on a number line as a dot plot will help in identifying the outliers.

Example:

Find the outliers of the data set. Also find the mean of the data set including the outliers and excluding the outliers.

$15,75,20,35,25,85,30,30,15,25,30$

First arrange the data set in order.

$15,15,20,25,25,30,30,30,35,75,85$

Plot the data on a number line as a dot plot.

The values $75$ and $85$ are far off the middle. So, these two values are outliers for the given data set.

Find the mean, median and mode of the data including the outliers:

$\text{Mean}\text{\hspace{0.17em}}\text{\hspace{0.17em}}=\text{\hspace{0.17em}}\text{\hspace{0.17em}}\frac{\text{Sum}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{of}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{the}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{data}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{values}}{\text{Number}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{of}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{data}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{values}}$

$=\frac{15+15+20+25+25+30+30+30+35+75+85}{11}$

$=35$

Find the mean of the data excluding the outliers:

$\text{Mean}\text{\hspace{0.17em}}\text{\hspace{0.17em}}=\text{\hspace{0.17em}}\text{\hspace{0.17em}}\frac{\text{Sum}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{of}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{the}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{data}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{values}}{\text{Number}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{of}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{data}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{values}}$

$=\frac{15+15+20+25+25+30+30+30+35}{9}$

$=25$

The mean of the given data set is $35$ when outliers are included, but it is $25$ when outliers are excluded.