When we display the data distribution in a standardized way using 5 summary – minimum, Q1 (First Quartile), median, Q3(third Quartile), and maximum), it is called as Box plot. It helps to find out how much is the data values vary or spread out with the help of graph. As we need more information than just knowing the measures of central tendency, this is where the box plot helps. This also takes less space. It is also a type of graphical representation of data.
It is used to know
- the outliers and its values.
- Symmetry of Data
- Tight grouping of data
- Data Skewness -if and how
Important Terms of Box Plots
- Median – The mid-value(Vertical line inside the box)
- First quartile – the mid-value between the lowest number and median.(Upper Quartile)
- Third Quartile – the mid-value between the median and the largest value. (Lower Quartile)
- Interquartile range – Range between 25th percentile to 75th percentile
- Whiskers – The 2 lines extending to highest and lowest observations outside the box
- Maximum – Third Quartile + 1.5 * (Interquartile range)
- Minimum – First quartile – 1.5 * (Interquartile range)
Box and Whisker Plot
The method to summarize a set of data which is measured using an interval scale is called a box and whisker plot. These are maximum used for data analysis. We use these types of graphs or graphical representation to know:
- Distribution Shape
- Central Value of it
- Variability of it
A box and whisker plot (boxplot) is a chart that shows data from a five-number summary. It does not show the distribution in particular as much as a stem and leaf plot or histogram does. But it is primarily used to indicate a distribution is skewed or not and if there are potential unusual observations (also called outliers) present in the data set. Boxplots are also very beneficial when large numbers of data set are involved or compared.
Since, the centre, spread and overall range are immediately apparent, using these boxplots the distributions can be compared easily.
A box and whisker plot is a method of compiling a set of data mapped on an interval scale. It is also used for descriptive data analysis. The graph plotted here is used to show the shape of the distribution, its central value, and its variability.
Box and Whisker Plot Chart
In a box and whisker plot:
- the ends of the box are the upper and lower quartiles so that the box crosses the interquartile range
- a vertical line inside the box marks the median
- the two lines outside the box are the whiskers extending to the highest and lowest observations.