How to Make Box Plot (Box and Whisker Chart) in Excel?

Box plots (also called box and whisker charts) provide a great way to visually summarize a dataset, and gain insights into the distribution of the data.

In this tutorial, we will discuss what a box plot is, how to make a box plot in Microsoft Excel (new and old versions), and how to interpret the results.

What is a Box Plot (Box and Whisker Chart)?

A box plot is used in statistical analysis to visualize the distribution in a set of data.

Box plot chart in Excel
An example of a Box Plot chart created in Excel

The box plot divides numerical data into ‘quartiles’ or four parts.

The main ‘box’ of the box plot is drawn between the first and third quartiles, with an additional line drawn to represent the second quartile, or the ‘median’.

The width of the box basically marks the most concentrated area of the data distribution.

A box plot can also contain ‘whiskers’ which are simply lines that depict the minimum and maximum values that are outside the first and third quartiles. 

Why Use a Box Plot?

A box plot is a great way to visually summarize your data using 5 descriptive indicators.

They can be quite helpful when you want to see how the data is distributed and the range to which the data extends.

In this way, box plots also help us find outliers and can tell if the data is symmetric or skewed.  

Box plots are especially helpful when you want to compare different sample distributions, and since it provides a 5-point summary of the entire dataset, it is really useful when the datasets you are studying are very large.

How to Interpret a Box Plot

Before we see how to create a box plot (also called a box and whisker chart), let us first learn to understand and interpret a box plot.

The image below shows employee salaries of a given company:

Employee salaries dataset

This data can be represented with a box and whisker plot as shown below:

Box plot chart explained

Let us interpret this chart by looking at its 5 indicators:

  • The tip of the top whisker (line) marks the maximum value in the data. From the above chart we can see that the highest salary paid by the company is $40,000.
  • The length of the top whisker represents how far the highest value is from the median of the dataset. This value can help us identify if the dataset is skewed or if there are any potential outliers.
  • The top edge of the box marks the third quartile of the data.
  • The bottom edge of the box marks the first quartile of the data.
  • The line that runs across the middle of the box is the second quartile or the median value of the data.
  • The vertical width of the box represents the spread in the data. In other words, it tells us if there is a large amount of variation in employee salaries or if they are more or concentrated around the median. In the above chart, we can see that the range of salaries is 40,000-10,000 = $30,000.
  • The length of the bottom whisker represents how far the lowest value is from the median of the dataset. This also helps us see if the dataset is skewed.
  • The tip of the bottom whisker marks the minimum value in the data (or the lowest salary paid by the company). From the above chart we can see that the highest salary paid by the company is $10,000.

How to Create a Box Plot Chart in Excel

Now that we understand box plots a little better, let us see how to create them in Excel.

Up until Excel 2016, there was no separate feature in Excel to create box plots.

As such, in older versions, if you need to create a box plot, you need to improvise a stacked chart and convert it into a box plot.

The box plot was included as a separate charting option only from Excel 2016 onwards.

So, in this tutorial, we will show you how to create a box plot in both older as well as newer Excel versions.

We are going to create the box plot based on the dataset shown below:

Dataset for the box plot chart

The above screenshot displays employee salaries for two different companies – Company A and Company B.

Let us see how to create a box plot from the above data in both older and newer Excel versions.

Creating a Box Plot in Newer Excel Versions (2016, 2019, & Office 365)

If you’re using newer versions of Excel-like Excel 2016 or Excel in Office 365, creating a box plot is much simpler.

This is because these versions already have a template to create a box plot. 

So you can generate a chart directly from your original dataset and you don’t need to compute quartiles or differences between the quartiles, etc.

For example, on Office 365, all you need to do is:

  1. Select the range of cells containing your data. In our case, we can directly select the cells in the range A1:A10.
  2. From the Insert tab, click on the icon for Insert Statistic Chart as shown in the image below:
Click on Insert Statistic Chart
  1. This displays a dropdown menu, from where you can select the ‘Box and Whisker’ chart.
Click on Box and Whisker chart
  1. You should now get a Box Plot of your data.
box and whisker chart
  1. By default, the box plot performs quartile calculation with exclusive median. To convert it into inclusive median, select the Series Options icon from your Format Data Series sidebar and check the radio button next to ‘Inclusive median’ under the ‘Quartile Calculation’ category:
select inclusive median

As you can see, in just a few clicks, you get a box plot ready. You can now go ahead and change the Chart title. 

Final box plot in Excel 365

Note: The crosses that you can see in each box indicate the mean of the dataset.

Creating a Box Plot in Older Excel Versions (2013, 2010, 2007)

Older Excel versions do not have chart templates to create a box plot, but you can create one from a stacked bar chart.

To create a box plot in Excel versions 2013 and older, you need to follow the steps shown below:

  1. Compute 5 summary descriptors from the data
  2. Compute the Quartile differences
  3. Create a stacked column chart from the above computed values
  4. Convert the stacked column chart into a box plot.

Let us see how to perform each of these steps in more detail.

Step 1: Compute the 5 Summary Descriptors from the Data

First, we need to compute the 5 summary descriptors.

For this, create a second table based on the main data, using the formulas shown below (Make sure they are in the same order as shown below too):

DescriptorFormula
Minimum Value=MIN(range)
First Quartile=QUARTILE.INC(range,1)
Median=QUARTILE.INC(range,2)
Third Quartile=QUARTILE.INC(range,3)
Maximum Value=MAX(range)

Here are the formulas that we used for our sample dataset:

Calculating 5 stats to make the box plot

Step 2: Compute the Quartile Differences

The next step is to find the differences between each of the above-computed values. These will help us specify the heights for every column segment of our stacked column chart.

Create a third table. In this table, specify the following values:

  • In the first row: The minimum values of your data. In our example, the formulas used are =E3 and =G3.
  • In the second row: The difference between the first quartile and minimum value. In our example, the formulas are =E4-E3 and =G4-G3.
  • In the third row: The difference between the second and first quartiles. In our example, the formulas are =E5-E4 and =G5-G4.
  • In the fourth row: The difference between the third and second quartiles. In our example, the formulas are =E6-E5 and =G6-G5.
  • In the fifth row: The difference between the maximum value and the third quartile. In our example, the formulas are =E7-E6 and =G7-G6.

Here’s a screenshot of the third table, after applying to our sample dataset:

Inter quartle differences

Step 3: Create a Stacked Column Chart from the Computed Values

Now that we have computed all the required differences, we can use these to specify the heights of each column segment for the column chart.

We will later modify this column chart and convert it into a box plot.

  1. Select the range of cells containing the third table (in our case, cell range J2:K7).
Select the third table
  1. From the Insert tab, select the Insert Column Chart button (under the Charts group).
Insert column chart
  1. Select the Stacked Column (under the 2-D Column category) from the dropdown that appears.
Click on the Stacked column option
  1. This creates a stacked column chart based on our selected range of cells. However, notice that this does not yet look like a box plot. This is because the chart was created by selecting data horizontally from your selection, rather than vertically.
Stacked column chart is inserted
  1. To reverse this, right click on the chart and click on ‘Select Data’ from the context menu that appears.
Click on Select Data
  1. Click on the Switch Row/Column button from the dialog box and click OK.
Click on Switch Row and Column

Your stacked column chart should now more or less resemble a box chart, with 5 segments for each column:

Stacked column chart that  looks more like the box plot

You can now go ahead and change the chart title and get rid of the legend if you need to.

Change the chart title

Step 4: Convert the Stacked Column Chart into a Box Plot

We are not done yet. We still need to convert this chart into a proper box plot.

For this, follow the steps shown below:

Remove the bottom segment from all the stacked columns
  1. Select the lowest segment of any of one of the stacked columns. You will notice that clicking on a segment of a single column automatically selects that segment in all the columns. 
Select the lowest segment of the cluster chart
  1. Next, select the Format tab from the Chart Tools tab and click on Format Selection from the Current Selection group. 
Click on Format selection
  1. This opens the Format Data Series sidebar to the right of your Excel window. Under Series options, click on the icon for Fill and line
Select Fill and Line option in the Format Data series pane
  1. Under the Fill category, check the radio button next to No Fill.
Click on No Fill
  1. The lowest segment of all your stacked columns should now disappear.
The lowest segment of the chart becomes invisible
Convert the top and second to bottom segments into whiskers for the box plot
  1. Select the second to bottom segment of the stacked columns.
Select the second to bottom segment
  1. Select the Design tab from the Chart Tools tab and click on Add Chart Element from the Chart Layouts group. 
Click on Add chart element
  1. From the dropdown menu that appears, select Error Bars->Standard Deviation.
Add a standard deviation bar
  1. This will add error bars to all the stacked columns as shown below.
Error bars are added to the columns in the chart
  1. Select the error bar on any one stacked column. It will automatically select the error bars on all the stacked columns. From the Format Error Bars sidebar, click on the icon for Error Bar Options.
Click on Error Bar Options
  1. Under the Direction category, check the radio button next to ‘Minus’. Under the End Style category, check the radio button next to ‘No Cap’ and under the Error Amount category, select the radio button next to Percentage. Change the percentage to 100% in the input box next to it.
Setting to apply to error bars
  1. Finally remove the second to bottom segment, so that only the lower whisker is visible. For this again select the segment, and click on No Fill from the Format Data Series.
whisker has been added to the chart
  1. Repeat the same steps (a. to h.) on the topmost segment of the stacked column chart, to convert it into the top whisker of the box plot.
Select the top segment of the chart
  1. So far, this is how your chart should be looking:
Both sides of the chart have whiskers
Change the fill and outline for the middle 2 sections:
  1. Select the second from top section of the stacked column chart (the yellow segment in our sample chart). 
change the fill and outline of the chart
  1. From the Format Data Series sidebar, select the icon for Fill and Line under Series Options and check the radio button next to Solid Fill
  2. Select an appropriate Fill color and Transparency. For our sample chart, we used the default ‘Blue, Accent 1’ color and a transparency of 60%.
Solid fill the chart
  1. Under the Border category, check the radio button next to Solid line and select an appropriate color and transparency for the box border. We left it at the default settings.
Select the border setting
  1. Repeat the steps a. To d. for the grey segment of your stacked chart too.
Repeat the same steps to give color and border to second segment

Your end result should like an actual box plot, as shown in the screenshot below:

Box and Whisker plot end result

Limitations of the Box Plot

There are a few points to note about box plots:

  • The box plot is good at representing bell shaped or Gaussian distributions, but can hide certain important insights in case of bimodal or other non-Gaussian distributions.
  • Box plots are not very intuitive at first glance. So, people who are not really familiar with concepts like quartiles, etc. might not be able to understand or interpret a box plot correctly.

These limitations aside, if your data is more or less symmetrically distributed around the median, then the box plot can help you understand the distribution at a glance, even if you have a really large dataset. 

Moreover, the box plot can be quite helpful in understanding the distribution and range of data. As such, the merits of the box plot far outnumber the demerits. 

In this tutorial, we showed you how to create a box plot using both older and newer versions of Excel. We hope it was helpful for you.

Other articles you may also like: