Arithmetic Mean and Range

Representative Values

In statistics, the arithmetic mean (AP) or simply called average is the sum of all observations to the total number of observations. The arithmetic mean can also inform or model concepts outside of statistics. In a physical sense, the arithmetic mean can be thought of as a center of gravity. From the mean of a data set we can think of the average distance the data points are from the mean as standard deviation. The square of standard deviation (i.e. variance) is analogous to the moment of inertia in the physical model.

Say for example, you wanted to know the weather in Shimla. On the internet you would find the temperatures for a lot of days, data of the temperature in the past and the data of the temperature in the present and also the predictions of the temperature in the future. Wouldn’t all this be extremely confusing? Instead of this long list of data mathematicians decided to use representative values which could take into consideration a wide range of data. Instead of weather for every particular day, we use terms such as average (arithmetic mean), median and mode to describe weather over a month or so.

There are several types of Representative values that are used by mathematicians in Data handling, namely;

  • Arithmetic Mean (Average)
  • Range
  • Median
  • Mode

Out of the four above, mean, median and mode are types of average

Arithmetic Mean Definition

Arithmetic mean represents a number that is obtained by dividing the sum of the elements of a set by the number of values in the set. So you can use the layman term Average, or be a little bit fancier and use the word “Arithmetic mean” your call, take your pick -they both mean the same.


If any data set consisting of the values b1, b2 , b3, …., bn  then the arithmetic mean B is defined as:

B = (Sum of all observations)/ (Total number of observation)

= 1/n \(\sum_{i=1}^n b_i = \frac{b_1+b_2+b_3+….+b_n}{n}\)

If these n observations have corresponding frequencies, the arithmetic mean is computed using the formula

x = \(\frac{x_{1}f_{1}+x_{2}f_{2}+……+x_{n}f_{n}}{N}\( and 

using Sigma notation = \(\frac{\sum_{i=1}^{n}x_{i}f_{i}}{N}\( 

where N = f1 + f2 + ……….+ fn.

The above formula can also be used to find the weighted arithmetic mean by taking f1, f2,…., fn as the weights of x1, x2,….., xn. 

When the frequencies divided by N are replaced by probabilities p1, p2, ……,pn we get the formula for the expected value of a discrete random variable.

X = x1p1 + x2p2 +…….+ xnpn. or 

using Sigma notation = \(\sum_{i=1}^{n}x_{i}p_{i}\)

Representative Values of Data

We see the use of representative value quite regularly in our daily life. When you ask about the mileage of the car, you are asking for the representative value of the amount of distance traveled to the amount of fuel consumed. This doesn’t mean that the temperature in Shimla in constantly the representative value but that overall it amounts to the average value. Average here represents a number that expresses a central or typical value in a set of data, calculated by the sum of values divided by the number of values.

Arithmetic Mean Explanation using an Example

The Arithmetic means utilizes two basic mathematical operations, addition and division to find a central value for set of values. If you wanted to find the arithmetic means of the runs scored by Virat Kohli in the last few innings, all you would have to do is sum up his runs to obtain sum total and then divide it by the number of innings. For example;

The arithmetic mean of Virat Kohli’s batting scores also called his Batting Average is;

Sum of runs scored/Number of innings = 661/10

Arithmetic mean of his scores in the last 10 innings is 66.1. If we add another score to this sum, say his 11th innings, the arithmetic mean will proportionally change. If the runs scored in 11th innings are 70, the new average becomes;

Average is a pretty neat tool, but it comes with its set of problems. Sometimes it doesn’t represent the situation accurately enough. I’ll show you what I mean. Let’s take the results of a class test for example. Say there are 10 students in the class and they recently gave a test out of 100 marks. There are two scenarios here.

First:  50, 53, 50, 51, 48, 93, 90, 92, 91, 90

Second: 71, 72, 70, 75, 73, 74, 75, 70, 74, 72

Why don’t you calculate the Arithmetic mean of both the sets above? You will find that both the sets have a huge difference in the value even though they have a similar arithmetic mean. In this respect, completely relying on Arithmetic mean can be occasionally misleading. At least from the point of view of students scoring 50’s/ 100, the second scenario is quite different. Same applies for the students with 90, in case of these students in the second set, the marks are reduced. So for both the classes the results mean something different but the average for both classes are the same. In the first class, the students are performing very varied, some very well and some not so well whereas in the other class the performance is kind of uniform. Therefore we need an extra representative value to help reduce this ambiguity.


Range as the word suggests represents the difference between the largest and the smallest value of data. This helps us determine the range over which the data is spread. Taking the previous example into consideration once again. There are 10 students in the class and they recently gave a test out of 100 marks. There are two scenarios here.

First:  50, 53, 50, 51, 48, 93, 90, 92, 91, 90

Second: 71, 72, 70, 75, 73, 74, 75, 70, 74, 72

The range in the first scenario is represented by the difference between the largest value, 93 and the smallest value, 48. The range therefore is,

Range in First set = 93 – 48 = 45

Whereas in the second scenario, the range is represented by the difference between the highest value, 75 and the smallest value, 70.

Range in the second set = 75 – 70 = 5

The difference in the value of Range between the two scenario enables us to estimate the range over which the values are spread, the larger the range, the larger apart the values are spread. This gives us the extra information which is are not getting through average.