What is statistics? – Mean

Statistics can be understood either by using statistical / math terminology or by not using it at all. Here, we will try to make simple and understandable some statistical terms. LaTeX, a math code translator, will be used whenever is needed.

Defining some Statistical terms
Population: When we use the word “Population”, we refer e.g. to all the people that exist in a city or e.g. to the total number of birds that exist in a region.
Sample: Sample is a part of the whole population e.g 100 people from a city which “occupies” 10000 people.

population vs Sample

Central Tendency: Central Tendency refers to the middle point or location or the center of an arithmetic series or a dataset. Usually, the Arithmetic mean, Mode, and Median are used, but other measurements of Central Tendency also exist.

Definition of the Mean / average – Theoretical definition
(Arithmetic) Mean is defined as the (1) addition of all the numbers that exist in an arithmetic series or set and then (2) the division of this result by the size of this set / series. Note that the (Arithmetic) Mean is a measure of Central Tendency.

Statistical definition of the mean / average
The Statistical symbol that is used for the (Arithmetic) Mean of Population is this: “\bar{x}” and the Statistical symbol that is used for the (Arithmetic) Mean of Sample is this: “\mu“. Specifically, it is defined statistically as: \bar{x}=\frac{x_{1}+x_{2}+x_{3}... ...+x_{n}}{n} or otherwise, it can be found as:

\scriptsize{}\bar{x}=\frac{1}{n}\sum_{i=1}^{n}x_{i}=\frac{1}{n}(x_{1}+...+x_{n})}

Note that \sum indicates that the elements that are inside this, their final results must be added together. It indicates a Summation of all the produced results from each followed element.

The n is a symbol that shows the size of a Sample arithmetic series or Sample set e.g. the number of the calories that you intake or loose per day for a year. The year (the set) which includes all the numbers while x_{1}, x_{2}, x_{3} is a symbol that shows the position of this number inside this set / series e.g. 1 for the 1st day loose/intake calories, 2 for the 2nd one, 3 for the third one, etc.

Statistical Example I
You have measured how many kilometers you have run per day, for five (5) days. You would like to calculate the mean or “the average” of your running distance per day. In order to find this:

i) You must add the distance that you have run per day:
\small{x_{1}=1.5+x_{2}=2+x_{3}=2+x_{4}=1.5+x_{5}=3}=>\sum(x_{i})=10 and then:

ii) You must divide by the size of your dataset. Here, it is “5 days”: \small{n}=5.

iii) By applying these numbers into the appropriate positions in the mean formula, the following result is found: \bar{x}=\frac{1.5+2+2+1.5+3}{5}=\frac{10}{5}=2

iv) This means that you have run -in average- a distance of 2 kilometers per day

mean_run

Statistical Example II
You observed how many gulls were flying per 7 hours in your region and you would like to calculate the average number (mean) of your observations:

i) You must add all your observations about how many gulls were flying per hour:
\small{x_{1}=2+x_{2}=0+x_{3}=3+x_{4}=2+x_{5}=2+x_{6}=0+x_{7}=3=>\sum(x_{i})=11} and then:

ii) you must divide by the size of your dataset. Here, it is “7 hours”: n=7.

iii) By applying these numbers into the appropriate positions in the mean formula, the following result is found: \bar{x}=\frac{2+0+3+2+2+0+3}{7}=\frac{11}{7}, therefore \bar{x}=1.71.

mean_gulls

We must say that many types of “mean” can exist such as the Harmonic Mean or the Geometrical Mean. They are also parts of Central Tendency. Each “mean” has a different usefulness in statistics.

Resources
Online LaTeX code translator
Wikipedia: Mean