What is statistics? – Standard Deviation and Variance for Grouped data

Standard Deviation and Variance for Grouped Data
Statistical Formula
The Population Variance for Grouped data can be calculated by using the following formula:

\sigma ^{2}=\frac{1}{\sum(f)}*[(\sum fX^2)-\frac{(\sum fX)^2}{\sum f}]

while the sample Variance for Grouped data can be calculated by using the follow following:

s^{2}=\frac{1}{\sum(f)-1}*[(\sum fX^2)-\frac{(\sum fX)^2}{\sum f}]

Symbol explanation
The \sum shows that the results after the execution of the appropriate math operations that are shown in \sum must be added
The \sigma^{2} and the s^{2} are symbols for the Variance.
The f is the Frequency, thus, the size of each range.
The X is the middle point of each range. For example, the age group of 20-30 has as a middle point the \frac{20+30}{2}=25 and the age group of 31-40 has as a middle point the \frac{31+40}{2}=35.5.

Statistical Example Ι
The following table shows the grouped data about the times that you walked with your dog outside the house per three hours, in a day. Therefore, you went your dog for a walk six (6) times between 1 and 3 o’clock while you went your dog for a walk ten (10) times between 4 and 6 o’clock.

Hour GroupsΧ or middle point of the hour group f fX ή f*XX^2 f(Χ^2)
ή f*(X^2)
1 - 32612424
4 - 65105025250
7 - 9886464512
10 - 1211444121484
Total -->281702141270

Replacing the symbols with the numbers of the example Ι

The \sum(f)=28 is equal to the total number of frequencies. That is, it it the total times that you went for a walk with your dog from noon (1) until the midnight (12) in a day.
The \sum(fX) shows that the results that will be produced after the multiplication of f*X for each hour group must be added, that is: \sum(fX)=6*2+10*5+8*8+4*11=12+50+64+44=170.
The \sum(fX^2) shows that the results that will be produced after the multiplication of f*(X^2) for each hour group must be added, that is: \sum(fX^2)=6*4+10*25+8*64+4*121=24+250+512+484=1270.

Dogs_Variance_example

Results for The Example Ι
By placing these numbers into the appropriate formula positions in order to find what is the Population Variance or Grouped data / Grouped frequencies and the Sample Variance for Grouped data / Grouped frequencies, we get:

Population Variance:
\sigma ^{2}=\frac{1}{28}*[(1270)-\frac{(170)^2}{28}]=\frac{1}{28}*[(1270)-\frac{28900}{28}]
\sigma ^{2}=\frac{1}{28}*[(1270)-1032]=\frac{1}{28}*(238)=8.5

Sample Variance:
s^{2}=\frac{1}{28-1}*[(1270)-\frac{(170)^2}{27}]=\frac{1}{27}*[(1270)-\frac{28900}{28}]
s^{2}=\frac{1}{27}*[(1270)-1032]=\frac{1}{27}*(238)=8.8

Dogs_Variance_example

Therefore, the result for Population Variance -if we suggest that the current dataset includes all the possible observations that could happen for the subject of interest- then the result is \sigma ^{2}=8.5 while if these frequencies / observations were part of a population and not the population itself, that is, that multiple samples can exist for the same subject of interest, taken from the target population, then the Sample Variance is s^2=8.8.

Calculating Standard Deviation for Grouped Frequencies
The Standard Deviation for Population and Sample for grouped data is the square root of the Population Variance and the Sample Variance for Grouped Frequencies, respectively. Below, it is included also the Mean for grouped data, respectively. Therefore:

Standard Deviation and Mean for Population:
\sqrt{\sigma ^{2}}=\sqrt{8.5}=2.92

\mu=\frac{f*x}{\sum f}=\frac{170}{28}=6.07

Standard Deviation and Mean for Sample:
\sqrt{s^{2}}=\sqrt{8.8}=2.97

\bar{X}=\frac{f*x}{(\sum f)-1}=\frac{170}{27}=6.30.

Interpretation of the Results for Example Ι
The Variance and the Standard deviation show the distance that exists between numbers of a dataset. We calculated also the Mean for Grouped Data because most of times is referred together with the Standard Deviation with in the following form: 6.30 \pm 2.97. Therefore, If you find the result of one Standard Deviation below the mean and one Standard Deviation above the Mean \bar{X}\pm1TA, then you can suggest that the smallest frequency of Hour Range can be 3.30 or, more simply 3 and then, the largest frequency can be 9.30 or 9! If you look back to the table, the smallest true frequency is 4 while the true largest frequency is 10!

Interpretation of Results for Example Ι: Your friend
Let’s suppose that your friend also has written down the frequency of times that he went for a walk with his dog using the same hour ranges as you, in the same day as you did. The result that have reported to you was about Variance: s^2=25 and thus you can suggest that his Standard Deviation was s=5. No other information was reported to you by him.

Note that these values are higher than your obtained values. Therefore, you can suggest that the frequencies in his Hour ranges have a higher discrepancy between them than the discrepancy that exists between your frequencies! Note, however, you cannot know if the frequency about the times that your friend went out for a walk with his dog was higher or lower than your frequencies, in the same hour ranges and day. In order to suggest this, you must also know the mean of his grouped frequencies and then you must compare it with the mean of your grouped frequencies.