Section 1.2: Descriptive Statistics
Learning Objectives
At the end of this section you should be able to answer the following questions:
- What is the concept of central tendency?
- What is the concept of dispersion?
So, let’s say you had measured the height of everyone you know. All of those responses by themselves don’t tell you much, beyond the height of each individual.
However, what we are after is a way to explain what the height of a typical person within this group might be. That is the concept of central tendency.
For example, let’s say you wanted to work out what the average height of all your friends is. The most obvious way to do that is to look at the mean. The mean is simply all of the numbers (their height measurements) added together and then divided by the number of responses (or number of friends whose height measurements you have collected). In contrast, if the numbers were to be listed in numerical ascending order (from lowest number [shortest person] to highest number [tallest person]), the number in the middle would be the median. Another similar measure or statistic that measures central tendency is the mode. The mode is just the number that appears most often. Finally, the range is the difference between the lowest and highest values.
This information covers the main ways to look at the central numbers of a data set, but how can we tell how the data is distributed across the range? Are the responses all relatively close together, or are they spread widely apart? What we want, when we need to tell how the data is distributed across its range, is a way to explain the general dispersion or scattering of individual responses across the range. That is the concept of variability or dispersion.
The quickest way to determine how much the responses differ from each other is to look at the standard deviation. To work out the standard deviation for the height of all your friends you would first calculate the mean, then for each case or individual response you would subtract the mean and square the result (providing a squared difference from the mean), then you would calculate the mean of those squared differences then take the square root of that value.
If you did these calculations using paper and a pen, it could take a while if you have a lot of friends. Another way to complete the calculations is to use a statistical program to perform all these calculations for you!
Your standard deviation tells you how much the responses generally vary from the mean. A low standard deviation means that most of the numbers are close to the mean. A high standard deviation means that the numbers are more spread out. The standard error, which can be found by dividing the standard deviation by the square root of the total number of responses, tells you how accurate the mean of any given sample from that population is likely to be, when compared to the true population mean.
These types of descriptive statistics are the basic information you would provide for describing your data.