A measure of central tendency specifies where the data are centred. For a return series, a measure of central tendency shows where the empirical distribution of return is centred, essentially a measure of the “expected” return based on the observed sample.
Measures of location, mean, median, and mode include not only measures of central tendency but also other measures that illustrate other aspects of the location or distribution of the data.
Measures of Central Tendency
The arithmetic mean is the sum of the values of the observations in a dataset divided by the number of observations.
The sample mean or average, , is the arithmetic mean value of a sample:
where n is the number of observations in the sample.
The median is the value of the middle item of a dataset that has been sorted into ascending or descending order. In an odd-numbered sample of n observations, the median is the value of the observation that occupies the (n + 1)/2 position. In an even-numbered sample, we define the median as the mean of the values of the observations occupying the n/2 and (n + 2)/2 positions (the two middle observations).
The mode is the most frequently occurring value in a dataset. A dataset can have more than one mode, or even no mode. When a dataset has a single value that is observed most frequently, its distribution is said to be unimodal. If a dataset has two most frequently occurring values, then it has two modes and its distribution is referred to as bimodal. When all the values in a dataset are different, the distribution has no mode because no value occurs more frequently than any other value.
Dealing with Outliers
Three options exist for dealing with extreme values:
Option 1: Do nothing; use the data without any adjustment.
Option 2: Delete all the outliers
Option 3: Replace the outliers with another value.
Trimmed mean is computing an arithmetic mean after excluding a stated small percentage of the lowest and highest value.
Winsorised mean is calculated after assigning one specified low value to a stated percentage of the lowest values in the dataset and one specified high value to a stated percentage of the highest values in the dataset.
Measures of Location
Quartiles divide the distribution into quarters, quintiles into fifths, deciles into tenths, and percentiles into hundredths. The interquartile range (IQR) is the difference between the third quartile and the first quartile or .
A box and whisker plot represents the lower bound of the second quartile and the upper bound of the third quartile, with the median or arithmetic average noted as a measure of central tendency of the entire distribution. The whiskers are the lines that run from the box and are bounded by the “fences”, which represent the lowest and highest value of the distribution.










