Showing posts with label skew. Show all posts
Showing posts with label skew. Show all posts

Thursday, September 16, 2010

Median

Median is the middle point in the data set. An equal number of items are below and above this value.

The dataset must be ordered before the median can be determined.

The number of items (n) will determine where the median is located.

(n+1)/2 = median rank

For example:

1 , 1 , 2 , 4 , 6 , 2 , 9 , 3 , 7 , 5 , 2 , 5 , 9 , 6

Ordered:
1 , 1 , 2 , 2 , 2 , 3 , 4 , 5 , 5 , 6 , 6 , 7 , 9 , 9

Total count:
n=14

Median rank = (14+1)/2=7.5

Since this dataset has an even number of items (n=14) then the median is found between the 7th and 8th position. The 7th value is 4, and the 8th value is 5. The value inbetween is (4+5)/2=4.5. The median is 4.5. There are 7 items above and below this value.

For datasets with odd number n, the median falls exactly on the median rank.

To illustrate:

1 , 1 , 2 , 4 , 6 , 2 , 9 , 3 , 7 , 5 , 2 , 5 , 9 , 6, 4

Ordered:
1 , 1 , 2 , 2 , 2 , 3 , 4 , 4 , 5 , 5 , 6 , 6 , 7 , 9 , 9

Median rank = (15+1)/2 = 8

The value in the 8th position is 4, so the median is 4. There are 7 items above and below it.

The mean, median, and mode are all measures of central tendency. The skew can be determined by comparing these three measures.




Mean

The mean is the average value in the dataset.

It is calculated by adding up the data values (x), then dividing by the number of items (n).

The mean of a sample is traditionally labelled x-bar. The mean of a population is labelled µ (mu).

sum(x)/n = x-bar

For example, find the mean of the following sample dataset:

10
12
1
16
10
11
13
6
15
6

sum(x) = 10+12+1+16+10+11+13+6+15+6 =100

n=10

x-bar = 100/10 = 10

The mean is 10.

It is also the "center" of the data - in the sense that the difference of each value from the mean will sum up to zero. This is because there are equal positive differences as there are negative.

Check this, using the above example:

10 - 10 = 0
12 - 10 = 2
1 - 10 = -9
16 - 10 = 6
10 - 10 = 0
11 - 10 = 1
13 - 10 = 3
6 - 10 = -4
15 - 10 = 5
6 - 10 = -4


0 + 2 + -9 + 6 + 0 + 1 + 3 + -4 + 5 + -4 = 0

The mean, median, and mode are all measures of central tendency. The skew can be determined by comparing these three measures.

Tuesday, September 14, 2010

Mode

Mode is the value in a dataset that appears the most frequently.

For example:

In the following the sample, the mode is 5

1
1
2
5
6
2
9
5
7
5
2
5
9
5

Count the number of times 5 appears. It appears the most, so it is the mode.

Some datasets have more than one mode.

If there is a single mode, the term 'unimodal' is used. The example above is unimodal. There are five 5's. Had there also been five 2's, than the example is no longer unimodal. Then, both five and two would be called modes.

The mean, median, and mode are all measures of central tendency. The skew can be determined by comparing these three measures.