Simple statistics you NEED to know - the median and long tails


If you ever hear survival times quoted, you need to understand that what the doctor or researcher is usually telling you is not their personal opinion, but what the evidence shows about survival.

The evidence is almost always quoted as the average time people last in these circumstances.

The type of average that is used is the median.

It is not the same as the average you may be thinking of (called the mean) where you add up scores and divide by the number of scores.

The median is simply the score that the guy in the middle has when they are lined up.

The median in each of the lines of numbers below is 48:

47 48 49

12 24 36 48 60 72 84

6 12 18 24 30 36 42 48 54 60 66 72 78 84 90

That's because when you put the numbers in order, the one in the middle is 48.

But each of the lines of numbers above is unusual - the numbers are evenly spaced.

The real world is not like that.

Survival in the real world is more like the line below:

6 12 18 33 37 40 42 48 60 72 84 108 156 204 252

When these numbers are lined up, the middle guy still has 48, so the median is still 48.

But as we move to the right of the middle guy, the numbers spread out.

Because of the way this looks when we graph it, we say it has a "long tail".

(Mathematicians may say it is "skewed to the right".)

So, if these men had been told that expected survival in their circumstances was 48 months, they might not have appreciated that:

  • 20% of them would be alive 13 years later; and
  • the last guy would die 21 years later.

What this means for prostate cancer survivors is this:

  • It is not unrealistic to hope to survive longer than the median - half the men will - and because survival curves often have long tails, much longer survival is possible.
  • You should be wary of men who claim to have unexpectedly long survival because of {... insert a treatment, diet, belief, practice here ...}. They may well just be lucky to be in that long tail.

