# Why an expected value is not always expected

When you read a population poll or see the outcome of one on TV, these invariably state the uncertainty in the overall result. Generally a typical political poll will quote an uncertainty of around 3 or 4%. What this uncertainty means can be interpreted in different ways. The most basic definition is that if the poll were truly unbiased and repeated, the same results would be obtained in 95 out of 100 tries to within the quoted uncertainty. So if the poll said that 70% of a given population were expected to vote a certain way with a quoted uncertainty of 3%, this means that if the poll were redone 100 times using a perfectly random sample each time, then 95 of those 100 sample results would have an overall outcome anywhere between 67% to 73%. Here again the outcome being the number of respondents planning on voting the same way as the original sample having the 70% outcome with a 3% uncertainty.

Another way of describing the uncertainty is to say that the true population value were found by asking everyone (if the total population could be sampled at all), then the true value would be within the estimated sample poll value range 95% of the time under identical conditions. The expected range here being the sample value plus or minus the uncertainty as described above.

The uncertainty comes primarily from the random sampling nature of the measurement. If the sampling is truly random, obtaining exactly the true population value from a random subsample is not a reasonable expectation. What is reasonable is that as the sample size increases, the sample will become a better approximation for the population as a whole. Generally with a sample taken from a population, it is expected that the sample average may approximate the population value but will likely give a value somewhat larger or smaller than the true population value. Calculating the uncertainty for a given sample as it relates to the population requires college level algebra, probability and statistics and so is not generally a trivial matter.

All measurements have some uncertainty associated with them. Even the most precise measurement of length using a micrometer is limited by the number of digits it is able to display. Technically, infinite quantifiable precision requires infinite digits. Even still, if a micrometer were able to be made with infinite digits, each measurement of the same length would have to be done identically to get an identical result. Each user would also have different nuances resulting in different values even if measuring the same length. Similarly, as different objects were measured (even if by the same user), the unique difficulties and issues with each of those measurements would result in some specific bias and imprecision with each value.

This is not to discourage measurement of any kind but rather to recognize the quality of a measurement. Knowing whether you have a high or low quality measurement is very important information. Random samples require large numbers in the sample to obtain higher precision of a proportion estimate. Material property measurements such as density, mass, length and temperature generally require high quality instrumentation to obtain results with small uncertainty.

Put another way, in the realms of science, engineering and technology, measurement is gospel. This is not the end of the story by any means as a measurement without an understanding of its uncertainty is meaningless.

There are two types of uncertainty that need to be considered. One type is bias where there is a consistent shift in the same direction for a measurement. This could be a tendency to overestimate a value or alternatively to consistently underestimate the value by a constant amount. The other type of uncertainty is the random component which will sometimes be greater and sometimes less than the true value but generally closer to the true value that being further from it. The credible difference a measurement can take away from the expected average depends on a number of variable such as the sample size and the inherent uncertainty in any given single value.

The expected variation in the mean of a sample generally decreases with the square root of the sample size. This means that to a first approximation, you have to increase your sample size by a factor of 4 to decrease the uncertainty in its sample mean by a factor of approximately 2. Similarly, increasing a sample size by a factor of 9 would be required to reduce an uncertainty in a sample mean by a factor of approximately 3.

## Comments