You know from earlier lectures that the estimator bo in a simple model depends on how error is defined. (Counts = mode, absolute differences = median, and squared differences = mean). This chapter evaluates the different estimators under reasonable assumptions about error.
If the data has a bo and we take a sample from that data we would like to estimate ßo with bo. ßo could be the mode, median, or mean. The mode is so inconsistent it is easily eliminated from the rest of the discussion. The remainder of this chapter focuses on whether the median or mean is the best estimator.
To determine which is better, we construct sampling distributions of medians and means and use those distributions to evaluate the effectiveness of each.
In sampling distributions, the peaks of both the median and mean are at ßo, so both are unbiased.
The distribution of the mean is a bit tighter around its peak, so it is a bit more efficient.
With larger sample sizes, the mean continues to be more efficient, so it is more consistent.
Unbiasedness, consistency, and efficiency are desirable attributes for our estimators, but many estimators have this property. The mean is the most efficient if we assume that ERRORS are normally distributed.
The values for the median are in the text.
If the distribution of errors is normal, then the mean is the most efficient unbiased estimator. However, when we adopt it as the model we adopt the SSE as the error term.
We make other assumptions about error,
Exercise