Simple Models: Models of ERROR and Sampling Distributions.

This chapter is concerned with the error term in the MODELS. Where the full MODEL might be
MODEL: Yi = ßo + Ei
The model says that were it not for random (misses) perturbations in the data which Ei captures, all the Yi would equal ßo.

You know from earlier lectures that the estimator bo in a simple model depends on how error is defined. (Counts = mode, absolute differences = median, and squared differences = mean). This chapter evaluates the different estimators under reasonable assumptions about error.

If the data has a bo and we take a sample from that data we would like to estimate ßo with bo. ßo could be the mode, median, or mean. The mode is so inconsistent it is easily eliminated from the rest of the discussion. The remainder of this chapter focuses on whether the median or mean is the best estimator.
To determine which is better, we construct sampling distributions of medians and means and use those distributions to evaluate the effectiveness of each.
In sampling distributions, the peaks of both the median and mean are at ßo, so both are unbiased.
The distribution of the mean is a bit tighter around its peak, so it is a bit more efficient.
With larger sample sizes, the mean continues to be more efficient, so it is more consistent.
Unbiasedness, consistency, and efficiency are desirable attributes for our estimators, but many estimators have this property. The mean is the most efficient if we assume that ERRORS are normally distributed.

Is the normal distribution of errors a good assumption.

Sampling distribution for mean.

The values for the median are in the text.
If the distribution of errors is normal, then the mean is the most efficient unbiased estimator. However, when we adopt it as the model we adopt the SSE as the error term.
We make other assumptions about error,

  1. independent
  2. identically distributed (homoscedasticity of variance)
  3. their mean is zero.

In the simple model, MSE (mean square error) has a special name, variance. Therefore MSE estimates ERROR in more complex models to be discussed, Where n = number of subjects and p = number of parameters in the MODEL.

Brain Exercise

After reading Chapter 4 in Judd and McClelland, answer the following assuming realistic errors, and submit this form.
Type your name:
Which bo is best?
Give three one or two word reasons why: