The data displayed here come from a random sample of the 1990 census data for people in Sonoma County, California who are between the ages of 30 and 60. Each of the cells in the table at right show the count of the number of people who fall into that cell; for example, there were five people with no college education who fell into the "Richer" category.
Richer means an income of greater than 23000 (the median income for the sample).
You can change the frequency for a cell by dragging the point at the top of the blue bar in that cell. As you drag, the expected values and the chi-square statistic are recomputed. At right, you can see that the chi-square statistic for this situation is 23.7 and that this is high enough that the p-value is very close to zero.
The point of this sketch is that the chi-square algorithm has a geometric visualization. Each cell contributes an amount to the total chi-square corresponding to the width of a rectangle whose area is equal to the square of the difference between the actual and expected values.
- Is there any way to adjust the cell frequencies so that the chi square = 0? Describe the conditions under which this can be true.
- Is it possible to have exactly on cell frequency different from its expected value? Why or why not?
- Is it possible to have both frequencies in one column different than their expected values but all the other four frequencies equal to their expected values? Why or why not?
- Under what conditions is the chi square very large? (See if you can produce a chi square > 80.)
- Suppose two cells have the same difference between observed and expected values, but one has a high expected value and the other has a low expected value. Will they both contribute the same amount to the chi square? Why or why not?