# R Squared

#### Description

As in the Least Squares visualization, six data points are arranged in the coordinate plane and may be moved about at will. The least squares regression line for the six points is drawn.

Two buttons control hiding and showing of:

- the squares of the deviations of the points from the mean of their
*y*values (the red squares), and - the squares of the errors or residuals from the points to the regression line (the blue squares).

The sum of the areas of the red squares is shown as a large red square - the total squared deviation that we are trying to explain with a linear model. Likewise, the sum of the the areas of the blue squares is shown as a large blue square - the total squared error that remains unexplained by the linear model.

The ratio of these two areas (blue / red) is proportion of the deviation that remains unexplained. One minus this ratio is proportion of the deviation that is explained.

#### Questions

- Why is it that the big blue square will always have less area than the big red square?
- Drag point
*P*_{1}along the the regression line until it becomes an outlier. Notice that as you get farther and farther from the other five points, the*r*^{2}value becomes closer and closer to one. Explain why this is happening and comment on how it can happen that one point can provide a high correlation. - Drag the six points until they all have approximately the same
*y*-values. Notice that r squared becomes nearly zero. Explain why this has to be so. - Find two other arrangements of the six points that produce a zero
*r*^{2}value and explain why they do so in terms of the red and blue squares. - Arrange the six points so that the
*r*^{2}value is one. Explain why your arrangement works. - Is there any arrangement of the six points that has an
*r*^{2}value of one but in which the points are not all on a line? Why, or why not?