If you are learning linear regression, then you need to clearly understand the concept of Coefficient of Determination R^{2} and the Adjusted Coefficient of Determination R^{2}_{adj}.

I am going to explain these concepts in a very easy way.

We are going to cover the following:

- What is Coefficient of Determination
- Properties of Coefficient of Determination
- Adjusted Coefficient of Determination
- Final Notes

### 1. What is Coefficient of Determination?

The coefficient of determination, is used to determine the proportion of the variation of one of the variables that is predictable from the other variable.

Look at the table below. What do you think is the relationship between X and Y?

It seems that Y equal X/2. But carefully looking at the table we see that this is not exactly true for two of the data points.

But we can say that 80% of the time, Y is X/2. This means that the coefficient of determination is 0.8 (or 80%)

Y |
X |

1.0 | 2.0 |

2.0 | 4.0 |

3.0 | 7.0 |

4.0 | 8.0 |

5.0 | 10.0 |

6.0 | 12.3 |

7.0 | 14.0 |

8.0 | 16.0 |

9.0 | 18.0 |

10.0 | 20.0 |

**Table 1:** For 8 out of the 10 points, y=x/2

The coefficient of determination is a measure of how certain we are in making predictions from a certain model.

It determines the ratio of the explained variation to the total variation.

The value of R^{2} ranges from 0 to 1, that it:

0 < R^{2} < 1

It denotes the strength of the linear association between x and y. When we are using a line of best fit, then the coefficient of determination represents the percent of the data that is closest to the line of best fit.

For example, if R = 0.89 then R^{2} = 0.792 which means that 79.2% of the total variation in y can be explained bz the linear relationship between y and x (as described by the regression equation, in our case it is y = x/2.

The other 20.8% of the variation remains unexplained.

So we can say that the coefficient of determination is a measure of how well the regression line represents the data.

**Formula for R ^{2} is given by:**

**2. Properties of Coefficient of Determination**

Let’s now outline some of the properties of R^{2} that you need to know. To get used to these properties, take some time to write then out in your note.

0 ≤ R^{2} ≤ 1 if f(X) = r(X) = E(Y | X)

if X and Y are independent, then R^{2} = 0

if Y = f(X) then R^{2} = 1

if f(X) = a*X + b* then the theoretical linear regression is given by R^{2} = (R(X,Y))^{2}

if the joint distribution of X and Z is normal, then R^{2} = (R(X,Y))^{2}

**3. Adjusted Coefficient of Determination R**^{2}_{adj}

^{2}

_{adj}

Just like the Coefficient of determination, the adjusted Coefficient of Determination R^{2}_{adj} is used to determine how well a multiple regression equation fits the sample data.

The difference between R^{2} and R^{2}_{adj} is that R^{2} increases automatically as new independent variables are added to the regression equation even if they don’t contribute to any new explanatory power to the equation.

However the R^{2}_{adj} increases ONLY IF the new independent variables added, increases the explanatory power of the regression equation. This makes the R^{2}_{adj }more reliable in measuring how well a multiple regression equation fits the sample data.

**4. Final Notes**

I hope this brief discussion have helped you understand the concept of Coefficient of Determination and Adjusted Coefficient of Determination as it applies to Regression Analysis. Especially take not of the difference between the two as this always appears in statistics quiz and exams.

Thank you for reading and remember to leave a comment below if you have any challenges following the explanation.