## Definition The **coefficient of determination ($R^2$)** measures the proportion of variance in the target variable explained by the model. The standard goodness-of-fit metric for regression. ## Formula $ R^2 = 1 - \frac{\text{SS}_{\text{res}}}{\text{SS}_{\text{tot}}} = 1 - \frac{\sum_i (y_i - \hat y_i)^2}{\sum_i (y_i - \bar y)^2} $ - $\text{SS}_{\text{res}}$ — sum of squared residuals (model errors). - $\text{SS}_{\text{tot}}$ — total sum of squares (variance of the target). ## Interpretation - **$R^2 = 1$** — perfect fit. - **$R^2 = 0$** — model predicts as well as the mean of $y$. - **$R^2 < 0$** — possible on held-out data: model is *worse* than predicting the mean. A common sign of underfitting or distribution shift. ## A Common Misconception $R^2$ is sometimes described as "the correlation squared" — true only for univariate linear regression evaluated on the training set. For other cases — multivariate models, non-linear models, held-out evaluations — $R^2$ measures *explained variance*, not correlation. ## Adjusted $R^2$ Plain $R^2$ never decreases when features are added — even useless ones. **Adjusted $R^2$** penalises model complexity: $ R^2_{\text{adj}} = 1 - (1 - R^2) \cdot \frac{n - 1}{n - p - 1} $ with $n$ samples and $p$ features. Use when comparing models with different numbers of features. ## Out-of-Sample $R^2$ Can Be Negative On the training set, $R^2 \in [0, 1]$. On held-out data, $R^2$ can be negative — the model performs *worse* than just predicting the training mean. Treat negative $R^2$ as a red flag: probable overfitting, wrong feature engineering, or distribution shift. ## When Not to Use $R^2$ - **Non-linear relationships** — $R^2$ still measures linear-style variance explained; can be misleading. - **Heteroscedastic data** — $R^2$ doesn't reflect varying noise. - **Time-series with trends** — naive baselines (just-predict-the-trend) can give high $R^2$ for trivial reasons. In those cases, report MSE/RMSE *and* a domain-appropriate metric. ## Related - [[MSE MAE RMSE]] - [[Linear Regression]] - [[Bias-Variance Tradeoff]]