Interpretable ML 1 — Don’t trust linear models

10 min readJan 21, 2022

From her vantage point high above the cave, the Adventurer Statistician watched with a mixture of disbelief and pity as the cohort of data scientists descended into the flickering light of the sacrificial fire. Surrounded by the gleaming golden idol of linear models, they chanted in unison, their voices echoing through the cavernous space, “Linear models are interpretable, linear models are interpretable.” The Adventurer Statistician shook her head, her heart sinking as she witnessed their blind devotion to a flawed methodology. She could only hope they would not take their naive beliefs into the real world, making causal decisions that could have dire consequences for the organization.

Are you curious about why the Adventurer Statistician made that remark? I was fortunate enough to have a brief conversation with her. Here’s what she had to say.

🤔 What is interpretability?

Well, there is no single agreed-upon definition, but a reasonable, non-mathematical interpretation is:

predict what would happen if you changed the input (i.e. predictions)

I don’t find the alternative definition (see [1]) to be satisfactory, as it focuses on the degree to which a human can consistently predict the model’s result:

the degree to which a human can consistently predict the model’s result

which is closer in definition to “computational feasibility”, but it offers little insight into a model’s internal mechanisms. Let’s illustrate the interpretability in terms of how changes in inputs would impact the model’s output. The authors of ESL and ISL provide a masterful explanation of this subtle yet critical distinction in [6].

Just as you can predict that a car won’t start if you put diesel in its gas tank, you can also anticipate the trajectory of a ball thrown with varying force. These examples illustrate the concept of interpretability in the sense of being able to predict the consequences of changing input variables. Understanding these relationships is essential for making informed decisions.

🥼 Is interpretability the same as explainability?

There is a subtle but crucial distinction between these two concepts. One could, informally, define explainability as the

ability to expose the reasons behind this change, to understand the internal mechanics (models that can summarize the causes behind specific behaviors).

The crucial aspect of this definition is the word why or cause. Can you identify the underlying reasons behind the car’s failure to start or the ball’s trajectory? If not, you can merely interpret the model’s predictions but not fully explain its behavior. Explainability goes a step further by uncovering the causal structure of the model, enabling us to quantify its reasoning. For instance, you can fit a parabola to the ball’s trajectory, which serves as an interpretation of the model’s output. However, to truly explain the trajectory, you would need to derive it from Newton’s laws, revealing the model’s underlying causal mechanisms.

For a much more comprehensive discussion, please refer to [4]. Do you consider yourself as entirely interpretable, given that your decisions are influenced by your brain’s billions of neurons, personal biases, and incomplete knowledge?

📈 Are linear models interpretable?

With this definition in mind, let’s don our statistical hats and investigate whether linear models are truly interpretable. I’ll employ an example with a known data generating process (DGP) and walk you through the process step by step, revealing the challenges of interpreting coefficients. This exploration draws heavily from [3] and [6].

The dummy data generating process. All the predictors are independent. X_4 is a categorical predictor with two levels. Figure by author.

We can now proceed to fit a linear model to the data, step by step. First, let’s fit the model using only x_3 .

At this point, it’s tempting to infer that if we change X_3 by one unit, Y will change by 5.6 units, or at the very least, Y will increase more rapidly than X_3.

What if we add other variables to the model?

The coefficient for X_3 remains unchanged compared to the first fit; they lie within each other's confidence intervals. So, what's the fuss about linear models? Well, the preceding example is a rare occurrence in real-world scenarios, primarily in well-designed experiments (typically conducted in a laboratory). When we collect data, the DGP (data generating process) is typically unknown, and the predictors are correlated. This correlation can either be genuine or spurious and has a significant impact on interpretability, as I'll demonstrate in the revised example.

What if the predictors are correlated?

I’ll revisit the steps above, introducing mild and strong correlations.

dummy model with correlation, figure by author

Let’s first assume we gathered data solely on X_3.

Surprisingly, the coefficient is entirely off, and even the sign is reversed. As before, we might be tempted to interpret the linear model and inform management: “The response is decreasing with X_3". This would lead to an erroneous causal business decision, as the true coefficient equals 5. This artifact arises from the correlation corr(X_1, X_3) = -0.9. It implies that we cannot alter X_1 by one unit without simultaneously affecting X_3, and vice versa. The concept of "holding all other predictors constant" is prohibited by the underlying data structure, which deviates from the data distribution. It would be analogous to changing a person's height without modifying their body weight in a medical study. Two-meter-tall individuals weighing fifty kilograms are exceedingly uncommon.

If you are unaware of the true DGP and if you fail to collect all relevant predictors, which is often the case in data science, you’ll encounter this issue. Additionally, observe the confidence interval; it is surprisingly broad.

Unfortunately, this is not the only artifact we will encounter. Let’s examine what occurs when we incorporate one more predictor.

adding X_2 to the regression, figure by author.

adding X_1 to the regression, figure by author.

The value of the beta_3 coefficient is dependent on which predictor(s) are included in the model, and it also influences the width of the confidence interval!

We are fortunate to have precise knowledge of the DGP but reflect on whether this applies to your applications.

Fitting using the exact functional form of the DGP, figure by author.

🕳️Interpretability pitfalls — summary

Look over your shoulder, can you see the boulder?

The challenges of relying solely on observational data, or the inability to manipulate the system on which the data are collected (data-based multicollinearity) and engineering features highly correlated, such as x² (structural multicollinearity), can wreak havoc on linear regression models. These issues manifest in the following ways:

The estimated coefficient of any one variable depends on which other predictors are included in the model.
The confidence interval of the estimated regression coefficients broadens as more predictors are added to the model, leading to decreased precision.
The marginal contribution of any predictor variable in reducing the error sum of squares depends on which other predictors are already in the model.
Hypothesis tests for coefficient=0 may yield different conclusions depending on which predictors are in the model. Is that the end of it? Unfortunately, not. Other issues may arise, such as:
Heteroskedasticity (non-constant variance), which invalidates statistical tests and can lead to inaccurate inferences.
Autocorrelation (correlation between successive error terms), which can make it difficult to interpret the model and assess its predictive power.
Latent variables, which are unobserved factors that influence the dependent variable but are not directly measurable. These variables can create bias and make it challenging to accurately assess the relationships between variables.

🤨 Yes, but this is precisely what regularization was designed for…

It helps to prevent overfitting, reduce the variance of the estimates, and perform feature selection. However, it’s not flawless.

Ridge regression effectively handles correlated features. It incorporates all predictors into the model, and the coefficients are assigned based on the correlation among them.
Lasso regression arbitrarily chooses one predictor from among the correlated ones and sets the coefficients of the rest to zero. The selected variable changes randomly as model parameters are modified. This approach doesn’t perform as well as ridge regression.

Let’s employ ridge regression to demonstrate its effect.

Ridge regression Y vs X_3, image by author

and now with some other predictors, as for the regular OLS

Ridge regression with three predictors, image by author.

Ridge regression helps mitigate the collinearity but does not render perfectly the DGP. In the example above, we have only partial information, not knowing that X_4 is important in the DGP.

For more details, see the comprehensive scikit-learn review [5]

😨 Is it all bad news?

Not really, at least not from a data science perspective. While the different models remain predictive, they are not easily interpretable and cannot be used to make direct causal business decisions. This requires further analysis.

In fact, the predictive power of the model remains high even with partial information. Fitting a regular linear regression using X_1, X_2, and X_3 only provides an incomplete glimpse into the underlying, unknown, DGP

Predictive power of the regular OLS using partial information, image by author.

And if we were Maxwell’s Demons and knew the DGP perfectly, we would get almost perfect results

Omniscient ridge regression, image by author.

However, the coefficients (and therefore the interpretability) might not mirror the exact structure of the original DGP.

💭 Is interpretability a myth? Is it useful in the first place?

This is a fair question to ask. Is interpretability just a passing trend or a crucial aspect of machine learning and analytics? For understanding causal relationships, shouldn’t we utilize dedicated models and approaches like those in econometrics?

What if we have many features (10, 100, 1⁰⁶), correlated with overlapping predictive power? Will we be able to condense them into a comprehensible verbal template, even if the model is linear? What value would this bring to the table?
Is the underlying process stationary? If not, what is the timescale, and how often should we retrain our model?
What if two vastly different models (GLM and GBM, for instance) predict identical values but offer conflicting interpretations? Does this render interpretation pointless?
Even if we can interpret the model, does it provide any additional insight into causal relationships beyond our current understanding? (Correlation does not imply causation.)
Machine learning and inference are inherently iterative processes, and so is interpretation.
Does the interpretation change when we incorporate added information that truly matters? If so, can we rely on the interpretation?
Are our own decision-making processes interpretable? Are we rational beings?

Interpretability is valuable but should be clearly communicated and explained to decision-makers. When properly executed, it reveals the behavior of the model but does not guarantee insights into causal relationships. Therefore, exercise caution when making business decisions based on interpretability alone. Conduct further analyses or even behavioral experiments to validate your findings.

💡 What works then?

If your goal is to understand how the model produces its numerical output, current IML methods (which I’ll explain in an upcoming post) can be helpful. In the case of a linear model, it’s acceptable to use the coefficients. However, this doesn’t provide a complete understanding of the model. You cannot use them to make causal claims, such as “If I increase X_1 by one unit, Y will increase by b_1 unit," which is incorrect and potentially harmful, as I've demonstrated previously.

If your objective is to make causal decisions, then rely on causal statistical inference, at least as a metamodel built on top of your predictive model.

🏁 Conclusion

Critically evaluate claims like “linear models are white box”. Apply your statistical expertise to assess their validity. Ask yourself: “Is this claim accurate? Are there quantitative references to support it?”
Remember that correlation does not imply causation.
If you require causal inferences, employ causal models.
Don’t make causal business decisions based solely on models without conducting a thorough causal analysis.
Understand the limitations and strengths of various models. There is no one-size-fits-all solution.
Use models for their intended purposes. Some are designed for prediction, while others are intended to reveal causal relationships.
Communicate results to management with caution.
Able to compute a numerical output value is not equivalent to interpretability. Deterministic models, with a one-to-one mapping, should always produce numerical output values (y=f(x)).
Interpretability is not as valuable as it might seem at first glance unless the purpose is well-defined beforehand. Is it for marketing, regulation, common knowledge, or reassurance due to skipping the math?