The Ultimate Guide to SHAP for Model Explainability

Artificial Intelligence (AI) is table stakes for any modern business—from healthcare and finance to retail and logistics. The call for transparent, understandable models is louder than ever. While black-box models offer high performance, the growing necessity for ethical and transparent decision-making is leading data scientists and stakeholders alike to prioritize model interpretability and explainability.

SHAP (SHapley Additive exPlanations), a Python package for calculating Shapley values, uses game theory to provide consistent and locally accurate feature attributions. SHAP bridges the gap between high performance and explainability, helping to unravel the complex decision-making processes of black-box machine learning models.

The purpose of this guide is to provide a hands-on introduction to using the SHAP library for model explainability. Whether you're a data scientist seeking to explain your models or an AI enthusiast interested in ethical AI, this guide aims to equip you with the tools and understanding needed to leverage SHAP in your projects for improved transparency and decision-making.

The Need For Understanding Your Model

For AI systems that influence critical decisions, such as medical diagnoses or loan approvals, the consequences of misunderstood or biased algorithms can be severe. Model interpretability and explainability is vital for several reasons:

  1. Transparency: Stakeholders, whether they are doctors, policymakers, or consumers, need to understand how decisions affecting them are made.
  2. Ethical Considerations: Opaque decision-making can raise ethical concerns, especially when algorithms are used in healthcare, criminal justice, or social services.
  3. Regulatory Compliance: Many industries have regulations requiring transparent decision-making. Uninterpretable opaque models can lead to legal issues.
  4. Debugging and Improvement: Understanding a model's reasoning can help identify errors and areas for improvement.

Real-world Consequences of Black-Box Models

Let's look at a few examples where the lack of transparency in AI modeling has real-world consequences:

  1. Healthcare: In a predictive model for patient risk, an incomprehensible model might fail to consider key symptoms, resulting in an incorrect diagnoses.
  2. Financial Industry: Black-box models used for credit scoring can unfairly discriminate against certain groups, perpetuating systemic biases.
  3. Criminal Justice: Biased data can inadvertently influence algorithms used for sentencing or bail decisions, leading to unjust outcomes.

These examples lie on the more extreme end of the spectrum of consequences, but they demonstrate the attention required when using black-box models to ensure fair and unbiased results.

The Role of SHAP in Addressing Model Explainability

SHAP offers a mathematically grounded approach to decompose any model's predictions into understandable parts, making it easier to identify, diagnose and rectify issues. With its ability to provide both global interpretability (understanding model behavior as a whole) and local interpretability (explaining individual predictions), SHAP brings us one step closer to the ideal of fully transparent, responsible AI.

By demystifying parts of the black-box, SHAP not only satisfies the intellectual curiosity of data scientists but also fulfills the ethical responsibility of making AI understandable and equitable for everyone involved.

What is SHAP and How Does It Work?

SHAP is a Python package for calculating Shapley values to provide consistent and locally accurate feature attributions for any model. By breaking down the prediction of any machine learning model into individual feature contributions, it helps to unravel the complex decision-making process of the black-box.

Introducing Shapley Values

Shapley values, originally formulated for cooperative game theory, provide a way to fairly allocate rewards among players based on their individual contributions to the game. In the context of machine learning, these "players" are the features, and the "reward" is the model's prediction. (For a detailed look at Shapley values, see our previous post).

Why is SHAP Considered a Powerful Tool for Explainability?

  1. Consistency: It ensures that if a feature's contribution increases, the attributed importance does not decrease.
  2. Local Accuracy: SHAP provides accurate attributions for individual predictions, allowing for case-by-case explanations.
  3. Global Interpretability: It can also offer an overall view of feature importance across the model.
  4. Versatility: SHAP can be applied to any machine learning model, making it widely applicable.
  5. Regulatory Acceptance: Thanks to its strong theoretical underpinnings, SHAP is the only framework that meets the criteria for regulatory compliance in sectors like finance and healthcare.

Key Concepts: Dive Deeper into SHAP Values

SHAP values measure the impact of each feature on a specific prediction in relation to the model's baseline prediction. Here's how they work:

  1. Baseline Value: The model's average prediction over the entire dataset, which serves as a starting point.
  2. Feature Contributions: Each feature either pushes the prediction above or below this baseline. Positive SHAP values indicate an increase, and negative values indicate a decrease.
  3. Summation to Prediction: The sum of all SHAP values and the baseline prediction should equal the actual prediction for a specific instance.

By providing a detailed breakdown of each feature's contribution, SHAP values offer a granular level of explainability that is invaluable for understanding, debugging, and improving machine learning models.

Practical Steps to Leveraging SHAP

Now that we've covered the theoretical foundations of SHAP at a high-level, let's get some hands-on experience actually implementing SHAP to explain our model.

Setting Up Your Environment

Before diving into SHAP, you'll need to set up your environment. Start by installing the necessary libraries:

pip install shap pandas xgboost scikit-learn

(Note: xbgoost, scikit-learn, and pandas are used for this walkthrough but are not required if you prefer to use different libraries for handling and modeling your data.)

Loading and Preparing the Dataset

For this example, let's use the popular UCI Statlog Heart dataset. This dataset has the following columns:

  • Age: Age in years
  • Sex: Gender (1 = Male; 0 = Female)
  • Chest Pain Type (cp): Type of chest pain experienced by the patient (Value 1: typical angina, Value 2: atypical angina, Value 3: non-anginal pain, Value 4: asymptomatic)
  • Resting Blood Pressure (trestbps): Blood pressure at rest in mm Hg
  • Serum Cholesterol (chol): Serum cholesterol in mg/dl
  • Fasting Blood Sugar (fbs): Fasting blood sugar > 120 mg/dl (1 = true; 0 = false)
  • Resting ECG (restecg): Results of electrocardiogram at rest (Value 0: normal, Value 1: having ST-T wave abnormality, Value 2: showing probable or definite left ventricular hypertrophy)
  • Max Heart Rate Achieved (thalach): Maximum heart rate achieved during the stress test
  • Exercise-Induced Angina (exang): Angina induced by exercise (1 = yes; 0 = no)
  • Oldpeak: ST depression induced by exercise relative to rest
  • Slope: The slope of the peak exercise ST segment (Value 1: upsloping, Value 2: flat, Value 3: downsloping)
  • Number of Major Vessels (ca): Number of major vessels (0-3) colored by fluoroscopy
  • Thal: Thalassemia (3 = normal; 6 = fixed defect; 7 = reversible defect)
  • Presence of Heart Disease (target): Diagnosis of heart disease (0 = absence of heart disease, 1 = presence of heart disease)

We can load the data as follows:

import pandas as pd

heart_df = pd.read_csv(
heart_df.pop("thal") # drop categorical feature thal
X, y = heart_df, heart_df.pop("target")

Training a Machine Learning Model

We'll use an XGBoost classifier as our machine learning model:

from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
model = XGBClassifier(), y_train)

Applying SHAP to Explain Model Predictions

With our model trained, we can now use SHAP to explain the model and it's predictions. First, let's do a global analysis of model across all data points. Since we used XGBoost, we'll use the TreeExplainer, which is a high-speed exact algorithm for tree ensemble methods:

import shap

explainer = shap.TreeExplainer(model)
shap_values = explainer(X_test)

Feature importance summary plot using SHAP bar chart for our trained XGBoost model.

The summary plot provides a high-level overview of feature importance and impact direction. We can tell that ca (number of major blood vessels colored by fluoroscopy), cp (chest pain type), and oldpeak (T depression induced by exercise relative to rest) are the three most impactful features globally for our model when predicting the presence of heart disease.

This impact, however, is the mean absolute SHAP value, which means that we cannot determine from just this chart how each feature impacts the prediction based on the actual value of the feature -- we can only determine the average magnitude of the change from the base prediction.

To better understand each feature's impact, let's take a look at the beeswarm chart:


We can take a deeper look using the beeswarm chart to understand the directional impact of different feature values for each feature.

The beeswarm charts helps us better understand how different values for these features impact the prediction by showing us the SHAP value for each feature for each data point. For example, we can see that lower values for ca generally reduce the likelihood of heart disease, and higher values of cp generally increase the likelihood.

Now let's look at how we can visualize and explain the prediction for an individual data point using a force plot:

# shap.initjs() + matplotlib=False will produce an
# interactive chart in a Notebook
shap.plots.force(shap_values[0], matplotlib=True)

The force chart shows the impact of each feature value relative to the average prediction to arrive at the final prediction of the model.

For our example above, we can see that the values for sex, restecg, age, and ca are all increasing the likelihood of heart disease; however, the values for cp, slope, oldpeak, and exang force a decrease in likelihood that outweighs the increase from the other features, resulting in a prediction that heart disease is not likely. Remember that the SHAP value for each feature is the change relative to the base average prediction of the model.

Using Insights From SHAP To Improve Model Performance

First, let's take a look at how are model is currently performing using AUROC, which computes the area under the receiver operating characteristic curve. This metric is useful for understanding the model's ability to "discriminate" between classes. For example, an AUROC of 0.8 indicates that 80% of the time the model will correctly assign a higher probability of the positive class (1 in our case) to a randomly selected example from the positive class than to a randomly selected example from the negative class (0 in our case). Thus an AUROC of 1.0 would be a perfect score, whereas a score of 0.5 would indicate random guessing.

from sklearn.metrics import roc_auc_score

preds = model.predict_proba(X_test)[:, 1]
auroc = roc_auc_score(y_test.values, preds)
print(f"AUROC: {auroc}")

# AUROC: 0.8850

Using the insights we've gathered, we can do the following to try to improve our model's performance:

  1. Feature Selection: Use only the most impactful features according to the SHAP summary bar chart. Removing features with low or negligible SHAP values may improve performance.
  2. Feature Engineering: Based on the SHAP values, we could create interaction terms between the most important features.
  3. Parameter Tuning: Use grid or randomized search with cross validation to fine-tune the hyperparameters of our model. For XGBoost, this could include parameters such as max_depth or learning_rate. We will skip this step since it does not directly leverage our insights from SHAP.

We can implement feature selection quite easily by following the same steps as above with only the desired features:

top_features = ["ca", "cp", "oldpeak", "thalach"]
X_train_selected = X_train[top_features]
X_test_selected = X_test[top_features]

feature_selected_model = XGBClassifier(), y_train)

preds = feature_selected_model.predict_proba(X_test_selected)[:, 1]
auroc = roc_auc_score(y_test.values, preds)
print(f"Feature Selected AUROC: {auroc}")

# Feature Selected AUROC: 0.9091

Selecting only the top four features resulted in an improvement to our metrics! Next, let's engineer a new interaction term and see if we can improve our results even further. Taking a look at our SHAP values, we can see that oldpeak and slope, which are both related to exercise, are in our top features. We can try out multiplying the two features together and removing oldpeak:

def add_interaction(df):
  df_copy = df.copy()
  df_copy["oldpeak_x_slope"] = df["oldpeak"].values * df["slope"].values
  return df_copy[["oldpeak_x_slope", "ca", "cp", "thalach"]]

X_train_interaction = add_interaction(X_train)
X_test_interaction = add_interaction(X_test)

interaction_model = XGBClassifier(random_state=42), y_train)

preds = interaction_model.predict_proba(X_test_interaction)[:, 1]
auroc = roc_auc_score(y_test.values, preds)
print(f"Feature Interaction AUROC: {auroc}")

# Feature Interaction AUROC: 0.9412

Through feature selection and feature engineering, we were able to improve our model's performance on the held-out test set from 0.8850 ➝ 0.9091 ➝ 0.9412 using our insights from SHAP.

Model-Agnostic SHAP Usage: PyTorch

For our example we used XGBoost and the SHAP TreeExplainer. But what if we want to use SHAP for any machine learning? Let's take a look at how we can use the SHAP Explainer on a SOTAI Calibrated Linear model (PyTorch):

import sotai
import torch

# Create a SOTAI Pipeline 
pipeline = sotai.Pipeline(
    features=["ca", "cp", "oldpeak", "thalach"],
    name="Heart Pipeline",

# Train a default Calibrated Linear model
trained_model = pipeline.train(heart_df)

# Wrapper function for converting numpy -> tensor and calling model
def f(x):
    model = trained_model.model
    with torch.no_grad():
        output = model(torch.DoubleTensor(x))
        return output.detach().numpy()

# Get test data as numpy
test_data_csv =
test_data_csv.prepare(trained_model.model.features, None)
test_data_numpy = list(

# Run SHAP Explainer
explainer = shap.Explainer(f, test_data_numpy, feature_names=features)
shap_values = explainer(test_data_numpy)

The Shapley values computed here are the same as before, so you can use the same charting and analysis. The primary trick to remember here is the wrapper function. Since SHAP takes numpy inputs, we need to convert them to tensors, call the model, and return the model outputs as numpy. This trick also works for TensorFlow models.

Real-World Applications of SHAP

Now that you know how to implement SHAP, let's take a look at how you can leverage SHAP for some real-world applications.

Healthcare: Explaining Medical Diagnoses

Imagine a scenario where a patient consults a doctor about their risk of heart failure. The doctor uses a black-box machine learning model to assess this risk. The model flags the patient as high-risk, but how should the doctor proceed?

Without further insights, simply telling the patient they are at high risk would raise more questions than answers. Both the doctor and the patient would be left wondering, "Why?" And without a deeper understanding of the model's decision, the doctor might hesitate to rely on it for critical healthcare choices.

Now apply SHAP. The doctor can see that elevated cholesterol and high blood pressure are the primary contributors to the patient's risk. With this information, the doctor can now offer targeted advice, such as recommending lifestyle changes to lower these specific health metrics. The patient leaves not only understanding their risk but also with actionable steps to improve their health. SHAP transformed the black-box model into a valuable tool for the doctor to diagnose and educate patients.

Finance: Explaining Credit Risk

Consider a loan officer evaluation an application for a business loan. They utilize a complex machine learning model to gauge the risk associated with granting the loan. The model classifies the applicant as "high-risk," but what should the loan officer do next?

Simply denying the loan based on a "high-risk" label would likely prompt questions from the applicant, such as, "Why was I deemed high-risk?" Without a transparent model, the loan officer is left without a concrete explanation and may also question the reliability of the model's prediction.

Now apply SHAP. The loan officer can see that a low credit score and high debt-to-income ratio are the primary factors contributing to the high-risk assessment. With this information in hand, the loan officer can confidently communicate the reasons for the decision to the applicant. Moreover, they can suggest specific areas for improvement, such as raising the credit score or lowering debt, before reapplying for the loan. SHAP has opened up the black-box model, turning it into a useful tool to enhance both decision-making and customer relations for the loan officer.

But what if the applicant raises their credit score and receives an even higher risk assessment? This would not be fair to the applicant. Unfortunately SHAP only provides us the means to explain a model's predictions -- we have no guarantees on expected or required behavior. Beyond just explaining the model, we want to make sure that increasing an applicant's credit score will always lower the predicted risk. SOTAI models enable such guarantees through feature constraints, which we discuss in more detail here.

Impact of SHAP on Decision-Making and Model Trust

SHAP provides a granular view of how each feature contributes to a model's predictions, enabling stakeholders pinpoint key factors that affect outcomes. By making predictions more transparent and understandable, SHAP helps to demystify and build trust in the decision-making process. This trust is vital for gaining stakeholder buy-in, which is crucial for the broader acceptance and deployment of AI solutions for making decisions.

Best Practices and Pitfalls to Avoid

There are some best practices and potential pitfalls you should be aware of to get the most our of your SHAP analyses.

Tips for Effectively Using SHAP

Handling Categorical Variables

Categorical variables often require special attention when interpreting models using SHAP. A common approach during modeling is to one-hot encode each category, which results in a separate feature for each category value (i.e. 1 if that category and 0 otherwise). While this increases the dimensionality of the feature space, it provides a more granular view of how each category influences the model's predictions. For a more holistic view, you may want to aggregate one-hot encoded features back into a single feature when charting SHAP values.

Dealing with Correlated Features

SHAP values can sometimes be misleading when features are highly correlated. In such cases, one feature might "steal" the importance of another, leading to skewed interpretations. It might be beneficial to either combine correlated features into a single feature or exclude one to ensure a more accurate attribution.

Common Mistakes to Avoid When Analyzing SHAP Values

Overconfidence in Low-Importance Features

A low SHAP value doesn't necessarily mean the feature is unimportant across all data points. It could be crucial for a subset of predictions. Always consider the context and domain knowledge when interpreting SHAP values.

Ignoring Feature Interactions

While SHAP values offer an average effect of each feature, they might not capture the entire story when features interact significantly. Be aware that more advanced SHAP plots or additional analyses might be required to uncover these interactions.

Ensuring Ethical Considerations in Model Explanations

While SHAP provides a pathway to model explainability, the ethical responsibility of using the model judiciously still lies with the practitioner. Ensure that your model doesn't reinforce existing biases in the data and be transparent about the limitations of your SHAP analyses.

By following these best practices and remaining aware of potential pitfalls, you'll be better equipped to use SHAP for meaningful and responsible model explainability.

Future Trends in AI Interpretability and Explainability

Evolving Landscape of Interpretable and Explainable AI

  1. Interpretable Models: The traditional trade-off between model complexity and interpretability is being constantly re-evaluated in the AI community. On one hand, there's a resurgence in the use of simpler, transparent models like linear regression and decision trees, which offer inherent interpretability at the cost of potentially lower performance. On the other hand, emerging approaches like calibrated lattice models are gaining attention for their ability to deliver high performance without sacrificing interpretability. These models bridge the gap between complexity and transparency, offering a compelling alternative to traditional black-box models.
  2. Ethical AI: As AI systems become more integrated into decision-making processes across various sectors, the ethical implications of these systems are coming to the forefront. There's a growing push towards creating models that not only perform well but also adhere to ethical standards.
  3. Regulatory Requirements: With regions like the European Union setting guidelines for AI transparency, we can expect similar legislation to emerge globally. Such regulations will necessitate more robust and explainable models, further driving advancements in interpretability techniques.
  4. Human-in-the-Loop Systems: Combining human expertise with AI's data-crunching capabilities is becoming increasingly popular. Explainability tools like SHAP will be crucial in facilitating effective communication between humans and AI, thereby leading to more informed decisions.
  5. Customized Interpretability: As businesses adopt AI solutions tailored to their specific needs, there's a growing demand for customized interpretability tools that can explain complex, domain-specific models.

Emerging Tools and Techniques

  1. Automated Interpretability: Automation in interpretability could provide real-time explanations for AI decisions, making the technology more accessible and easier to trust.
  2. Multi-Model Interpretability: As ensemble models and multi-modal learning gain popularity, interpretability tools will need to evolve to handle the complexity of these systems.
  3. Interactive Visualizations: Advanced visualization tools that allow users to interactively explore model decisions will become commonplace, offering intuitive ways to understand complex algorithms.
  4. Explainability-as-a-Service: With the increasing adoption of cloud-based AI solutions, we might see the rise of cloud platforms offering explainability as an added service.
  5. Integration of LLMs for Explainability: To make AI explanations more relatable to humans, future techniques might integrate large language models into interpretability methods, aiming to explain AI decisions in a way that humans can more naturally understand.


The demand for machine learning models that are not just highly performance but also interpretable is more pressing than ever. SHAP (SHapley Additive exPlanations) stands out as a key tool in the data scientists's current arsenal for achieving this goal. Built upon the robust mathematical foundation of Shapley values, SHAP offers an valuable framework for understanding the complex decision-making processes of black-box models. It equips you with the ability not just to make predictions but to explain them in a way that's both consistent and locally accurate.

If you're a budding data scientist or an established professional looking to bring transparency to your models, we encourage you to further explore SHAP. Resources like the SHAP library, coupled with the increasing wealth of tutorials and community support, make this endeavor feasible and rewarding.

As the AI landscape continues to evolve, responsible AI development that incorporates ethical considerations and transparency will only grow in importance. SHAP serves as a pivotal tool in this journey, helping us build not just more intelligent systems but more understandable and ethical ones.

Additional Resources

References and Recommended Reading

  • "A Unified Approach to Interpreting Model Predictions" by Scott Lundberg and Su-In Lee. This foundational paper introduced SHAP and is a must-read for those interested in the theoretical underpinnings.
  • "Interpretable Machine Learning" by Christoph Molnar. This book provides a comprehensive overview of interpretability in the machine learning context, including a section on SHAP.

Links to SHAP Documentation and Tutorials

  • SHAP GitHub Repository: This is the official GitHub repository containing the SHAP library, examples, and more.
  • SHAP Documentation: Comprehensive documentation that covers everything from installation to advanced usage.
  • Tutorials on Medium: Various tutorials are available that walk you through practical implementations of SHAP in Python.
  • Check out SOTAI's SHAPshot for easily managing, viewing, explaining, and sharing your SHAP package results.