79583922

Date: 2025-04-21 01:38:30
Score: 0.5
Natty:
Report link

I suspect that the issue is due to the fact that the shap_values array has slight differences in its output format depending on the model used (e.g., XGBoost vs. RandomForestClassifier).

You can successfully generate SHAP analysis plots simply by adjusting the dimensions of the shap_values array.

Since I don't have your data, I generated a sample dataset as an example for your reference:

import numpy as np
import pandas as pd
import shap
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

# Generate sample data
np.random.seed(42)
features = pd.DataFrame({
    "feature_1": np.random.randint(18, 70, size=100),
    "feature_2": np.random.randint(30000, 100000, size=100),
    "feature_3": np.random.randint(1, 4, size=100), 
    "feature_4": np.random.randint(300, 850, size=100),
    "feature_5": np.random.randint(1000, 50000, size=100)
})
target = np.random.randint(0, 2, size=100)
features_names = features.columns.tolist()

# The following code is just like your example.
X_train, X_test, y_train, y_test = train_test_split(features, target, test_size=0.2, random_state=42)
rf_model = RandomForestClassifier(n_estimators=100, random_state=42)
rf_model.fit(X_train, y_train)
y_pred = rf_model.predict(X_test)
explainer = shap.TreeExplainer(rf_model)
shap_values = explainer.shap_values(X_test)

# Adjust the dimensions of the shap_values object.
shap.summary_plot(shap_values[:,:,0], X_test, feature_names=features_names)
shap.summary_plot(shap_values[:,:,0], X_test, feature_names=features_names, plot_type="bar")

enter image description here

enter image description here

With the above, you can successfully run the SHAP analysis by simply adjusting shap_values to shap_values[:,:,0].
As for what the third dimension of shap_values represents when using RandomForestClassifier, you can explore it further on your own.

Reasons:
  • Blacklisted phrase (1): enter image description here
  • Long answer (-1):
  • Has code block (-0.5):
  • Low reputation (1):
Posted by: 陳俊方