paddlets.xai.post_hoc.shap_explainer

class ShapExplainer(model: Optional[Union[PaddleBaseModel, Pipeline]], background_data: TSDataset, background_sample_number: Union[None, int] = None, shap_method: str = 'kernel', task_type: str = 'regression', seed: int = 123, use_paddleloader: bool = False, **kwargs)[source]

Bases: BaseExplainer

Shap explainer. This class only (currently) supports regression model of forecasting task. It uses shap value to provide the contribution value of model input to model output. For shap, please see https://github.com/slundberg/shap.

Parameters

model (PaddleBaseModel|Pipeline) – A model object that supports predict function.
background_data (TSDataset) – A TSDataset for training the shap explainer
background_sample_number (int) – number of instances sampled from the background_data
shap_method (str) – The shap method to apply. Optionally, {‘kernel’, ‘deep’}.
task_type (str) – Task type of the model. Only support the regression task.
seed (int) – random seed.
use_paddleloader (bool) – Only effective when the model is of type PaddleBaseModel.
kwargs – Optionally, additional keyword arguments passed to shap_method.

explain(foreground_data: TSDataset, nsamples: int = 100, sample_start_index: int = - 1, sample_num: int = 1, **kwargs) → ndarray[source]

Calculate the explanatory value of the test sample.

Parameters

foreground_data (TSDataset) – test data.
nsamples (int) – Number of times to re-evaluate the model when explaining each prediction. More samples lead to lower variance estimates of the SHAP values. Only used in shap_method=kernel. Default nsamples=100.
sample_start_index (int) – The sample start index of the test data. Default the latest sample.
sample_num (int) – The sample number of the test data.
kwargs – Optionally, additional keyword arguments passed to shap.explainer.shap_values.

Returns

np.ndarray object(out_chunk_len, samples, in_chunk_len + out_chunk_len(known_cov input), feature dims)

get_explanation(out_chunk_index: int = 1, sample_index: int = 0) → ndarray[source]

Get the explanatory output of a certain time point in the prediction length.

Parameters

out_chunk_index (int) – The certain time point in the prediction length.
sample_index (int) – The sample index of the explanatory value. Default the first sample.

Returns

np.ndarray object(in_chunk_len + out_chunk_len(known_cov input), feature dims)

plot(method: Optional[Union[str, List[str]]] = None, sample_index: int = 0, **kwargs) → None[source]

Display the shap value of different dimensions. Such as ‘OI’(output time dimension vs input time dimension), ‘OV’(output time dimension vs variable dimension), ‘IV’(input time dimension vs variable dimension), ‘I’(input time dimension), and ‘V’(variable dimension).

Parameters

method (str|List(str)) – display method. Optional, {‘OI’, ‘OV’, ‘IV’, ‘I’, ‘V’}.
sample_index (int) – The sample index of the explanatory value. Default the first sample.
kwargs – other parameters.

Returns

None

force_plot(out_chunk_indice: Optional[Union[int, List[int]]] = 1, sample_index: int = 0, **kwargs) → None[source]

Visualize the given SHAP values with an additive force layout.

Parameters

out_chunk_indice (int) – The certain time point in the prediction length.
sample_index (int) – The sample index of the explanatory value. Default the first sample.
kwargs – Optionally, additional keyword arguments passed to shap.force_plot.

Returns

None

summary_plot(out_chunk_indice: Optional[Union[int, List[int]]] = 1, sample_index: int = 0, **kwargs) → None[source]

Create a SHAP feature importance based on previously interpreted samples.

Parameters

out_chunk_indice (int) – The certain time point in the prediction length.
sample_index (int) – The sample index of the explanatory value. Default the first sample.
kwargs – Optionally, additional keyword arguments passed to shap.summary_plot.

Returns

None