Date	Team	Opponent	Goal Scored	Ball Possession %	Attempts	On-Target	Off-Target	Blocked	Corners	Offsides	Free Kicks	Saves	Pass Accuracy %	Passes	Distance Covered (Kms)	Fouls Committed	Yellow Card	Yellow & Red	Red	Man of the Match	1st Goal	Round	PSO	Goals in PSO	Own goals	Own goal Time
14-06-2018	Russia	Saudi Arabia	5	40	13	7	3	3	6	3	11	0	78	306	118	22	0	0	0	Yes	12	Group Stage	No	0
14-06-2018	Saudi Arabia	Russia	0	60	6	0	3	3	2	1	25	2	86	511	105	10	0	0	0	No		Group Stage	No	0
15-06-2018	Egypt	Uruguay	0	43	8	3	3	2	0	1	7	3	78	395	112	12	2	0	0	No		Group Stage	No	0
15-06-2018	Uruguay	Egypt	1	57	14	4	6	4	5	1	13	3	86	589	111	6	0	0	0	Yes	89	Group Stage	No	0
15-06-2018	Morocco	Iran	0	64	13	3	6	4	5	0	14	2	86	433	101	22	1	0	0	No		Group Stage	No	0	1	90

2、数据预处理

2.1、分离特征与标签


df_X    Goal Scored  Ball Possession -operator">%  Attempts  ...  Yellow -operator">& Red  Red  Goals in PSO
0            5                 40        13  ...             0    0             0
1            0                 60         6  ...             0    0             0
2            0                 43         8  ...             0    0             0
3            1                 57        14  ...             0    0             0
4            0                 64        13  ...             0    0             0
 
[5 rows x 18 columns]
df_y 0     True
1    False
2    False
3     True
4    False
Name: Man of the Match, dtype: bool

3、模型建立和训练

3.1、数据集切分
3.2、模型训练

4、模型特征重要性解释可视化

4.1、单个样本基于shap值进行解释可视化

(1)、挑选某条样本数据并转为array格式


输出当前测试样本：5 
 Goal Scored                 2
Ball Possession %          38
Attempts                   13
On-Target                   7
Off-Target                  4
Blocked                     2
Corners                     6
Offsides                    1
Free Kicks                 18
Saves                       1
Pass Accuracy %            69
Passes                    399
Distance Covered (Kms)    148
Fouls Committed            25
Yellow Card                 1
Yellow & Red                0
Red                         0
Goals in PSO                3
Name: 118, dtype: int64
输出当前测试样本的真实label： False
输出当前测试样本的的预测概率： [[0.29 0.71]]


输出当前测试样本：7 
 Goal Scored                 0
Ball Possession %          53
Attempts                   16
On-Target                   4
Off-Target                 10
Blocked                     2
Corners                     7
Offsides                    1
Free Kicks                 20
Saves                       1
Pass Accuracy %            77
Passes                    466
Distance Covered (Kms)    107
Fouls Committed            23
Yellow Card                 1
Yellow & Red                0
Red                         0
Goals in PSO                0
Name: 35, dtype: int64
输出当前测试样本的真实label： False
输出当前测试样本的的预测概率： [[0.56 0.44]]