ML之LoR:基于信用卡数据集利用LoR逻辑回归算法实现如何开发通用信用风险评分卡模型之以scorecardpy框架全流程讲解


睫毛高大
睫毛高大 2022-09-19 10:23:47 51242
分类专栏: 资讯

ML之LoR:基于信用卡数据集利用LoR逻辑回归算法实现如何开发通用信用风险评分卡模型之以scorecardpy框架全流程讲解

目录

基于信用卡数据集利用LoR逻辑回归算法实现如何开发通用信用风险评分卡模型之全流程讲解

1、定义数据集

1.1、查看部分数据

1.2、统计所有变量类型、个数等信息

2、数据预处理

2.1、变量筛选

2.2、分析Woe变量分箱

T1、自动分箱—利用woebin()函数

T2、手动分箱—利用自定义breaks_list参数即可

2.3、分析变量分箱后可视化—观察是否存在单调性

2.4、对变量执行woe分箱变换

3、模型训练

3.1、切分数据集

3.2、划分自变量和因变量

3.3、模型建立、训练、预测:建立逻辑回归模型

3.4、模型评估

4、模型上线并监控

4.1、模型推理—计算信用得分

4.2、线上模型评估—评分稳定性评估PSI


相关文章
ML之LoR:基于信用卡数据集利用LoR逻辑回归算法实现如何开发通用信用风险评分卡模型之以scorecardpy框架全流程讲解
ML之LoR:基于信用卡数据集利用LoR逻辑回归算法实现如何开发通用信用风险评分卡模型之以scorecardpy框架全流程讲解代码实现

基于信用卡数据集利用LoR逻辑回归算法实现如何开发通用信用风险评分卡模型之全流程讲解

1、定义数据集

加载德国信用卡数据集,将由一组属性描述的债务人分类为良好或不良信用风险的信用数据。 
数据集UCI Machine Learning Repository: Data Set

1.1、查看部分数据

status.of.existing.checking.accountduration.in.monthcredit.historypurposecredit.amountsavings.account.and.bondspresent.employment.sinceinstallment.rate.in.percentage.of.disposable.incomepersonal.status.and.sexother.debtors.or.guarantorspresent.residence.sincepropertyage.in.yearsother.installment.planshousingnumber.of.existing.credits.at.this.bankjobnumber.of.people.being.liable.to.provide.maintenance.fortelephoneforeign.workercreditability
0... < 0 DM6critical account/ other credits existing (not at this bank)radio/television1169unknown/ no savings account... >= 7 years4male : divorced/separatednone4real estate67noneown2skilled employee / official1yes, registered under the customers nameyesgood
10 <= ... < 200 DM48existing credits paid back duly till nowradio/television5951... < 100 DM1 <= ... < 4 years2male : divorced/separatednone2real estate22noneown1skilled employee / official1noneyesbad
2no checking account12critical account/ other credits existing (not at this bank)education2096... < 100 DM4 <= ... < 7 years2male : divorced/separatednone3real estate49noneown1unskilled - resident2noneyesgood
3... < 0 DM42existing credits paid back duly till nowfurniture/equipment7882... < 100 DM4 <= ... < 7 years2male : divorced/separatedguarantor4building society savings agreement/ life insurance45nonefor free1skilled employee / official2noneyesgood
4... < 0 DM24delay in paying off in the pastcar (new)4870... < 100 DM1 <= ... < 4 years3male : divorced/separatednone4unknown / no property53nonefor free2skilled employee / official2noneyesbad
5no checking account36existing credits paid back duly till noweducation9055unknown/ no savings account1 <= ... < 4 years2male : divorced/separatednone4unknown / no property35nonefor free1unskilled - resident2yes, registered under the customers nameyesgood
6no checking account24existing credits paid back duly till nowfurniture/equipment2835500 <= ... < 1000 DM... >= 7 years3male : divorced/separatednone4building society savings agreement/ life insurance53noneown1skilled employee / official1noneyesgood
70 <= ... < 200 DM36existing credits paid back duly till nowcar (used)6948... < 100 DM1 <= ... < 4 years2male : divorced/separatednone2car or other, not in attribute Savings account/bonds35nonerent1management/ self-employed/ highly qualified employee/ officer1yes, registered under the customers nameyesgood
8no checking account12existing credits paid back duly till nowradio/television3059... >= 1000 DM4 <= ... < 7 years2male : divorced/separatednone4real estate61noneown1unskilled - resident1noneyesgood
90 <= ... < 200 DM30critical account/ other credits existing (not at this bank)car (new)5234... < 100 DMunemployed4male : divorced/separatednone2car or other, not in attribute Savings account/bonds28noneown2management/ self-employed/ highly qualified employee/ officer1noneyesbad
100 <= ... < 200 DM12existing credits paid back duly till nowcar (new)1295... < 100 DM... < 1 year3male : divorced/separatednone1car or other, not in attribute Savings account/bonds25nonerent1skilled employee / official1noneyesbad
11... < 0 DM48existing credits paid back duly till nowbusiness4308... < 100 DM... < 1 year3male : divorced/separatednone4building society savings agreement/ life insurance24nonerent1skilled employee / official1noneyesbad
120 <= ... < 200 DM12existing credits paid back duly till nowradio/television1567... < 100 DM1 <= ... < 4 years1male : divorced/separatednone1car or other, not in attribute Savings account/bonds22noneown1skilled employee / official1yes, registered under the customers nameyesgood
13... < 0 DM24critical account/ other credits existing (not at this bank)car (new)1199... < 100 DM... >= 7 years4male : divorced/separatednone4car or other, not in attribute Savings account/bonds60noneown2unskilled - resident1noneyesbad
14... < 0 DM15existing credits paid back duly till nowcar (new)1403... < 100 DM1 <= ... < 4 years2male : divorced/separatednone4car or other, not in attribute Savings account/bonds28nonerent1skilled employee / official1noneyesgood
15... < 0 DM24existing credits paid back duly till nowradio/television1282100 <= ... < 500 DM1 <= ... < 4 years4male : divorced/separatednone2car or other, not in attribute Savings account/bonds32noneown1unskilled - resident1noneyesbad
16no checking account24critical account/ other credits existing (not at this bank)radio/television2424unknown/ no savings account... >= 7 years4male : divorced/separatednone4building society savings agreement/ life insurance53noneown2skilled employee / official1noneyesgood
17... < 0 DM30no credits taken/ all credits paid back dulybusiness8072unknown/ no savings account... < 1 year2male : divorced/separatednone3car or other, not in attribute Savings account/bonds25bankown3skilled employee / official1noneyesgood
180 <= ... < 200 DM24existing credits paid back duly till nowcar (used)12579... < 100 DM... >= 7 years4male : divorced/separatednone2unknown / no property44nonefor free1management/ self-employed/ highly qualified employee/ officer1yes, registered under the customers nameyesbad
19no checking account24existing credits paid back duly till nowradio/television3430500 <= ... < 1000 DM... >= 7 years3male : divorced/separatednone2car or other, not in attribute Savings account/bonds31noneown1skilled employee / official2yes, registered under the customers nameyesgood

1.2、统计所有变量类型、个数等信息

  1. <class 'pandas.core.frame.DataFrame'>
  2. RangeIndex: 1000 entries, 0 to 999
  3. Data columns (total 21 columns):
  4. Column Non-Null Count Dtype
  5. --- ------ -------------- -----
  6. 0 status.of.existing.checking.account 1000 non-null category
  7. 1 duration.in.month 1000 non-null int64
  8. 2 credit.history 1000 non-null category
  9. 3 purpose 1000 non-null object
  10. 4 credit.amount 1000 non-null int64
  11. 5 savings.account.and.bonds 1000 non-null category
  12. 6 present.employment.since 1000 non-null category
  13. 7 installment.rate.in.percentage.of.disposable.income 1000 non-null int64
  14. 8 personal.status.and.sex 1000 non-null category
  15. 9 other.debtors.or.guarantors 1000 non-null category
  16. 10 present.residence.since 1000 non-null int64
  17. 11 property 1000 non-null category
  18. 12 age.in.years 1000 non-null int64
  19. 13 other.installment.plans 1000 non-null category
  20. 14 housing 1000 non-null category
  21. 15 number.of.existing.credits.at.this.bank 1000 non-null int64
  22. 16 job 1000 non-null category
  23. 17 number.of.people.being.liable.to.provide.maintenance.for 1000 non-null int64
  24. 18 telephone 1000 non-null category
  25. 19 foreign.worker 1000 non-null category
  26. 20 creditability 1000 non-null object
  27. dtypes: category(12), int64(7), object(2)
  28. memory usage: 84.0+ KB

2、数据预处理

2.1、变量筛选

利用var_filter函数根据变量的缺失率、IV值、等价值率等因素进行筛选,并指定目标变量y

  1. var_filter(dt, y, x=None, iv_limit=0.02, missing_limit=0.95,
  2. identical_limit=0.95, var_rm=None, var_kp=None,
  3. return_rm_reason=False, positive='bad|1')
  4. '''
  5. 函数功能:即当某个变量的 IV 值iv_limit小于0.02,或缺失率missing_limit大于95%,或同值率(除空值外)identical_limit大于95%,则剔除掉该变量。
  6. 体参数如下:可跳到该函数查询
  7. varrm:可设置强制保留的变量,默认为空;
  8. varkp:可设置强制剔除的变量,默认为空;
  9. return_rm_reason:可设置是否返回剔除原因,默认为不返回(False);
  10. positive:可设置坏样本对应的值,默认为“bad|1”。
  11. '''
age.in.yearsother.debtors.or.guarantorssavings.accoun

网站声明:如果转载,请联系本站管理员。否则一切后果自行承担。

本文链接:https://www.xckfsq.com/news/show.html?id=1753
赞同 0
评论 0 条
睫毛高大L0
粉丝 0 发表 9 + 关注 私信
上周热门
如何使用 StarRocks 管理和优化数据湖中的数据?  2672
【软件正版化】软件正版化工作要点  2637
统信UOS试玩黑神话:悟空  2532
信刻光盘安全隔离与信息交换系统  2216
镜舟科技与中启乘数科技达成战略合作,共筑数据服务新生态  1092
grub引导程序无法找到指定设备和分区  743
WPS City Talk · 校招西安站来了!  15
金山办公2024算法挑战赛 | 报名截止日期更新  15
看到某国的寻呼机炸了,就问你用某水果手机发抖不?  14
有在找工作的IT人吗?  13
本周热议
我的信创开放社区兼职赚钱历程 40
今天你签到了吗? 27
信创开放社区邀请他人注册的具体步骤如下 15
如何玩转信创开放社区—从小白进阶到专家 15
方德桌面操作系统 14
我有15积分有什么用? 13
用抖音玩法闯信创开放社区——用平台宣传企业产品服务 13
如何让你先人一步获得悬赏问题信息?(创作者必看) 12
2024中国信创产业发展大会暨中国信息科技创新与应用博览会 9
中央国家机关政府采购中心:应当将CPU、操作系统符合安全可靠测评要求纳入采购需求 8

加入交流群

请使用微信扫一扫!