DL之DNN优化技术:自定义MultiLayerNet【5*100+ReLU】对MNIST数据集训练进而比较三种权重初始值(Xavier参数初始化、He参数初始化)性能差异


服饰细心
服饰细心 2022-09-19 15:09:30 64856
分类专栏: 资讯

DL之DNN优化技术:自定义MultiLayerNet【5*100+ReLU】对MNIST数据集训练进而比较三种权重初始值(Xavier参数初始化、He参数初始化)性能差异

导读
思路:观察不同的权重初始值(std=0.01、Xavier初始值、He初始值)的赋值进行实验,会在多大程度上影响神经网络的学习。
结论:std=0.01时完全无法进行学习,是因为正向传播中传递的值很小(集中在0附近的数据)。因此,逆向传播时求到的梯度也很小,权重几乎不进行更新。相反,当权重初始值为Xavier初始值和He初始值时,学习进行得很顺利。并且,我们发现He初始值时的学习进度更快一些
总结:在神经网络的学习中,权重初始值非常重要。很多时候权重初始值的设定关系到神经网络的学习能否成功。权重初始值的重要性容易被忽视,而任何事情的开始(初始值)总是关键的。

目录

输出结果

设计思路

核心代码


输出结果

  1. ===========iteration:0===========
  2. std=0.01:2.302533896615576
  3. Xavier:2.301592862642649
  4. He:2.452819600404312
  5. ===========iteration:100===========
  6. std=0.01:2.3021427450183882
  7. Xavier:2.2492771742332085
  8. He:1.614645290697084
  9. ===========iteration:200===========
  10. std=0.01:2.3019226530108763
  11. Xavier:2.142875264754691
  12. He:0.8883226546097108
  13. ===========iteration:300===========
  14. std=0.01:2.3021797231413514
  15. Xavier:1.801154569414849
  16. He:0.5779849031641334
  17. ===========iteration:400===========
  18. std=0.01:2.3012695247928474
  19. Xavier:1.3899007227604079
  20. He:0.41014765063844627
  21. ===========iteration:500===========
  22. std=0.01:2.3007728429528314
  23. Xavier:0.9069490262118367
  24. He:0.33691702821838565
  25. ===========iteration:600===========
  26. std=0.01:2.298961977446477
  27. Xavier:0.7562167106493611
  28. He:0.3818234934485747
  29. ===========iteration:700===========
  30. std=0.01:2.3035037771527715
  31. Xavier:0.5636724725221689
  32. He:0.21607562992114449
  33. ===========iteration:800===========
  34. std=0.01:2.3034607224422023
  35. Xavier:0.5658840865099287
  36. He:0.33168882912900743
  37. ===========iteration:900===========
  38. std=0.01:2.305051548224051
  39. Xavier:0.588201820904584
  40. He:0.2569635828759095
  41. ===========iteration:1000===========
  42. std=0.01:2.2994594023429755
  43. Xavier:0.4185962336886156
  44. He:0.20020701131406038
  45. ===========iteration:1100===========
  46. std=0.01:2.2981894831572904
  47. Xavier:0.3963740567004913
  48. He:0.25746657996551603
  49. ===========iteration:1200===========
  50. std=0.01:2.2953607843932193
  51. Xavier:0.41330568558866765
  52. He:0.2796398422265146
  53. ===========iteration:1300===========
  54. std=0.01:2.2964967978545396
  55. Xavier:0.39618376387851506
  56. He:0.2782019670206384
  57. ===========iteration:1400===========
  58. std=0.01:2.299861702734514
  59. Xavier:0.24832216447348573
  60. He:0.1512273585162205
  61. ===========iteration:1500===========
  62. std=0.01:2.3006214773891234
  63. Xavier:0.3596899255315174
  64. He:0.2719352219860638
  65. ===========iteration:1600===========
  66. std=0.01:2.298109767745866
  67. Xavier:0.35977950572647455
  68. He:0.2650267112104039
  69. ===========iteration:1700===========
  70. std=0.01:2.301979953517381
  71. Xavier:0.23664052932406424
  72. He:0.13415720105707601
  73. ===========iteration:1800===========
  74. std=0.01:2.299083895357553
  75. Xavier:0.2483172887982285
  76. He:0.14187181238369628
  77. ===========iteration:1900===========
  78. std=0.01:2.305385198129093
  79. Xavier:0.3655424067819445
  80. He:0.21497438379944553

设计思路

核心代码

  1. class MultiLayerNet:
  2. '……'
  3. def predict(self, x):
  4. for layer in self.layers.values():
  5. x = layer.forward(x)
  6. return x
  7. def loss(self, x, t):
  8. y = self.predict(x)
  9. weight_decay = 0
  10. for idx in range(1, self.hidden_layer_num + 2):
  11. W = self.params['W' + str(idx)]
  12. weight_decay += 0.5 * self.weight_decay_lambda * np.sum(W ** 2)
  13. return self.last_layer.forward(y, t) + weight_decay
  14. def accuracy(self, x, t):
  15. y = self.predict(x)
  16. y = np.argmax(y, axis=1)
  17. if t.ndim != 1 :
  18. t = np.argmax(t, axis=1)
  19. accuracy = np.sum(y == t) / float(x.shape[0]) 计算accuracy并返回
  20. return accuracy
  21. def numerical_gradient(self, x, t): T1、numerical_gradient()函数:数值微分法求梯度
  22. loss_W = lambda W: self.loss(x, t)
  23. grads = {}
  24. for idx in range(1, self.hidden_layer_num+2):
  25. grads['W' + str(idx)] = numerical_gradient(loss_W, self.params['W' + str(idx)])
  26. grads['b' + str(idx)] = numerical_gradient(loss_W, self.params['b' + str(idx)])
  27. return grads
  28. def gradient(self, x, t):
  29. self.loss(x, t)
  30. dout = 1
  31. dout = self.last_layer.backward(dout)
  32. layers = list(self.layers.values())
  33. layers.reverse()
  34. for layer in layers:
  35. dout = layer.backward(dout)
  36. grads = {}
  37. for idx in range(1, self.hidden_layer_num+2):
  38. grads['W' + str(idx)] = self.layers['Affine' + str(idx)].dW + self.weight_decay_lambda * self.layers['Affine' + str(idx)].W
  39. grads['b' + str(idx)] = self.layers['Affine' + str(idx)].db
  40. return grads
  41. networks = {}
  42. train_loss = {}
  43. for key, weight_type in weight_init_types.items():
  44. networks[key] = MultiLayerNet(input_size=784, hidden_size_list=[100, 100, 100, 100],
  45. output_size=10, weight_init_std=weight_type)
  46. train_loss[key] = []
  47. for i in range(max_iterations):
  48. 定义x_batch、t_batch
  49. batch_mask = np.random.choice(train_size, batch_size)
  50. x_batch = x_train[batch_mask]
  51. t_batch = t_train[batch_mask]
  52. for key in weight_init_types.keys():
  53. grads = networks[key].gradient(x_batch, t_batch)
  54. optimizer.update(networks[key].params, grads)
  55. loss = networks[key].loss(x_batch, t_batch)
  56. train_loss[key].append(loss)
  57. if i % 100 == 0:
  58. print("===========" + "iteration:" + str(i) + "===========")
  59. for key in weight_init_types.keys():
  60. loss = networks[key].loss(x_batch, t_batch)
  61. print(key + ":" + str(loss))

相关文章
DL之DNN:自定义MultiLayerNet【5*100+ReLU】对MNIST数据集训练进而比较三种权重初始值性能差异

网站声明:如果转载,请联系本站管理员。否则一切后果自行承担。

本文链接:https://www.xckfsq.com/news/show.html?id=3187
赞同 0
评论 0 条
服饰细心L0
粉丝 0 发表 7 + 关注 私信
上周热门
Kingbase用户权限管理  2016
信刻全自动光盘摆渡系统  1745
信刻国产化智能光盘柜管理系统  1415
银河麒麟添加网络打印机时,出现“client-error-not-possible”错误提示  1011
银河麒麟打印带有图像的文档时出错  919
银河麒麟添加打印机时,出现“server-error-internal-error”  710
麒麟系统也能完整体验微信啦!  653
统信桌面专业版【如何查询系统安装时间】  628
统信操作系统各版本介绍  619
统信桌面专业版【全盘安装UOS系统】介绍  593
本周热议
我的信创开放社区兼职赚钱历程 40
今天你签到了吗? 27
信创开放社区邀请他人注册的具体步骤如下 15
如何玩转信创开放社区—从小白进阶到专家 15
方德桌面操作系统 14
我有15积分有什么用? 13
用抖音玩法闯信创开放社区——用平台宣传企业产品服务 13
如何让你先人一步获得悬赏问题信息?(创作者必看) 12
2024中国信创产业发展大会暨中国信息科技创新与应用博览会 9
中央国家机关政府采购中心:应当将CPU、操作系统符合安全可靠测评要求纳入采购需求 8

添加我为好友,拉您入交流群!

请使用微信扫一扫!