写给程序员的机器学习入门 (二) - pytorch 与矩阵计算入门（二）

风晓 2023-12-31 09:49:15  53478 赞同 0 反对 0

分类：资源

接上一篇写给程序员的机器学习入门 (二) - pytorch 与矩阵计算入门（一）

pytorch 的损失计算器封装 (loss function)

pytorch 提供了几种常见的损失计算器的封装，我们最开始看到的也称 L1 损失 (L1 Loss)，表示所有预测输出与正确输出的相差的绝对值的平均 (有的场景会有多个输出)，以下是使用 L1 损失的例子：

# 定义参数
>>> w = torch.tensor(1.0, requires_grad=True)
>>> b = torch.tensor(0.0, requires_grad=True)

# 定义输入和输出的 tensor
# 注意 pytorch 提供的损失计算器要求预测输出和正确输出均为浮点数，所以定义输入与输出的时候也需要用浮点数
>>> x = torch.tensor(2.0)
>>> y = torch.tensor(5.0)

# 创建损失计算器
>>> loss_function = torch.nn.L1Loss()

# 计算预测输出
>>> p = x * w + b
>>> p
tensor(2., grad_fn=<AddBackward0>)

# 计算损失
# 等同于 (p - y).abs().mean()
>>> l = loss_function(p, y)
>>> l
tensor(3., grad_fn=<L1LossBackward>)

而计算相差值的平方作为损失称为 MSE 损失 (Mean Squared Error)，有的地方又称 L2 损失，以下是使用 MSE 损失的例子：

# 定义参数
>>> w = torch.tensor(1.0, requires_grad=True)
>>> b = torch.tensor(0.0, requires_grad=True)

# 定义输入和输出的 tensor
>>> x = torch.tensor(2.0)
>>> y = torch.tensor(5.0)

# 创建损失计算器
>>> loss_function = torch.nn.MSELoss()

# 计算预测输出
>>> p = x * w + b
>>> p
tensor(2., grad_fn=<AddBackward0>)

# 计算损失
# 等同于 ((p - y) ** 2).mean()
>>> l = loss_function(p, y)
>>> l
tensor(9., grad_fn=<MseLossBackward>)

方便叭🙂️，如果你想看更多的损失计算器可以参考以下地址：

https://pytorch.org/docs/stable/nn.html#loss-functions

pytorch 的参数调整器封装 (optimizer)

pytorch 还提供了根据导函数值调整参数的调整器封装，我们在这两篇文章中看到的方法 (随机初始化参数值，然后根据导函数值 * 学习比率调整参数减少损失) 又称随机梯度下降法 (Stochastic Gradient Descent)，以下是使用封装好的调整器的例子：

# 定义参数
>>> w = torch.tensor(1.0, requires_grad=True)
>>> b = torch.tensor(0.0, requires_grad=True)

# 定义输入和输出的 tensor
>>> x = torch.tensor(2.0)
>>> y = torch.tensor(5.0)

# 创建损失计算器
>>> loss_function = torch.nn.MSELoss()

# 创建参数调整器
# 需要传入参数列表和指定学习比率，这里的学习比率是 0.01
>>> optimizer = torch.optim.SGD([w, b], lr=0.01)

# 计算预测输出
>>> p = x * w + b
>>> p
tensor(2., grad_fn=<AddBackward0>)

# 计算损失
>>> l = loss_function(p, y)
>>> l
tensor(9., grad_fn=<MseLossBackward>)

# 从损失自动微分求导函数值
>>> l.backward()

# 确认参数的导函数值
>>> w.grad
tensor(-12.)
>>> b.grad
tensor(-6.)

# 使用参数调整器调整参数
# 等同于:
# with torch.no_grad():
#     w -= w.grad * learning_rate
#     b -= b.grad * learning_rate
optimizer.step()

# 清空导函数值
# 等同于:
# w.grad.zero_()
# b.grad.zero_()
optimizer.zero_grad()

# 确认调整后的参数
>>> w
tensor(1.1200, requires_grad=True)
>>> b
tensor(0.0600, requires_grad=True)
>>> w.grad
tensor(0.)
>>> b.grad
tensor(0.)

SGD 参数调整器的学习比率是固定的，如果我们想在学习过程中自动调整学习比率，可以使用其他参数调整器，例如 Adam 调整器。此外，你还可以开启冲量 (momentum) 选项改进学习速度，该选项开启后可以在参数调整时参考前一次调整的方向 (正负)，如果相同则调整更多，而不同则调整更少。

如果你对 Adam 调整器的实现和冲量的实现有兴趣，可以参考以下文章 (需要一定的数学知识):

https://mlfromscratch.com/optimizers-explained

如果你想查看 pytorch 提供的其他参数调整器可以访问以下地址：

https://pytorch.org/docs/stable/optim.html

使用 pytorch 实现上一篇文章的例子

好了，学到这里我们应该对 pytorch 的基本操作有一定了解，现在我们来试试用 pytorch 实现上一篇文章最后的例子。

上一篇文章最后的例子代码如下：

# 定义参数
weight = 1
bias = 0

# 定义学习比率
learning_rate = 0.01

# 准备训练集，验证集和测试集
traning_set = [(2, 5), (5, 11), (6, 13), (7, 15), (8, 17)]
validating_set = [(12, 25), (1, 3)]
testing_set = [(9, 19), (13, 27)]

# 记录 weight 与 bias 的历史值
weight_history = [weight]
bias_history = [bias]

for epoch in range(1, 10000):
    print(f"epoch: {epoch}")

    # 根据训练集训练并修改参数
    for x, y in traning_set:
        # 计算预测值
        predicted = x * weight + bias
        # 计算损失
        diff = predicted - y
        loss = diff ** 2
        # 打印除错信息
        print(f"traning x: {x}, y: {y}, predicted: {predicted}, loss: {loss}, weight: {weight}, bias: {bias}")
        # 计算导函数值
        derivative_weight = 2 * diff * x
        derivative_bias = 2 * diff
        # 修改 weight 和 bias 以减少 loss
        # diff 为正时代表预测输出 > 正确输出，会减少 weight 和 bias
        # diff 为负时代表预测输出 < 正确输出，会增加 weight 和 bias
        weight -= derivative_weight * learning_rate
        bias -= derivative_bias * learning_rate
        # 记录 weight 和 bias 的历史值
        weight_history.append(weight)
        bias_history.append(bias)

    # 检查验证集
    validating_accuracy = 0
    for x, y in validating_set:
        predicted = x * weight + bias
        validating_accuracy += 1 - abs(y - predicted) / y
        print(f"validating x: {x}, y: {y}, predicted: {predicted}")
    validating_accuracy /= len(validating_set)

    # 如果验证集正确率大于 99 %，则停止训练
    print(f"validating accuracy: {validating_accuracy}")
    if validating_accuracy > 0.99:
        break

# 检查测试集
testing_accuracy = 0
for x, y in testing_set:
    predicted = x * weight + bias
    testing_accuracy += 1 - abs(y - predicted) / y
    print(f"testing x: {x}, y: {y}, predicted: {predicted}")
testing_accuracy /= len(testing_set)
print(f"testing accuracy: {testing_accuracy}")

# 显示 weight 与 bias 的变化
from matplotlib import pyplot
pyplot.plot(weight_history, label="weight")
pyplot.plot(bias_history, label="bias")
pyplot.legend()
pyplot.show()

使用 pytorch 实现后代码如下:

# 引用 pytorch
import torch

# 定义参数
weight = torch.tensor(1.0, requires_grad=True)
bias = torch.tensor(0.0, requires_grad=True)

# 创建损失计算器
loss_function = torch.nn.MSELoss()

# 创建参数调整器
optimizer = torch.optim.SGD([weight, bias], lr=0.01)

# 准备训练集，验证集和测试集
traning_set = [
    (torch.tensor(2.0), torch.tensor(5.0)),
    (torch.tensor(5.0), torch.tensor(11.0)),
    (torch.tensor(6.0), torch.tensor(13.0)),
    (torch.tensor(7.0), torch.tensor(15.0)),
    (torch.tensor(8.0), torch.tensor(17.0))
]
validating_set = [
    (torch.tensor(12.0), torch.tensor(25.0)),
    (torch.tensor(1.0), torch.tensor(3.0))
]
testing_set = [
    (torch.tensor(9.0), torch.tensor(19.0)),
    (torch.tensor(13.0), torch.tensor(27.0))
]

# 记录 weight 与 bias 的历史值
weight_history = [weight.item()]
bias_history = [bias.item()]

for epoch in range(1, 10000):
    print(f"epoch: {epoch}")

    # 根据训练集训练并修改参数
    for x, y in traning_set:
        # 计算预测值
        predicted = x * weight + bias
        # 计算损失
        loss = loss_function(predicted, y)
        # 打印除错信息
        print(f"traning x: {x}, y: {y}, predicted: {predicted}, loss: {loss}, weight: {weight}, bias: {bias}")
        # 从损失自动微分求导函数值
        loss.backward()
        # 使用参数调整器调整参数
        optimizer.step()
        # 清空导函数值
        optimizer.zero_grad()
        # 记录 weight 和 bias 的历史值
        weight_history.append(weight.item())
        bias_history.append(bias.item())

    # 检查验证集
    validating_accuracy = 0
    for x, y in validating_set:
        predicted = x * weight.item() + bias.item()
        validating_accuracy += 1 - abs(y - predicted) / y
        print(f"validating x: {x}, y: {y}, predicted: {predicted}")
    validating_accuracy /= len(validating_set)

    # 如果验证集正确率大于 99 %，则停止训练
    print(f"validating accuracy: {validating_accuracy}")
    if validating_accuracy > 0.99:
        break

# 检查测试集
testing_accuracy = 0
for x, y in testing_set:
    predicted = x * weight.item() + bias.item()
    testing_accuracy += 1 - abs(y - predicted) / y
    print(f"testing x: {x}, y: {y}, predicted: {predicted}")
testing_accuracy /= len(testing_set)
print(f"testing accuracy: {testing_accuracy}")

# 显示 weight 与 bias 的变化
from matplotlib import pyplot
pyplot.plot(weight_history, label="weight")
pyplot.plot(bias_history, label="bias")
pyplot.legend()
pyplot.show()

输出如下:

epoch: 1
traning x: 2.0, y: 5.0, predicted: 2.0, loss: 9.0, weight: 1.0, bias: 0.0
traning x: 5.0, y: 11.0, predicted: 5.659999847412109, loss: 28.515602111816406, weight: 1.1200000047683716, bias: 0.05999999865889549
traning x: 6.0, y: 13.0, predicted: 10.090799331665039, loss: 8.463448524475098, weight: 1.6540000438690186, bias: 0.16679999232292175
traning x: 7.0, y: 15.0, predicted: 14.246713638305664, loss: 0.5674403309822083, weight: 2.0031042098999023, bias: 0.22498400509357452
traning x: 8.0, y: 17.0, predicted: 17.108564376831055, loss: 0.011786224320530891, weight: 2.1085643768310547, bias: 0.24004973471164703
validating x: 12.0, y: 25.0, predicted: 25.33220863342285
validating x: 1.0, y: 3.0, predicted: 2.3290724754333496
validating accuracy: 0.8815345764160156
epoch: 2
traning x: 2.0, y: 5.0, predicted: 4.420266628265381, loss: 0.3360907733440399, weight: 2.0911941528320312, bias: 0.2378784418106079
traning x: 5.0, y: 11.0, predicted: 10.821391105651855, loss: 0.03190113604068756, weight: 2.1143834590911865, bias: 0.24947310984134674
traning x: 6.0, y: 13.0, predicted: 13.04651165008545, loss: 0.002163333585485816, weight: 2.132244348526001, bias: 0.25304529070854187
traning x: 7.0, y: 15.0, predicted: 15.138755798339844, loss: 0.019253171980381012, weight: 2.1266629695892334, bias: 0.25211507081985474
traning x: 8.0, y: 17.0, predicted: 17.107236862182617, loss: 0.011499744839966297, weight: 2.1072371006011963, bias: 0.24933995306491852
validating x: 12.0, y: 25.0, predicted: 25.32814598083496
validating x: 1.0, y: 3.0, predicted: 2.3372745513916016
validating accuracy: 0.8829828500747681
epoch: 3
traning x: 2.0, y: 5.0, predicted: 4.427353858947754, loss: 0.32792359590530396, weight: 2.0900793075561523, bias: 0.24719521403312683
traning x: 5.0, y: 11.0, predicted: 10.82357406616211, loss: 0.0311261098831892, weight: 2.112985134124756, bias: 0.2586481273174286
traning x: 6.0, y: 13.0, predicted: 13.045942306518555, loss: 0.002110695466399193, weight: 2.1306276321411133, bias: 0.26217663288116455
traning x: 7.0, y: 15.0, predicted: 15.137059211730957, loss: 0.018785227090120316, weight: 2.1251144409179688, bias: 0.2612577974796295
traning x: 8.0, y: 17.0, predicted: 17.105924606323242, loss: 0.011220022104680538, weight: 2.105926036834717, bias: 0.2585166096687317
validating x: 12.0, y: 25.0, predicted: 25.324134826660156
validating x: 1.0, y: 3.0, predicted: 2.3453762531280518
validating accuracy: 0.8844133615493774

省略途中的输出

epoch: 202
traning x: 2.0, y: 5.0, predicted: 4.950470924377441, loss: 0.0024531292729079723, weight: 2.0077908039093018, bias: 0.9348894953727722
traning x: 5.0, y: 11.0, predicted: 10.984740257263184, loss: 0.00023285974748432636, weight: 2.0097720623016357, bias: 0.9358800649642944
traning x: 6.0, y: 13.0, predicted: 13.003972053527832, loss: 1.5777208318468183e-05, weight: 2.0112979412078857, bias: 0.9361852407455444
traning x: 7.0, y: 15.0, predicted: 15.011855125427246, loss: 0.00014054399798624218, weight: 2.0108213424682617, bias: 0.9361057877540588
traning x: 8.0, y: 17.0, predicted: 17.00916290283203, loss: 8.39587883092463e-05, weight: 2.0091617107391357, bias: 0.9358686804771423
validating x: 12.0, y: 25.0, predicted: 25.028034210205078
validating x: 1.0, y: 3.0, predicted: 2.9433810710906982
validating accuracy: 0.9900028705596924
testing x: 9.0, y: 19.0, predicted: 19.004947662353516
testing x: 13.0, y: 27.0, predicted: 27.035730361938477
testing accuracy: 0.9992080926895142

同样的训练成功了😼。你可能会发现输出的值和前一篇文章的值有一些不同，这是因为 pytorch 默认使用 32 位浮点数 (float32) 进行运算，而 python 使用的是 64 位浮点数 (float64), 如果你把参数定义的部分改成这样：

# 定义参数
weight = torch.tensor(1.0, dtype=torch.float64, requires_grad=True)
bias = torch.tensor(0.0, dtype=torch.float64, requires_grad=True)

然后计算损失的部分改成这样，则可以得到和前一篇文章一样的输出：

# 计算损失
loss = loss_function(predicted, y.double())

使用矩阵乘法实现批次训练

前面的例子虽然使用 pytorch 实现了训练，但还是一个一个值的计算，我们可以用矩阵乘法来实现批次训练，一次计算多个值，以下修改后的代码：

# 引用 pytorch
import torch

# 定义参数
weight = torch.tensor([[1.0]], requires_grad=True) # 1 行 1 列
bias = torch.tensor(0.0, requires_grad=True)

# 创建损失计算器
loss_function = torch.nn.MSELoss()

# 创建参数调整器
optimizer = torch.optim.SGD([weight, bias], lr=0.01)

# 准备训练集，验证集和测试集
traning_set_x = torch.tensor([[2.0], [5.0], [6.0], [7.0], [8.0]]) # 5 行 1 列，代表有 5 组，每组有 1 个输入
traning_set_y = torch.tensor([[5.0], [11.0], [13.0], [15.0], [17.0]]) # 5 行 1 列，代表有 5 组，每组有 1 个输出
validating_set_x = torch.tensor([[12.0], [1.0]]) # 2 行 1 列，代表有 2 组，每组有 1 个输入
validating_set_y = torch.tensor([[25.0], [3.0]]) # 2 行 1 列，代表有 2 组，每组有 1 个输出
testing_set_x = torch.tensor([[9.0], [13.0]]) # 2 行 1 列，代表有 2 组，每组有 1 个输入
testing_set_y = torch.tensor([[19.0], [27.0]]) # 2 行 1 列，代表有 2 组，每组有 1 个输出

# 记录 weight 与 bias 的历史值
weight_history = [weight[0][0].item()]
bias_history = [bias.item()]

for epoch in range(1, 10000):
    print(f"epoch: {epoch}")

    # 根据训练集训练并修改参数

    # 计算预测值
    # 5 行 1 列的矩阵乘以 1 行 1 列的矩阵，会得出 5 行 1 列的矩阵
    predicted = traning_set_x.mm(weight) + bias
    # 计算损失
    loss = loss_function(predicted, traning_set_y)
    # 打印除错信息
    print(f"traning x: {traning_set_x}, y: {traning_set_y}, predicted: {predicted}, loss: {loss}, weight: {weight}, bias: {bias}")
    # 从损失自动微分求导函数值
    loss.backward()
    # 使用参数调整器调整参数
    optimizer.step()
    # 清空导函数值
    optimizer.zero_grad()
    # 记录 weight 和 bias 的历史值
    weight_history.append(weight[0][0].item())
    bias_history.append(bias.item())

    # 检查验证集
    with torch.no_grad(): # 禁止自动微分功能
        predicted = validating_set_x.mm(weight) + bias
        validating_accuracy = 1 - ((validating_set_y - predicted).abs() / validating_set_y).mean()
    print(f"validating x: {validating_set_x}, y: {validating_set_y}, predicted: {predicted}")

    # 如果验证集正确率大于 99 %，则停止训练
    print(f"validating accuracy: {validating_accuracy}")
    if validating_accuracy > 0.99:
        break

# 检查测试集
with torch.no_grad(): # 禁止自动微分功能
    predicted = testing_set_x.mm(weight) + bias
    testing_accuracy = 1 - ((testing_set_y - predicted).abs() / testing_set_y).mean()
print(f"testing x: {testing_set_x}, y: {testing_set_y}, predicted: {predicted}")
print(f"testing accuracy: {testing_accuracy}")

# 显示 weight 与 bias 的变化
from matplotlib import pyplot
pyplot.plot(weight_history, label="weight")
pyplot.plot(bias_history, label="bias")
pyplot.legend()
pyplot.show()

输出如下:

epoch: 1
traning x: tensor([[2.],
        [5.],
        [6.],
        [7.],
        [8.]]), y: tensor([[ 5.],
        [11.],
        [13.],
        [15.],
        [17.]]), predicted: tensor([[2.],
        [5.],
        [6.],
        [7.],
        [8.]], grad_fn=<AddBackward0>), loss: 47.79999923706055, weight: tensor([[1.]], requires_grad=True), bias: 0.0
validating x: tensor([[12.],
        [ 1.]]), y: tensor([[25.],
        [ 3.]]), predicted: tensor([[22.0200],
        [ 1.9560]])
validating accuracy: 0.7663999795913696
epoch: 2
traning x: tensor([[2.],
        [5.],
        [6.],
        [7.],
        [8.]]), y: tensor([[ 5.],
        [11.],
        [13.],
        [15.],
        [17.]]), predicted: tensor([[ 3.7800],
        [ 9.2520],
        [11.0760],
        [12.9000],
        [14.7240]], grad_fn=<AddBackward0>), loss: 3.567171573638916, weight: tensor([[1.8240]], requires_grad=True), bias: 0.13199999928474426
validating x: tensor([[12.],
        [ 1.]]), y: tensor([[25.],
        [ 3.]]), predicted: tensor([[24.7274],
        [ 2.2156]])
validating accuracy: 0.8638148307800293

省略途中的输出

epoch: 1103
traning x: tensor([[2.],
        [5.],
        [6.],
        [7.],
        [8.]]), y: tensor([[ 5.],
        [11.],
        [13.],
        [15.],
        [17.]]), predicted: tensor([[ 4.9567],
        [10.9867],
        [12.9966],
        [15.0066],
        [17.0166]], grad_fn=<AddBackward0>), loss: 0.0004764374461956322, weight: tensor([[2.0100]], requires_grad=True), bias: 0.936755359172821
validating x: tensor([[12.],
        [ 1.]]), y: tensor([[25.],
        [ 3.]]), predicted: tensor([[25.0564],
        [ 2.9469]])
validating accuracy: 0.99001544713974
testing x: tensor([[ 9.],
        [13.]]), y: tensor([[19.],
        [27.]]), predicted: tensor([[19.0265],
        [27.0664]])
testing accuracy: 0.998073160648346

嗯？这回怎么用了 1103 次才训练成功？这是因为 weight 和 bias 调整的方向始终都是一致的，所以只用一个批次训练反而会更慢。在之后的文章中，我们会用更多的参数 (神经元) 来训练，而它们可以有不同的调整方向，所以不会出现这个例子中的问题。当然，业务上有的时候会出现因为参数调整方向全部一致导致训练很慢，或者根本无法收敛的问题，这个时候我们可以通过更换模型，或者切分多个批次来解决。

划分训练集，验证集和测试集的例子

上面的例子定义训练集，验证集和测试集的时候都是一个个 tensor 的定义，有没有觉得很麻烦？我们可以通过 pytorch 提供的 tensor 操作来更方便的划分它们：

# 原始数据集
>>> dataset = [(1, 3), (2, 5), (5, 11), (6, 13), (7, 15), (8, 17), (9, 19), (12, 25), (13, 27)]

# 转换原始数据集到 tensor，并且指定数值类型为浮点数
>>> dataset_tensor = torch.tensor(dataset, dtype=torch.float32)
>>> dataset_tensor
tensor([[ 1.,  3.],
        [ 2.,  5.],
        [ 5., 11.],
        [ 6., 13.],
        [ 7., 15.],
        [ 8., 17.],
        [ 9., 19.],
        [12., 25.],
        [13., 27.]])

# 给随机数生成器分配一个初始值，使得每次运行都可以生成相同的随机数
# 这是为了让训练过程可重现，你也可以选择不这样做
>>> torch.random.manual_seed(0)
<torch._C.Generator object at 0x10cc03070>

# 生成随机索引值, 用于打乱数据顺序防止分布不均
>>> dataset_tensor.shape
torch.Size([9, 2])
>>> random_indices = torch.randperm(dataset_tensor.shape[0])
>>> random_indices
tensor([8, 0, 2, 3, 7, 1, 4, 5, 6])

# 计算训练集，验证集和测试集的索引值列表
# 60 % 的数据划分到训练集，20 % 的数据划分到验证集，20 % 的数据划分到测试集
>>> traning_indices = random_indices[:int(len(random_indices)*0.6)]
>>> traning_indices
tensor([8, 0, 2, 3, 7])
>>> validating_indices = random_indices[int(len(random_indices)*0.6):int(len(random_indices)*0.8):]
>>> validating_indices
tensor([1, 4])
>>> testing_indices = random_indices[int(len(random_indices)*0.8):]
>>> testing_indices
tensor([5, 6])

# 划分训练集，验证集和测试集
>>> traning_set_x = dataset_tensor[traning_indices][:,:1] # 第一维度不截取，第二维度截取索引值小于 1 的元素
>>> traning_set_y = dataset_tensor[traning_indices][:,1:] # 第一维度不截取，第二维度截取索引值大于或等于 1 的元素
>>> traning_set_x
tensor([[13.],
        [ 1.],
        [ 5.],
        [ 6.],
        [12.]])
>>> traning_set_y
tensor([[27.],
        [ 3.],
        [11.],
        [13.],
        [25.]])
>>> validating_set_x = dataset_tensor[validating_indices][:,:1]
>>> validating_set_y = dataset_tensor[validating_indices][:,1:]
>>> validating_set_x
tensor([[2.],
        [7.]])
>>> validating_set_y
tensor([[ 5.],
        [15.]])
>>> testing_set_x = dataset_tensor[testing_indices][:,:1]
>>> testing_set_y = dataset_tensor[testing_indices][:,1:]
>>> testing_set_x
tensor([[8.],
        [9.]])
>>> testing_set_y
tensor([[17.],
        [19.]])

写成代码如下：

# 原始数据集
dataset = [(1, 3), (2, 5), (5, 11), (6, 13), (7, 15), (8, 17), (9, 19), (12, 25), (13, 27)]

# 转换原始数据集到 tensor
dataset_tensor = torch.tensor(dataset, dtype=torch.float32)

# 给随机数生成器分配一个初始值，使得每次运行都可以生成相同的随机数
torch.random.manual_seed(0)

# 切分训练集，验证集和测试集
random_indices = torch.randperm(dataset_tensor.shape[0])
traning_indices = random_indices[:int(len(random_indices)*0.6)]
validating_indices = random_indices[int(len(random_indices)*0.6):int(len(random_indices)*0.8):]
testing_indices = random_indices[int(len(random_indices)*0.8):]
traning_set_x = dataset_tensor[traning_indices][:,:1]
traning_set_y = dataset_tensor[traning_indices][:,1:]
validating_set_x = dataset_tensor[validating_indices][:,:1]
validating_set_y = dataset_tensor[validating_indices][:,1:]
testing_set_x = dataset_tensor[testing_indices][:,:1]
testing_set_y = dataset_tensor[testing_indices][:,1:]

注意改变数据分布可以影响训练速度，你可以试试上面的代码经过多少次训练可以训练成功 (达到 99 % 的正确率)。不过，数据越多越均匀，分布对训练速度的影响就越少。

定义模型类 (torch.nn.Module)

如果我们想把自己写好的模型提供给别人用，或者用别人写好的模型，应该怎么办呢？pytorch 提供了封装模型的基础类 torch.nn.Module，上面例子中的模型可以改写如下：

# 引用 pytorch 和显示图表使用的 matplotlib
import torch
from matplotlib import pyplot

# 定义模型
# 模型需要定义 forward 函数接收输入并返回预测输出
# add_history 和 show_history 是自定义函数，它们仅用于帮助我们理解机器学习的原理，实际不需要这样做
class MyModle(torch.nn.Module):
    def __init__(self):
        # 初始化基类
        super().__init__()
        # 定义参数
        # 需要使用 torch.nn.Parameter 包装，requires_grad 不需要设置 (会统一帮我们设置)
        self.weight = torch.nn.Parameter(torch.tensor([[1.0]]))
        self.bias = torch.nn.Parameter(torch.tensor(0.0))
        # 记录 weight 与 bias 的历史值
        self.weight_history = [self.weight[0][0].item()]
        self.bias_history = [self.bias.item()]

    def forward(self, x):
        # 计算预测值
        predicted = x.mm(self.weight) + self.bias
        return predicted

    def add_history(self):
        # 记录 weight 和 bias 的历史值
        self.weight_history.append(self.weight[0][0].item())
        self.bias_history.append(self.bias.item())

    def show_history(self):
        # 显示 weight 与 bias 的变化
        pyplot.plot(self.weight_history, label="weight")
        pyplot.plot(self.bias_history, label="bias")
        pyplot.legend()
        pyplot.show()

# 创建模型实例
model = MyModle()

# 创建损失计算器
loss_function = torch.nn.MSELoss()

# 创建参数调整器
# 调用 parameters 函数可以自动递归获取模型中的参数列表 (注意是递归获取，嵌套模型也能支持)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

# 原始数据集
dataset = [(1, 3), (2, 5), (5, 11), (6, 13), (7, 15), (8, 17), (9, 19), (12, 25), (13, 27)]

# 转换原始数据集到 tensor
dataset_tensor = torch.tensor(dataset, dtype=torch.float32)

# 给随机数生成器分配一个初始值，使得每次运行都可以生成相同的随机数
# 这是为了让训练过程可重现，你也可以选择不这样做
torch.random.manual_seed(0)

# 切分训练集，验证集和测试集
random_indices = torch.randperm(dataset_tensor.shape[0])
traning_indices = random_indices[:int(len(random_indices)*0.6)]
validating_indices = random_indices[int(len(random_indices)*0.6):int(len(random_indices)*0.8):]
testing_indices = random_indices[int(len(random_indices)*0.8):]
traning_set_x = dataset_tensor[traning_indices][:,:1]
traning_set_y = dataset_tensor[traning_indices][:,1:]
validating_set_x = dataset_tensor[validating_indices][:,:1]
validating_set_y = dataset_tensor[validating_indices][:,1:]
testing_set_x = dataset_tensor[testing_indices][:,:1]
testing_set_y = dataset_tensor[testing_indices][:,1:]

# 开始训练过程
for epoch in range(1, 10000):
    print(f"epoch: {epoch}")

    # 根据训练集训练并修改参数
    # 切换模型到训练模式，将会启用自动微分，批次正规化 (BatchNorm) 与 Dropout
    model.train()

    # 计算预测值
    predicted = model(traning_set_x)
    # 计算损失
    loss = loss_function(predicted, traning_set_y)
    # 打印除错信息
    print(f"traning x: {traning_set_x}, y: {traning_set_y}, predicted: {predicted}, loss: {loss}, weight: {model.weight}, bias: {model.bias}")
    # 从损失自动微分求导函数值
    loss.backward()
    # 使用参数调整器调整参数
    optimizer.step()
    # 清空导函数值
    optimizer.zero_grad()
    # 记录 weight 和 bias 的历史值
    model.add_history()

    # 检查验证集
    # 切换模型到验证模式，将会禁用自动微分，批次正规化 (BatchNorm) 与 Dropout
    model.eval()
    predicted = model(validating_set_x)
    validating_accuracy = 1 - ((validating_set_y - predicted).abs() / validating_set_y).mean()
    print(f"validating x: {validating_set_x}, y: {validating_set_y}, predicted: {predicted}")

    # 如果验证集正确率大于 99 %，则停止训练
    print(f"validating accuracy: {validating_accuracy}")
    if validating_accuracy > 0.99:
        break

# 检查测试集
predicted = model(testing_set_x)
testing_accuracy = 1 - ((testing_set_y - predicted).abs() / testing_set_y).mean()
print(f"testing x: {testing_set_x}, y: {testing_set_y}, predicted: {predicted}")
print(f"testing accuracy: {testing_accuracy}")

# 显示 weight 与 bias 的变化
model.show_history()

定义和使用模型类需要注意以下几点：

必须在构造函数 __init__ 中调用 super().__init__() 初始化基类 (一般 python 继承类也需要这样做)
必须定义 forward 函数接收输入并返回预测输出
模型中定义参数需要使用 torch.nn.Parameter 包装，requires_grad 不需要设置 (会统一帮我们设置)
调用 model.parameters() 可以递归获取参数列表 (支持嵌套模型)，创建参数调整器时需要这个参数列表
在训练前调用 model.train() 开启自动微分等功能
在验证或者使用训练好的模型前调用 model.eval 关闭自动微分等功能

我们在后面继续使用 pytorch 进行机器学习时，代码的结构会基本和上面的例子一样，只是模型和检查验证集测试集的部分不同。此外，批次正规化与 Dropout 等功能会在后面的文章中介绍。

本篇就到此结束了，相信看到这里你已经掌握了用 pytorch 进行机器学习的基本模式😼。

如果您发现该资源为电子书等存在侵权的资源或对该资源描述不正确等，可点击“私信”按钮向作者进行反馈；如作者无回复可进行平台仲裁，我们会在第一时间进行处理！

评价 0 条