deep learning with pytorch(一)

1.create a basic nerual network model with pytorch

数据集 Iris UCI Machine Learning Repository

fully connected  


  1. # %%
  2. import torch
  3. import torch.nn as nn
  4. import torch.nn.functional as F
  5. # %%
  6. # create a model class that inherits nn.Module 这里是Module 不是model
  7. class Model(nn.Module):
  8. #input layer (4 features of the flower) -->
  9. # Hidden layer1 (number of neurons) -->
  10. # H2(n) --> output (3 classed of iris flowers)
  11. def __init__(self, in_features = 4, h1 = 8, h2 = 9, out_features = 3):
  12. super().__init__() # instantiate out nn.Module 实例化
  13. self.fc1 = nn.Linear(in_features= in_features, out_features= h1)
  14. self.fc2 = nn.Linear(in_features= h1, out_features= h2)
  15. self.out = nn.Linear(in_features= h2, out_features= out_features)
  16. # moves everything forward
  17. def forward(self, x):
  18. # rectified linear unit 修正线性单元 大于0则保留,小于0另其等于0
  19. x = F.relu(self.fc1(x))
  20. x = F.relu(self.fc2(x))
  21. x = self.out(x)
  22. return x
  23. # %%
  24. # before we turn it on we need to create a manual seed, because networks involve randomization every time.
  25. # say hey start here and then go randomization, then we'll get basically close to the same outputs
  26. # pick a manual seed for randomization
  27. torch.manual_seed(seed= 41)
  28. # create an instance of model
  29. model = Model()

2.load data and train nerual network model 


torch.optim — PyTorch 2.2 documentation

1. optimizer.zero_grad()

  • 作用: 清零梯度。在训练神经网络时,每次参数更新前,需要将梯度清零。因为如果不清零,梯度会累加到已有的梯度上,这是PyTorch的设计决策,目的是为了处理像RNN这样的网络结构,它们在一个循环中多次计算梯度。

  • 原理: PyTorch在进行反向传播(backward)时,会累计梯度,而不是替换掉当前的梯度值。因此,如果不手动清零,梯度值会不断累积,导致训练过程出错。

2. loss.backward()

  • 作用: 计算梯度。这一步会根据损失函数对模型参数进行梯度的计算。在神经网络中,损失函数衡量的是模型输出与真实标签之间的差异,通过反向传播算法,可以计算出损失函数关于模型各个参数的梯度。

  • 原理: 反向传播是一种有效计算梯度的算法,它首先计算输出层的梯度,然后逆向逐层传播至输入层。这个过程依赖于链式法则,是深度学习训练中的核心。

3. optimizer.step()

  • 作用: 更新参数。基于计算出的梯度,更新模型的参数。这一步实际上是在执行优化算法(如SGD、Adam等),根据梯度方向和设定的学习率调整参数值,以减小损失函数的值。

  • 原理: 优化器根据梯度下降(或其它优化算法)更新模型参数。梯度指示了损失函数增长最快的方向,因此通过向相反方向调整参数,模型的预测误差会逐渐减小。

    1. # %%
    2. import pandas as pd
    3. import matplotlib.pyplot as plt
    4. %matplotlib inline
    5. # %%
    6. # url = 'https://gist.githubusercontent.com/curran/a08a1080b88344b0c8a7/raw/0e7a9b0a5d22642a06d3d5b9bcbad9890c8ee534/iris.csv'
    7. my_df = pd.read_csv('dataset/iris.csv')
    8. # %%
    9. # change last column from strings to integers
    10. my_df['species'] = my_df['species'].replace('setosa', 0.0)
    11. my_df['species'] = my_df['species'].replace('versicolor', 1.0)
    12. my_df['species'] = my_df['species'].replace('virginica', 2.0)
    13. my_df
    14. # my_df.head()
    15. # my_df.tail()
    16. # %%
    17. # train test split ,set X,Y
    18. X = my_df.drop('species', axis = 1) # 删除指定列
    19. y = my_df['species']
    20. # %%
    21. #Convert these to numpy arrays
    22. X = X.values
    23. y = y.values
    24. # X
    25. # %%
    26. # train test split
    27. from sklearn.model_selection import train_test_split
    28. X_train, X_test, y_train, y_test = train_test_split(X, y, test_size= 0.2, random_state= 41)
    29. # %%
    30. # convert X features to float tensors
    31. X_train = torch.FloatTensor(X_train)
    32. X_test = torch.FloatTensor(X_test)
    33. #convert y labels to long tensors
    34. y_train = torch.LongTensor(y_train)
    35. y_test = torch.LongTensor(y_test)
    36. # %%
    37. # set the criterion of model to measure the error,how far off the predicitons are from the data
    38. criterion = nn.CrossEntropyLoss()
    39. # choose Adam optimizer, lr = learing rate (if error does not go down after a bunch of
    40. # iterations(epochs), lower our learning rate),学习率越低,学习所需时间越长
    41. optimizer = torch.optim.Adam(model.parameters(), lr= 0.01)
    42. # 传进去的参数包括fc1, fc2, out
    43. # model.parameters
    44. # %%
    45. # train our model
    46. # epochs? (one run through all the training data in out network )
    47. epochs = 100
    48. losses = []
    49. for i in range(epochs):
    50. # go forward and get a prediction
    51. y_pred = model.forward(X_train) # get a predicted results
    52. #measure the loss/error, gonna be high at first
    53. loss = criterion(y_pred, y_train) # predicted values vs y_train
    54. # keep track of our losses
    55. #detach()不再跟踪计算图中的梯度信息,numpy(): 这个方法将PyTorch张量转换成NumPy数组。因为NumPy数组在Python科学计算中非常普遍,很多库和函数需要用到NumPy数组作为输入。
    56. losses.append(loss.detach().numpy())
    57. #print every 10 epoches
    58. if i % 10 == 0:
    59. print(f'Epoch: {i} and loss: {loss}')
    60. # do some back propagation: take the error rate of forward propagation and feed it back
    61. # thru the network to fine tune the weights
    62. # optimizer.zero_grad() 清零梯度,为新的梯度计算做准备。
    63. # loss.backward() 计算梯度,即对损失函数进行微分,获取参数的梯度。
    64. # optimizer.step() 更新参数,根据梯度和学习率调整参数值以最小化损失函数。
    65. optimizer.zero_grad()
    66. loss.backward()
    67. optimizer.step()
    68. # %%
    69. # graph it out
    70. plt.plot(range(epochs), losses)
    71. plt.ylabel("loss/error")
    72. plt.xlabel("Epoch")

