当前位置:   article > 正文

实现多层感知机

实现多层感知机

目录

多层感知机:

介绍:

代码实现:

运行结果:

问题答疑:

线性变换与非线性变换

参数含义

为什么清除梯度?

反向传播的作用

为什么更新权重?


多层感知机:

介绍:

缩写:MLP,这是一种人工神经网络,由一个输入层、一个或多个隐藏层以及一个输出层组成,每一层都由多个节点(神经元)构成。在MLP中,节点之间只有前向连接,没有循环连接,这使得它属于前馈神经网络的一种。每个节点都应用一个激活函数,如sigmoid、ReLU等,以引入非线性,从而使网络能够拟合复杂的函数和数据分布。

代码实现:

  1. import torch
  2. import torch.nn as nn
  3. import torch.optim as optim
  4. from torchvision import datasets, transforms
  5. from torch.utils.data import DataLoader
  6. # Step 1: Define the MLP model
  7. class SimpleMLP(nn.Module):
  8. def __init__(self):
  9. super(SimpleMLP, self).__init__()
  10. self.fc1 = nn.Linear(784, 128) # Input layer to hidden layer
  11. self.fc2 = nn.Linear(128, 64) # Hidden layer to another hidden layer
  12. self.fc3 = nn.Linear(64, 10) # Hidden layer to output layer
  13. self.relu = nn.ReLU()
  14. def forward(self, x):
  15. x = x.view(-1, 784) # Flatten the input from 28x28 to 784
  16. x = self.relu(self.fc1(x))
  17. x = self.relu(self.fc2(x))
  18. x = self.fc3(x)
  19. return x
  20. # Step 2: Load MNIST dataset
  21. transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
  22. train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
  23. test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)
  24. train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
  25. test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)
  26. # Step 3: Define loss function and optimizer
  27. model = SimpleMLP()
  28. criterion = nn.CrossEntropyLoss()
  29. optimizer = optim.SGD(model.parameters(), lr=0.01)
  30. # Step 4: Train the model
  31. num_epochs = 5
  32. for epoch in range(num_epochs):
  33. for batch_idx, (data, target) in enumerate(train_loader):
  34. optimizer.zero_grad()
  35. output = model(data)
  36. loss = criterion(output, target)
  37. loss.backward()
  38. optimizer.step()
  39. if batch_idx % 100 == 0:
  40. print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
  41. epoch, batch_idx * len(data), len(train_loader.dataset),
  42. 100. * batch_idx / len(train_loader), loss.item()))
  43. # Step 5: Evaluate the model on the test set (optional)
  44. with torch.no_grad():
  45. correct = 0
  46. total = 0
  47. for images, labels in test_loader:
  48. outputs = model(images)
  49. _, predicted = torch.max(outputs.data, 1)
  50. total += labels.size(0)
  51. correct += (predicted == labels).sum().item()
  52. print('Accuracy of the network on the 10000 test images: {} %'.format(100 * correct / total))

运行结果:

问题答疑:

线性变换与非线性变换

在神经网络中

线性变换通常指的是权重矩阵和输入数据的矩阵乘法,再加上偏置向量。数学上,对于一个输入向量

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/爱喝兽奶帝天荒/article/detail/848688
推荐阅读
相关标签