当前位置:   article > 正文

NLP 学习笔记 1:pytorch基础操作以及Perceptron 和 FF networks实现_perceptron pytorch

perceptron pytorch

一些自己的nlp学习笔记

一:一些基础的pytorch操作

1 tensor的建立

  1. import torch
  2. import numpy as np
  3. x = torch.Tensor(2,3) # 建立两行三列的torch tensor
  4. print(x.type()) # type是Tensor类的一个mothod,返回Python string
  5. # torch.FloatTensor是real number的默认类型,一般来说GPU都能很好的处理
  6. x = torch.rand(2,3) # uniform distribution
  7. x = torch.randn(2,3) # normal distribution
  8. x = torch.zeros(2, 3) # 全0 tensor
  9. x = torch.ones(2, 3) # 全1 tensor
  10. x.fill_(5) # 将tensor中全填入某相同的值
  11. #Tensor from list
  12. x = torch.Tensor([[1, 2, 3], [4, 5, 6]]) #从list中获取tensor
  13. #From numpy to torch
  14. a = np.random.rand(2, 3)
  15. x = torch.from_numpy(a) # 用from_numpy将numpy类型转为tensor
  16. x = torch.from_numpy(a).type(torch.FloatTensor) # 可以用type来指定数据类型
  17. y = torch.from_numpy(a).type_as(x) # 可以用type_as来指定与其他tensor相同的数
  18. # 据类型
  19. #数据类型以及数据类型的转换,一般默认为FloatTensor
  20. z = x.long() # 转为long

 2 tensor基础操作

  1. # 求和
  2. print(torch.add(x,x))
  3. print(torch.sum(x, dim=0)) #按列求和
  4. # 对应元素求积
  5. print(torch.mul(x,x))
  6. print(x*x)
  7. # range tensor
  8. print(torch.arange(6))
  9. # 返回不同shape的tensor
  10. print(x.view(3, 2))
  11. x1 = torch.arange(6).view(2,3)
  12. # indexing + sum
  13. x2 = torch.ones(3, 2).long()
  14. x2[:, 1] += 1
  15. print('x1 =', x1)
  16. print('x2 =', x2)
  17. # 矩阵乘
  18. print(torch.mm(x1, x2))

3 检查pytorch所需硬件

  1. import torch
  2. print(torch.cuda.is_available())
  3. print(torch.cuda.current_device())
  4. print(torch.cuda.device(0))
  5. print(torch.cuda.device_count())
  6. print(torch.cuda.get_device_name(0))

4 pytorch 中的 Automatic differentiation

  1. x = torch.ones(1, requires_grad=True)
  2. print(x)
  3. y = x+42
  4. print(y)
  5. z = 3*y*y
  6. print(z)
  7. z.backward() # 计算梯度
  8. print(x.grad) # ∂z/∂x = 6(x+42) = 6*1+252 = 258
  9. print(y.grad) # y的gradient没有保存因为没有requires_grad=True

二:The Perceptron

  1. import torch
  2. import torch.nn as nn
  3. # nn.Module 是所有神经网络的基类
  4. class Perceptron(nn.Module):
  5. """Our perceptron class"""
  6. def __init__(self, input_dim):
  7. """
  8. Constructor
  9. """
  10. super().__init__()
  11. self.fc = nn.Linear(input_dim, 1)
  12. self.sigmoid = torch.nn.Sigmoid()
  13. def forward(self, x_in):
  14. # squeeze unwraps the result from the singleton list
  15. return self.sigmoid(self.fc(x_in)) #.squeeze()
  16. print(Perceptron(10).forward(torch.ones(10)))
Activation functions

Sigmoidf\left ( x \right )=\frac{1}{1+e^{-x}}

Tanhf(x)=\frac{e^{x}-e^{-x}}{e^{x}+e^{-x}}

Relu :f(x)=max(0,x)

Loss function

MSE Loss:L(y,\hat{y})=\frac{1}{n}\sum_{i=1}^{n}(y_{i}-\hat{y_{i}})^{2}

  1. import torch
  2. import torch.nn as nn
  3. mse_loss = nn.MSELoss()
  4. produced = torch.randn(2, 4, requires_grad=True)
  5. print(produced)
  6. expected = torch.randn(2, 4)
  7. print(expected)
  8. loss = mse_loss(produced, expected)
  9. print(loss)

categorical cross-entropy loss

L(y,\hat{y})=-\sum_{i=1}^{n}y_{i}log(\hat{y})

  1. import torch
  2. import torch.nn as nn
  3. ce_loss = nn.CrossEntropyLoss() # for binary classification, we can use nn.BCELoss()
  4. produced = torch.randn(2, 4, requires_grad=True) # 2*4, normal distribution
  5. print(produced)
  6. # input is an index for each vector indicating the correct category/class
  7. expected = torch.tensor([1, 0], dtype=torch.int64)
  8. loss = ce_loss(produced, expected)
  9. print(loss)

三:Language classification with the Perceptron

1 setup

  1. from random import randint
  2. import torch
  3. from torch.utils.data import Dataset, DataLoader
  4. import torch.nn as nn
  5. import torch.nn.functional as F
  6. import torch.optim as optim

2 Data Preparation

建立LanguageRecognitionDataset类,用于处理原始data,生成我们language classification训练所需要的dataset

  1. class LanguageRecognitionDataset(Dataset):
  2. """An automatically generated dataset for our language classification task."""
  3. def _get_bigrams(self, sentence_list):
  4. big rams = {}
  5. # for each sentence
  6. for s in sentence_list:
  7. # for each bigram
  8. for k in range(len(s)-1):
  9. bigrams[s[k:k+2]] = 1.0
  10. return bigrams.keys()
  11. def _get_bigram_vector(self, sentence):
  12. sent_bigrams = self._get_bigrams([sentence])
  13. vector = []
  14. for bigram in self.bigrams:
  15. vector.append(1.0 if bigram in sent_bigrams else 0.0)
  16. return vector
  17. def __init__(self, sample, training_bigrams = None):
  18. """
  19. Args:
  20. sample: List of sentences with their classification (True/False)
  21. """
  22. self.num_samples = len(sample)
  23. if not training_bigrams:
  24. self.bigrams = self._get_bigrams([x for x, _ in sample])
  25. else:
  26. self.bigrams = training_bigrams
  27. self.data = []
  28. for sentence, gold_label in sample:
  29. sentence = sentence.lower()
  30. item = {'inputs': torch.tensor(self._get_bigram_vector(sentence)), 'outputs': torch.tensor([gold_label])}
  31. self.data.append(item)
  32. def __len__(self):
  33. return self.num_samples
  34. def __getitem__(self, idx):
  35. return self.data[idx]
  36. LanguageRecognitionDataset([("ciao ciao pippo", 1), ("la casa si trova in collina", 1)])[1]

3 建立一个简单的dataset

  1. training_sentences = [("Scienziata italiana scopre la più grande esplosione nell’Universo.", 1.0),
  2. ("Nell’ammasso di galassie di Ofiuco, distante 390 milioni di anni luce.", 1.0),
  3. ("Ha rilasciato una quantità di energia 5 volte più grande della precedente che deteneva il primato.", 1.0),
  4. ("Syria war: Turkey says thousands of migrants have crossed to EU.", 0.0),
  5. ("Turkey could no longer deal with the amount of people fleeing Syria's civil war, he added.", 0.0),
  6. ("Greece says it has blocked thousands of migrants from entering illegally from Turkey.", 0.0),
  7. ("Tutto perfetto? Non proprio. Ci sono elementi problematici che vanno considerati.", 1.0),
  8. ("Il primo è l’autonomia degli studenti, che devono essere in grado di gestire la tecnologia.", 1.0),
  9. ("Il secondo, è la durata e la cadenza delle lezioni.", 1.0),
  10. ("Per motivi di connessione, di competenze, di strumenti.", 1.0),
  11. ("Serve un’assistenza dedicata.", 1.0),
  12. ("Potremmo completare l’anno scolastico in versione virtuale?", 1.0),
  13. ("Siamo preparati per affiancare la didattica tradizionale a quella virtuale, ma non siamo pronti per sostituirla", 1.0),
  14. ("Various architectures of recurrent neural networks have been successful.", 0.0),
  15. ("They perform tasks relating to sequence measuring", 0.0),
  16. ("The networks operate by processing input components sequentially", 0.0),
  17. ("They retain a hidden vector between iterations", 0.0),
  18. ("It is constantly used and modified throughout the sequence.", 0.0),
  19. ("They are able to model arbitrarily complicated programs.", 0.0),
  20. ("L’Istituto, che raccoglie studenti di liceo scientifico, linguistico e tecnico economico, è l’esempio ideale.", 1.0),
  21. ]
  22. validation_sentences = [("L’Istituto superiore di sanità ha confermato tutti i casi esaminati.", 1.0),
  23. ("Measures announced after an emergency cabinet meeting also include the cancellation of the Paris half-marathon which was to be held on Sunday.", 0.0),
  24. ("Lavagne in condivisione, documenti scaricabili sulla piattaforma gratuita, esercizi collaborativi.", 1.0),
  25. ("Each encoder consists of two major components", 0.0),
  26. ]
  27. test_sentences = [("Il ministro della Salute francese ha raccomandato di salutarsi mantenendo le distanze, mentre l’Organizzazione mondiale della sanità alza l’allerta a molto alta.", 1.0),
  28. ("Possiamo riammalarci ma in questo caso si parla di ricaduta.", 1.0),
  29. ("The vast majority of infections and deaths are in China, where the virus originated late last year.", 0.0),
  30. ("France has banned all indoor gatherings of more than 5,000 people, as part of efforts to contain the country's coronavirus outbreak", 0.0)]
  31. def test_dataset_class():
  32. simple_dataset = LanguageRecognitionDataset(training_sentences)
  33. print('Dataset test:')
  34. for i in range(len(training_sentences)):
  35. print(f' sample {i}: {simple_dataset[i]}')
  36. test_dataset_class()

4 Model training 

我们建立一个trainer类,其中包含了以下几个部分

  • training loop:使用模型,在数据集上迭代,来解决我们的问题
  • evaluation function:来评估我们模型的学习状态
  • prediction function:获取我们模型的output

为了让模型正确的学习,我们需要loss function来评估模型输出与真实值的差距,需要optimizer来基于loss更正模型参数

  1. class Trainer():
  2. """Utility class to train and evaluate a model."""
  3. def __init__(
  4. self,
  5. model,
  6. loss_function,
  7. optimizer):
  8. """
  9. Args:
  10. model: the model we want to train.
  11. loss_function: the loss_function to minimize.
  12. optimizer: the optimizer used to minimize the loss_function.
  13. """
  14. self.model = model
  15. self.loss_function = loss_function
  16. self.optimizer = optimizer
  17. def train(self, train_dataset, valid_dataset, epochs=1):
  18. """
  19. Args:
  20. train_dataset: a Dataset or DatasetLoader instance containing
  21. the training instances.
  22. valid_dataset: a Dataset or DatasetLoader instance used to evaluate
  23. learning progress.
  24. epochs: the number of times to iterate over train_dataset.
  25. Returns:
  26. avg_train_loss: the average training loss on train_dataset over
  27. epochs.
  28. """
  29. assert epochs > 1 and isinstance(epochs, int)
  30. print('Training...')
  31. train_loss = 0.0
  32. for epoch in range(epochs):
  33. print(' Epoch {:03d}'.format(epoch + 1))
  34. epoch_loss = 0.0
  35. for step, sample in enumerate(train_dataset):
  36. inputs = sample['inputs']
  37. labels = sample['outputs']
  38. # we need to set the gradients to zero before starting to do backpropragation
  39. # because PyTorch accumulates the gradients on subsequent backward passes
  40. self.optimizer.zero_grad()
  41. predictions = self.model(inputs)
  42. sample_loss = self.loss_function(predictions, labels)
  43. #print("Before BP:", list(model.parameters()))
  44. sample_loss.backward()
  45. self.optimizer.step()
  46. #print("After BP:", list(model.parameters()))
  47. # sample_loss is a Tensor, tolist returns a float (alternative: use float() instead of .tolist())
  48. epoch_loss += sample_loss.tolist()
  49. print(' [E: {:2d} @ step {}] current avg loss = {:0.4f}'.format(epoch, step, epoch_loss / (step + 1)))
  50. avg_epoch_loss = epoch_loss / len(train_dataset)
  51. train_loss += avg_epoch_loss
  52. print(' [E: {:2d}] train loss = {:0.4f}'.format(epoch, avg_epoch_loss))
  53. valid_loss = self.evaluate(valid_dataset)
  54. print(' [E: {:2d}] valid loss = {:0.4f}'.format(epoch, valid_loss))
  55. print('... Done!')
  56. avg_epoch_loss = train_loss / epochs
  57. return avg_epoch_loss
  58. def evaluate(self, valid_dataset):
  59. """
  60. Args:
  61. valid_dataset: the dataset to use to evaluate the model.
  62. Returns:
  63. avg_valid_loss: the average validation loss over valid_dataset.
  64. """
  65. valid_loss = 0.0
  66. # no gradient updates here
  67. with torch.no_grad():
  68. for sample in valid_dataset:
  69. inputs = sample['inputs']
  70. labels = sample['outputs']
  71. predictions = self.model(inputs)
  72. sample_loss = self.loss_function(predictions, labels)
  73. valid_loss += sample_loss.tolist()
  74. return valid_loss / len(valid_dataset)
  75. def predict(self, x):
  76. """
  77. Returns: hopefully the right prediction.
  78. """
  79. return self.model(x).tolist()

5 最后,定义dataset,setup trainer,训练我们的模型

  1. training_dataset = DataLoader(LanguageRecognitionDataset(training_sentences), batch_size=6)
  2. validation_dataset = DataLoader(LanguageRecognitionDataset(validation_sentences, training_dataset.dataset.bigrams), batch_size=2)
  3. test_dataset = DataLoader(LanguageRecognitionDataset(test_sentences, training_dataset.dataset.bigrams), batch_size=2)
  4. print("Number of input dimensions", len(training_dataset.dataset.bigrams))
  5. model = Perceptron(len(training_dataset.dataset.bigrams))
  6. trainer = Trainer(
  7. model,
  8. loss_function = nn.MSELoss(),
  9. optimizer = optim.SGD(model.parameters(), lr=0.01)
  10. )
  11. avg_epoch_loss = trainer.train(training_dataset, validation_dataset,
  12. epochs=50)

5 evaluation

检查我们的模型是否真的学习了一些东西

  1. trainer.evaluate(test_dataset)
  2. for step, batch in enumerate(test_dataset):
  3. print(step, trainer.predict(batch['inputs']), batch['outputs'])

四:Language classification with a Feedforward Neural Network

1 model definition

  1. class LanguageRecognitionFF(nn.Module):
  2. """A simple model that classifies language"""
  3. def __init__(self, input_dim, hparams):
  4. super().__init__()
  5. # Hidden layer: transforms the input value/scalar into
  6. # a hidden vector representation.
  7. self.fc1 = nn.Linear(input_dim, hparams.hidden_size)
  8. self.relu = nn.ReLU()
  9. # Output layer: transforms the hidden vector representation
  10. # into a value/scalar (hopefully the input value + 1).
  11. self.fc2 = nn.Linear(hparams.hidden_size, 1)
  12. self.sigmoid = nn.Sigmoid()
  13. def forward(self, x):
  14. hidden = self.fc1(x)
  15. relu = self.relu(hidden)
  16. result = self.fc2(relu)
  17. return self.sigmoid(result)

2 Model Building

尽量把超参数与model definition分开,因为这样可以我们可以在不碰模型的情况下改变超参数

  1. class HParams():
  2. hidden_size = 16

instance

model_ff = LanguageRecognitionFF(len(training_dataset.dataset.bigrams), HParams)

3 Model Training

  1. trainer = Trainer(
  2. model = model_ff,
  3. loss_function = nn.MSELoss(),
  4. optimizer = optim.SGD(model_ff.parameters(), lr=1e-5)
  5. )
trainer.train(training_dataset, validation_dataset, 50)

4 Model Evaluation

  1. trainer.evaluate(test_dataset)
  2. for step, batch in enumerate(test_dataset):
  3. print(trainer.predict(batch['inputs']), batch['outputs'])
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/凡人多烦事01/article/detail/543202
推荐阅读
相关标签
  

闽ICP备14008679号