当前位置:   article > 正文

基于 BERT+BILSTM 实现情感分析分类(附源码)

bert+bilstm

目录

一、数据集

二、数据清洗和划分

2.1 安装依赖

2.2 清洗和划分

三、下载 Bert 模型

四、训练和测试模型


本文主要基于 Bert 和 BiLSTM 实现情感分类,其中参考了多个博客,具体见参考链接。

源码已上传Gitee : bert-bilstm-in-Sentiment-classification

一、数据集

数据集采用疫情期间微博评论数据集,一共10万条,可从这里下载Weibo nCoV Data

二、数据清洗和划分

2.1 安装依赖

主要安装如下依赖,pip 更换国内源

  1. pandas
  2. scikit-learn
  3. matplotlib
  4. seaborn
  5. torch
  6. torchvision
  7. datasets
  8. transformers
  9. pytorch_lightning

2.2 清洗和划分

实现代码如下所示。

  1. import pandas as pd
  2. from sklearn.model_selection import train_test_split
  3. import pandas as pd
  4. import matplotlib.pyplot as plt
  5. import seaborn as sns
  6. # Read Data
  7. df = pd.read_csv('data/nCoV_100k_train.labled.csv')
  8. # Only need text and labels
  9. df = df[['微博中文内容', '情感倾向']]
  10. df = df.rename(columns={'微博中文内容': 'text', '情感倾向': 'label'})
  11. print(df)
  12. # Observing data balance
  13. print(df.label.value_counts())
  14. print(df.label.value_counts() / df.shape[0] * 100)
  15. plt.figure(figsize=(8, 4))
  16. sns.countplot(x='label', data=df)
  17. plt.show()
  18. # print(df_train[df_train.label > 5.0])
  19. # print(df_train[(df_train.label < -1.1)])
  20. # # discarding outliers
  21. # df_train.drop(df_train[(df_train.label < -1.1) | (df_train.label > 5)].index, inplace=True, axis=0)
  22. # df_train.reset_index(inplace=True, drop=True)
  23. # print(df_train.label.value_counts())
  24. # sns.countplot(x='label', data=df_train)
  25. # plt.show()
  26. df.drop(df[(df.label == '4') |
  27. (df.label == '-') |
  28. (df.label == '·') |
  29. (df.label == '-2') |
  30. (df.label == '10') |
  31. (df.label == '9')].index, inplace=True, axis=0)
  32. df.reset_index(inplace=True, drop=True)
  33. print(df.value_counts())
  34. sns.countplot(x='label', data=df)
  35. plt.show()
  36. # checking for empty rows
  37. print(df.isnull().sum())
  38. # deleting empty row data
  39. df.dropna(axis=0, how='any', inplace=True)
  40. df.reset_index(inplace=True, drop=True)
  41. print(df.isnull().sum())
  42. # examining duplicate data
  43. print(df.duplicated().sum())
  44. print(df[df.duplicated()==True])
  45. # deleting duplicate data
  46. index = df[df.duplicated() == True].index
  47. df.drop(index, axis=0, inplace=True)
  48. df.reset_index(inplace=True, drop=True)
  49. print(df.duplicated().sum())
  50. # We also need to address duplicate data where the text is the same but the label is different
  51. print(df['text'].duplicated().sum())
  52. print(df[df['text'].duplicated() == True])
  53. # viewing examples
  54. print(df[df['text'] == df.iloc[1473]['text']])
  55. print(df[df['text'] == df.iloc[1814]['text']])
  56. # removing data where the text is the same but the label is different
  57. index = df[df['text'].duplicated() == True].index
  58. df.drop(index, axis=0, inplace=True)
  59. df.reset_index(inplace=True, drop=True)
  60. # checking
  61. print(df['text'].duplicated().sum()) # 0
  62. print(df)
  63. # inspecting shapes and indices
  64. print("======data-clean======")
  65. print(df.tail())
  66. print(df.shape)
  67. # viewing the maximum length of text
  68. print(df['text'].str.len().sort_values())
  69. # Split dataset. 0.6/0.2/0.2
  70. train, test = train_test_split(df, test_size=0.2)
  71. train, val = train_test_split(train, test_size=0.25)
  72. print(train.shape)
  73. print(test.shape)
  74. print(val.shape)
  75. train.to_csv('./data/clean/train.csv', index=None)
  76. val.to_csv('./data/clean/val.csv', index=None)
  77. test.to_csv('./data/clean/test.csv', index=None)

三、下载 Bert 模型

在 Hugging Face google-bert/bert-base-chinese 下载对应的模型文件,魔搭社区下载的运行可能有如下问题。

safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge

 下载对应的模型文件。

四、训练和测试模型

训练和测试代码如下所示。

  1. # https://www.kaggle.com/code/isseyice/sentiment-classification-based-on-bert-and-lstm
  2. # https://github.com/iceissey/issey_Kaggle/blob/main/Bert_BiLSTM/BiLSTM_lighting.py#L36
  3. # https://www.cnblogs.com/chuanzhang053/p/17653381.html
  4. import torch
  5. import datasets
  6. import pandas as pd
  7. from datasets import load_dataset # hugging-face dataset
  8. from torch.utils.data import Dataset
  9. from torch.utils.data import DataLoader
  10. import torch.nn as nn
  11. from transformers import BertTokenizer, BertModel
  12. import torch.optim as optim
  13. from torch.nn.functional import one_hot
  14. import pytorch_lightning as pl
  15. from pytorch_lightning import Trainer
  16. from torchmetrics.functional import accuracy, recall, precision, f1_score # lightning中的评估
  17. from pytorch_lightning.callbacks.early_stopping import EarlyStopping
  18. from pytorch_lightning.callbacks import ModelCheckpoint
  19. # todo: 定义超参数
  20. batch_size = 16
  21. epochs = 5
  22. dropout = 0.1
  23. rnn_hidden = 768
  24. rnn_layer = 1
  25. class_num = 3
  26. lr = 0.001
  27. PATH = './model' # model checkpoint path
  28. # 分词器
  29. token = BertTokenizer.from_pretrained('./model/bert-base-chinese')
  30. # Customize dataset 自定义数据集
  31. class MydataSet(Dataset):
  32. def __init__(self, path, split):
  33. # self.dataset = load_dataset('csv', data_files=path, split=split) # TypeError: read_csv() got an unexpected keyword argument 'mangle_dupe_cols'.
  34. self.df = pd.read_csv(path)
  35. self.dataset = datasets.Dataset.from_pandas(self.df)
  36. def __getitem__(self, item):
  37. text = self.dataset[item]['text']
  38. label = self.dataset[item]['label']
  39. return text, label
  40. def __len__(self):
  41. return len(self.dataset)
  42. # todo: 定义批处理函数
  43. def collate_fn(data):
  44. sents = [i[0] for i in data]
  45. labels = [i[1] for i in data]
  46. # 分词并编码
  47. data = token.batch_encode_plus(
  48. batch_text_or_text_pairs=sents, # 单个句子参与编码
  49. truncation=True, # 当句子长度大于max_length时,截断
  50. padding='max_length', # 一律补pad到max_length长度
  51. max_length=300,
  52. return_tensors='pt', # 以pytorch的形式返回,可取值tf,pt,np,默认为返回list
  53. return_length=True,
  54. )
  55. # input_ids: 编码之后的数字
  56. # attention_mask: 是补零的位置是0,其他位置是1
  57. input_ids = data['input_ids'] # input_ids 就是编码后的词
  58. attention_mask = data['attention_mask'] # pad的位置是0,其他位置是1
  59. token_type_ids = data['token_type_ids'] # (如果是一对句子)第一个句子和特殊符号的位置是0,第二个句子的位置是1
  60. labels = torch.LongTensor(labels) # 该批次的labels
  61. # print(data['length'], data['length'].max())
  62. return input_ids, attention_mask, token_type_ids, labels
  63. # 定义模型,上游使用bert预训练,下游任务选择双向LSTM模型,最后加一个全连接层
  64. class BiLSTMClassifier(nn.Module):
  65. def __init__(self, drop, hidden_dim, output_dim):
  66. super(BiLSTMClassifier, self).__init__()
  67. self.drop = drop
  68. self.hidden_dim = hidden_dim
  69. self.output_dim = output_dim
  70. # 加载bert中文模型
  71. self.embedding = BertModel.from_pretrained('./model/bert-base-chinese')
  72. # 冻结上游模型参数(不进行预训练模型参数学习)
  73. for param in self.embedding.parameters():
  74. param.requires_grad_(False)
  75. #生成下游RNN层以及全连接层
  76. self.lstm = nn.LSTM(input_size=768, hidden_size=self.hidden_dim, num_layers=2, batch_first=True,
  77. bidirectional=True, dropout=self.drop)
  78. self.fc = nn.Linear(self.hidden_dim * 2, self.output_dim)
  79. # 使用CrossEntropyLoss作为损失函数时,不需要激活。因为实际上CrossEntropyLoss将softmax-log-NLLLoss一并实现的。
  80. def forward(self, input_ids, attention_mask, token_type_ids):
  81. embedded = self.embedding(input_ids=input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids)
  82. # 第0维才是我们需要的embedding,embedding.last_hidden_state = embedding[0]
  83. embedded = embedded.last_hidden_state # The embedding we need is in the 0th dimension. So, to extract it, we use: embedding.last_hidden_state = embedding[0].
  84. out, (h_n, c_n) = self.lstm(embedded)
  85. output = torch.cat((h_n[-2, :, :], h_n[-1, :, :]), dim=1) # This part represents the output of the bidirectional LSTM. For detailed explanations, please refer to the article.
  86. output = self.fc(output)
  87. return output
  88. # Define PyTorch Lightning
  89. class BiLSTMLighting(pl.LightningModule):
  90. def __init__(self, drop, hidden_dim, output_dim):
  91. super(BiLSTMLighting, self).__init__()
  92. self.model = BiLSTMClassifier(drop, hidden_dim, output_dim) # Set up model
  93. self.criterion = nn.CrossEntropyLoss() # Set up loss function
  94. self.train_dataset = MydataSet('./data/clean/train.csv', 'train')
  95. self.val_dataset = MydataSet('./data/clean/val.csv', 'train')
  96. self.test_dataset = MydataSet('./data/clean/test.csv', 'train')
  97. def configure_optimizers(self):
  98. optimizer = optim.AdamW(self.parameters(), lr=lr)
  99. return optimizer
  100. def forward(self, input_ids, attention_mask, token_type_ids): # forward(self,x)
  101. return self.model(input_ids, attention_mask, token_type_ids)
  102. def train_dataloader(self):
  103. train_loader = DataLoader(dataset=self.train_dataset, batch_size=batch_size, collate_fn=collate_fn,
  104. shuffle=True, num_workers=3)
  105. return train_loader
  106. def training_step(self, batch, batch_idx):
  107. input_ids, attention_mask, token_type_ids, labels = batch # x, y = batch
  108. y = one_hot(labels + 1, num_classes=3)
  109. # 将one_hot_labels类型转换成float
  110. y = y.to(dtype=torch.float)
  111. # forward pass
  112. y_hat = self.model(input_ids, attention_mask, token_type_ids)
  113. # y_hat = y_hat.squeeze() # 将[128, 1, 3]挤压为[128,3]
  114. loss = self.criterion(y_hat, y) # criterion(input, target)
  115. self.log('train_loss', loss, prog_bar=True, logger=True, on_step=True, on_epoch=True) # 将loss输出在控制台
  116. return loss # 必须把log返回回去才有用
  117. def val_dataloader(self):
  118. val_loader = DataLoader(dataset=self.val_dataset, batch_size=batch_size, collate_fn=collate_fn, shuffle=False, num_workers=3)
  119. return val_loader
  120. def validation_step(self, batch, batch_idx):
  121. input_ids, attention_mask, token_type_ids, labels = batch
  122. y = one_hot(labels + 1, num_classes=3)
  123. y = y.to(dtype=torch.float)
  124. # forward pass
  125. y_hat = self.model(input_ids, attention_mask, token_type_ids)
  126. # y_hat = y_hat.squeeze()
  127. loss = self.criterion(y_hat, y)
  128. self.log('val_loss', loss, prog_bar=False, logger=True, on_step=True, on_epoch=True)
  129. return loss
  130. def test_dataloader(self):
  131. test_loader = DataLoader(dataset=self.test_dataset, batch_size=batch_size, collate_fn=collate_fn, shuffle=False, num_workers=3)
  132. return test_loader
  133. def test_step(self, batch, batch_idx):
  134. input_ids, attention_mask, token_type_ids, labels = batch
  135. target = labels + 1 # Used for calculating accuracy and F1-score later on
  136. y = one_hot(target, num_classes=3)
  137. y = y.to(dtype=torch.float)
  138. # forward pass
  139. y_hat = self.model(input_ids, attention_mask, token_type_ids)
  140. # y_hat = y_hat.squeeze()
  141. pred = torch.argmax(y_hat, dim=1)
  142. acc = (pred == target).float().mean()
  143. loss = self.criterion(y_hat, y)
  144. self.log('loss', loss)
  145. # task: Literal["binary", "multiclass", "multilabel"], corresponding to [binary classification, multiclass classification, multilabel classification]
  146. # average=None: outputs scores for each class separately; without it, the scores are averaged.
  147. re = recall(pred, target, task="multiclass", num_classes=class_num, average=None)
  148. pre = precision(pred, target, task="multiclass", num_classes=class_num, average=None)
  149. f1 = f1_score(pred, target, task="multiclass", num_classes=class_num, average=None)
  150. def log_score(name, scores):
  151. for i, score_class in enumerate(scores):
  152. self.log(f"{name}_class{i}", score_class)
  153. log_score("recall", re)
  154. log_score("precision", pre)
  155. log_score("f1", f1)
  156. self.log('acc', accuracy(pred, target, task="multiclass", num_classes=class_num))
  157. self.log('avg_recall', recall(pred, target, task="multiclass", num_classes=class_num, average="weighted"))
  158. self.log('avg_precision', precision(pred, target, task="multiclass", num_classes=class_num, average="weighted"))
  159. self.log('avg_f1', f1_score(pred, target, task="multiclass", num_classes=class_num, average="weighted"))
  160. # 训练
  161. def train():
  162. # 增加回调最优模型,这个比较好用
  163. checkpoint_callback = ModelCheckpoint(
  164. monitor='val_loss', # Monitor object: 'val_loss'
  165. dirpath='./model/checkpoints/', # Path to save the model
  166. filename='model-{epoch:02d}-{val_loss:.2f}', # Name of the best model
  167. save_top_k=1, # Save only the best one
  168. mode='min' # When the monitored metric is at its lowest.
  169. )
  170. # Trainer可以帮助调试,比如快速运行、只使用一小部分数据进行测试、完整性检查等,
  171. # 详情请见官方文档https://lightning.ai/docs/pytorch/latest/debug/debugging_basic.html
  172. # auto自适应gpu数量
  173. trainer = Trainer(max_epochs=epochs, log_every_n_steps=10, accelerator='gpu', devices="auto", fast_dev_run=False, callbacks=[checkpoint_callback])
  174. model = BiLSTMLighting(drop = dropout, hidden_dim = rnn_hidden, output_dim = class_num)
  175. trainer.fit(model)
  176. return model
  177. # 测试
  178. def test(model = None):
  179. # Load the parameters of the previously trained best model.
  180. if model is None:
  181. model = BiLSTMLighting.load_from_checkpoint(checkpoint_path=PATH,
  182. drop=dropout, hidden_dim=rnn_hidden, output_dim=class_num)
  183. trainer = Trainer(fast_dev_run=False)
  184. result = trainer.test(model)
  185. print(result)
  186. if __name__ == '__main__':
  187. model = train()
  188. test(model)

执行如下命令。

python main.py

运行结果:

  1. root@dsw-398300-bf64cb7b7-f28cl:/mnt/workspace/bert-bilstm-in-sentiment-classification# python main.py
  2. 2024-07-13 13:18:25.494250: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
  3. 2024-07-13 13:18:25.933228: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
  4. To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
  5. 2024-07-13 13:18:27.151707: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
  6. GPU available: True (cuda), used: True
  7. TPU available: False, using: 0 TPU cores
  8. IPU available: False, using: 0 IPUs
  9. HPU available: False, using: 0 HPUs
  10. Some weights of the model checkpoint at ./model/bert-base-chinese were not used when initializing BertModel: ['cls.seq_relationship.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.seq_relationship.weight', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
  11. - This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  12. - This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
  13. Missing logger folder: /mnt/workspace/bert-bilstm-in-sentiment-classification/lightning_logs
  14. LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
  15. | Name | Type | Params
  16. -----------------------------------------------
  17. 0 | model | BiLSTMClassifier | 125 M
  18. 1 | criterion | CrossEntropyLoss | 0
  19. -----------------------------------------------
  20. 23.6 M Trainable params
  21. 102 M Non-trainable params
  22. 125 M Total params
  23. 503.559 Total estimated model params size (MB)
  24. Epoch 4: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4506/4506 [13:41<00:00, 5.48it/s, loss=0.52, v_num=0, train_loss_step=0.654, train_loss_epoch=0.566]`Trainer.fit` stopped: `max_epochs=5` reached.
  25. Epoch 4: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4506/4506 [13:42<00:00, 5.48it/s, loss=0.52, v_num=0, train_loss_step=0.654, train_loss_epoch=0.566]
  26. root@dsw-398300-bf64cb7b7-f28cl:/mnt/workspace/bert-bilstm-in-sentiment-classification# python main.py
  27. 2024-07-13 15:10:59.831283: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
  28. 2024-07-13 15:10:59.868774: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
  29. To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
  30. 2024-07-13 15:11:00.420343: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
  31. GPU available: True (cuda), used: True
  32. TPU available: False, using: 0 TPU cores
  33. IPU available: False, using: 0 IPUs
  34. HPU available: False, using: 0 HPUs
  35. Some weights of the model checkpoint at ./model/bert-base-chinese were not used when initializing BertModel: ['cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight', 'cls.predictions.bias']
  36. - This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  37. - This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
  38. LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
  39. | Name | Type | Params
  40. -----------------------------------------------
  41. 0 | model | BiLSTMClassifier | 125 M
  42. 1 | criterion | CrossEntropyLoss | 0
  43. -----------------------------------------------
  44. 23.6 M Trainable params
  45. 102 M Non-trainable params
  46. 125 M Total params
  47. 503.559 Total estimated model params size (MB)
  48. Epoch 4: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4506/4506 [13:41<00:00, 5.48it/s, loss=0.544, v_num=1, train_loss_step=0.435, train_loss_epoch=0.568]`Trainer.fit` stopped: `max_epochs=5` reached.
  49. Epoch 4: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4506/4506 [13:41<00:00, 5.48it/s, loss=0.544, v_num=1, train_loss_step=0.435, train_loss_epoch=0.568]
  50. GPU available: True (cuda), used: False
  51. TPU available: False, using: 0 TPU cores
  52. IPU available: False, using: 0 IPUs
  53. HPU available: False, using: 0 HPUs
  54. /opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py:1764: PossibleUserWarning: GPU available but not used. Set `accelerator` and `devices` using `Trainer(accelerator='gpu', devices=1)`.
  55. rank_zero_warn(
  56. Testing DataLoader 0: 61%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉ | 690/1127 [46:08<29:13, 4.01s/it]
  57. Testing DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1127/1127 [1:17:28<00:00, 4.12s/it]
  58. ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
  59. ┃ Test metric ┃ DataLoader 0 ┃
  60. ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
  61. │ acc │ 0.7424814105033875 │
  62. │ avg_f1 │ 0.7335338592529297 │
  63. │ avg_precision │ 0.7671085596084595 │
  64. │ avg_recall │ 0.7424814105033875 │
  65. │ f1_class0 │ 0.5128293633460999 │
  66. │ f1_class1 │ 0.7843108177185059 │
  67. │ f1_class2 │ 0.6716415286064148 │
  68. │ loss │ 0.5851244330406189 │
  69. │ precision_class0 │ 0.637470543384552 │
  70. │ precision_class1 │ 0.7558891773223877 │
  71. │ precision_class2 │ 0.7077022790908813 │
  72. │ recall_class0 │ 0.4741062521934509 │
  73. │ recall_class1 │ 0.8350556492805481 │
  74. │ recall_class2 │ 0.6949878334999084 │
  75. └───────────────────────────┴───────────────────────────┘
  76. [{'loss': 0.5851244330406189, 'recall_class0': 0.4741062521934509, 'recall_class1': 0.8350556492805481, 'recall_class2': 0.6949878334999084, 'precision_class0': 0.637470543384552, 'precision_class1': 0.7558891773223877, 'precision_class2': 0.7077022790908813, 'f1_class0': 0.5128293633460999, 'f1_class1': 0.7843108177185059, 'f1_class2': 0.6716415286064148, 'acc': 0.7424814105033875, 'avg_recall': 0.7424814105033875, 'avg_precision': 0.7671085596084595, 'avg_f1': 0.7335338592529297}]
  77. root@dsw-398300-bf64cb7b7-f28cl:/mnt/workspace/bert-bilstm-in-sentiment-classification#

参考链接:

[1] 使用huggingface实现BERT+BILSTM情感3分类(附数据集源代码)_bert和bilstm中间需要加入drop层么-CSDN博客

[2] https://huggingface.co/google-bert/bert-base-chinese/tree/main

[3] 【NLP实战】基于Bert和双向LSTM的情感分类【上篇】_bert+lstm-CSDN博客

[4] 【NLP实战】基于Bert和双向LSTM的情感分类【中篇】_bert+lstm-CSDN博客

[5] https://github.com/iceissey/issey_Kaggle/tree/main/Bert_BiLSTM

[6] https://www.kaggle.com/code/isseyice/sentiment-classification-based-on-bert-and-lstm#Part-2:-Training-and-Evaluating-the-Model [7]https://www.kaggle.com/datasets/liangqingyuan/chinese-text-multi-classification?resource=download

[8]千言(LUGE)| 全面的中文开源数据集合 

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/小惠珠哦/article/detail/913877
推荐阅读
相关标签
  

闽ICP备14008679号