赞
踩
错误:2020年10月之后,安装或更新软件包时可能会遇到错误。这是因为pip将改变它解决依赖冲突的方式。
我们建议您使用–use feature=2020 resolver在新的解析器成为默认解析器之前用它测试您的包。
TorchVision0.4.2要求torch1.3.1,但您将使用torch 1.8.0,这是不兼容的。
torch1.3.1
ERROR: After October 2020 you may experience errors when installing or updating packages. This is because pip will change the way that it resolves dependency conflicts.
We recommend you use --use-feature=2020-resolver to test your packages with the new resolver before it becomes the default.
torchvision 0.4.2 requires torch==1.3.1, but you’ll have torch 1.8.0 which is incompatible.
Collecting dataclasses; python_version < “3.7”
Using cached dataclasses-0.8-py3-none-any.whl (19 kB)
Installing collected packages: dataclasses, torch, torchtext
Attempting uninstall: torch
Found existing installation: torch 1.3.1
Can’t uninstall ‘torch’. No files were found to uninstall.
# -*- coding: utf-8 -*- """ Created on 2019 @author: fancp """ import torch import torch.nn as nn w = torch.empty(3,5) #1.均匀分布 - u(a,b) #torch.nn.init.uniform_(tensor, a=0.0, b=1.0) print(nn.init.uniform_(w)) # ============================================================================= # tensor([[0.9160, 0.1832, 0.5278, 0.5480, 0.6754], # [0.9509, 0.8325, 0.9149, 0.8192, 0.9950], # [0.4847, 0.4148, 0.8161, 0.0948, 0.3787]]) # ============================================================================= #2.正态分布 - N(mean, std) #torch.nn.init.normal_(tensor, mean=0.0, std=1.0) print(nn.init.normal_(w)) # ============================================================================= # tensor([[ 0.4388, 0.3083, -0.6803, -1.1476, -0.6084], # [ 0.5148, -0.2876, -1.2222, 0.6990, -0.1595], # [-2.0834, -1.6288, 0.5057, -0.5754, 0.3052]]) # ============================================================================= #3.常数 - 固定值 val #torch.nn.init.constant_(tensor, val) print(nn.init.constant_(w, 0.3)) # ============================================================================= # tensor([[0.3000, 0.3000, 0.3000, 0.3000, 0.3000], # [0.3000, 0.3000, 0.3000, 0.3000, 0.3000], # [0.3000, 0.3000, 0.3000, 0.3000, 0.3000]]) # ============================================================================= #4.全1分布 #torch.nn.init.ones_(tensor) print(nn.init.ones_(w)) # ============================================================================= # tensor([[1., 1., 1., 1., 1.], # [1., 1., 1., 1., 1.], # [1., 1., 1., 1., 1.]]) # ============================================================================= #5.全0分布 #torch.nn.init.zeros_(tensor) print(nn.init.zeros_(w)) # ============================================================================= # tensor([[0., 0., 0., 0., 0.], # [0., 0., 0., 0., 0.], # [0., 0., 0., 0., 0.]]) # ============================================================================= #6.对角线为 1,其它为 0 #torch.nn.init.eye_(tensor) print(nn.init.eye_(w)) # ============================================================================= # tensor([[1., 0., 0., 0., 0.], # [0., 1., 0., 0., 0.], # [0., 0., 1., 0., 0.]]) # ============================================================================= #7.xavier_uniform 初始化 #torch.nn.init.xavier_uniform_(tensor, gain=1.0) #From - Understanding the difficulty of training deep feedforward neural networks - Bengio 2010 print(nn.init.xavier_uniform_(w, gain=nn.init.calculate_gain('relu'))) # ============================================================================= # tensor([[-0.1270, 0.3963, 0.9531, -0.2949, 0.8294], # [-0.9759, -0.6335, 0.9299, -1.0988, -0.1496], # [-0.7224, 0.2181, -1.1219, 0.8629, -0.8825]]) # ============================================================================= #8.xavier_normal 初始化 #torch.nn.init.xavier_normal_(tensor, gain=1.0) print(nn.init.xavier_normal_(w)) # ============================================================================= # tensor([[ 1.0463, 0.1275, -0.3752, 0.1858, 1.1008], # [-0.5560, 0.2837, 0.1000, -0.5835, 0.7886], # [-0.2417, 0.1763, -0.7495, 0.4677, -0.1185]]) # ============================================================================= #9.kaiming_uniform 初始化 #torch.nn.init.kaiming_uniform_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu') #From - Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification - HeKaiming 2015 print(nn.init.kaiming_uniform_(w, mode='fan_in', nonlinearity='relu')) # ============================================================================= # tensor([[-0.7712, 0.9344, 0.8304, 0.2367, 0.0478], # [-0.6139, -0.3916, -0.0835, 0.5975, 0.1717], # [ 0.3197, -0.9825, -0.5380, -1.0033, -0.3701]]) # ============================================================================= #10.kaiming_normal 初始化 #torch.nn.init.kaiming_normal_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu') print(nn.init.kaiming_normal_(w, mode='fan_out', nonlinearity='relu')) # ============================================================================= # tensor([[-0.0210, 0.5532, -0.8647, 0.9813, 0.0466], # [ 0.7713, -1.0418, 0.7264, 0.5547, 0.7403], # [-0.8471, -1.7371, 1.3333, 0.0395, 1.0787]]) # ============================================================================= #11.正交矩阵 - (semi)orthogonal matrix #torch.nn.init.orthogonal_(tensor, gain=1) #From - Exact solutions to the nonlinear dynamics of learning in deep linear neural networks - Saxe 2013 print(nn.init.orthogonal_(w)) # ============================================================================= # tensor([[-0.0346, -0.7607, -0.0428, 0.4771, 0.4366], # [-0.0412, -0.0836, 0.9847, 0.0703, -0.1293], # [-0.6639, 0.4551, 0.0731, 0.1674, 0.5646]]) # ============================================================================= #12.稀疏矩阵 - sparse matrix #torch.nn.init.sparse_(tensor, sparsity, std=0.01) #From - Deep learning via Hessian-free optimization - Martens 2010 print(nn.init.sparse_(w, sparsity=0.1)) # ============================================================================= # tensor([[ 0.0000, 0.0000, -0.0077, 0.0000, -0.0046], # [ 0.0152, 0.0030, 0.0000, -0.0029, 0.0005], # [ 0.0199, 0.0132, -0.0088, 0.0060, 0.0000]]) # =============================================================================
看了三个经典案例:
A,GitHub德语到英语的翻译
B,博客Pytorch】【torchtext(二)】Field详解
C,B站文本生成案例
B,
1,Field:可以理解为这是一个金刚钻,可以给数据加上,…还可以弄到相同长度,还可以定义切分的方式。
TEXT = Field(sequential=True, lower=True, fix_length=10,tokenize=str.split,batch_first=True)
LABEL = Field(sequential=False, use_vocab=False)
2,金刚钻和数据一起丢给example,数据就处理好了
for text,label in zip(corpus,labels):
example = Example.fromlist([text,label],fields=fields)
examples.append(example)
3, 将处理好的数据丢给BucketIterator.splits
这个和上面两步不是同一个项目中的,这个是德英翻译,上面的是博客讲解
train_iterator, valid_iterator, test_iterator = BucketIterator.splits(
(train_data, valid_data, test_data),
batch_size = BATCH_SIZE,
device = device)
源代码:
# 1.数据 corpus = ["D'aww! He matches this background colour", "Yo bitch Ja Rule is more succesful then", "If you have a look back at the source"] labels = [0,1,0] # 2.定义不同的Field TEXT = Field(sequential=True, lower=True, fix_length=10,tokenize=str.split,batch_first=True) LABEL = Field(sequential=False, use_vocab=False) fields = [("comment", TEXT),("label",LABEL)] # 3.将数据转换为Example对象的列表 examples = [] for text,label in zip(corpus,labels): example = Example.fromlist([text,label],fields=fields) examples.append(example) print(type(examples[0])) print(examples[0].comment) print(examples[0].label) # 4.构建词表 new_corpus = [example.comment for example in examples] TEXT.build_vocab(new_corpus) print(TEXT.process(new_corpus))
大致流程:
1,读取文件
# 设置表头
fields = [('score', None), ('id',None), ('date',None), ('query',None),
('name',None),('tweet',TWEET), ('category',None), ('label',LABEL)]
# 读取数据
twitterDataset = data.TabularDataset(
path = 'training-processed.csv',
format = 'CSV',
fields = fields,
skip_header = False
)
# 分离 train, test, val
train, test, val = twitterDataset.split(split_ratio=[0.8, 0.1, 0.1], stratified=True, strata_field='label')
2,做词表啦,切分啦
LABEL = data.LabelField() # 标签
TWEET = data.Field(lower=True) # 内容/文本
# 构建词汇表
vocab_size = 20000
TWEET.build_vocab(train, max_size=vocab_size)
LABEL.build_vocab(train)
3,好吧, 我描述不清楚了,先放着吧,之后再回来看
总之就是文本的预处理的很好用的一个包
这个一直出问题,用不了,现在依然解决不了,不能运行代码了,智能看看教程看看这么用就好了
torchtext.data.Field参数介绍
后来发现在kaggle 的notebook上没有这个问题,可以运行!!!而且cpu和gpu都可
可能就是我的pytorch和torchtextd 版本问题。
因为github这里提到:此仓库仅适用于要求PyTorch 1.8或更高版本的torchtext 0.9或更高版本
把一个句子中的所有词转变为词向量以后,同时求一个均值(或者和或者最大值)
在github德语翻译为英语案例中,使用scapy是为了定义tokenizer函数
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。