赞
踩
'''
Description: object has no attribute ‘get_vocab’
Autor: 365JHWZGo
Date: 2021-12-07 11:45:13
LastEditors: 365JHWZGo
LastEditTime: 2021-12-07 12:45:34
'''
今天在写新闻分类时发现黑马的代码无法使用。
1.没有text_classification
train_dataset, test_dataset = text_classification.DATASETS['AG_NEWS'](root=load_data_path)
解决方法
train_dataset, test_dataset = torchtext.datasets.AG_NEWS(root=path,split=('train',"test"))
2.没有get_vocab方法
解决方法
然后查看其申明
进入ag_news.py
def AG_NEWS(root, split):
path = download_from_url(URL[split], root=root,
path=os.path.join(root, split + ".csv"),
hash_value=MD5[split],
hash_type='md5')
return _RawTextIterableDataset(DATASET_NAME, NUM_LINES[split],
_create_data_from_csv(path))
再进入_RawTextIterableDataset类,是在datasets_utils.py中
class _RawTextIterableDataset(torch.utils.data.IterableDataset): """Defines an abstraction for raw text iterable datasets. """ def __init__(self, description, full_num_lines, iterator): """Initiate the dataset abstraction. """ super(_RawTextIterableDataset, self).__init__() self.description = description self.full_num_lines = full_num_lines self._iterator = iterator self.num_lines = full_num_lines self.current_pos = None def __iter__(self): return self def __next__(self): if self.current_pos == self.num_lines - 1: raise StopIteration item = next(self._iterator) if self.current_pos is None: self.current_pos = 0 else: self.current_pos += 1 return item def __len__(self): return self.num_lines def pos(self): """ Returns current position of the iterator. This returns None if the iterator hasn't been used yet. """ return self.current_pos def __str__(self): return self.description
发现确实没有get_vocab函数,那就自己实现它的功能吧!
首先了解它的功能是统计train_datasets中的不同单词总数
def get_vocab(self): lengthAll = 0 d = dict() for i in range(self.num_lines): sub_content = self.__next__()[1].lower() remove = str.maketrans("","",string.punctuation) sub_content = sub_content.translate(remove).split() for sub in sub_content: if sub not in d: lengthAll+=1 d[sub]=1 else: continue # if i%1000==0: # print(i) return lengthAll
输入测试代码
print(train_dataset.get_vocab())
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。