赞
踩
汇总了22个数据集
ORB(An Open Reading Benchmark) is an evaluation server which tests a single reading comprehension model's performance on diverse datasets. It contains a suite of seven existing datasets (DROP, ROPES, SQuAD1.1, SQuAD2.0 Quoref, NewsQA, NarrativeQA) and synthetic augmentations from various adversarial models, which test a model's capabilties to learn various lingusitic artifacts in a single unified model.
Answer:Ground Truth Answers: the southern and central parts of Francesouthern and central parts of France,about one-eighth
A Large-Scale Person-Centered Cloze Dataset. We have constructed a new "Who-did-What" dataset of over 200,000 fill-in-the-gap (cloze) multiple choice reading comprehension problems constructed from the LDC English Gigaword newswire corpus. The WDW dataset has a variety of novel features. First, in contrast with the CNN and Daily Mail datasets (Hermann et al., 2015) we avoid using article summaries for question formation. Instead, each problem is formed from two independent articles --- an article given as the passage to be read and a separate article on the same events used to form the question. Second, we avoid anonymization --- each choice is a person named entity. Third, the problems have been filtered to remove a fraction that are easily solved by simple baselines, while remaining 84% solvable by humans. We report performance benchmarks of standard systems and propose the WDW dataset as a challenge task for the community. ( ARTICLE HERE )
Name | Download | More Info | Metric | |
---|---|---|---|---|
The Corpus of Linguistic Acceptability | Matthew's Corr | |||
The Stanford Sentiment Treebank | Accuracy | |||
Microsoft Research Paraphrase Corpus | F1 / Accuracy | |||
Semantic Textual Similarity Benchmark | Pearson-Spearman Corr | |||
Quora Question Pairs | F1 / Accuracy | |||
MultiNLI Matched | Accuracy | |||
MultiNLI Mismatched | Accuracy | |||
Question NLI | Accuracy | |||
Recognizing Textual Entailment | Accuracy | |||
Winograd NLI | Accuracy | |||
Diagnostics Main | Matthew's Corr |
SuperGLUE:https://super.gluebenchmark.com/
Name | Identifier | Download | More Info | Metric | |
---|---|---|---|---|---|
Broadcoverage Diagnostics | AX-b | Matthew's Corr | |||
CommitmentBank | CB | Avg. F1 / Accuracy | |||
Choice of Plausible Alternatives | COPA | Accuracy | |||
Multi-Sentence Reading Comprehension | MultiRC | F1a / EM | |||
Recognizing Textual Entailment | RTE | Accuracy | |||
Words in Context | WiC | Accuracy | |||
The Winograd Schema Challenge | WSC | Accuracy | |||
BoolQ | BoolQ | Accuracy | |||
Reading Comprehension with Commonsense Reasoning | ReCoRD | F1 / Accuracy | |||
Winogender Schema Diagnostics | AX-g | Gender Parity / Accuracy |
example: { "question_type": "YES_NO", "question": "上海迪士尼可以带吃的进去吗", "documents": [ { 'paragraphs': ["text paragraph 1", "text paragraph 2"] }, "answers": [ "完全密封的可以,其它不可以。", // answer1 "可以的,不限制的。只要不是易燃易爆的危险物品,一般都可以带进去的。", //answer2 "罐装婴儿食品、包装完好的果汁、水等饮料及包装完好的食物都可以带进乐园,但游客自己在家制作的食品是不能入园,因为自制食品有一定的安全隐患。" // answer3 ], "yesno_answers": [ "Depends", // corresponding to answer 1 "Yes", // corresponding to answer 2 "Depends" // corresponding to asnwer 3 ] }
本文提出了首个中文法律阅读理解数据集,该数据集包含约10,000篇文档,主要涉及民事一审判决书和刑事一审判决书,数据来源于中国裁判文书网。通过抽取裁判文书的事实描述内容(“经审理查明”或者“原告诉称”部分),针对事实描述内容标注问题,最终形成约50,000个问答对。该数据集涉及多种问题类型,包括片段抽取型问题(Span-Extraction)、是否类问题(YES/NO)、拒答类问题(Unanswerable),期望可以覆盖真实场景中大多数类型的问题。我们希望通过该数据集,可以进一步促进法律领域相关任务的技术研究,例如要素抽取、问答系统、推荐系统等。以要素抽取为例,传统的要素抽取需要预定义大量标签,而由于裁判文书种类以及涉及案由(罪名)的多样性,使得标签定义工作比较繁重,通过阅读理解技术能够一定程度上避免这个问题。
本文的主要贡献有如下几点:
TensorFlow Datasets 是可用于 TensorFlow 或其他 Python 机器学习框架(例如 Jax)的一系列数据集。 所有数据集都作为 tf.data.Datasets
提供,助您实现易用且高性能的输入流水线。 要开始使用,请参阅这份指南以及我们的数据集列表。
Note: The datasets documented here are from HEAD
and so not all are available in the current tensorflow-datasets
package. They are all accessible in our nightly package tfds-nightly
.
Audio
common_voice
crema_d
dementiabank
fuss
groove
librispeech
libritts
ljspeech
nsynth
savee
speech_commands
tedlium
vctk
voxceleb
voxforge
Image
abstract_reasoning
aflw2k3d
arc
binarized_mnist
celeb_a
celeb_a_hq
clevr
clic
nights_staycoil100
div2k
downsampled_imagenet
dsprites
duke_ultrasound
flic
lost_and_found
lsun
nyu_depth_v2
scene_parse150
shapes3d
the300w_lp
Image classification
beans
bigearthnet
binary_alpha_digits
caltech101
caltech_birds2010
caltech_birds2011
cars196
cassava
cats_vs_dogs
cifar10
cifar100
cifar10_1
cifar10_corrupted
citrus_leaves
cmaterdb
colorectal_histology
colorectal_histology_large
curated_breast_imaging_ddsm
cycle_gan
deep_weeds
diabetic_retinopathy_detection
dmlab
dtd
emnist
eurosat
fashion_mnist
food101
geirhos_conflict_stimuli
horses_or_humans
i_naturalist2017
imagenet2012
imagenet2012_corrupted
imagenet2012_real
imagenet2012_subset
imagenet_a
imagenet_resized
imagenet_v2
imagenette
imagewang
kmnist
lfw
malaria
mnist
mnist_corrupted
omniglot
oxford_flowers102
oxford_iiit_pet
patch_camelyon
pet_finder
places365_small
plant_leaves
plant_village
plantae_k
quickdraw_bitmap
resisc45
rock_paper_scissors
smallnorb
so2sat
stanford_dogs
stanford_online_products
stl10
sun397
svhn_cropped
tf_flowers
uc_merced
vgg_face2
visual_domain_decathlon
Object detection
Question answering
Structured
Summarization
aeslc
big_patent
billsum
cnn_dailymail
covid19sum
gigaword
multi_news
newsroom
opinion_abstracts
opinosis
reddit
reddit_tifu
samsum
scientific_papers
wikihow
xsum
Text
anli
blimp
c4
cfq
civil_comments
clinc_oos
cos_e
definite_pronoun_resolution
eraser_multi_rc
esnli
gap
glue
goemotions
nights_stayimdb_reviews
irc_disentanglement
librispeech_lm
lm1b
math_dataset
movie_rationales
multi_nli
multi_nli_mismatch
openbookqa
pg19
qa4mre
reddit_disentanglement
scan
scicite
snli
super_glue
tiny_shakespeare
wiki40b
wikipedia
wikipedia_toxicity_subtypes
winogrande
wordnet
xnli
yelp_polarity_reviews
Translate
flores
opus
para_crawl
ted_hrlr_translate
ted_multi_translate
wmt14_translate
wmt15_translate
wmt16_translate
wmt17_translate
wmt18_translate
wmt19_translate
wmt_t2t_translate
Video
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。