当前位置:   article > 正文

KBQA 常用数据集之 ComplexWebQuestions_cwq数据集

cwq数据集

目录

1. 论文相关

2. 数据集概述

      2.1 内容介绍

       2.2 数据统计

3. 模型性能比较


1. 论文相关

      ComplexWebQuestions [Talmor and Berant 2018b]

      源自论文:The Web as a Knowledge-base for Answering Complex Questions

      数据集:https://www.dropbox.com/sh/7pkwkrfnwqhsnpo/AACuu4v3YNkhirzBOeeaHYala

      Learboard: Leaderboard | tau-nlp

2. 数据集概述

    2.1 内容介绍

      CWQ(ComplexWebQuestions)涉及到的知识库是Freebase。该数据集中包含Question 文件和Web Snippet 文件。

      其中,Question files 主要有以下字段:

IDThe unique ID of the example
webqsp_IDThe original WebQuestionsSP ID from which the question was constructed
websq_questionThe WebQuestionsSP Question from which the question was constructed
machine_questionThe artificial complex question, before paraphrasing
question The natural language complex question
sparqlFreebase SPARQL query for the question. Note that the SPARQL was constructed for the machine question, the actual question after paraphrasing may differ from the SPARQL. 
compositionality_typeAn estimation of the type of compositionally. {composition, conjunction, comparative, superlative}. The estimation has not been manually verified,  the question after paraphrasing may differ from this estimation
answersa list of answers each containing answer: the actual answer; answer_id: the Freebase answer id; aliases: freebase extracted aliases for the answer
createdcreation time

       Web Snippet Files 中有以下字段:

question_IDthe ID of related question, containing at least 3 instances of the same ID (full question, split1, split2)
questionThe natural language complex question
web_queryQuery sent to the search engine
split_source'noisy supervision split' or ‘ptrnet split’, please train on examples containing “ptrnet split” when comparing to Split+Decomp  from https://arxiv.org/abs/1807.09623
split_type'full_question' or ‘split_part1' or ‘split_part2’ please use ‘composition_answer’ in question of type composition and split_type: “split_part1” when training a reading comprehension model on splits as in Split+Decomp  from https://arxiv.org/abs/1807.09623 (in the rest of the cases use the original answer).
web_snippets~100 web snippets per query. Each snippet includes Title,Snippet. 

    2.2数据统计

Question Files 数据集划分
类别数量
Train

27,734

Dev3,480
Test3,475
Total34,689
Web Snippet Files 数据集划分
train set snippets10,035,571
dev set snippets1,350,950
test set snippets1,339,468

3. 模型性能比较

各模型在ComplexWebQuestions上的表现
模型(年份)AccuracyPrecisionHit@1F1论文代码链接

TextRay

(2019)

40.8333.87
Learning to Answer Complex Questions over Knowledge Bases with Query Composition
GitHub - umich-dbgroup/TextRay-Release at master

PullNet

(2019)

47.2PullNet: Open Domain Question Answering with Iterative Retrieval on Knowledge Bases and Text

FullModel

(2019)

39.336.5Knowledge Base Question Answering with Topic Units

HSP

(2019)

66.18Complex Question Decomposition for Semantic Parsinghttps://github.com/cairohy/hsp

QGG

(2020)

44.140.4Query Graph Generation for Answering Multi-hop Complex Questions from Knowledge BasesGitHub - lanyunshi/Multi-hopComplexKBQA

SPARQA

(2020)

31.57SPARQA: Skeleton-based Semantic Parsing for Complex Questions over Knowledge BasesGitHub - nju-websoft/SPARQA: SPARQA: Skeleton-based Semantic Parsing for Complex Questions over Knowledge Bases, AAAI 2020

MULTIQUE

(2020)

41.2334.62Answering Complex Questions by Combining Information from Curated and Extracted Knowledge Bases

Rigel-intersect

(2021)

48.7Expanding End-to-End Question Answering on Differentiable Knowledge Graphs with Intersection

TransferNet

(2021)

48.6TransferNet: An Effective and Transparent Framework for Multi-hop Question Answering over Relation GraphGitHub - shijx12/TransferNet: Pytorch implementation of EMNLP 2021 paper "TransferNet: An Effective and Transparent Framework for Multi-hop Question Answering over Relation Graph "
NSM(2021)47.6Improving Multi-hop Knowledge Base Question Answering by Learning Intermediate Supervision Signalshttps://github.com/​​​​RichardHGL/WSDM2021_NSM

BERT-Large

(2021)

66.468.2Unseen Entity Handling in Complex Question Answering over Knowledge Base via Language Generation

shrink KB

(2021)

46.2Improving Query Graph Generation for Complex Question Answering over Knowledge Base

 

内容将持续更新,欢迎大家评论补充~

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/2023面试高手/article/detail/722824
推荐阅读
相关标签
  

闽ICP备14008679号